After following Laurie Kirk down a rabbit hole on subnormal numbers in the IEEE 754 float specification, I stumbled upon other interesting properties of floating-point numbers, specifically how NaNs (Not a Number) are represented in binary. After more than 10 years of scientific computing and data science, I thought there was nothing about floats that could surprise me, but oh, was I wrong. Let’s see if I can surprise you. I’ve built the computer-science equivalent of a magic trick to showcase these properties.

The magic trick

The trick works in two stages:

  1. You choose a phrase of your liking. With a special Python function, you can convert it into a numpy array of NaNs. It’s a normal array. It’s normal NaNs. Your phrase is nowhere to be seen.
  2. You send the numpy array to an API endpoint at https://magicfloat.sauerburger.io/unravel. Using advanced magic (knowledge of IEEE 754), I can unravel your secrets by looking at the array of NaNs.

Step one: Enchanting your phrase

import numpy as np

def enchant(phrase: str) -> np.ndarray:
    return np.frombuffer(b"".join([
        bytes([x]) + b"\xff\x80\x7f" for x in phrase.encode("utf-8")
    ]), dtype=np.float32)

If you call that with "Computers are fun!", you get a numpy array of floats with no signs of the phrase. It seems the message is gone.

box = enchant("Computers are fun!")
>>> box
array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan], dtype=float32)

>>> box.shape
(18,)

>>> box.dtype
dtype('float32')

Step two: Open the magic NaN array

I’m providing an API endpoint at https://magicfloat.sauerburger.io/unravel that takes the binary version of the numpy array and responds with your original phrase. The following function does the necessary encoding and request handling.

import requests

def unravel(box: np.ndarray) -> str:
    if box.dtype != np.float32:
        raise ValueError("Magic box must be float32.")
    response = requests.post("https://magicfloat.sauerburger.io/unravel", data=box.tobytes())
    if not response.ok:
        raise RuntimeError("The planets don't seem to align: %s" % response.text)
    return response.text

If we continue the example from above, we get: drum roll

>>> unravel(box)
'Computers are fun!'

How does it work?

Floating-point numbers are represented using three components,

  • the sign of the numbers,
  • the exponent used with base 2, and
  • the fractional part of the number, the mantissa.

In memory, they are arranged as follows. The order might be different depending on the endianness of your platform.

x 1 1 1 1 1 1 1 1 x x x x x x x x x x x x x x x x x x x x x x x
Sign Biased exponent (8 bits) Mantissa (23 bits)

A few combinations of bits have a special meaning, such as +inf, -inf, and NaN. When the exponent is all-ones (as shown in the chart), it represents one of the aforementioned three cases.

Sign Biased exponent Mantissa Special meaning
0 all ones: 1111 1111 all zero +inf
1 all ones: 1111 1111 all zero -inf
any all ones: 1111 1111 at least one bit not zero NaN

We observe that +inf and -inf each have a unique binary representation. However, for NaN, we have 2^24 - 1 possible binary representations. For my little magic trick, I pack one UTF-8 encoded byte in each 32-bit float number. I invite you to discover the details yourself.