GFloat Basics
This notebook shows the use of decode_float
to explore properties of some float formats.
# Install packages
from pandas import DataFrame
import numpy as np
from gfloat import decode_float
from gfloat.formats import *
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 2
1 # Install packages
----> 2 from pandas import DataFrame
3 import numpy as np
5 from gfloat import decode_float
ModuleNotFoundError: No module named 'pandas'
List all the values in a format
The first example shows how to list all values in a given format. We will choose the OCP E5M2 format.
The object format_info_ocp_e5m2
is from the gfloat.formats
package, and describes the characteristics of that format:
format_info_ocp_e5m2
FormatInfo(name='ocp_e5m2', k=8, precision=3, emax=15, has_nz=True, has_infs=True, num_high_nans=3, has_subnormals=True, is_signed=True, is_twos_complement=False)
We shall use the format to decode all values from 0..255, and gather them in a pandas DataFrame.
We see that decode_float
returns a lot more than just the value - it also splits out the exponent, significand, and sign, and returns the FloatClass
, which allows us to distinguish normal and subnormal numbers, as well as zero, infinity, and nan.
fmt = format_info_ocp_e5m2
vals = [decode_float(fmt, i) for i in range(256)]
DataFrame(vals).set_index("code")
fval | exp | expval | significand | fsignificand | signbit | fclass | |
---|---|---|---|---|---|---|---|
code | |||||||
0 | 0.000000e+00 | 0 | -14 | 0 | 0.00 | 0 | FloatClass.ZERO |
1 | 1.525879e-05 | 0 | -14 | 1 | 0.25 | 0 | FloatClass.SUBNORMAL |
2 | 3.051758e-05 | 0 | -14 | 2 | 0.50 | 0 | FloatClass.SUBNORMAL |
3 | 4.577637e-05 | 0 | -14 | 3 | 0.75 | 0 | FloatClass.SUBNORMAL |
4 | 6.103516e-05 | 1 | -14 | 0 | 1.00 | 0 | FloatClass.NORMAL |
... | ... | ... | ... | ... | ... | ... | ... |
251 | -5.734400e+04 | 30 | 15 | 3 | 1.75 | 1 | FloatClass.NORMAL |
252 | -inf | 31 | 16 | 0 | 1.00 | 1 | FloatClass.INFINITE |
253 | NaN | 31 | 16 | 1 | 1.25 | 1 | FloatClass.NAN |
254 | NaN | 31 | 16 | 2 | 1.50 | 1 | FloatClass.NAN |
255 | NaN | 31 | 16 | 3 | 1.75 | 1 | FloatClass.NAN |
256 rows × 7 columns
Additional format info: special values, min, max, dynamic range
In addition, FormatInfo
can tell us about other characteristics of each format.
To reproduce some of the OCP spec’s tables 1 and 2:
def compute_dynamic_range(fi):
return np.log2(fi.max / fi.smallest)
for prop, probe in (
("Format ", lambda fi: fi.name.replace("format_info_", "")),
("Max exponent (emax) ", lambda fi: fi.emax),
("Exponent bias ", lambda fi: fi.expBias),
("Infinities ", lambda fi: 2 * int(fi.has_infs)),
("Number of NaNs ", lambda fi: fi.num_nans),
("Number of zeros ", lambda fi: int(fi.has_zero) + int(fi.has_nz)),
("Max normal number ", lambda fi: fi.max),
("Min normal number ", lambda fi: fi.smallest_normal),
("Min subnormal number ", lambda fi: fi.smallest_subnormal),
("Dynamic range (binades)", lambda x: round(compute_dynamic_range(x))),
):
print(
f"{prop} {probe(format_info_ocp_e4m3):<20} {probe(format_info_ocp_e5m2):<20} {probe(format_info_p3109(3))}"
)
Format ocp_e4m3 ocp_e5m2 p3109_p3
Max exponent (emax) 8 15 15
Exponent bias 7 15 16
Infinities 0 2 2
Number of NaNs 2 6 1
Number of zeros 2 2 1
Max normal number 448.0 57344.0 49152.0
Min normal number 0.015625 6.103515625e-05 3.0517578125e-05
Min subnormal number 0.001953125 1.52587890625e-05 7.62939453125e-06
Dynamic range (binades) 18 32 33
How do subnormals affect dynamic range?
Most, if not all, low-precision formats include subnormal numbers, as they increase the number of values near zero, and increase dynamic range.
A natural question is “by how much?”. To answer this, we can create a mythical new format, a copy of e4m3
, but with has_subnormals
set to true.
import copy
e4m3_no_subnormals = copy.copy(format_info_ocp_e4m3)
e4m3_no_subnormals.has_subnormals = False
And now compute the dynamic range with and without:
dr_with = compute_dynamic_range(format_info_ocp_e4m3)
dr_without = compute_dynamic_range(e4m3_no_subnormals)
print(f"Dynamic range with subnormals = {dr_with}")
print(f"Dynamic range without subnormals = {dr_without}")
print(f"Ratio = {2**(dr_with - dr_without):.1f}")
Dynamic range with subnormals = 17.807354922057606
Dynamic range without subnormals = 15.637429920615292
Ratio = 4.5