Numbers in Python

Why Understanding Numbers Matters

In programming, numbers do not always behave the same way they do in pure mathematics. Why? Because computers must store numbers in limited memory, using bits.

  • There is a limit to accuracy.
  • There is an upper and lower bound.
  • There is a tradeoff between accuracy and memory usage.

Example 1: 0.1 + 0.2 != 0.3

0.1 + 0.2
# 0.30000000000000004

This happens because 0.1 cannot be represented exactly in binary. That creates a small but important error.

Example 2: Integer Overflow

When you add 1 to the maximum value of a 32-bit integer,

2147483647

the result may become

-2147483648

Why does this happen? Because 2147483647 + 1 is outside the range of a 32-bit integer.

Example 3: Using Floats in Financial Applications Is Dangerous

0.1 + 0.1 + 0.1
# 0.30000000000000004

If this error accumulates, it can cause serious problems. That is why decimal types are often used in finance.

Example 4: float32 Is Preferred in ML

It uses less memory than float64. Many GPUs are optimized for float32. In machine learning, small numerical errors are often acceptable.

Example 5: Quantization in LLMs

We can reduce model size through quantization. This changes the number type, for example from float64 to float8. This is a tradeoff between model accuracy and model size.

This is why it is important to choose the right number type based on your application requirements and hardware constraints.

Numbers Inside a Computer: Bits

Inside a computer, numbers are represented as bits.

For example:

01000001

Depending on the context, this can be interpreted as 65 or 'A'. In other words, numbers are one way to interpret bits.

Integers

Integers are straightforward. They are represented as sums of powers of a base.

In decimal:

345
= 3×10² + 4×10¹ + 5×10⁰

In binary:

1011
= 1×2³ + 0×2² + 1×2¹ + 1×2⁰
= 8 + 2 + 1
= 11

How Fractions Work

The basic idea is the same for fractional numbers.

In decimal:

0.375
= 3×10⁻¹ + 7×10⁻² + 5×10⁻³

In binary:

0.101₂
= 1×2⁻¹ + 0×2⁻² + 1×2⁻³
= 0.5 + 0 + 0.125
= 0.625

In decimal, we can represent these numbers with a finite number of digits:

1/2 = 0.5
1/4 = 0.25
1/5 = 0.2
1/8 = 0.125

This is because they can be expressed using powers of 10.

1/4 = 25/100
1/5 = 2/10
1/8 = 125/1000

In other words, a fraction has a terminating decimal expansion only if the denominator's prime factors are also prime factors of 10. For example:

10 = 2 × 5

Since 2 and 5 are the prime factors of 10, we can represent 1/10 with a finite number of decimal digits.

In binary, the condition for a finite representation is:

terminating binary fraction = denominator is a power of 2

Now consider 1/10. Since 10 = 2 × 5, and 5 is not a factor of 2^n, its binary representation is infinite.

0.1 = 2⁻⁴ + 2⁻⁵ + 2⁻⁸ + 2⁻⁹ + 2⁻¹² + 2⁻¹³ + ⋯

So in a computer, 0.1 must be stored with limited precision by cutting off the infinite expansion. That is why 0.1 + 0.2 is not exactly 0.3.

Floating-Point Numbers

Consider a number like this:

0.00000000000000000000000101₂

If we store numbers like this directly, they take more memory and computation becomes less efficient. What can we do?

We use scientific notation. In decimal, we can write:

123000000 = 1.23 × 10⁸
0.00000123 = 1.23 × 10⁻⁶

So we only need to store the significant part (1.23) and the exponent (8 or -6).

The same idea works in binary.

x = (−1)^s × 1.m × 2^e

| Part | Role | | --- | --- | | s | sign (+ or -) | | m | mantissa / fractional part | | e | exponent |

Example: convert 10.75 to binary.

Integer part:

10 = 1010₂

Fractional part:

0.75 = 0.11₂

Combine them:

1010.11₂

Scientific notation:

1.01011₂ × 2³

In memory, a float32 is stored like this:

sign | exponent | mantissa
 1bit |   8bit   |  23bit

It is called floating point because the position of the decimal point "floats" depending on the exponent. You can also see that float32 stores only 23 bits of mantissa precision, which is why rounding errors occur.

Fixed-Point Numbers

With floating-point numbers, precision varies depending on the size of the number. Fixed-point numbers are a way to guarantee precision up to a fixed decimal place. The implementation is simple: multiply the value until the desired decimal places become an integer, then store it as an integer. When you want the original value again, divide by the same scale factor.

Money and cryptocurrency calculations often use fixed-point representations because the smallest unit is predefined. For Japanese yen, for example, that unit would be one sen.

How Should You Choose?

The choice is a tradeoff between precision and memory usage. If the number of decimal places is fixed and precision is critical, fixed-point is often better. If some loss of precision is acceptable, as in machine learning, floating-point is usually the better choice.

Differences Between Languages

Different languages have different default numeric types. Python integers use arbitrary precision, so they are designed not to overflow. The tradeoff is additional overhead to manage that flexibility. Rust uses fixed-size integer types such as i32 or i64, so overflow can happen if you exceed the range. In debug builds, Rust checks for overflow in many integer operations.