Floating Points –Rounding Errors in Algebraic Processes

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

The floating point representation of numbers is dangerous. With floating points, the computer stores the digits and the position of the point. Imagine a "6-digit decimal computer". This is a computer that uses base 10 numbers, like we human, but can only store 6 digits, the position of the dot, and the power of 10 used in scientific notation.

Such computer would represent the number 1/3 as:

1/3 = 0.333333

That means the number is rounded down by about one millionth. But it might get much worse. Have a look at the following operation:

a = 1 + 1/30,000 = 1.00003[3333...]
b = a - 1

It leads to b = 3\cdot 10^{-5}, which means the computer produced a rounding error of about 10%! Now you should feel sorry you did not write:

a = 1 - 1
b = a + 1/30,000

which would have given  b = 3.33333\cdot 10^{-5}.

Normally, this is not a big deal. But in probability computation it is. Take the way the disjunction (union) of probabilities is computed. Given N events whose probability is p_i. The probability that at least one of these events happens is:

[1] P(p_1 \lor p_2 \lor ...\lor p_N) = 1 - (1 - p_1)\cdot(1 - p_2)...\cdot(1 -p_N)

(the symbols A \lor B means A or B is true, or both). It is immediately clear that if all p_i are very small, and will be rounded like the 1/3 above, the error can become soon big enough to spoil your computation.

And this is actually a pitty! Because:

[2] (1 - p_1)\cdot(1 - p_2)...\cdot(1 -p_N) = 1 - \sum_i p_i + \sum_{i\neq j} p_i\cdot p_j - \sum_{i\neq j \neq k} p_i\cdot p_j\cdot p_k ...

where \sum_i means "sum" for all values of i, in this case 1 to N, \sum_{i\neq j} all values of i and j not equal between themselves, etc.

Equation [2] tells us that if we solve [1] algebraically, subtract 1 and only after substitute the values in the various p_i, the error would be smaller.

Let's make an extreme example. If we had two events with p_1 = p_2 = 10^{-6}:

[3] P(p_1 \lor p_2) = 1 - (1-10^-6)^2 = 2\cdot 10^{-6}-10^{-12} = 1.99999\cdot 10^{-6}

Imagine you have a "3-digit decimal computer" (3 digits plus the position of the point and the power of 10). If you solve equation [3] using the formula in [2], even with such lousy computer the result would be a decent 1.99\cdot 10^{-6}. This because both 2\cdot 10^{-6} and 10^{-12} do not need more than 3 digits. On the contrary, a naive computation, where the products are computed through equation [1] substituting the value of the variables, would give:

[1'] 1 - 0.999\cdot 0.999 = 1.99\cdot 10^{-3}

A terribly wrong result. The short script below, in Python, solve the equation algebraically:

def probabilities_disjunction(probabilities: list):
    n = len(probabilities)
    # tree is a list of strings whose length is len(probabilities) with all possible 0/1 without the 0s.
    # e.g. for three probabilities: 001, 010, 011, 100, 101, 110, 111 
    tree = []
    for i in range(1, 2**n):
        tree.append(str(bin(i))[2:].zfill(n))
    v = []
    for p in probabilities:
        v.append([1., -p])
    prod = 0
    # multiply all probabilities*(-1) according to tree
    for branch in tree:
        inter_prod = 1.
        for i, leaf in enumerate(branch):
            inter_prod *= v[i][int(leaf)]
        prod += inter_prod
    return -prod

It is clearly non optimised, and the number of computations grows exponentially with the number of probabilities.

Interested in the topic? Below two books on Python (I read the High Performance Python, which IMO should be read by anyone writing scientific code) plus two on the subject (which I admit I haven't read).

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *