Floating Points –Rounding Errors in Algebraic Processes

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

The floating point representation of numbers is dangerous. With floating points, the computer stores the digits and the position of the point. Imagine a "6-digit decimal computer". This is a computer that uses base 10 numbers, like we human, but can only store 6 digits, the position of the dot, and the power of 10 used in scientific notation.

Such computer would represent the number 1/3 as:

1/3 = 0.333333

That means the number is rounded down by about one millionth. But it might get much worse. Have a look at the following operation:

a = 1 + 1/30,000 = 1.00003[3333...]
b = a - 1

It leads to b = 3\cdot 10^{-5}, which means the computer produced a rounding error of about 10%! Now you should feel sorry you did not write:

a = 1 - 1
b = a + 1/30,000

which would have given  b = 3.33333\cdot 10^{-5}.

Normally, this is not a big deal. But in probability computation it is. Take the way the disjunction (union) of probabilities is computed. Given N events whose probability is p_i. The probability that at least one of these events happens is:

[1] P(p_1 \lor p_2 \lor ...\lor p_N) = 1 - (1 - p_1)\cdot(1 - p_2)...\cdot(1 -p_N)

(the symbols A \lor B means A or B is true, or both). It is immediately clear that if all p_i are very small, and will be rounded like the 1/3 above, the error can become soon big enough to spoil your computation.

And this is actually a pitty! Because:

[2] (1 - p_1)\cdot(1 - p_2)...\cdot(1 -p_N) = 1 - \sum_i p_i + \sum_{i\neq j} p_i\cdot p_j - \sum_{i\neq j \neq k} p_i\cdot p_j\cdot p_k ...

where \sum_i means "sum" for all values of i, in this case 1 to N, \sum_{i\neq j} all values of i and j not equal between themselves, etc.

Equation [2] tells us that if we solve [1] algebraically, subtract 1 and only after substitute the values in the various p_i, the error would be smaller.

Let's make an extreme example. If we had two events with p_1 = p_2 = 10^{-6}:

[3] P(p_1 \lor p_2) = 1 - (1-10^-6)^2 = 2\cdot 10^{-6}-10^{-12} = 1.99999\cdot 10^{-6}

Imagine you have a "3-digit decimal computer" (3 digits plus the position of the point and the power of 10). If you solve equation [3] using the formula in [2], even with such lousy computer the result would be a decent 1.99\cdot 10^{-6}. This because both 2\cdot 10^{-6} and 10^{-12} do not need more than 3 digits. On the contrary, a naive computation, where the products are computed through equation [1] substituting the value of the variables, would give:

[1'] 1 - 0.999\cdot 0.999 = 1.99\cdot 10^{-3}

A terribly wrong result. The short script below, in Python, solve the equation algebraically:

def probabilities_disjunction(probabilities: list):
    n = len(probabilities)
    # tree is a list of strings whose length is len(probabilities) with all possible 0/1 without the 0s.
    # e.g. for three probabilities: 001, 010, 011, 100, 101, 110, 111 
    tree = []
    for i in range(1, 2**n):
        tree.append(str(bin(i))[2:].zfill(n))
    v = []
    for p in probabilities:
        v.append([1., -p])
    prod = 0
    # multiply all probabilities*(-1) according to tree
    for branch in tree:
        inter_prod = 1.
        for i, leaf in enumerate(branch):
            inter_prod *= v[i][int(leaf)]
        prod += inter_prod
    return -prod

It is clearly non optimised, and the number of computations grows exponentially with the number of probabilities.

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *