The floating point representation of numbers is dangerous. With floating points, the computer stores the digits and the position of the point. Imagine a "6-digit decimal computer". This is a computer that uses base 10 numbers, like we human, but can only store 6 digits, the position of the dot, and the power of 10 used in scientific notation.

Such computer would represent the number as:

That means the number is rounded down by about one millionth. But it might get much worse. Have a look at the following operation:

It leads to , which means the computer produced a rounding error of about 10%! Now you should feel sorry you did not write:

which would have given .

Normally, this is not a big deal. But in probability computation it is. Take the way the disjunction (union) of probabilities is computed. Given events whose probability is . The probability that at least one of these events happens is:

[1]

(the symbols means or is true, or both). It is immediately clear that if all are very small, and will be rounded like the 1/3 above, the error can become soon big enough to spoil your computation.

And this is actually a pitty! Because:

[2]

where means "sum" for all values of , in this case 1 to N, all values of and not equal between themselves, etc.

Equation [2] tells us that if we solve [1] algebraically, subtract 1 and only after substitute the values in the various , the error would be smaller.

Let's make an extreme example. If we had two events with :

[3]

Imagine you have a "3-digit decimal computer" (3 digits plus the position of the point and the power of 10). If you solve equation [3] using the formula in [2], even with such lousy computer the result would be a decent . This because both and do not need more than 3 digits. On the contrary, a naive computation, where the products are computed through equation [1] substituting the value of the variables, would give:

[1']

A terribly wrong result. The short script below, in Python, solve the equation algebraically:

def probabilities_disjunction(probabilities: list): n = len(probabilities) # tree is a list of strings whose length is len(probabilities) with all possible 0/1 without the 0s. # e.g. for three probabilities: 001, 010, 011, 100, 101, 110, 111 tree = [] for i in range(1, 2**n): tree.append(str(bin(i))[2:].zfill(n)) v = [] for p in probabilities: v.append([1., -p]) prod = 0 # multiply all probabilities*(-1) according to tree for branch in tree: inter_prod = 1. for i, leaf in enumerate(branch): inter_prod *= v[i][int(leaf)] prod += inter_prod return -prod

It is clearly non optimised, and the number of computations grows exponentially with the number of probabilities.

Interested in the topic? Below two books on Python (I read the High Performance Python, which IMO should be read by anyone writing scientific code) plus two on the subject (which I admit I haven't read).

[amazon_link asins='1449361595,0486679993,B0000CM0MD,0471778648' template='ProductGrid' store='visualab-20' marketplace='US' link_id='cb3d5c13-8d9c-11e7-8525-fb3bd5b60163']