# How big can an error be when we estimate something?

Often, when you make an estimation based on many assumptions, people say "There might be errors in all your assumptions, and the error on the result, being the sum of all these errors, is going to be huge".

In reality, errors compensate each others.  You might overestimate one variable, but will underestimate the next one. Unless you are biased, the error will grow like a drunken wanderer.

Say we want to estimate the number $N$ of something. Number of candies eaten by children in the world. Or piano tuners in Chicago. Or whatever.

To estimate $N$, we multiply estimated values $e_i$ of the factors which contribute to $N$, whose real (unknown) values is $a_i$. For estimating the candies, we might have the number of people in the world, fraction of children, sugar-producing crops and so on.

$N = a_1 \cdot a_2 \cdot ... \approx e_1 \cdot e_2 ...$

In the end, will compute our estimate by multiplying all $e_i$.

Now, let's say that you are really bad in estimating, and you never get the right value. All $e_i$ are wrong by a factor 2 –sometimes your estimate $e_i$ is the double, sometimes is one half of the actual value $a_i$.

Now you do what any good engineer would have done before the advent of pocket calculator when had to multiply numbers –you sum logarithms:

$log(N) = \sum_i log(a_i) \approx \sum_i log(e_i)$

($\sum$ means "sum".) But we said your estimates are

$e_i = a_i \cdot random(2, 0.5)$

Or

$log(e_i) = log(a_i) + random(+1, -1)$

Approximating $log(2)$ to $1$.

This allows us to separate the errors from the estimates and write

$log(N) \approx \sum log(a_i) + \sum random(+1, -1)$

$log(N) \approx log(N) + log(\sigma_{final})$

where $\sigma_{final}$ is the error you'll get at the end of the estimate.

The logarithm of the final error, $\sigma_{final}$ , actually diffuses quite slowly. Like drunken wanderers who can only walk on a line, will make one step in one direction, than two steps in the opposite direction, and so on. After $S$ steps, 70% of those drunken wanderers are on average no more than $\sqrt{S}$ steps away from their starting point.

This means that 70% of the times the log of your estimation error is not bigger than $\sqrt{S}*log(\sigma)$, where $\sigma$ is the average (estimated) error factor, which we initially assumed to be 2. Or

$\sigma_{final} = \sigma^{\sqrt{S}}$

With $S$ number of assumptions you made, $\sigma$ the average error for each factor, $\sigma_{final}$ the final estimation error.