Author

Peter Reilly is a member of the Bailey Network, a group of former analysts and investors who are now consulting in the reporting space

Most people are terrible at assessing personal risk. We tend to overestimate the risk of something rare but scary – such as getting eaten by a shark – and underestimate the mundane – such as crossing the road while using a mobile phone. Some of this is just human nature, but I think there is a deeper factor at work. I think many people find statistics quite confusing.

This is especially true of probability, or the likelihood of an event occurring. Probability can sometimes be calculated accurately – the outcome of a coin toss, for example – but it’s often just an estimate, based on lots of small assumptions. In the real world, almost nothing is truly random. Almost everything that happens is caused, directly or indirectly, by a combination of previous events.

Small digits are just more common than large ones

Numbers are not equal

Rather surprisingly, this observation also applies to numbers. It seems obvious that the numbers in, say, an annual report are distributed randomly, with ‘ones’ appearing as often as ‘nines’. In reality, numbers do not occur with equal frequency, and this has subtle but important implications.

Consider a company with revenues of €100m, which is growing at a steady 5% each year. It will take about 14 years for the revenue to double to €200m. It will then take eight years to reach €300m and six more years to reach €400m. Growing from €800m to €900m takes less than three years. If you look back at the company’s annual revenues over a very long period, you will notice that its sales will start with a one much more often than with a nine.

Almost all algorithms follow a certain routine, which means that their outputs are not truly random

This underlying explanation is that going from 100 to 200 is an increase of 100%, but going from 800 to 900 is just 12.5%. It’s an inevitable corollary of using decimal numbering. Small digits are just more common than large ones.

Security concern

All this probably sounds arcane but there are some interesting real-world implications. The first one is fraud detection. Large databases should reflect the non-random nature of numbers. If humans start to manipulate the numbers, they will probably change their relative distribution. If the distortion is large enough, it can be spotted by an appropriate analytical program. This will not by itself be firm evidence of fraud, but it should prompt further investigation.

The second implication is for cryptography and digital passwords. Generating random numbers turns out to be very challenging, even for sophisticated computers. Buried inside the computer will be an algorithm. The reasons are slightly different, but almost all algorithms follow a certain routine, which means that their outputs are not truly random. The vaunted random-number generator turns out not to be not truly random after all.

Imagine a world where AI can crack existing codes and create new, uncrackable ones

Many readers will have used a token to remotely access their IT networks. The token displays a number that changes every 30 seconds or so and appears to offer perfect security. It is generated by an algorithm and if this is compromised, the security is suddenly worthless. This exact situation occurred some years ago with RSA’s SecurID tokens, and the vendor was initially less than transparent about the breach.

Cryptographers have known about the difficulty of generating truly random numbers for many decades. But when was the last time you talked to one (and understood what they were saying)?

The last implication is speculative. Artificial intelligence (AI) seems to have replaced quantum computing as the next big thing. It is not far-fetched to imagine a world where AI can both crack all existing codes, by taking advantage of the non-random nature of numbers, and create new, uncrackable ones that are genuinely random. Now, that is a scary prospect.

Advertisement