How to Generate Better Random Numbers At The Bash Command Line

0
81
Shutterstock/David Orcea

Generating random numbers in bash seems simple to do using the $RANDOM variable, but is the variable truly that random? Find out what may be preventing you generating high quality random numbers, and more!

Random Numbers At The Terminal

It seems simple to generate a random number in Bash:

But is the number truly random?

Not really, as we can see. The random number generator in Bash depends on a seed – a value passed to the random number generation function – which, provided the seed is the same, will always generate the same sequence of random numbers as the example output above shows.

We can initialize the random number generator with a seed value by setting the RANDOM variable to the desired seed value. So perhaps we can provide a random number as seed to the random generator?

It seems to work for a little, each time we want to generate a random number we pre-seed the random number generator with a randomly generated number. But all we did was trick ourselves for a little while; we just created one extra layer of depth, but the outcome is about the same; numbers are not random and can be influenced by a fixed seed provided earlier.

This problem is called a problem of ‘random entropy’ generation. The more entropy we can generate, the better our random numbers will be. This particular problem is not limited to Bash only, it exists in all basic computer systems which try to generate random numbers. Random is thus never really random. Some other random systems use for example mouse movements and keyboard strokes and other semi-random input in combination to increase the complexity of the random entropy pool.

So, how can we generate a random number ‘good enough’ to be named truly random?

For this, we would, as a source and seed, require something which is truly, or near to truly, random. We could think about using the date of today, but that isn’t very random on second thoughts. How about the seconds since 00:00:00 UTC on 1 January 1970 (generally called ‘epoch’ in Linux circles)? Maybe, but all one needs is a log file somewhere and the epoch can be reconstructed.

A better solution is using the least significant digits of the nanosecond precision timer:

In principle, even this is not perfect. It may fall under the header of ‘better random number generation’ as per the title of this article, but the entropy is not perfect by definition. Let’s look at this a little closer.

In the example, we take bytes 4 to 9 or 6 numbers from the epoch time, as expressed by date +%N and output as a result of the subshell initiated by $(…). This means our minimum seed is 0 and our maximum seed is 999999. This is only a range of 1 million numbers.

In principle, this system could still be ‘hacked’: one could simply cycle through all those 1 million numbers and grab the random number sequences generated from that. It surely would be a very poor solution for an encryption key generation, for example!

If we select less numbers, the risk for this gets larger. If we select more, the risk becomes smaller, but the ‘random seed’ becomes less random too. This can be exemplified by including the seconds since epoch:

We can see the seconds ticking over! Note the initial 6 > 7 > 8 etc.

For standard random number generation purposes – for example in test software which varies it’s testing approach based on a random seed provided to it, the nanosecond based solution is sufficient/enough. For other solutions which may need even better quality random numbers, an external hardware based solution may be required.

True random number generation is a not a straightforward matter. There are hardware based solutions which may come close to, or achieve, true random entropy and/or random number generation. Especially devices which are not only hardware based may be the key to generate that perfect random number.