Secure random number generation in JAVA
“Random numbers” means numbers which are random in practice (i.e. unpredictable and non – reproducible). As simple this term looks when you hear it for the first time, it is more difficult to reproduce. It is a bit different when we talk about single random numbers or random numbers in sequence. A single random number is the one drawn from set of possible values each with equal probability, like tossing of a coin, while in case of sequence of random numbers, each number in the sequence should be statistically independent of each other.
We say that A is statistically independent of B whether B happens or not; it makes no difference how often A happens. Statistical independence means one event conveys no information about the other; statistical dependence means there is some information.
11 courses, 8+ hours of training
- http://search.dilbert.com/comic/Random%20Number%20Generator
- As shown in Dilbert comic, you can never be sure if the random numbers generated are truly random or not. It might be the case that when Dilbert arrived, that was the first time the generator was producing this sequence.
In the case of generating random numbers using a computer, this might look like a “black box” behaving as a “Random Number Generator.” However, there are two main approaches to generating the random number: the Pseudo-Random Number Generator (PRNG) and theTrue-Random Number Generator (TRNG). Both have them have their own pros and cons and are useful in different situations.
Pseudo-random number generator:
As the word ‘pseudo’ suggests, they appear random but are not random in the way you might expect them to be. These use some mathematical formulae like Linear Congruential Formula (explained later) or some pre-calculated tables to produce the random numbers.
As stated by John Von Neumann, 1951:-
"Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."
This might be interpreted against PRNGs but the true motive behind saying this was to warn against the randomness of such random number generators.
True-random number generator:
TRNG typically use some external physical phenomena to generate random numbers and hence are more random than PRNG’s.
It’s easy to demonstrate the basic difference between both of them by using the roll a dice example. With PRNG, it will eventually repeat itself and given a starting point (seeding), it is easy to reproduce the sequence. While TRNG will make use of some hardware phenomena or atmospheric events like keyboard event, interrupts etc. to generate the sequence.
There are lots of characteristic differences between both the generators which make them useful in different scenarios.
Random number generation in Java:-
Java provides mainly two sets of API/classes to generate Random numbers: Random and SecureRandom.
Random API:
This is used to generate a stream of pseudorandom numbers. It uses a 48-bit seed (the initial data) which is then modified using a linear congruential formula.
Linear congruential formula works by computing each successive random number from the previous number. It is very simple to understand and is defined by a recurrence relation:
Xi+1 = (A*Xi + C) mod M
where Xi is the sequence of pseudorandom values and Xo, 0 <= Xo <=m the "seed" or "start value"
Even though the generators using this formula are fast and require minimal memory (typically 32 or 64 bits), these generators can't be used when randomness is a critical issue, like generating keys, session ids etc. The problem with these generators is that once you know the starting seed or the previous value, it’s very easy to find out the next value. At any point of time and given a value, the next random value will always be the same, which is obviously not desirable when you are using it for session ids or passwords.
To explain this, we have an example :
import java.util.Random;
public class RandomGenerator {;
public static void main(String[] args) {
Random rand = new Random(123456);
for (int j = 0; j<5; j++)
{
int nextRandom = rand.nextInt(5);
System.out.print(nextRandom+" ");
}
}
}
No matter where, at what time and who runs this, the sequence of random number generated will always be:
3 2 2 0 3
Or in the words of Java Documentation:
"If two instances of Random are created with the same seed, and the same sequence of method calls is made for each, they will generate and return identical sequences of numbers."
And that happens because all the other methods like nextInt(), nextLong() makes use of definition of next() method and the way next() calculates random number is:
" The method next is implemented by class Random by atomically updating the seed to
(seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1)
and returning
(int)(seed >>> (48 - bits)).
This is a linear congruential pseudorandom number generator, as defined by D. H. Lehmer and described by Donald E. Knuth in The Art of Computer Programming, Volume 3: Seminumerical Algorithms, section 3.2.1.” - Random API Documentation.
By default, the seed for the Random algorithm is the system time since January 1, 1970, measured in milliseconds. Hence, if the person knows the running time of the application, it is not difficult to guess the random number generated.
SecureRandom vs. random:
If you have been using java.util.Random API of Java to generate random numbers in places desiring good security, then you might consider java.util.SecureRandom, because if there is insufficient randomness in the random numbers generated by your generator, it compromises the security and protection of your system.
This differs from the Random class because it produces cryptographically strong random numbers using cryptographically strong pseudo random number generator (CSPRNG).
The generation of random numbers in CSPRNGs uses entropy, which is nothing but an unpredictable input (true random source). It might be a hardware random number generator or possibly some unpredictable system process, such as the timings events, interrupts etc. .
A true random source is a theoretical random number generator that is perfect in every way. The next value produced is completely uncorrelated with all previous values.
SecureRandom implementation:
The java.security.SecureRandom class does not actually implement a pseudorandom number generator (PRNG) itself. It uses PRNG implementations in other classes to generate random numbers. Hence, the randomness of the random numbers, security and performance of SecureRandom depends on the algorithm chosen. If you want truly "cryptographically strong" randomness, then you need a strong entropy source. Entropy here refers to nothing but the randomness collected by an operating system or application. The entropy source is one which collects random data and supplies to destination.
The file which controls the configuration of the SecureRandom API is located at: $JAVA_HOME/lib/security/java.security
We can select the source of seed data for SecureRandom by using the entropy gathering device specified in securerandom.source property in java.security file.
E.g. securerandom.source=file:/dev/urandom
OR securerandom.source=file:/dev/random
On Solaris and Linux systems, if file:/dev/urandom is specified and it exists, "NativePRNG" reads random bytes directly from /dev/urandom while in case of Windows systems, the URLs file:/dev/random and file:/dev/urandom enables use of the Microsoft CryptoAPI seed functionality.
Specifying the system property -Djava.security.egd=file:/dev/urandom can override the use of securerandom.source property and has a higher precedence. Apart from the entropy source, another important consideration in case of CSPRNG is the provider. The term "provider" refers to a package or set of packages that supply a concrete implementation of a subset of the cryptography aspects of the Java Security API. A provider may, for example, implement one or more digital signature algorithms or message digest algorithms.
E.g. To register the SUN Provider:-
security.provider.1=sun.security.provider.Sun
Looking closely at the java implementation of SecureRandom, setSeed method in SecureRandom is defined as:
public void setSeed(byte[] seed)
{
secureRandomSpi.engineSetSeed(seed);
isSeeded = true;
}
SecureRandomSpi defines the Service Provider Interface (SPI) for the SecureRandom class. All three abstract methods (engineSetSeed,engineGenerateSeed,and engineNextBytes) in this class must be implemented by each service provider who wishes to supply the implementation of a cryptographically strong pseudo-random number generator. The better the implementation of cryptographically strong pseudo random number generator, the more secure the random numbers generated would be.
On Linux, the default implementation for SecureRandom is "NativePRNG," while on Windows, the default is "SHA1PRNG" which you can also use on Linux if you explicitly specify it.
From here on, we will discuss NativePRNG implementation and issues.
NativePRNG is one such class in Linux which extends secureRandomSPI.
public final class NativePRNG extends SecureRandomSpi
NativePRNG interacts with /dev/random and /dev/urandom, so it is only available if those files are present.
The important consideration here is that the given seed in the setSeed function supplements the existing seed instead of replacing it. Hence , unlike Random() function , repeated calls to the function based on the seed don’t yield the same sequence; this, in turn, ensures randomness.
setSeed() -> engineSetSeed() -> implSetSeed()
implSetSeed() sends random bytes to OS and tries to write to /dev/(u)random . Since getSeed() and setSeed() directly read/write /dev/random and /dev/random is only writable by root in many configurations, there is a SHA1PRNG instance around in parallel in NativePRNG. This instance is used so that seed bytes sent via setSeed() are not ignored and are taken into consideration in the algorithm. Also , when entropy pool runs out of random data , SHA1PRNG instance is used to generate random numbers.
The other advantage NativePRNG has over SHA1PRNG and many other random number generator algorithms is that it continuously receives entropy from the operating system (by reading from /dev/(u)random). The other PRNGs do not acquire any additional entropy after seeding.
Here we can actually conclude that performance of SecureRandom depends on the base algorithm i.e. the source of the randomness . In case of Linux, it’s about NativePRNG which in turn is about /dev/random or /dev/urandom behaviour.
Random and Urandom are both devices in /dev in Linux to generate random data, but have a few differences . These devices are like special files that use some source of entropy to generate random data. A kernel gathers information from external sources (i.e. some unpredictable sources like interrupts , keyboard events , mouse movements) to generate input to the entropy pool . This input is nothing but some bits with extremely strong random properties. The entropy pool acts as source of input for randomness for both the devices mentioned above.
Read from /dev/random device is blocked if the entropy pool runs low on the bits until there is sufficient entropy gathered. While it’s not the same case with dev/urandom device. urandom reads from the entropy pool till sufficient entropy is available. Once the entropy pool is exhausted ,it uses SHA cryptographic hash algorithm to generate strong random numbers.
Let’s consider the example below to see how much random data both the devices generate over span of 30 seconds with almost the same disk intensive work :
[parul@pargarg-mc ~]$ dd if=/dev/random of=/tmp/parul
0+29 records in
1+0 records out
512 bytes (512 B) copied, 36.2981 seconds, 0.0 kB/s
[parul@pargarg-mc ~]$ dd if=/dev/urandom of=/tmp/parul
653417+0 records in
653416+0 records out
334548992 bytes (335 MB) copied, 37.1233 seconds, 9.0 MB/s
Its 512 B ( random ) VS 335 MB (urandom), hence if we want more random numbers to be generated in short span of time, /dev/urandom is always a better option. Since /dev/random gets blocked and affects the performance of the application due to low entropy, it should be used only when it is strongly needed i.e. in cases of long-lived GPG/SSL/SSH keys [ http://www.kernel.org/doc/man-pages/online/pages/man4/random.4.html].
People often overestimate the randomness they require and underestimate the randomness of /dev/urandom device, leading to over usage of /dev/random. As Ted Ts'o, who wrote the kernel RNG, say: -
“Past a certain point /dev/urandom will start returning results which are cryptographically random. At that point, you are depending on the strength of the SHA hash algorithm, and actually being able to not just to find hash collisions, but being able to trivially find all or most possible pre-images for a particular SHA hash algorithm. If that were to happen, it's highly likely that all digital signatures and openssh would be totally broken”
SecureRandom Usage:
OWASP Enterprise Security API (ESAPI) provides a security control library for helping programmers to integrate security into their applications. ESAPI aims to provide a “best practice” kind of list for security issues. It provides a Randomizer interface which is then implemented by DefaultRandomizer which defines a set of methods for creating cryptographically random numbers and strings.
As we can see from Randomizer implementation, it internally makes use of SecureRandom API and not Random:
public class DefaultRandomizer implements org.owasp.esapi.Randomizer {
/** The sr. */
private SecureRandom secureRandom = null;
Similar to this, OWASP has another project CSRFGuard which is a library that implements a variant of the synchronizer token pattern to mitigate the risk of Cross-Site Request Forgery (CSRF) attacks. It makes use of a unique per-session token verification pattern using a JavaEE filter to mitigate the risk of CSRF attacks. When an HttpSession is first instantiated, CSRFGuard will generate a cryptographically random token using the SecureRandom class to be stored in the session.
11 courses, 8+ hours of training
Conclusion
It’s not only about just the usage of SecureRandom in Java but using it securely, too. The trade-off is between performance and security, so choose wisely.