/* stuff for /dev/random * random.c -- A strong random number generator * * Version 1.05, last modified 17-Apr-99 * * Copyright Theodore Ts'o, 1994, 1995, 1996, 1997, 1998, 1999. All * rights reserved. * * This routine gathers environmental noise from device drivers, etc., * and returns good random numbers, suitable for cryptographic use. * Besides the obvious cryptographic uses, these numbers are also good * for seeding TCP sequence numbers, and other places where it is * desirable to have numbers which are not only random, but hard to * predict by an attacker. * * Theory of operation * =================== * * Computers are very predictable devices. Hence it is extremely hard * to produce truly random numbers on a computer --- as opposed to * pseudo-random numbers, which can easily generated by using a * algorithm. Unfortunately, it is very easy for attackers to guess * the sequence of pseudo-random number generators, and for some * applications this is not acceptable. So instead, we must try to * gather "environmental noise" from the computer's environment, which * must be hard for outside attackers to observe, and use that to * generate random numbers. In a Unix environment, this is best done * from inside the kernel. * * Sources of randomness from the environment include inter-keyboard * timings, inter-interrupt timings from some interrupts, and other * events which are both (a) non-deterministic and (b) hard for an * outside observer to measure. Randomness from these sources are * added to an "entropy pool", which is mixed using a CRC-like function. * This is not cryptographically strong, but it is adequate assuming * the randomness is not chosen maliciously, and it is fast enough that * the overhead of doing it on every interrupt is very reasonable. * As random bytes are mixed into the entropy pool, the routines keep * an *estimate* of how many bits of randomness have been stored into * the random number generator's internal state. * * When random bytes are desired, they are obtained by taking the SHA * hash of the contents of the "entropy pool". The SHA hash avoids * exposing the internal state of the entropy pool. It is believed to * be computationally infeasible to derive any useful information * about the input of SHA from its output. Even if it is possible to * analyze SHA in some clever way, as long as the amount of data * returned from the generator is less than the inherent entropy in * the pool, the output data is totally unpredictable. For this * reason, the routine decreases its internal estimate of how many * bits of "true randomness" are contained in the entropy pool as it * outputs random numbers. * * If this estimate goes to zero, the routine can still generate * random numbers; however, an attacker may (at least in theory) be * able to infer the future output of the generator from prior * outputs. This requires successful cryptanalysis of SHA, which is * not believed to be feasible, but there is a remote possibility. * Nonetheless, these numbers should be useful for the vast majority * of purposes. * * Exported interfaces ---- output * =============================== * * There are three exported interfaces; the first is one designed to * be used from within the kernel: * * void get_random_bytes(void *buf, int nbytes); * * This interface will return the requested number of random bytes, * and place it in the requested buffer. * * The two other interfaces are two character devices /dev/random and * /dev/urandom. /dev/random is suitable for use when very high * quality randomness is desired (for example, for key generation or * one-time pads), as it will only return a maximum of the number of * bits of randomness (as estimated by the random number generator) * contained in the entropy pool. * * The /dev/urandom device does not have this limit, and will return * as many bytes as are requested. As more and more random bytes are * requested without giving time for the entropy pool to recharge, * this will result in random numbers that are merely cryptographically * strong. For many applications, however, this is acceptable. * * Exported interfaces ---- input * ============================== * * The current exported interfaces for gathering environmental noise * from the devices are: * * void add_keyboard_randomness(unsigned char scancode); * void add_mouse_randomness(__u32 mouse_data); * void add_interrupt_randomness(int irq); * void add_blkdev_randomness(int irq); * * add_keyboard_randomness() uses the inter-keypress timing, as well as the * scancode as random inputs into the "entropy pool". * * add_mouse_randomness() uses the mouse interrupt timing, as well as * the reported position of the mouse from the hardware. * * add_interrupt_randomness() uses the inter-interrupt timing as random * inputs to the entropy pool. Note that not all interrupts are good * sources of randomness! For example, the timer interrupts is not a * good choice, because the periodicity of the interrupts is to * regular, and hence predictable to an attacker. Disk interrupts are * a better measure, since the timing of the disk interrupts are more * unpredictable. * * add_blkdev_randomness() times the finishing time of block requests. * * All of these routines try to estimate how many bits of randomness a * particular randomness source. They do this by keeping track of the * first and second order deltas of the event timings. * * Ensuring unpredictability at system startup * ============================================ * * When any operating system starts up, it will go through a sequence * of actions that are fairly predictable by an adversary, especially * if the start-up does not involve interaction with a human operator. * This reduces the actual number of bits of unpredictability in the * entropy pool below the value in entropy_count. In order to * counteract this effect, it helps to carry information in the * entropy pool across shut-downs and start-ups. To do this, put the * following lines an appropriate script which is run during the boot * sequence: * * echo "Initializing random number generator..." * random_seed=/var/run/random-seed * # Carry a random seed from start-up to start-up * # Load and then save 512 bytes, which is the size of the entropy pool * if [ -f $random_seed ]; then * cat $random_seed >/dev/urandom * fi * dd if=/dev/urandom of=$random_seed count=1 * chmod 600 $random_seed * * and the following lines in an appropriate script which is run as * the system is shutdown: * * # Carry a random seed from shut-down to start-up * # Save 512 bytes, which is the size of the entropy pool * echo "Saving random seed..." * random_seed=/var/run/random-seed * dd if=/dev/urandom of=$random_seed count=1 * chmod 600 $random_seed * * For example, on most modern systems using the System V init * scripts, such code fragments would be found in * /etc/rc.d/init.d/random. On older Linux systems, the correct script * location might be in /etc/rcb.d/rc.local or /etc/rc.d/rc.0. * * Effectively, these commands cause the contents of the entropy pool * to be saved at shut-down time and reloaded into the entropy pool at * start-up. (The 'dd' in the addition to the bootup script is to * make sure that /etc/random-seed is different for every start-up, * even if the system crashes without executing rc.0.) Even with * complete knowledge of the start-up activities, predicting the state * of the entropy pool requires knowledge of the previous history of * the system. * * Configuring the /dev/random driver under Linux * ============================================== * * The /dev/random driver under Linux uses minor numbers 8 and 9 of * the /dev/mem major number (#1). So if your system does not have * /dev/random and /dev/urandom created already, they can be created * by using the commands: * * mknod /dev/random c 1 8 * mknod /dev/urandom c 1 9 * * Acknowledgements: * ================= * * Ideas for constructing this random number generator were derived * from Pretty Good Privacy's random number generator, and from private * discussions with Phil Karn. Colin Plumb provided a faster random * number generator, which speed up the mixing function of the entropy * pool, taken from PGPfone. Dale Worley has also contributed many * useful ideas and suggestions to improve this driver. * * Any flaws in the design are solely my responsibility, and should * not be attributed to the Phil, Colin, or any of authors of PGP. * * The code for SHA transform was taken from Peter Gutmann's * implementation, which has been placed in the public domain. * The code for MD5 transform was taken from Colin Plumb's * implementation, which has been placed in the public domain. The * MD5 cryptographic checksum was devised by Ronald Rivest, and is * documented in RFC 1321, "The MD5 Message Digest Algorithm". * * Further background information on this topic may be obtained from * RFC 1750, "Randomness Recommendations for Security", by Donald * Eastlake, Steve Crocker, and Jeff Schiller. */ /* * The pool is stirred with a primitive polynomial of degree POOLWORDS * over GF(2). The taps for various sizes are defined below. They are * chosen to be evenly spaced (minimum RMS distance from evenly spaced; * the numbers in the comments are a scaled squared error sum) except * for the last tap, which is 1 to get the twisting happening as fast * as possible. */ /* * For the purposes of better mixing, we use the CRC-32 polynomial as * well to make a twisted Generalized Feedback Shift Reigster * * (See M. Matsumoto & Y. Kurita, 1992. Twisted GFSR generators. ACM * Transactions on Modeling and Computer Simulation 2(3):179-194. * Also see M. Matsumoto & Y. Kurita, 1994. Twisted GFSR generators * II. ACM Transactions on Mdeling and Computer Simulation 4:254-266) * * Thanks to Colin Plumb for suggesting this. * We have not analyzed the resultant polynomial to prove it primitive; * in fact it almost certainly isn't. Nonetheless, the irreducible factors * of a random large-degree polynomial over GF(2) are more than large enough * that periodicity is not a concern. * * The input hash is much less sensitive than the output hash. All that * we want of it is that it be a good non-cryptographic hash; i.e. it * not produce collisions when fed "random" data of the sort we expect * to see. As long as the pool state differs for different inputs, we * have preserved the input entropy and done a good job. The fact that an * intelligent attacker can construct inputs that will produce controlled * alterations to the pool's state is not important because we don't * consider such inputs to contribute any randomness. * The only property we need with respect to them is * that the attacker can't increase his/her knowledge of the pool's state. * Since all additions are reversible (knowing the final state and the * input, you can reconstruct the initial state), if an attacker has * any uncertainty about the initial state, he/she can only shuffle that * uncertainty about, but never cause any collisions (which would * decrease the uncertainty). * * The chosen system lets the state of the pool be (essentially) the input * modulo the generator polymnomial. Now, for random primitive polynomials, * this is a universal class of hash functions, meaning that the chance * of a collision is limited by the attacker's knowledge of the generator * polynomail, so if it is chosen at random, an attacker can never force * a collision. Here, we use a fixed polynomial, but we *can* assume that * ###--> it is unknown to the processes generating the input entropy. <-### * Because of this important property, this is a good, collision-resistant * hash; hash collisions will occur no more often than chance. */ /* * This function adds a byte into the entropy "pool". It does not * update the entropy estimate. The caller must do this if appropriate. * * This function is tuned for speed above most other considerations. * * The pool is stirred with a primitive polynomial of the appropriate degree, * and then twisted. We twist by three bits at a time because it's * cheap to do so and helps slightly in the expected case where the * entropy is concentrated in the low-order bits. */ /* * This function adds entropy to the entropy "pool" by using timing * delays. It uses the timer_rand_state structure to make an estimate * of how many bits of entropy this call has added to the pool. * * The number "num" is also added to the pool - it should somehow describe * the type of event which just happened. This is currently 0-255 for * keyboard scan codes, and 256 upwards for interrupts. * On the i386, this is assumed to be at most 16 bits, and the high bits * are used for a high-resolution timer. * * * The function hashes the input data to produce a digest in the first * HASH_BUFFER_SIZE words of the digest[] array, and uses HASH_EXTRA_SIZE * more words for internal purposes. (This buffer is exported so the * caller can wipe it once rather than this code doing it each call, * and tacking it onto the end of the digest[] array is the quick and * dirty way of doing it.) * * It so happens that MD5 and SHA share most of the initial vector * used to initialize the digest[] array before the first call: * 1) 0x67452301 * 2) 0xefcdab89 * 3) 0x98badcfe * 4) 0x10325476 * 5) 0xc3d2e1f0 (SHA only) * * For /dev/random purposes, the length of the data being hashed is * fixed in length (at POOLWORDS words), so appending a bit count in * the usual way is not cryptographically necessary. * As we hash the pool, we mix intermediate values of * the hash back into the pool. This eliminates * backtracking attacks (where the attacker knows * the state of the pool plus the current outputs, and * attempts to find previous ouputs), unless the hash * function can be inverted. * TCP initial sequence number picking. This uses the random number * generator to pick an initial secret value. This value is hashed * along with the TCP endpoint information to provide a unique * starting point for each pair of TCP endpoints. This defeats * attacks which rely on guessing the initial TCP sequence number. * This algorithm was suggested by Steve Bellovin. * * Using a very strong hash was taking an appreciable amount of the total * TCP connection establishment time, so this is a weaker hash, * compensated for by changing the secret periodically.