Why is rand()%6 biased?
When using the function to generate random numbers and applying the modulo operation to obtain a random number in the range 0 to 5, a bias can occur. This bias arises primarily from the mismatch between the range of random numbers generated by and the modulus.The function typically returns an integer in the range [0, RANDMAX], where RANDMAX is a system-defined constant (e.g., 32767 in many systems). Performing compresses the uniformly distributed random numbers from into the range 0 to 5.However, the issue is that 32767 (assuming RAND_MAX is 32767) is not divisible by 6; the division yields a quotient of 5459 and a remainder of 1. Consequently, some numbers in the range 0 to 5 have one more possible outcome than others.Specifically, when returns values in the intervals [0, 5459], [5460, 10919], [10920, 16379], [16380, 21839], [21840, 27299], and [27300, 32766], the modulo operation yields 0, 1, 2, 3, 4, and 5, respectively. However, because 32767 is the last value and the modulo result is 1, the outcome of 1 has one more possibility than the others.This results in the numbers 0 to 5 in not being uniformly distributed. Specifically, the probability of 1 is slightly higher than that of the other numbers (0, 2, 3, 4, 5).To achieve a more uniform distribution when using , the following methods can be employed:Use more sophisticated random number generation algorithms, such as Mersenne Twister (typically implemented via ).Use rejection sampling, i.e., only compute the modulo when returns a value within a range that is divisible by 6. For example, compute only when returns a value less than 32766 (32766 is the largest number less than 32767 that is divisible by 6).By employing these methods, the uneven distribution caused by the modulo operation can be minimized, resulting in more uniformly distributed random numbers.