Skip to content Skip to sidebar Skip to footer

Random Number In The Range 1 To Sys.maxsize Is Always 1 Mod 2^10

I am trying to find the statistical properties of the PRNGs available in Python (2.7.10) by using the frequency test, runs test and the chi squared test. For carrying out the frequ

Solution 1:

@roeland hinted at the cause: in Python 2, sample() uses int(random.random() * n) repeatedly. Look at the source code (in your Python's Lib/random.py) for full details. In short, random.random() returns no more than 53 significant (non-zero) leading bits; then int() fills the rest of the low-order bits with zeroes (you're obviously on a machine where sys.maxsize == 2**63 - 1); then indexing your base (xrange(1, sys.maxsize)) by an even integer with "a lot" of of low-order 0 bits always returns an odd integer with the same number of low-order 0 bits (except for the last).

In Python 3 none of that happens - random in Python 3 uses stronger algorithms, and only falls back to random.random() when necessary. For example, here under Python 3.4.3:

>>> hex(random.randrange(10**70))
'0x91fc11ed768be3a454bd66f593c218d8bbfa3b99f6285291e1d9f964a9'>>> hex(random.randrange(10**70))
'0x7b07ff02b6676801e33094fca2fcca7f6e235481c479c521643b1acaf4'

EDIT

Here's a more directly relevant example, under 3.4.3 on a 64-bit box:

>>>import random, sys>>>sys.maxsize == 2**63 - 1
True
>>>for i in random.sample(range(1, sys.maxsize), 6):...print(bin(i))
0b10001100101001001111110110011111000100110100111001100000010110
0b100111100110110100111101001100001100110001110010000101101000101
0b1100000001110000110100111101101010110001100110101111011100111
0b111110100001111100101001001001101101100100011001001010100001110
0b1100110100000011100010000011010010100100110111001111100110100
0b10011010000110101010101110001000101110111100100001111101110111

Python 3 doesn't invoke random.random() at all in this case, but instead iteratively grabs chunks of 32 bits from the underlying Mersenne Twister (32-bit unsigned ints are "the natural" outputs from this implementation of MT) , pasting them together to build a suitable index. So, in Python 3, platform floats have nothing to do with it; in Python 2, quirks of float behavior have everything to do with it.

Solution 2:

That depends on a lot of things, like how exactly the RNG is implemented, how much bits of state it uses, and how exactly the sample function is implemented.

Here's what the documentation says:

Almost all module functions depend on the basic function random(), which generates a random float uniformly in the semi-open range [0.0, 1.0). Python uses the Mersenne Twister as the core generator. It produces 53-bit precision floats and has a period of 2**19937-1.

So if the sample indeed uses random() under the hood, then you should only expect 53 bits of meaningful bits in your result.

Solution 3:

It certainly looks like rounding error in random.sample.

The bottom 4 or so bits are always zero after the multiplication by the spread of the range (maxsize -1) then when the start of the range (1) is added they are always 1

if the multiplication was working correctly, given that the spread is not a power of two, and given that the random number only has 53 varying bits I'd expect to see varying values in the rightmost bits too.

Post a Comment for "Random Number In The Range 1 To Sys.maxsize Is Always 1 Mod 2^10"