Random Number In The Range 1 To Sys.maxsize Is Always 1 Mod 2^10
Solution 1:
@roeland hinted at the cause: in Python 2, sample()
uses int(random.random() * n)
repeatedly. Look at the source code (in your Python's Lib/random.py
) for full details. In short, random.random()
returns no more than 53 significant (non-zero) leading bits; then int()
fills the rest of the low-order bits with zeroes (you're obviously on a machine where sys.maxsize == 2**63 - 1
); then indexing your base (xrange(1, sys.maxsize)
) by an even integer with "a lot" of of low-order 0 bits always returns an odd integer with the same number of low-order 0 bits (except for the last).
In Python 3 none of that happens - random
in Python 3 uses stronger algorithms, and only falls back to random.random()
when necessary. For example, here under Python 3.4.3:
>>> hex(random.randrange(10**70))
'0x91fc11ed768be3a454bd66f593c218d8bbfa3b99f6285291e1d9f964a9'>>> hex(random.randrange(10**70))
'0x7b07ff02b6676801e33094fca2fcca7f6e235481c479c521643b1acaf4'
EDIT
Here's a more directly relevant example, under 3.4.3 on a 64-bit box:
>>>import random, sys>>>sys.maxsize == 2**63 - 1
True
>>>for i in random.sample(range(1, sys.maxsize), 6):...print(bin(i))
0b10001100101001001111110110011111000100110100111001100000010110
0b100111100110110100111101001100001100110001110010000101101000101
0b1100000001110000110100111101101010110001100110101111011100111
0b111110100001111100101001001001101101100100011001001010100001110
0b1100110100000011100010000011010010100100110111001111100110100
0b10011010000110101010101110001000101110111100100001111101110111
Python 3 doesn't invoke random.random()
at all in this case, but instead iteratively grabs chunks of 32 bits from the underlying Mersenne Twister (32-bit unsigned ints are "the natural" outputs from this implementation of MT) , pasting them together to build a suitable index. So, in Python 3, platform floats have nothing to do with it; in Python 2, quirks of float behavior have everything to do with it.
Solution 2:
That depends on a lot of things, like how exactly the RNG is implemented, how much bits of state it uses, and how exactly the sample
function is implemented.
Here's what the documentation says:
Almost all module functions depend on the basic function random(), which generates a random float uniformly in the semi-open range [0.0, 1.0). Python uses the Mersenne Twister as the core generator. It produces 53-bit precision floats and has a period of 2**19937-1.
So if the sample
indeed uses random()
under the hood, then you should only expect 53 bits of meaningful bits in your result.
Solution 3:
It certainly looks like rounding error in random.sample.
The bottom 4 or so bits are always zero after the multiplication by the spread of the range (maxsize -1
) then when the start of the range (1
) is added they are always 1
if the multiplication was working correctly, given that the spread is not a power of two, and given that the random number only has 53 varying bits I'd expect to see varying values in the rightmost bits too.
Post a Comment for "Random Number In The Range 1 To Sys.maxsize Is Always 1 Mod 2^10"