Download presentation
Presentation is loading. Please wait.
Published byMildred Merritt Modified over 7 years ago
1
Pedro Giffuni pfg @ apache.org Apache OpenOffice committer
Random Numbers: When you need a little bit of chaos in your application. Pedro Giffuni apache.org Apache OpenOffice committer
2
[How this started] AOO i Calc's RAND() behaves poorly on most platforms. Rand() uses a bad Pseudo-random code for generating "random" numbers. Even with as few records as 1000, the "random" numbers repeats itself several times. (Version 3.2 of OpenOffice has the same "feature", so it's not a new one). OpenBSD’s “random in the wild” (tedunangst) blog based on Theo’s “random hunt” posting. Many applications tend to implement their own: Playstation. Uses FreeBSD’s libc but carries an additional implementation (Mersenne Twister) Montecarlo analysis and simulators (Spice) tend to use their own code.
3
According to a random definition
A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; sequences such as the results of an ideal dice roll, or the digits of π exhibit statistical randomness (from Wikipedia). Code to generate such sequences is ubiquitous: ANSI C, C++, Boost, OpenOffice Calc RAND(), Encryption requires it’s own Number Generators: encrypted text is meant to look like PNRG (ZFS talk).
4
So how to generate PRNG: Back to π
Constant value of the result of the division C/d π is an irrational number, meaning that it cannot be written as the ratio of two integers. …
5
Some notes on the Pi number method
Pi is not the only irrational number, it just happens to be a natural example, and independent of accidental “physics”. Near to fractals? Computationally a no-go: numerically intensive, still stuff for supercomputers. There was a time when you would buy books with random numbers tables. We are actually doing a division and the result come from the modulus operation. Perhaps this can be done faster in some way even if the sequence is not completely irrational? Orthogonal to real “noise”, which is not necessarily random. RFC 4086.
6
Random number generators in libc
FreeBSD version lives in lib/libc/stdlib/rand.c Based on "Random number generators: good ones are hard to find", * Park and Miller, Communications of the ACM, vol. 31, no. 10, * October 1988, p /* * Compute x = (7^5 * x) mod (2^31 - 1) * without overflowing 31 bits: * (2^31 - 1) = * (7^5) … The basic algorithm was created by D.H Lemmer in AKA Linear Congruential Generator. Subject to Hull-Dobell Theorem for maximizing periods. Stores state in 32 or 64 bits. Maximum period is 2^31.
7
Problems of the LCG While fast, not very good quality: there is a period. No restitution (not necessarily a problem). Period not easy to determine: depends on parameters. Common error: use of non prime numbers. Passes only basic tests: no MonteCarlo analysis, Not crypto-ready. Seed is limited to 32 bits: in the case of libc, this is a standards-imposed limitation. Seed happens to be important: may need “warm-up”.
8
Theo de Radt’s hunt for seed values
This is a rough sorting of the popular 'constant' seeds people like to use. 1 srand (42) 1 srand(0x1234) 1 srand(0xabad1dea) 1 srand(1) 1 srand(100) 1 srand(121212) 1 srand(14878) 1 srand(1729) 1 srand(1806) 1 srand(180673) 1 srand(1969) 1 srand(301) 1 srand(9) 1 srandom(0) 1 srandom(10) 1 srandom(12346) 1 srandom(4) 1 srandom(7) 1 srandom(8) 2 srand (0x31DF840C) 2 srand( 0 ) 2 srand(42) 3 srand (0) 3 srandom(0xDEADBEEF) 3 srandom(1) 5 srand( L) 7 srand( ) 7 srandom(27) 17 srand(380843) 21 srand(0) Beyond that, the "improvements" are pretty much time(NULL) and getpid() constructs.
9
Mersenne-Twister (1997) Widely available: Included in the Playstation4, standard in STL C++. Huge period: − 1 Relatively fast: accelerated versions available. Great for MonteCarlo Analysis Passes *almost* all the tests. Not Crypto-grade: used in challenges for “newbie” hackers.
10
Xorshift randomizers Discovered by George Marsaglia (2003)
Xorshift RNGs , Journal of Statistical Software. July 2003. Mostly equivalent to LCG: are also easily computable. Many interesting variants. As the LCGs they don’t pass all the statistical tests and are not crypto- quality.
11
What we did for Apache OpenOffice
Problem is clearly a dependency on system-generated PRNG. Calc’s RAND() and RANDBETWEEN() affected. Generate seeds with SAL’s rtl_random() seems reasonable. Much more needed to be done *now*. Implemented (copy/paste) 2006 PRNG variant designed by B.A Wichmann and I.D Hill (r ). Appeared to be used by Microsoft Office according to online link. Replace WH2006 with Mersenne Twister from Boost (r ). Replace MT with KISS from George Masaglia (r ). Enhanced seeding with rtl_random().
12
// Congruential #define CNG ( ScCNG = * ScCNG ) // Xorshift #define XS ( ScXS ^= (ScXS << 13), ScXS ^= (ScXS >> 17), ScXS ^= (ScXS << 5) ) #define KISS (b32MWC() + CNG + XS) void ScInterpreter::ScRandom() { RTL_LOGFILE_CONTEXT_AUTHOR( aLogger, "sc", "pfg", "ScInterpreter::ScRandom" ); static sal_Bool SqSeeded = sal_False; static sal_uInt32 ScCNG, ScXS = ; // Seeding for the PRNG if (SqSeeded == sal_False) { rtlRandomPool aPool = rtl_random_createPool(); rtl_random_getBytes(aPool, &ScCNG, sizeof(ScCNG)); rtl_random_getBytes(aPool, &nScRandomQ, sizeof(nScRandomQ[0]) * SCRANDOMQ_SIZE); rtl_random_destroyPool(aPool); SqSeeded = sal_True; } PushDouble(static_cast<double>(KISS) / SAL_MAX_UINT32); KISS
13
Lessons learned You are very probably using the wrong PRNG and you are using it wrongly too. OpenOffice is widely general-purpose but some of the above may apply to other function implementations as well. Still not a complete solution: OS dependent issues, but the problem has been moved to SAL. Amusement with OpenBSD’s “solution”, breaking ANSI C. Perhaps also a layering violation: the OS can’t know better than the application.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.