What Every Experimenter Must Know About Randomization

2/19/2026 at 1:57:36 AM

I'm no statistician, but the part about halfway through that says not to use PRNGs for random assignment into bins seems wrong to me?

Sure I can understand why for a research trial you might want just want to be totally safe and use a source of true randomness, but for all practical purposes a decent PRNG used for sorting balls into buckets is totally indistinguishable from true randomness is it not?

I was half expecting this to have been written a few decades ago when really bad PRNGs were in common usage, but the article seems to be timestamped 2025.

by p1necone

2/19/2026 at 3:06:15 AM

Plenty of previously well regarded PRNGs are distinguishable in quite surprising ways.

Perhaps you could say otherwise for a CSPRNG.

by nullc

2/19/2026 at 8:36:36 AM

In PRNGs there is a compromise between speed and the quality of their statistical properties.

So you must choose a PRNG wisely, depending on the intended purpose.

There are PRNGs good enough for any application, including those that use cryptographic mixing functions, but in many cases people prefer the fastest PRNGs.

The problems appear when the fastest PRNGs are used in applications for which they are not good enough, so the PRNG choice must be done carefully, whenever it is likely to matter.

With recent CPUs, the PRNG choice is much simpler than in the past. They can produce high quality random numbers by using AES at a rate only a few times lower than they can fill memory.

Because of this, the speed gap between the fastest PRNGs and good PRNGs has become much narrower than in the past. Therefore, if you choose a very good PRNG you do not lose much speed, so you can make this choice much more often.

Many kinds of non-cryptographic PRNGs have become obsolete, i.e. all those that are slower than the PRNGs using AES, SHA-2 or SHA-1, which use the dedicated hardware included in modern CPUs.

The non-cryptographic PRNGs that remain useful, due to superior speed, contain a linear congruential generator or a Galois field counter, which guarantee maximum period and allow sequence jumps and the ability to generate multiple independent random streams, together with some non-linear mixing function for the output, which improves the statistical properties.

by adrian_b

2/19/2026 at 1:33:17 PM

Note this doesn't apply to GPUs among other things. To that end, counter based PRNGs such as Philox that employ a weakened cryptographic function are useful.

by fc417fc802

2/19/2026 at 3:26:45 AM

For almost all practical purposes, a decent prng is just as good as a csprng. Cryptographic security only becomes relevant in an anverserial situation. Otherwise, you would need whatever weekmess exists in the prng to be somehow correlated with what you are trying to measure.

by gizmo686

2/19/2026 at 6:40:18 AM

If practical purposes include simulating physical processes then the problems with orngs become quite important.

by contubernio

2/19/2026 at 7:11:30 AM

What is an example there?

by RandomLensman

2/19/2026 at 12:15:36 PM

It should be noted that while the article recommends TRNGs with radioactive substances, those are completely unnecessary.

Any high-gain electronic amplifier whose input is connected to a resistor, or for a higher signal level, to a diode, will produce copious amounts of random noise at its output. If the output is converted with a comparator to digital, you have a good source of random bits.

Older people can remember the random audio or video noise of ancient radio receivers or TV sets, when they were tuned outside a correct channel.

In the past, there were easily available analog TV tuners for PCs, which could be used as noise sources. Nowadays, it is still possible to use the microphone input of a PC and connect to it an external analog noise source with a negligible cost, made on a little PCB with a small amplifier IC, e.g. an operational amplifier or an audio amplifier. The microphone input of PCs provides a weak 5 V supply, which would be enough to power a noise source.

The only problem that exists with any analog noise source is that the imperfections of analog-to-digital conversion, e.g. the fluctuations of comparator thresholds, due to temperature or age, can cause a bias in the random bits, so they do not have a truly uniform distribution. This problem also exists with radioactive sources.

Thus the bits must be processed to remove any bias. There are more sophisticated methods, but even the brute force method of using a one-way hash function, e.g. one of the SHA-3 or SHA-2 variants, to hash the bits and produce a uniform random number, is good enough.

Most modern CPUs, since Intel Ivy Lake, have instructions for providing true random numbers. However those are less useful outside their main intended application, of providing temporary session keys for TLS or other network protocols.

Some CPUs, especially from AMD, had bugs in their RNG, so they provided bad values. Even where the TRNG is good, the output passes through AES-128. Its state is small so you may encounter the problems mentioned in the parent article, of unreachable parts of the solution space that you are investigating in some Monte Carlo simulation.

It is preferable to be able to choose yourself the bias-removing method, to be able to use a hashing method with a much greater state.

by adrian_b

2/19/2026 at 2:45:26 PM

> Any high-gain electronic amplifier whose input is connected to a resistor, or for a higher signal level, to a diode, will produce copious amounts of random noise at its output. If the output is converted with a comparator to digital, you have a good source of random bits.

These are so fun. One time I was able to get a device to replay an entire WPA2 handshake by triggering a signal generator with its reset line and feeding the output to its rng ADC. Good parlour trick that one.

by bobmcnamara

2/19/2026 at 3:37:44 AM

All models are wrong, some are useful.

Yes, there are theoretical issues with assuming PRNGs are truly random. However, there are also theoretical issues with assuming that Newton's law of universal gravitation is true.

I am confident is saying that more experiments have gone wrong due to not considering relativity, than have gone wrong due to the proper usage of a statistically sound (even if not cryptographically so) PRNG.

I also suspect that both classes of errors are dwarfed by improper usage or non sound PRNGs.

by gizmo686

2/19/2026 at 6:41:29 AM

Those are qualitatively different issues. Newtonian gravity is very precise in a certain regime according to centuries of experiments. The issues with specific orngs are in many cases not even carefully analyzed.

by contubernio

2/19/2026 at 9:17:09 AM

> After a proud surgeon delivered a lecture on a procedure he’d performed many times, a medical student asked whether half of eligible patients had been set aside as a baseline for comparison.

> “Of course not!” thundered the surgeon. “That would have doomed half of the patients to death!”

> Stunned silence filled the lecture hall, and the student softly asked,

> “Which half?”

damn that was good

by NooneAtAll3

2/19/2026 at 9:59:50 AM

Was it? I see the point the story is trying to make, but if you think about it, the story by itself just doesn't make any sense at all. The "Which half?" line sounds kinda clever, but what is it actually implying? That the student is unsure whether the surgeon killed all his patients?

by streetfighter64

2/19/2026 at 1:56:45 PM

> but what is it actually implying? That the student is unsure whether the surgeon killed all his patients?

yes, exactly that?

it's a parable about null hypothesis - without data to compare against we have no idea whether new procedure is actually any good (as in, better)

in reality conversation would (should) soon turn towards observations by other surgeons, general death rate of the disease or even some basic theory - but simply saying "they would've died, trust me bro" is not it

also note that surgeon is overreacting. "set aside" does not automatically mean "do nothing" - control group should probably be given previous accepted version of treatment, as that's what the new idea is trying to replace

by NooneAtAll3

2/19/2026 at 7:29:26 AM

One advantage of PRNG assignment is that you can reproduce your assignment procedure whereas this is impossible with true RNG. If you use true RNG how could you ever prove to anyone that that was what you actually did?

by kkwteh

2/19/2026 at 8:40:24 AM

You could say how you produced the random numbers and which random numbers came out. What you can't do however, is then prove that you actually did that and only that and not for example re-rolled the dice till the results looked right to you.

But you have the same issue with a PRNG, just boiled down to the seed value. Disprove that I didn't chose seed 42 by random, you can't.

The only way to protect against this is registering the study with a third party beforehand and letting them dice out the numbers as soon as the hashed data is there.

By registering your plan beforehand, someone can check if you altered it to cherrypick or p-hack your results after the fact. By letting them chose the random numbers you protect against you rerolling the numbers when inconvenient. By letting them only see the hashed data you protect against them rerolling/altering based on their interest and against you swapping out data point indicies after the fact, since they can rehash the actual values and check if it matches their selection.

For many experimental designs this could be overkill, but registering studies is becoming more common in medical studies. I doubt they go as far as I described with the random numbers.

by atoav

2/19/2026 at 1:43:04 PM

> The only way

Not so. Assuming I don't need a whole bunch of runs in the final paper I can use the zero seed for the published work. I don't think anyone will contest that.

by fc417fc802

2/19/2026 at 3:36:31 PM

We're talking about randomizing experiment participants into treatment and control group... What you're saying is equivalent to "I can use the same order of assignments for every experiment I do"...

by topaz0

2/19/2026 at 5:09:26 PM

I thought we were speaking more generally. In the specific case of assigning experimental subjects to groups at the time the experiment is performed you'd want a "no tricks up my sleeve" number for the key such as a hash of the date string in ISO standard format (ie yyyy-mm-dd). Your RNG is then AES_ECB( key=sha3(date_string), plaintext=serial_number ) using the serial number of the participant.

If you need to use a rejection method to achieve a uniform distribution you can do so via plaintext=( ( serial_number << 32 ) | sample_counter ).

By adhering to such a scheme it becomes extremely difficult for anyone to reasonably accuse you of underhanded tricks via RNG manipulation.

by fc417fc802

2/18/2026 at 9:29:41 PM

"If N = 300, even a 256-bit seed arbitrarily precludes all but an unknown, haphazardly selected, non-random, and infinitesimally small fraction of permissible assignments. This introduces enormous bias into the assignment process and makes total nonsense of the p-value computed by a randomization test."

The first sentence is obviously true, but I'm going to need to see some evidence for "enormous bias" and "total nonsense". Let's leave aside lousy/little/badly-seeded PRNGs. Are there any non-cryptographic examples in which a well-designed PRNG with 256 bits of well-seeded random state produces results different enough from a TRNG to be visible to a user?

by BigTTYGothGF

2/18/2026 at 10:59:40 PM

The argument against PRNGs this paper makes isn't that the PRNG produces results that can be distinguished from TRNG, but that the 256-bit seed deterministically chooses a single shuffling. If you need 300 bits to truly shuffle the assignment but you only have 256 bits, then that's a lot of potential assignments that can never actually happen. With this argument it doesn't matter what the PRNG is, the fact that it's deterministic is all that matters. And this invalidates the p-value because the p-value assumes that all possible assignments are equiprobable, when in fact a lot of possible assignments have a probability of zero.

I imagine you could change the p-value test to randomly sample assignments generated via the exact same process that was used to generate the assignment used by the experiment, and as you run more and more iterations of this the calculated p-value should converge to the correct value, but then the question becomes is the p-value calculated this way the same as the p-value you'd get if you actually went ahead and used equiprobable assignment to begin with?

Ultimately, this all comes down to the fact that it's not hard to use true randomness for the whole thing, and true randomness produces statistically valid results, if you use true randomness for assignment then you can't screw up the p-value test, and so there's no reason at all to even consider how to safely use a PRNG here, all that does is open the door to messing up.

by lilyball

2/18/2026 at 11:04:30 PM

If you have 300 bits of shuffling entropy, you have a lot of potential assignments that can never happen because you won't test them before the universe runs out. No matter how you pick them.

Of course a PRNG generates the same sequence every time with the same seed, but that's true of every RNG, even a TRNG where the "seed" is your current space and time coordinates. To get more results from the distribution you have to use more seeds. You can't just run an RNG once, get some value, and then declare the RNG is biased towards the value you got. That's not a useful definition of bias.

by gzread

2/19/2026 at 12:13:15 AM

The number of possible assignments has to be effectively close to an integer multiple of the number of shuffles.

It doesn't matter how many universes it would take to generate all of them, there are some assignments that are less likely.

by jmalicki

2/19/2026 at 8:19:36 AM

Everyone agrees that most of the possible shuffles become impossible when a CSPRNG with 256 bits of state is used. The question is just whether that matters practically. The original author seems to imply it does, but I believe they're mistaken.

Perhaps it would help to think of the randomization in two stages. In the first, we select 2^256 members from the set of all possible permutations. (This happens when we select our CSPRNG algorithm.) In the second, we select a single member from the new set of 2^256. (This happens when we select our seed and run the CSPRNG.) I believe that measurable structure in either selection would imply a practical attack on the cryptographic algorithm used in the CSPRNG, which isn't known to exist for any common such algorithm.

by tripletao

2/19/2026 at 10:13:28 AM

Yeah, you're discarding almost all permutations, but in an unbiased manner. It seems to imply not only an attack, but that your experimental results rely strongly and precisely on some extremely esoteric (otherwise it would've been found already) property of the randomization algorithm. If you can only detect the effect of television on turkeys when using a PRNG whose output is appropriately likely to have a high dimensional vector space when formated as a binary square matrix then I think you should probably go back to the drawing board.

by recursivecaveat

2/19/2026 at 12:42:08 PM

The cases that are not close to a multiple are handled by the rejection of a part of the generated random numbers.

Let's say that you have a uniform random number generator, which generates with equal probability anyone of N numbers. Then you want to choose with equal probability one of M choices.

If M divides N, then you can choose 1 of M by either multiplication with taking the integer part, or by division with taking the remainder.

When M does not divide N, for unbiased choices you must reject a part of the generated numbers, either rejecting them before the arithmetic operation (equivalent to diminishing N to a multiple of M), or rejecting them after the arithmetic operation (diminishing the maximum value of the integer part of product or of the division remainder, to match M).

This is enough for handling the case when M < N.

When M is greater than N, you can use a power of N that is greater than M (i.e. you use a tuple of numbers for making the choice), and you do the same as before.

However in this case you must trust your RNG that its output sequence is not auto-correlated.

If possible, using from the start a bigger N is preferable, but even when that is impossible, in most cases the unreachable parts of the space of random number tuples will not make any statistical difference.

To be more certain of this, you may want to repeat the experiment with several of the generators with the largest N available, taking care that they really have different structures, so that it can be expected that whichever is the inaccessible tuple space, it is not the same.

by adrian_b

2/19/2026 at 5:47:26 PM

This is correct, but for the author's example of randomizing turkeys I wouldn't bother. A modern CSPRNG is fast enough that it's usually easier just to generate lots of excess randomness (so that the remainder is nonzero but tiny compared to the quotient and thus negligible) than to reject for exactly zero remainder.

For example, the turkeys could be randomized by generating 256 bits of randomness per turkey, then sorting by that and taking the first half of the list. By a counting argument this must be biased (since the number of assignments isn't usually a power of two), but the bias is negligible.

The rejection methods may be faster, and thus beneficial in something like a Monte Carlo simulation that executes many times. Rejection methods are also often the simplest way to get distributions other than uniform. The additional complexity doesn't seem worthwhile to me otherwise though, more effort and risk of a coding mistake for no meaningful gain.

by tripletao

2/19/2026 at 1:17:28 AM

And why does it matter in the context of randomly assigning participants in an experiment into groups? It is not plausible that any theoretical "gaps" in the pseudorandomness are related to the effect you are trying to measure, and unlikely that there is a "pattern" created in how the participants get assigned. You just do one assignment. You do not need to pick a true random configuration, just one random enough.

I assume that as long as p-values are concerned, the issue raised could very well be measured with simulations and permutations. I really doubt though that the distribution of p-values from pseudorandom assignments with gaps would not converge very fast to the "real" distribution you would get from all permuations due to some version of a law of large numbers. A lot of resampling/permutation techniques work by permuting a negligible fraction of all possible permutations, and the distribution of the statistics extracted converges pretty fast. As long as the way the gaps are formed are independent of the effects measured, it sounds implausible that the p-values one gets are problematic because of them.

by freehorse

2/19/2026 at 6:19:26 AM

P-values assume something weaker than "all assignments are equiprobable." If the subset of possible assignments is nice in the right ways (which any good PRNG will provide) then the resulting value will be approximately the same.

by hansvm

2/19/2026 at 2:55:04 PM

Gallant always uses TRNGs. Goofus always uses a high-quality PRNG (CSPRNG if you like) that's seeded with a TRNG. Everything else they do is identical. What are circumstances under which Goofus's conclusions would be meaningfully different than Gallant's?

Suppose I'm doing something where I need N(0,1) random variates. I sample from U(0,1) being sure to use a TRNG, do my transformations, and everything's good, right? But my sample isn't U(0,1), I'm only able to get float64s (or float32s), and my transform isn't N(0,1) as there's going to be some value x above which P(z>x)=0. The theory behind what I'm trying to do assumes N(0,1) and so all my p-values are invalid.

Nobody cares about that because we know that our methods are robust to this kind of discretization. Similarly I think nobody (most people) should care (too much) about having "only" 256 bits of entropy in their PRNG because our methods appear to be robust to that.

by BigTTYGothGF

2/19/2026 at 5:54:28 AM

> then you can't screw up the p-value test

Bakker and Wicherts (2011) would like to disagree! Apparently 15 % screw up the calculation of the p-value.

by kqr

2/19/2026 at 4:30:03 PM

I think your intuition comes from the assumption that the experimental subjects are already coming to you in a random order. If that's the case, then you might as well assign the first half to control and the second half to treatment. To see the problem with poor randomization, you have to think about situations where there is (often unknown) bias or correlations in the order of the list that you're drawing from to randomize. Say you have an ordered list of 10 numbers, assigned 5 and 5 to control and (null) treatment groups. There are 252 assignments, which in theory should be equally likely. Assuming they all give different values of your statistic, you'll have 12 assignments with p <= .0476. If, say, you do the assignment from ~~a 256~~ an 8 bit random number such that 4 of the possible assignments are twice as likely as the others under your randomization procedure, the probability of getting one of those 12 assignments something between .0469 and .0625, depending whether the more-likely assignments happen to be among the 12 most extreme statistics, which is a difference of about 1/3 and could easily be the difference between "p>.05" and "p<.05". Again, if you start with your numbers in a random order, then this doesn't matter -- the biased assignment procedure will still give you a random assignment, because each initial number will be equally likely to be among the over-sampled or under-sampled ones.

Also worth noting that the situations where this matters are usually where your effect size is fairly small compared to the unexplained variation, so a few percent error in your p-value can make a difference.

by topaz0

2/19/2026 at 5:36:40 PM

> If, say, you do the assignment from a 256 bit random number such that 4 of the possible assignments are twice as likely as the others under your randomization procedure

Your numbers don't make sense. Your number of assignments is way fewer than 2^256, so the problem the author is (mistakenly) concerned about doesn't arise--no sane method would result in any measurable deviation from equiprobable, certainly not "twice as likely".

With a larger number of turkeys and thus assignments, the author is correct that some assignments must be impossible by a counting argument. They are incorrect that it matters--as long as the process of winnowing our set to 2^256 candidates isn't measurably biased (i.e., correlated with turkey weight ex television effects), it changes nothing. There is no difference between discarding a possible assignment because the CSPRNG algorithm choice excludes it (as we do for all but 2^256) and discarding it because the seed excludes it (as we do for all but one), as long as both processes are unbiased.

by tripletao

2/19/2026 at 5:51:44 PM

typo -- meant to say 8 bit random number i.e. having 256 possibilities, convenient just because the number of assignments was close to a power of 2. If instead you use a 248-sided die and have equal probabilities for all but 4 of the assignments, the result is similar but in the other direction. Of course there are many other more subtle ways that your distribution over assignments could go wrong, I was just picking one that was easy to analyze.

by topaz0

2/19/2026 at 6:10:41 PM

Ah, then I see where you got 4 assignments and 2x probability. Then I think that is the problem the author was worried about and that it would be a real concern with those numbers, but that the much smaller number of possibilities in your example causes incorrect intuition for the 2^256-possibility case.

by tripletao

2/19/2026 at 6:14:54 PM

I think the intuition that everything will be fine in the 256 bit vs 300 bit case depends on the intuition that the assignments that you're missing will be (~close to) randomly distributed, but it's far from clear to me that you can depend on that to be true in general without carefully analyzing your procedure and how it interacts with the PRNG.

by topaz0

2/19/2026 at 7:11:49 PM

If you can find a case where this matters, then you've found a practical way to distinguish a CSPRNG seeded with true randomness from a stream of all true randomness. The cryptographers would consider that a weakness in the CSPRNG algorithm, which for the usual choices would be headline news. I don't think it's possible to prove that no such structure exists, but the world's top (unclassified) cryptographers have tried and failed to find it.

by tripletao

2/19/2026 at 6:26:36 PM

And worth noting that the "even when properly seeded with 256 bits of entropy" example in the article was intended as an extreme case, i.e. that many researchers in fact use seeds that are much less random than that.

by topaz0

2/19/2026 at 1:41:27 AM

By the definition of a cryptographically secure PRNG, no. They, with overwhelming probability, produce results indistinguishable from truly random numbers no matter what procedure you use to tell them apart.

by DevelopingElk

2/18/2026 at 11:56:06 PM

So here's how I would think about it intuitively:

We can create a balanced partitioning of the 300 turkeys with a 300 bit random number having an equal number of 1's and 0's.

Now suppose I randomly pick 300 bit number, still with equal 0's and 1's, but this time the first 20 bits are always 0's and the last 20 bits are always 1's. In this scenario, only the middle 260 bits (turkeys) are randomly assigned, and the remaining 40 are deterministic.

We can quibble over what constitutes an "enormous" bias, but the scenario above feels like an inadequate experiment design to me.

As it happens, log2(260 choose 130) ~= 256.

> Are there any non-cryptographic examples in which a well-designed PRNG with 256 bits of well-seeded random state produces results different enough from a TRNG to be visible to a user?

One example that comes to mind is shuffling a deck of playing cards. You need approximately 225 bits of entropy to ensure that every possible 52 card ordering can be represented. Suppose you wanted to simulate a game of blackjack with more than one deck or some other card game with more than 58 cards. 256 bits is not enough there.

by labcomputer

2/19/2026 at 1:32:27 AM

It's an interesting observation and that's a nice example you provided but does it actually matter? Just because certain sequences can't occur doesn't necessarily mean the bias has any practical impact. It's bias in the theoretical sense but not, I would argue, in the practical sense that is actually relevant. At least it seems to me at first glance, but I would be interested to learn more if anyone thinks otherwise.

For example. Suppose I have 2^128 unique playing cards. I randomly select 2^64 of them and place them in a deck. Someone proceeds to draw 2^8 cards from that deck, replacing and reshuffling between each draw. Does it really matter that those draws weren't technically independent with respect to the larger set? In a sense they are independent so long as you view what happened as a single instance of a procedure that has multiple phases as opposed to multiple independent instances. And in practice with a state space so much larger than the sample set the theoretical aspect simply doesn't matter one way or the other.

We can take this even farther. Don't replace and reshuffle after each card is drawn. Since we are only drawing 2^8 of 2^64 total cards this lack of independence won't actually matter in practice. You would need to replicate the experiment a truly absurd number of times in order to notice the issue.

by fc417fc802

2/19/2026 at 9:20:11 AM

At a certain point a bias in the prng just has to become significant? This will be a function of the experiment. So I don’t think it’s possible to talk about a general lack of “practical impact” without specifying a particular experiment. Thinking abstractly - where an “experiment” is a deterministic function that takes the output of a prng and returns a result - an experiment that can be represented by a constant function will be immune to bias, while one which returns the nth bit of the random number will be susceptible to bias.

by derriz

2/19/2026 at 1:25:12 PM

> At a certain point a bias in the prng just has to become significant?

Sure, at a point. I'm not disputing that. I'm asking for a concrete bound. When the state space is >= 2^64 (you're extremely unlikely to inadvertently stumble into a modern PRNG with a seed smaller than that) how large does the sample set need to be and how many experiment replications are required to reach that point?

Essentially what I'm asking is, how many independent sets of N numbers must I draw from a biased deck, where the bias takes the form of a uniformly random subset of the whole, before the bias is detectable to some threshold? I think that when N is "human" sized and the deck is 2^64 or larger that the number of required replications will be unrealistically large.

by fc417fc802

2/19/2026 at 8:46:40 PM

> 256 bits is not enough there

Yeah, but the question is: who cares?

Suppose you and I are both simulating card shuffling. We have the exact same setup, and use a 256-bit well-behaved PRNG for randomness. We both re-seed every game from a TRNG. The difference is that you use all 256 bits for your seed, while I use just 128 and zero-pad the rest. The set of all shuffles that can be generated by your method is obviously much larger than the set that can be generated by mine.

But again: who cares? What observable effect could there possibly be for anybody to take action if they know they're in a 128-bit world vs a 256-bit one?

The analogy obviously doesn't generalize downwards, I'd be singing a different tune if it was, say, 32 bits instead of 128.

by BigTTYGothGF

2/19/2026 at 12:15:13 AM

MCMC can be difficult for this reason. There are concepts like "k-dimensional equidistribution" etc. etc... where in some ways the requirements of a PRNG are far, far, higher than a cryptographically sound PRNG, but also in many ways less so, because you don't care about adversarial issues, and would prefer speed.

If you can't generate all possible assignments, you care about second and third order properties etc. of the sequence.

by jmalicki

2/19/2026 at 3:05:28 AM

Does there exist a single MCMC example that performs poorer when fed by a CSPRNG (with any fixed seed, including all zeroes; no state reuse within the simulation) as opposed to any other RNG source?

by coppsilgold

2/19/2026 at 12:28:59 AM

> There are concepts like "k-dimensional equidistribution" etc. etc... where in some ways the requirements of a PRNG are far, far, higher than a cryptographically sound PRNG

Huh? If you can chew through however many gigabytes of the supposed CSPRNG’s output, do some statistics, and with a non-negligible probability tell if the bytes in fact came from the CSPRNG in question or an actual iid random source, then you’ve got a distinguisher and the CSPRNG is broken.

by mananaysiempre

2/19/2026 at 1:51:33 AM

It all comes down to actual specific statistical tests, and how hard they are to break in specific applications.

No CSPRNG is absolutely perfect, no CSPRNG has ever absolutely passed every statistical test thrown at it.

In MCMC, it stresses very different statistical tests than the typical CSPRNG tests.

Every PRNG is absolutely broken if you want to be absolute about it. MCMC and crypto applications push on different aspects where statistical issues will cause application level failures.

See e.g. this paper https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf

(it's not the end all be all, but it's a good survey of why this stuff matters and why it's different)

by jmalicki

2/19/2026 at 2:43:07 AM

> no CSPRNG has ever absolutely passed every statistical test thrown at it

As far as I know (admittedly not a high standard), there is no published statistical test that you could run on, for example, a single AES-256-CTR bitstream set up with a random key and IV, running on a single computer, that would be able to tell you with a meaningful likelihood ratio that you were looking at a pseudorandom rather than truly random input before the computer in question broke down. (I’m assuming related-key attacks are out of scope if we’re talking about an RNG for simulation purposes.)

by mananaysiempre

2/19/2026 at 3:12:41 AM

Cryptographic operations when done correctly result in full chaos within the discrete domain (the so-called avalanche effect). Any bias of any kind gives rise to a distinguisher and the primitive is regarded as broken.

One way to imagine what symmetric cryptography does is a cellular automaton that is completely shuffled every iteration. In the case of Keccak/SHA3, that is almost exactly what happens too.

by coppsilgold

2/19/2026 at 3:24:32 AM

There has never been a perfect CSPRNG.

by jmalicki

2/19/2026 at 9:38:25 AM

There have been a very large number of CSPRNGs for which there does not exist any known practical method to distinguish them from TRNGs.

For all of them there are theoretical methods that can distinguish a sequence generated by them from a random sequence, but all such methods require an impossible amount of work.

For instance, a known distinguisher for the Keccak function that is used inside SHA-3 requires an amount of work over 2^1500 (which was notable because it was an improvement over a naive method that would have required an amount of work of 2^1600).

This is a so ridiculously large number in comparison with the size and age of the known Universe, that it is really certain that nobody will ever run such a test and find a positive result.

There are a lot of other such CPRNGs for which the best known distinguishers require a work of over 2^100, or 2^200, or even 2^500, and for those it is also pretty certain that no practical tests will find statistical defects.

There are a lot of CSPRNGs that could not be distinguished from TRNGs even by using hypothetical quantum computers.

Even many of the pretty bad cryptographic PRNGs, which today are considered broken according to their original definitions, can be made impossible to distinguish from TRNGs by just increasing the number of iterations in their mixing functions. This is not done because later more efficient mixing functions have been designed, which achieve better mixing with less work.

by adrian_b

2/19/2026 at 3:39:04 AM

> There has never been a perfect CSPRNG.

What is a perfect CSPRNG?

by coppsilgold

2/19/2026 at 3:23:50 AM

"before the computer in question broke down."

A good MCMC simulation might test that! E.g. say, training a large diffusion model. It takes way more computing power than the average time for a single computer to fail.

Also, the standards of those tests vs. does it bias the statistical model fitted with MCMC are different.

by jmalicki

2/19/2026 at 3:47:59 AM

I am aware of tests vs. ChaCha20 here https://www.pcg-random.org/index.html, I am not aware of tests vs. AES-256-CTR.

However at some point, 100x faster performance w/o an exploitable attack vector is also relevant! (though sometimes people find ways).

CSPRNGs are mostly worried about very specific attack vectors, and sure, they're like to be completely unpredictable. But other applications care more about other attack vectors like lack of k-dimensional equiprobability, and that hurts them far more.

The idea that CSPRNGs are the end all and be all of rngs holds CS back.

by jmalicki

2/19/2026 at 9:14:49 AM

I am familiar with that site and the PCG PRNGs are based on a sound principle, so they are good for many applications.

However I have never seen a place where the author says something about finding a statistical defect in ChaCha. She only correctly says that ChaCha is significantly slower than PRNGs like those of the PCG kind (and that it also shares the same property that any PRNG with a fixed state size has, of limited high-dimensional equidistribution; this is also true for any concrete instantiation of the PRNGs recommended by the author; the only difference is that with PRNGs having a simple definition you can make the same structure with a bigger state, as big as you want, but once you have chosen a size, you have again a limit; the PCG PRNGs recommended there, when having greater sizes than cryptographic PRNGs, they become slower than those cryptographic PRNGs, due to slow large integer multiplications).

In the past, I have seen some claims of statistical tests distinguishing cryptographic PRNGs that were false, due to incorrect methodology. E.g. I have seen a ridiculous paper claiming that an AI method is able to recognize that an AES PRNG is non-random. However, reading the paper has shown that they did not find anything that could distinguish a number sequence produced by AES from a true random sequence. Instead, they could distinguish the AES sequence from numbers read from /dev/random on an unspecified computer, using an unspecified operating system. Therefore, if there were statistical biases, those were likely in whichever was their /dev/random implementation (as many such implementations are bad, and even a good implementation may appear to have statistical abnormalities, depending on the activity done on the computer), not in the AES sequence.

by adrian_b

2/19/2026 at 5:21:22 AM

Are they claiming that ChaCha20 deviates measurably from equally distributed in k dimensions in tests, or just that it hasn't been proven to be equally distributed? I can't find any reference for the former, and I'd find that surprising. The latter is not surprising or meaningful, since the same structure that makes cryptanalysis difficult also makes that hard to prove or disprove.

For emphasis, an empirically measurable deviation from k-equidistribution would be a cryptographic weakness (since it means that knowing some members of the k-tuple helps you guess the others). So that would be a strong claim requiring specific support.

by tripletao

2/19/2026 at 6:00:09 AM

By O’Neill’s definition (§2.5.3 in the report) it’s definitely not equidistributed in higher dimensions (does not eventually go through every possible k-tuple for large but still reasonable k) simply because its state is too small for that. Yet this seems completely irrelevant, because you’d need utterly impossible amounts of compute to actually reject the hypothesis that a black-box bitstream generator (that is actually ChaCha20) has this property. (Or I assume you would, as such a test would be an immediate high-profile cryptography paper.)

Contrary to GP’s statement, I can’t find any claims of an actual test anywhere in the PCG materials, just “k-dimensional equdistribution: no” which I’m guessing means what I’ve just said. This is, at worst, correct but a bit terse and very slightly misleading on O’Neill’s part; how GP could derive any practical consequences from it, however, I haven’t been able to understand.

by mananaysiempre

2/19/2026 at 8:00:00 AM

As you note, a 256-bit CSPRNG is trivially not equidistributed for a tuple of k n-bit integers when k*n > 256. For a block cipher I think it trivially is equidistributed in some cases, like AES-CTR when k*n is an integer submultiple of 256 (since the counter enumerates all the states and AES is a bijection). Maybe more cases could be proven if someone cared, but I don't think anyone does.

Computational feasibility is what matters. That's roughly what I meant by "measurable", though it's better to say it explicitly as you did. I'm also unaware of any computationally feasible way to distinguish a CSPRNG seeded once with true randomness from a stream of all true randomness, and I think that if one existed then the PRNG would no longer be considered CS.

by tripletao

2/19/2026 at 12:01:46 AM

It reminded me an experiment where each subject was presented with a pseudorandomised sequence of trials, only that, unknown to the researchers, every time the experiment was running the same (default) seed was used, which resulted in all subjects being presented to the same "random" sequence of trials.

by freehorse

2/19/2026 at 10:20:40 AM

I’m pretty sure that one of the fundamental premises of this argument is not correct, but I haven’t done mathematical statistics so I would defer to anyone who has and says I’m wrong. The author expresses this in a variety of forms throughout the article, but for instance

> Significance tests are meaningful and valid if AND ONLY IF assignment was properly randomized.

OK I agree with the “if” but not the “only if” part of this implication. My intuition is that significance tests are still meaningful if a non-random method of assignment is chosen, but the bias introduced by the method is not in any way correlated with the observations under study.

This leads to the whole thing the author gets into about PRNGs not being valid for assignment. But a much simpler and more deterministic method of sampling is stratified sampling, which is used all over the place for various types of statistical experiments. The random entropy of a stratified sample is nowhere near enough to be fully random, yet I haven’t seen anyone claim that all experiments done using stratified sampling have invalid or meaningless p-values.

Basically stratified sampling works as follows. Say you want to study the effects of age and gender on income. One way is to just study everyone, record their age, gender and income and do your study. But that gets pretty unweildy and you may not be able to get accurate data over the whole population. What you can do instead is get a list of people, sort them by age and gender and pick a size that’s convenient. So say you have capacity to analyse a sample 1/20th the size of the total. Cool. Then you just take every 20th person on your list and that’s your sample. By the magic of stratified sampling, because your list is sorted by age and gender, the sample will more or less have the same proportions as the total population with respect to age and gender.

Not random, but used all over the place for statistical studies.

by seanhunter

2/19/2026 at 5:33:01 AM

The observation that it is impossible to ennumerate 300 bits of assignments with a 256-bit random generator is a very clever observation. Just like almost all assignments are balanced (300 vs. 295 bits), I wonder if there's a similar equipartition property argument that 256 bits will generate the vast majority of the outcomes, but my Shannon-fu is too weak to tell.

Maybe that's the problem? With 256 bits we will only get the "typical" assignments and not the edge cases which are the ones that are important for randomisation tests?

by kqr

2/19/2026 at 5:37:09 AM

I read this and came away a bit sheepish not really grasping the significance of extreme focus on PRNG and entropy for basic things. Glad to see the rest of the comments agreeing. "What every experimenter must know"...

by vehementi

2/18/2026 at 8:09:47 PM

Starts interesting, then veers into the usual "true random number" bullshit. Use radioactive decay as source of your random numbers!

by Tomte

2/19/2026 at 7:56:20 AM

> Use radioactive decay

It's a lot easier to use diodes (light emitting and otherwise).

by bob1029

2/18/2026 at 8:41:18 PM

How do we know it's truly random?

by amelius

2/19/2026 at 6:16:29 AM

We don't. We know with great certainty it's either random or needs non-local variables, which would let information travel faster than c, among other things. (This is a consistent result of the Bell test.)

Most prefer to believe in randomness.

by kqr

2/18/2026 at 10:20:50 PM

The only known explanation of what's going on in quantum mechanics is a multiversal one^[1]. Using radioactive decay of an atom as an example: there are an uncountably infinite number of universes that are initially "fungible" (identical in every way), and over time the universes gradually differentiate themselves with the atom going from a non-decayed to decayed state, at different times in each universe. But you will be in all of those universes. So if you thought the atom would decay in, let's say 5 seconds, there would be some universes where you were right and some where you were wrong. That makes it impossible to ever make reliable specific predictions about when the atom will decay. So, in practice that just looks like perfect randomness.

^[1]: There are other interpretations, of course. And those other interpretations are equally explanatory. But they do not claim to be explanations of what is actually happening to unobserved quantum particles. There is also Bohmian mechanics, but I don't know how many people take it seriously.

by ChadNauseam

2/18/2026 at 9:04:38 PM

> usual "true random number" bullshit

What's bullshit about it? This is how TRNGs in security enclaves work. They collect entropy from the environment, and use that to continuously reseed a PRNG, which generates bits.

If you're talking "true" in the philosophical sense, that doesn't exist -- the whole concept of randomness relies on an oracle.

by zeroxfe

2/18/2026 at 9:48:59 PM

What PRNGs lack compared to TRNGs is security (i.e. preventing someone from being able to use past values to predict future values). It's not that they somehow produce statistically invalid results (e.g. they generate 3s more often than 2s or something). Unless they're very poorly constructed.

by wavemode

2/18/2026 at 10:11:26 PM

Maybe people have bad memories from linear congruential generators, these could go really bad (https://en.wikipedia.org/wiki/Marsaglia%27s_theorem)

by refsys

2/19/2026 at 1:01:32 PM

While LCGs are bad by themselves, they (together with Galois field counters, which have a large number of possible implementations, e.g. LFSRs, GFSRs, XorShift etc.) have some very desirable properties for a PRNG: known period, it is possible to make jumps through the sequence and it is possible to extract sub-sequences from it that are certain to not overlap, e.g. for a multithreaded simulation.

Because of this, the best non-cryptographic PRNGs are made from either a LCG or a GFC that ensures the properties mentioned above, together with a non-linear mixing function that scrambles the output, for much better statistical properties than a linear generator would have alone.

The good cryptographic RNGs have the same kind of structure, but where a one-way hash function or a block cipher function is used to scramble the output of a counter. The counter ensures in a simpler way the same properties as a LCG or GFC. A simple counter can be used here because the output mixing function is much more complex.

by adrian_b

2/18/2026 at 9:22:58 PM

I don't think hardware random number generators are bullshit, but it's easy to overstate their importance. Outside of cryptography, there aren't a whole lot of cases that truly require that much care in how random numbers are generated. For the kind of examples the article opens with (web page A/B testing, clinical trials, etc.) you'll never have sample sizes large enough to justify worrying about the difference between a half-decent PRNG and a "true" random number generator.

by wtallis

2/19/2026 at 12:15:32 PM

Yes, agreed. In many cases, the determinism is a feature, particularly being able to store the seed for reproducibility.

by zeroxfe

2/19/2026 at 8:37:46 AM

The thesis of this piece is effectively "causality is the inevitable destination of science" paired with the corrolary that "correlation is not causation". Statistics is a field with very limited insights. Once you apply causality to the mathematics of statistics (i.e. Pearl's do-calculus) you can actually make claims about the informational content expressed through recording events and facts of reality.

by uoaei