[Cryptech Tech] FIPS 140-2 test program

Stephan Mueller smueller at chronox.de
Wed Jul 16 10:31:13 UTC 2014


Am Mittwoch, 16. Juli 2014, 09:27:03 schrieb Benedikt Stockebrand:

Hi Benedikt,

>Hi Stephan and list,
>
>>>While it talks about entropy for example, it doesn't clearly define
>>>the statistical model that its entropy measurement relates to.  This
>>>can be a documentation bug, but I wouldn't want to bet on it.
>>>
>> There is no model behind it. All they do is to apply:
>To quote from SP 800-90B (Draft) I've just started to read (page 11,
>first paragraph):
>
>    Entropy is defined relative to one's knowledge of (the probability
>    distribution on) $X$ prior to an observation, and reflects the
>    uncertainty associated with predicting its value---the larger the
>    entropy, the greater the uncertainty in predicting the value of an
>    observation.
>
>If there were no model, then all outputs were valid.  What makes one
>output suspect and the other not?

May I ask to make one step back: Entropy per definition cannot be tested 
with any application. Moreover, entropy is relative to the observer. For 
example, when I look at the desk of my wife, it is highly entropic to 
me, but not for her. The same applies to entropy in technical terms: 
when I do not have some foreknowledge and consider something entropic, 
somebody else may have that knowledge and considers that same thing 
having little or no entropy.

That said, all that test tools can do is to apply statistical tests to 
analyze patterns (i.e. deviations from white nose). Any pattern 
identified (1) diminish the maximum theoretical entropy you can imply 
when you look at one output bit of your RNG and (2) such patterns may 
hint to flaws in your RNG that even further decrease the potential 
entropy in your system (again, it goes back to the relativity of 
entropy).
>
>> - Shannon Entropy formula
>
>I'm not sure I really understand what they mean with that term, so I'm
>not going to comment on this one right now.

See http://en.wikipedia.org/wiki/Shannon_entropy -- the formula in the 
definition section
>
>> - Chi-Square Test
>> - mean calculation
>> - Simulation
>
>These tests all take an entirely arbitrary aspect and check for that
>aspect only.  Fair enough, but why are these tests of any value?  Why
>are they "better" than a test "Is a quote from the works of Shakespeare
>encoded in EBCDIC"?  They are arbitrary, and as such only useful if
>the underlying model matches the source of the data.

That is the key here: there are NO tests you can make that *shows* you 
have entropy. All you can do is make statistical tests to more and more 
conclude your noise source *may* have entropy.
>
>With the chi square test for example: What is so special about an 8 bit
>word size, statistically speaking?  Why not use a 65517 bit word size
>instead?

There is nothing about it. The word size however should correlate with 
the blocksize of your RNG. If you generate 8 bits at a time, you apply 8 
bits. If you generate 17 bits at one time, you apply 17 bits. All you 
want to know is whether your bits show any statistical significant 
patterns, not more, not less. And patterns typically emerge when you 
assess full blocks of your RNG.
>
>The mean is somewhat similar; it interprets the bit stream as a
>sequence of binary encoded numbers and then checks these.  Why exactly

Same here: apply blocksize.

>that?  And should it be big or little endian, which may make a huge
>difference on the result of the test?

The endianess matters little, because you look for patterns. If you have 
patterns, you have them regardless how you look at your bits (big or 
little endian).
>
>The simulation thing in particular is on the border of silly; all it
>does is that it applies a yes/no test with known output distribution to
>the input and checks that the ratio of yes and no results is getting
>close enough to the theoretically computed value.  There's really
>little reason to waste CPU cycles on expensive trigonometric functions
>(if done badly) or at least multiplications (if done somewhat more
>sensibly). The only reason for this test is that it somehow resembles
>the dart board method used in old day statistics text books because
>the authors couldn't really come up with anything better.

I am fine with your statement. Yet, the goal is to identify patterns. 
Thus, such simulation thing at least helped me once to identify patterns 
where all other calculations showed that there were none.

Thus, throw as many statistical tests at your RNG to check for patterns 
as you can. If one indicates a pattern, you have to scratch your head 
how much impact it has in the theoretical analysis of how much entropy 
is in your system.
>
>> Of course you only test the raw noise source and not anything
>> whitened.
>From a practical point of view this usually makes sense.  But:
>
>Why is that "of course"?  If I had reason to assume that the test data
>I got was really from the works of Shakespeare but had been sent
>through 3DES with a key of 1234567890, why shouldn't I test it for
>that?  Again, entropy is always relative to what you know, or assume,
>about the data stream you want to test.

Again, all testing just wants to look for patterns. If you take the 
output from a cipher with a known key, you *will* receive a 
*patternless* data stream and all statistical tests will show thumbs up.

But then I go back to my initial statement: entropy is relative. If you 
do not know that you used Shakespare + DES+ known key, the bit stream 
*looks* like white noise to you and you would assume it contains entropy 
(at least it is entropic to *you*). But somebody else with the 
foreknowledge on how the bit stream is generated does not consider it to 
contain entropy.
>
>I know this sounds somewhat weird, but the key point is that all

No, it is not weird at all! Your point is very good to illustrate the 
limitations of all "entropy test applications": it is not possible to 
test entropy.

>testing is making assumptions on the output, and the only thing we can
>do about is to try and make them explicit so we can match them with
>the nature of the actual source.
>
>> Especially when you have a cryptographic whitening function, all you
>> would test is that the crypto function is good. And any current
>> crypto
>> functions per definition must show good statistics as otherwise the
>> crypto function weak.
>
>That's exactly the point.  However: As soon as I know about (a) the
>design of the CSPRNG and (b) any weakness in the crypto function, then
>I can do some "targeted" testing on the underlying HWRNG.
>
>Or put in another way: If I take output x from a HWRNG, run that
>through some whitening function f, and return f(x) as whitened noise,
>then I could still run my usual tests on x as long as I knew the
>inverse function f^-1 of f, so mathematically speaking such a test is
>well possible; it may be cryptographically infeasible if f if chosen
>properly, but that's a tremendous difference: Cryptographic
>feasibility is not a constant, but something that changes over time,
>and people with more intimate knowledge of f may actually come to
>significantly different results than those without.
>
>
>Cheers,
>
>    Benedikt


Ciao
Stephan


More information about the Tech mailing list