[Cryptech Tech] Entropy source delivered

Benedikt Stockebrand bs at stepladder-it.com
Sun Aug 24 10:14:32 UTC 2014


Hi Fredrik and list,

Fredrik Thulin <fredrik at thulin.net> writes:

> Hi Benedikt, good to see you active again =)

yes, but only for the weekend; no guarantees on what'll happen from
Monday on...

>> That's 20 kByte/s (with the magic Zeners), or 160 kbit/s; the BC337-16's
>> I've used for the batch of boards get me less, around 90--120 kbit/s.
>
> Is that how many edges your MCU sees per second, or measured in some other 
> way?

No, that's the output on the USB interface, which according to the tests
is pure entropy already.

> How exactly does your current firmware derive bits of entropy from the edges? I 
> looked at the firmware you sent out a couple of weeks ago, and if I understood 
> correctly what function was actually being used it appeared to be busy-waiting 
> for the state of an input pin to change from low to high:
> [...]

Correct, that's how it works.

I've experimented with interrupts, but they are somewhat slow.
According to the documentation I've found it's four clock cycles until
the interrupt handler starts and I think another four to jump back.  And
that's without doing anything in the handler itself.

> Have you plotted the raw data you extract somehow, like Bernd and I have done 
> lately for my counter values?

I didn't exactly plot them, but I've run a number of very similar
tests---and the byte distribution test from dieharder should also do
this sort of test.

For the ARRGH board, you can disable the von Neumann extractor if you
use a profile setting of CFLAGS+=-DUSE_SIMPLE rather than
CFLAGS+=-DUSE_VN_EXTRACTOR to get the raw data.  Additionally, if you
look at Profiles/default.profile for a section titled "Noise
measurement" you can also switch from edge-to-edge measurement to simple
sampling, with a configurable number of samples to be XORed.

When you muck around with these however, keep in mind that the serial
interface eventually becomes the bottleneck; it maxes out at 50 kByte/s.

> When I tried busy-waiting as opposed to interrupt driven operations I
> saw some clear bias towards certain numbers which I think were caused
> by the fixed execution time of the busy-wait loop -

That shouldn't really be the issue; my assumption is that you were
pulling data faster than entropy entered the noise stream---remember
that with the avalanche noise, only when a breakdown or recovery occurs
any entropy enters the stream, between those points the signal is
deterministic.

> or maybe I was just observing these irregularities of my counter in 16 MHz 
> mode back then but did not understand the cause of the bias =).

My guess is that you observed some sort of tolerances, clock
synchronization etc. rather than the actual noise from the avalanche
effect.

> Anyway, a key takeaway from these last few days of studying the values that 
> the LSB comes from to me is that making it possible to extract such from the 
> RNG is actually super-important in order for someone (without an engineering 
> lab at their disposal) to be able to verify the quality of the RNG. Joachim 
> has always said so, but this just underlines the importance of it to me.

I fully agree.  That's why I've opted for the edge-to-edge algorithm,
which effectively eliminates all correlation between adjacent bits of
input to the von Neumann extractor, and the von Neumann extractor to
take care of any bias after that.

So the advantages of this approach are:

- Theory explains why this design is sound, and the test results confirm
  that.

- With that it is possible to test the output at this stage with fairly
  limited resources and knowledge---just feed the output through
  dieharder et al., and no need to come up with some clever
  interpretation of the results of individual tests.

- Component tolerances and such will affect the output speed only, but
  not the quality of the output.

- That put to the extreme, all obvious failure modes will simply make
  the device stop generating output, rather than producing predictable
  one.

- Bandwidth requirements on the external interface (serial/USB in this
  case) are minimized.

On the other hand, there are a few downsides:

- If the consumer of the data doesn't need pure entropy, but can also
  use a noise signal which still contains redundancy, then this design
  is not the most efficient.

- It is more difficult to parallelize several of these devices than with
  a fixed sampling rate.

- At least the edge-to-edge measurement largely depends on the nature of
  its input source: With the avalanche effect, whenever the analog
  signal changes direction, then entropy has entered the noise signal,
  while at any other time it is (at least theoretically) deterministic.


Cheers,

    Benedikt

-- 
Benedikt Stockebrand,                   Stepladder IT Training+Consulting
Dipl.-Inform.                           http://www.stepladder-it.com/

          Business Grade IPv6 --- Consulting, Training, Projects

BIVBlog---Benedikt's IT Video Blog: http://www.stepladder-it.com/bivblog/


More information about the Tech mailing list