[Cryptech Tech] [Cryptech-Commits] [core/platform/novena] 21/21: Sick hacks to compensate for sparse MUX within TRNG core.

Joachim Strömbergson joachim at secworks.se
Thu Oct 1 11:33:48 UTC 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Aloha!

Rob Austein wrote:
> At Wed, 30 Sep 2015 15:31:35 +0200, Joachim Strömbergson wrote:
>> I'm fairly (as in very) certain that we want to be able to trigger 
>> interrupt events in the CPU when things go bad in the FPGA.
>> Typically that the TRNG monitor detected that it is b0rked. There
>> might also be other rather important error events that we want to
>> inerrupt the CPU for.
> 
> OK.  That's not what I thought the existing error wires were doing.
> I agree that asynchronous events like that might be a reasonable use
> of interrupts.

No that is correct, the existing wires are signalling improper API
usage. So very closely related to API events.


> So basically the IRQ would say "go read the something-awful-happened 
> error register/queue/... ASAP".  From the software side, the
> interrupt handler itself would probably still just set a flag / send
> a condition / pick your programming abstraction, but the effect of
> that would be tell main program level to schedule an immediate poll,
> for some safe definition of "immediate".

Something like that. As in, stop whatever you are doing for now and
handle this error condition.


>> The second issue is if we should also flag for impproper use of the
>> API, which is what the error signal in the modules implement
>> today.
> 
> Right, that's what I thought the error wires were, and that's the 
> function that seems like it could just be a bit in the status word.

Sure. The only issue I have with that is deciding the temporal validity
of that status bit. My suggestion is that any violation sets the bit and
any reads to the status register (where the bit is located) clears the
bit. This means that if you do more than one violation before reading
the status register you wont know which one caused it. Is this an OK
solution?


> Doctrine as I was taught is to do as little as one possibly can at 
> interrupt level: set a flag, signal a condition, whatever, then get 
> the hell out and let main program level handle it.  So using an 
> interrupt to schedule an immediate poll is fine, and using an 
> interrupt to say "everything you were doing on the FPGA is now 
> hopelessly borked, bye, have a nice life" would also be appropriate
> if such a condition could occur, but having a real conversation with
> the FPGA at interrupt level would frowned upon.

We might have been tought different doctrines. In RTOSes at least, being
able to handle high prio stuff including reading an external register is
imho quite common.

But lets not digress. The important thing is if we are in agreement
design wise. And if I'm understanding it, it is this:

(1) We remove all error signals from the core APIs. Instead API access
violations are logged internally by setting an error bit in the status
register. Which bit to use needs to be defined across the cores. Since
we have:

STATUS_READY_BIT = 0;
STATUS_VALID_BIT = 1;

We could for example select 2 for error. And we also reserve some part
of the status register bit space, [24..31] for example, for individual
cores to specify different error conditions not related to access
violations. (see example below).


(2) Cores that have possible internal error conditions not directly
related to API access problems should have a separate, one bit wide
error signal out. This signal will then be connected to a utility core
that muxes error signals onto an external pin, provides status info
about which error event came from which core etc.


As an example lets say that the TRNG has separate health monitors in
each entropy provider as well as the output of the csprng. So with three
entropy providers the TRNG would have four different error bits in the
status register.

Suddenly the rosc based entropy source monitor detects that the entropy
source is dead. The TRNG sets its defined "BAD_ENTROPY_SOURCE2" bit in
its status register and pulls the error signal.

SW would get the IRQ and very soonish would read the appropriate
register in the utility core to figure out that the IRQ was caused to to
a problem in the TRNG. SW then reads the status register in the TRNG and
finds out that entropy source 2 is b0rked. SW can then decide to disable
this source or stop allowing any further use of the TRNG etc.

The API error bit in the core would in this case never be set since no
access violation has occurred.


> Fair enough.  Slightly wider conversation than I had originally 
> thought we were having, but no harm in that.

HW/SW interface design by talking - sounds scary.

- -- 
Med vänlig hälsning, Yours

Joachim Strömbergson - Alltid i harmonisk svängning.
========================================================================
 Joachim Strömbergson          Secworks AB          joachim at secworks.se
========================================================================
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBCAAGBQJWDRocAAoJEF3cfFQkIuyNkk4QALtaw4sulvEibWOnMMbb+udX
9Rn8GikMw9ugkXc4Xfm9KQZBTYRkqH5BkmOnyySeYo/YB9AaDigI/0VREzIU9pP5
5FZ8tHKHqf2aoklcyIEcw8T2p4iApasjRKg8647O7Ni+S1L8j3ZvUJsuSk3RkKEg
z0Bkm+HP/IHIF1o/gWysrO4IRlIF5GTw0oO81NAH3QSvaMsp/VqZKX2wBqGadJlx
4OuLu/uQUnKMp9Tz7kkV9kYDTFhNZoNaf+n0PDnXj45cXduXMPWRYPkQGR3NFAFE
oXMc5QzY4hvNWMveybBCnYardQ2aQKW5ZnM8c/4PEloYCPd6DTqUSZi6AMJiaTLv
icOQPjx8EO+R2q842879WNK6sfqwnCT290x1VoW5uvkz/EnUNSsuOAUCvigN7yp3
n5pj+jumMPVw+2rVyphTmamjCSH5vPJHGtZl+jrl6m4OLBg28RWYBxzfJc6XX4qX
3SgJPmOTlK+QojixeJR8JzGsLepkiItOnwiowabZ6wMaegtc2BBx+oPLnDCbwCKL
U5Gr10IhidOPE7Q6io3TL0LiQhjpQeGmEKffhVAiQla6RgLtQ7KkecRxh1rqy4HJ
SrDQiWk8YbDpYsqqD8Z/NV93YRnYWIcv6uKRo6zXYe9cJWy524/XNgIYiR1rJPjW
6mXZkpyK3KbuNAAQ31Ih
=6rCh
-----END PGP SIGNATURE-----


More information about the Tech mailing list