[Cryptech Core] Proposal for new FPGA architecture

Joachim Strömbergson joachim at assured.se
Tue Nov 13 09:01:07 UTC 2018


Aloha!

Thanks for good comments.

On 2018-11-07 23:37, Paul Selkirk wrote:
> I have to admit, I've read through this several times, and I'm still not
> sure I understand it completely. However, rather than get deep in the
> weeds on details, I want to address what I think are the main points.

It basically is a two-port, FPGA on-chip memory with added logic to
perform from-to write operations of n words. And SW on the CPU could
access this memory while transfer between the memory and addresses in
the other cores are happening. SW would have access to the address space
of all other cores as long as a transfer has not been started and not
yet completed.

SW would be responsible for setting the read and write pointers and the
amount of words to transferred. The pointers would auto-increment. So
one can have a big blob of data and hash it by stepping n steps (and
restarting the to-pointer.

SW would also have to keep a tabs on if a transfer operation has been
started.


> If I understand correctly, the proposal is to unwrap private keys to a
> buffer in the core_selector (on the FPGA), then send that key (or
> components thereof) directly to other cores to do, say, a signing operation.
> 
> The first problem is that the unwrapped key is actually ASN.1-encoded
> data. core_selector would need to recursively decode the key in
> hardware, to extract the usable key components.

Lets see if I can understand this. Are things inside the ASN.1-blob
wrapped with the master key in turn wrapped with a key inside the
ASN.1-blob? If that is the case it would still work.

1. Move the ASN.1-blob to the on-chip memory (or directly into the
keywrap core mem as today)

2. Unwrap the ASN.1-blob using keywrap (with mkm).

3. Transfer the unwrap ASN.1-blob to the on-chip memory.

4. Transfer the now unwrapped wrapping key to the keywrap core

5. Transfer the ASN.1-internal wrapped objects to keywrap.

6. Unwrap the internal objects and transfer them back to the on-chip memory.

Basically what I assume is done today, but instead of moving back and
forth between the STM32-memory and the FPGA, the data stays inside the FPGA.

As long as the CPU can know beforehand which ASN.1 fields (but not their
actual value) to use for different processing operations in the FPGA
this should work. But, if the CPU needs to process, look at the
unwrapped ASN.1-fields in order to know what to do next, then this of
course won't work. The only solution then is to have CPU functionality
in the FPGA itself.


> Secondly, RSA keys contain some "precalculated" speed-up fields that are
> specific to the systolic modexpa7 core. These are actually only
> generated on the first use of the key, so the key needs to be updated in
> the keystore, with these additional fields. So core_selector would also
> need to be able to do the ASN.1 encoding, in key-specific formats, and
> hand it off to keywrap.

And ASN.1 encoding is done before wrapping? not on wrapped data? If the
fields are wrapped you could of course transfer them to the CPU to be
encoded, be put into the ASN.1 struct.


> Finally, there's the issue of multi-tasking. The software supports
> multiple concurrent clients. If one task loads a key, then yields to
> another task, which loads another key into the same core_selector
> buffer, then we have a problem. (And task yields happen frequently, e.g.
> on every call to hal_io_wait.) This might be alleviated by partitioning
> the buffer into multiple key caches (with coordination between tasks and
> core_selector about who has what piece).

Yes. This is what I see as how it would work. The on-chip memory would
be big enough to handle multiple keys. SW would be responsible for
keeping track on what the memory actually contains as well as coordinate
between tasks to ensure that one task doen't try to start a core while a
memory transfer is in progress. I assume there are synch primitives
(semaphores etc) in place today to ensure that tasks don't mess with the
shared core resources uncontrolled.


> If we had fully-featured signing cores, then we could unwrap and decode
> keys directly to them, but at the moment we have pieces - modexp and
> point multipliers - and the software is intimately involved in all
> aspects of key generation, signing, and verification.
> 
> Or I might have missed the point entirely.

No, you might not have missed the point. It might well be me not
grasping how SW works with the cores and different fields. And I'm
trying to find ways of (1) reducing the number of FMC transfers and (2)
keeping unwrapped data which is today stored in the CPU RAM instead in
RAM inside the FPGA.

-- 
Med vänlig hälsning, Yours

Joachim Strömbergson
========================================================================
                               Assured AB
========================================================================

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <https://lists.cryptech.is/archives/core/attachments/20181113/3b9dfbbd/attachment.sig>


More information about the Core mailing list