[Cryptech Tech] Happier RSA timing numbers

Joachim Strömbergson joachim.strombergson at assured.se
Fri May 18 06:24:12 UTC 2018


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Aloha!

Rob Austein wrote:
> Going back to the original problem: profiling says that the main 
> bottleneck is the slow retrieval from the keystore: this appears to 
> take about twice as long as modexp, and the bulk of that time is
> spent in hal_aes_keyunwrap(), specifically in the function which runs
> a block through the AES core.  So this looks like an AES throughput 
> problem, at least given the way we're currently using the AES core.
> 
> It's possible that we could speed up the keystore a bit by using a 
> different wrapping algorithm, as Peter Gutmann has been suggesting
> all along, although perhaps for reasons other than speed.  Eg, we
> could use AES in one of the classic block cipher modes instead of
> AES keywrap: presumably we'd need a MAC too, but at least in theory
> that could run in parallel with the encryption, although the control
> code to manage that might be a bit of nasty.

Just to comment on the AES core speed.

Right now the AES core has four S-boxes, which makes the SubBytes part
of the AES-round take 4 cycles. In total AES-128 take 40-ish cycles.
AES-256 take 56. This could be approved quite dramatically by increasing
 the number of S-boxes to 8 or even 16. If we see that the actual
processing in the core is the bottleneck, we can try to do these
changes. It wouldn't be much work.

One thing to consider is the transfer of data over the FMC bus. We
should possibly consider having local data storage in the FPGA (using
blockRAM) if we are transferring the same data multiple times. This
would allow 128-bit word access by the AES core, removing a lot of latency.

Adding block cipher modes is something I have wanted to do for a while
(and yes, we have talked about it previously). I have implemented CTR,
CBC in other projects and know how they work. And I have a CMAC
implementation which is using the same AES core as in Cryptech:

https://github.com/secworks/cmac

That core can be added to the Cryptech project directly.


If you could do more analysis on where the bottleneck in the
hal_aes_keyunwrap() really is (AES processing, transfer of data, cipher
mode SW processing) we can see what we should change, add.

- -- 
Med vänlig hälsning, Yours

Joachim Strömbergson - Assured AB
========================================================================

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBCAAGBQJa/nGMAAoJEF3cfFQkIuyNgZQP/2tAbtSxXuBfcO6Ubwxa+s+8
LFZ9AmoXV/RwSzejkeACx/BauH5kZO1dcMcEmvfXSshXKtiDk/Vaf9eODTq7WCzu
uinHmxNOT0v88WtPPpZ1gshjqmE75UxgXI9h3iz5baJRmbuJXfiPvl0OD3Xo8SQs
muzd/xtW20Q1KDdqI/wSkT+Xp+nLpOBKmpXLOXzY+N6+9hus22rTxrWysI13BqjZ
jDrQawby3GGkkQQVgpAuulBnHV0ARJkgehHG0WuvWh09lT4M7Fm/xj/QZzWdmLh4
68hkh/fINfG9liqMdFFXQaXXY7rll8X6yN2MjN2sR9epuE9nwm2600bzuGEMQgeQ
zXqORfwRgmELrckSpwjkiYJ5wG3fK5SmRd003HRC1JMv9xT23TgOdLdOLz5kNvfF
1gaEu4qiA8MgG/EOGpiKg16sBTlWVJpjyuB4G55FXwCsD2zz3GzKbd17rb/zXR8V
abSunlVP64Ge7t1iqVB0tOVpcJhiBVESgtmIH7x0u0Xqc/bLFhS2uhEh9pTFyYhx
/LWdxAP8IHrHbjkwPFXqC3krL6YfKDdexG3oKVtAKzYvet/DXFNswTYBOIIMMmVE
wQQlAeTKXkJ/JEKnghdO3DkTLN2cUvV70K6KSE3X4zfRg+77aW2ggJpaI5PX12fK
omfkxCKNy3ubiZWwbyre
=/wR6
-----END PGP SIGNATURE-----


More information about the Tech mailing list