[Cryptech Tech] Happier RSA timing numbers
Rob Austein
sra at hactrn.net
Sun May 20 21:00:02 UTC 2018
Adding Peter Gutmann's trick for amortizing the cost of RSA blinding
factors over a long run of signatures was good for approximately 2x
speedup in the RSA code per se. Having blinding on still makes a
difference, but not a dramatic one, so while we might still want to
turn it off occasionally for testing, it's now cheap enough that I
don't think we need to worry about the cost of it in production.
Brief excerpt from latest profiling run (full log available upon
request, but not necessary for this message):
index % time self children called name
-----------------------------------------------
0.00 291.96 24000/24000 hal_rpc_pkey_sign [4]
[5] 75.1 0.00 291.96 24000 pkey_local_sign [5]
0.00 254.21 24000/24000 hal_ks_fetch [7]
0.00 37.75 24000/24000 pkey_local_sign_rsa [18]
0.01 0.00 48000/54695399 memset [44]
0.01 0.00 24000/6811963 hal_critical_section_start [70]
0.00 0.00 24000/6811963 hal_critical_section_end [267]
-----------------------------------------------
0.02 1.70 315522/46710960 hal_aes_keywrap [72]
3.58 250.08 46395438/46710960 hal_aes_keyunwrap [8]
[6] 65.7 3.60 251.78 46710960 do_block [6]
8.22 109.89 46710960/46735838 hal_io_wait [9]
16.75 63.03 140132880/153807632 hal_io_write [12]
11.57 42.32 93421920/192643038 hal_io_read [10]
-----------------------------------------------
So out of 291 seconds spent signing stuff in this test run, we spent
38 seconds on the actual signatures (including ASN.1, blinding, modexp
including FMC I/O, and other arithmetic), spent more than twice that
just on FMC I/O talking to the AES cores, and 110 seconds waiting for
the AES core. At least that's what the profiler thinks happened.
FWIW, totals here are from a long series of signatures, currently:
for client in [1..8]:
for keysize in [1024,2048,4086]:
for 1000 iterations:
rsa_decrypt(pkcs1_5_blob)
Testing for silly numbers of clients has been quite useful up until
now, but it is of course possible that it's skewing the numbers so we
may want to drop those cases, or perhaps even go back to just testing
four clients as the optimal target.
More information about the Tech
mailing list