[Cryptech Tech] Happier RSA timing numbers

Rob Austein sra at hactrn.net
Mon May 14 05:38:31 UTC 2018


After some API revision and refactoring, I finally have what appear to
be significantly better results for RSA signature time.  Data first:

    rsa_1024 sigs/sec 31.1885886942 secs/sig 0:00:00.032063 mean 0:00:00.031807 (n 1000, c 1 t0 2018-05-14 00:14:43.666793 t1 2018-05-14 00:15:15.729802)
    rsa_1024 sigs/sec 62.0656452192 secs/sig 0:00:00.016111 mean 0:00:00.031923 (n 1000, c 2 t0 2018-05-14 00:15:19.760137 t1 2018-05-14 00:15:35.872110)
    rsa_1024 sigs/sec 92.1548230517 secs/sig 0:00:00.010851 mean 0:00:00.032235 (n 1000, c 3 t0 2018-05-14 00:15:41.772934 t1 2018-05-14 00:15:52.624238)
    rsa_1024 sigs/sec 121.099755318 secs/sig 0:00:00.008257 mean 0:00:00.032710 (n 1000, c 4 t0 2018-05-14 00:16:00.395804 t1 2018-05-14 00:16:08.653459)
    rsa_1024 sigs/sec 17.8178135337 secs/sig 0:00:00.056123 mean 0:00:00.279888 (n 1000, c 5 t0 2018-05-14 00:16:18.316359 t1 2018-05-14 00:17:14.439968)
    rsa_1024 sigs/sec 17.3770304017 secs/sig 0:00:00.057547 mean 0:00:00.344278 (n 1000, c 6 t0 2018-05-14 00:17:25.959041 t1 2018-05-14 00:18:23.506273)
    rsa_1024 sigs/sec 17.3669715595 secs/sig 0:00:00.057580 mean 0:00:00.401701 (n 1000, c 7 t0 2018-05-14 00:18:36.895477 t1 2018-05-14 00:19:34.476040)
    rsa_1024 sigs/sec 17.3625224189 secs/sig 0:00:00.057595 mean 0:00:00.459085 (n 1000, c 8 t0 2018-05-14 00:19:49.754960 t1 2018-05-14 00:20:47.350278)

    rsa_2048 sigs/sec 10.3966848881 secs/sig 0:00:00.096184 mean 0:00:00.095938 (n 1000, c 1 t0 2018-05-14 00:20:49.541663 t1 2018-05-14 00:22:25.726169)
    rsa_2048 sigs/sec 20.7172339543 secs/sig 0:00:00.048268 mean 0:00:00.096246 (n 1000, c 2 t0 2018-05-14 00:22:29.788740 t1 2018-05-14 00:23:18.057732)
    rsa_2048 sigs/sec 30.8958572023 secs/sig 0:00:00.032366 mean 0:00:00.096766 (n 1000, c 3 t0 2018-05-14 00:23:24.007728 t1 2018-05-14 00:23:56.374527)
    rsa_2048 sigs/sec 40.9041753428 secs/sig 0:00:00.024447 mean 0:00:00.097338 (n 1000, c 4 t0 2018-05-14 00:24:04.196442 t1 2018-05-14 00:24:28.643824)
    rsa_2048 sigs/sec 10.6500590498 secs/sig 0:00:00.093896 mean 0:00:00.468416 (n 1000, c 5 t0 2018-05-14 00:24:38.354999 t1 2018-05-14 00:26:12.251192)
    rsa_2048 sigs/sec 10.5224474055 secs/sig 0:00:00.095034 mean 0:00:00.568701 (n 1000, c 6 t0 2018-05-14 00:26:23.879446 t1 2018-05-14 00:27:58.914371)
    rsa_2048 sigs/sec 10.5171613201 secs/sig 0:00:00.095082 mean 0:00:00.663538 (n 1000, c 7 t0 2018-05-14 00:28:12.430485 t1 2018-05-14 00:29:47.513176)
    rsa_2048 sigs/sec 10.5120833139 secs/sig 0:00:00.095128 mean 0:00:00.758363 (n 1000, c 8 t0 2018-05-14 00:30:02.919549 t1 2018-05-14 00:31:38.048171)

    rsa_4096 sigs/sec 1.78302871334 secs/sig 0:00:00.560843 mean 0:00:00.560602 (n 1000, c 1 t0 2018-05-14 00:31:40.268158 t1 2018-05-14 00:41:01.111622)
    rsa_4096 sigs/sec 3.52117021726 secs/sig 0:00:00.283996 mean 0:00:00.567657 (n 1000, c 2 t0 2018-05-14 00:41:05.271035 t1 2018-05-14 00:45:49.267530)
    rsa_4096 sigs/sec 5.26249088787 secs/sig 0:00:00.190024 mean 0:00:00.569010 (n 1000, c 3 t0 2018-05-14 00:45:55.314150 t1 2018-05-14 00:49:05.338232)
    rsa_4096 sigs/sec 6.99741967351 secs/sig 0:00:00.142909 mean 0:00:00.570636 (n 1000, c 4 t0 2018-05-14 00:49:13.288812 t1 2018-05-14 00:51:36.198634)
    rsa_4096 sigs/sec 4.98478533849 secs/sig 0:00:00.200610 mean 0:00:01.000996 (n 1000, c 5 t0 2018-05-14 00:51:46.084811 t1 2018-05-14 00:55:06.695255)
    rsa_4096 sigs/sec 4.96866585409 secs/sig 0:00:00.201261 mean 0:00:01.204553 (n 1000, c 6 t0 2018-05-14 00:55:18.500035 t1 2018-05-14 00:58:39.761305)
    rsa_4096 sigs/sec 4.96137844716 secs/sig 0:00:00.201556 mean 0:00:01.406723 (n 1000, c 7 t0 2018-05-14 00:58:53.487509 t1 2018-05-14 01:02:15.044397)
    rsa_4096 sigs/sec 4.95388745692 secs/sig 0:00:00.201861 mean 0:00:01.609353 (n 1000, c 8 t0 2018-05-14 01:02:30.690232 t1 2018-05-14 01:05:52.551903)

See git.cryptech.is/sw/libhal/tests/parallel-signatures.py for
details on what all the fields mean.

The important number is "sigs/sec", which is total signature
throughput (n / (t1 - t0)).

"c" is the number of clients, "n" is the number of samples.

"secs/sig" is the inverse of sigs/sec, while "mean" is the average
time for an individual signature: the ratio of these two shows how
much benefit we get from running multiple clients (which is why
they're roughly equal for n=1)

Comments and caveats:

* These numbers are for an FPGA load with eight ModExpA7 cores, so
  theory says that best result should be four signer clients running
  in parallel, and this time results seem to match the theory.  Yay.

* This is with RSA blinding disabled.  I have some code on a separate
  branch which (in theory) amortizes the cost of generating blinding
  factors, needs a bit of work to fit in with the rest of the
  changes.  Enabling any kind of blinding will be slower, question is
  just how much slower.

* The clients here are using the RPC API directly, not PKCS #11.  The
  PKCS #11 library needs additional work to make good use of this
  (discussed previously, summary: bad lock granularity, no biscuit).

* As predicted by the last set of profiling results, the improvement
  here comes from reducing unnecessary fetches from the keystore.
  This is nasty from an implementation standpoint because it's a major
  layering violation, but in this case the performance issue seems to
  warrant it.  The new code tries to abstract as much of the nastiness
  out of the API, but a certain amount of ick was unavoidable.

Code not yet pushed, needs a bit of cleanup, including the RSA
blinding stuff mentioned above, but, still, progress.


More information about the Tech mailing list