[Cryptech Core] ModExp test results

Rob Austein sra at hactrn.net
Sat Jun 27 14:36:48 UTC 2015


I've been running some tests on the old (about a month ago) and latest
(yesterday) ModExp core.  Results summarized below, notes follow.

				1024		2048		4096
--------------------------------------------------------------------------
Old ModExp encrypt		5.837198	35.920986	267.222054
Old ModExp decrypt		5.489668	35.583835	268.625344
Old CRT decrypt			14.142096	104.246679	802.028670
Old keygen/encrypt/decrypt	68.106694	679.259678	6305.303504

SW CRT decrypt			0.359630	0.396882	3.450447
SW keygen/encrypt/decrypt	2.406200	15.713556	377.959982

New ModExp encrypt		0.608936	[x] 0.497601	[x]
New ModExp decrypt		3.711576	[x] 23.882598	[x]
New CRT decrypt			9.749045	[x] 29.750953	[x]
New keygen/encrypt/decrypt	67.868772	[x] 512.707423	[x]

Notes:

- The [x] flags indicate cases for which the result was wrong.  The
  new core handles 1024 bit keys correctly but returns all-zero
  results for 2048 bit and 4096 bit keys.

- Columns are key sizes.  All other numbers are seconds.  Don't take
  anything beyond the first few digits seriously, code just reports
  delta of elapsed gettimeofday().

- The 4096 bit timing column is blank because the test program
  ordinarily gives up when it starts getting bad results; I coaxed it
  into continuing under gdb to confirm that the 4096 bit keys were
  broken too, but running under gdb messes up the timing numbers so I
  didn't keep them.  If people really care about how long it takes to
  get the wrong answer, I can tweak the test program to ignore errors
  and run the test again.

- Testing was done with the tests/test-rsa program in the
  user/sra/libhal repository.  All code involved should be pushed to
  the public repository, except for the couple of lines that drops the
  exponent padding (that's on hold until the new core works with all
  key sizes, since it doesn't work with the old core).

Details on what the various tests are:

- "ModExp encrypt" means throwing a message, a modulus, and the RSA
  public exponent (e, always 0x10001 in these tests) at the core.
  This is the RSA verification operation.

- "ModExp decrypt" means throwing a signature, a modulus, and the RSA
  private exponent (d) at the core.  This is the RSA signature
  operation.

- "CRT decrypt" means using the Chinese Remainder Theorem to perform
  the RSA decrypt, so instead of throwing a big exponent (d) at the
  core one throws two smaller exponents (d % (p-1) and d % (q-1)) at
  the core and runs the results through some other cheaper math to get
  the same final result.

  p and q are each about half the length of d, and the work involved
  in exponentiation goes up significantly with the length, so in
  theory the CRT result ought to be much better than the plain ModExp
  decrypt result.  It's not, because the core takes about as long for
  each of the short exponentiations as it does for the longer one
  they're replacing.

  There's a third exponentiation buried in the CRT version: generation
  of the RSA blinding factors.  So yeah, at the moment the CRT version
  takes about three times as long as the simple ModExp version.

- "keygen/encrypt/decrypt" includes generating a new key (basically,
  throwing CSPRNG output at a prime number tester until we have a pair
  of good primes) then performing both signature (CRT decrypt) and
  verification (encrypt)  Verification in the "SW" cases is done in
  software, in all other cases it's the "ModExp encrypt" operation.

  The prime number test does include some modular exponentiation.

- "SW" is running exactly the same test code on the same processor
  with one change: calls to the ModExp core are changed to calls to
  the libtfm software implementation of modular exponentiation.

Bottom line:

- Serious improvement for the short exponent case;

- Modest improvement for the other cases;

- New serious bug (may be trivial to fix for all I know, but it's a
  show-stopper until then);

- Some way to go yet before reaching break-even point vs software (on
  the A9, anyway -- I expect software on M4 to be slower than on A9).

I'm about to leave on a 900km road trip, so don't expect an immediate
response to anything.



More information about the Core mailing list