[Cryptech Tech] Revised keystore API and keystore flash "filesystem"

Rob Austein sra at hactrn.net
Tue Sep 6 20:37:49 UTC 2016

One of my current projects is revising the internal keystore API used
within the Cryptech HSM software.  One aspect of this is figuring out
how to make better use of the keystore flash than we have to date.
I've talked over some of the details of this with Fredrik and Paul,
but wanted to share the discussion with a wider audience in case
anybody has advice or sees serious problems with the approach we're

We've talked about layering some kind of filesystem on top of the
flash, whether that filesystem be something that understands flash
wear leveling (eg, UFFS, YAFFS) or something that doesn't (eg, FATFS).
While all of these are potentially doable, all also have potential
issues (licensing, lack of wear leveling, code complexity).  More to
the point, a filesystem per se may not really be appropriate here in
any case: what we really want is some kind of record storage, so we'd
be layering that on top of a filesystem, adding still more code.
Maybe we don't need all of that, and maybe we can do better.

Peter Gutmann advised us to look at PKCS #15, which is sort of an
ASN.1 structured filesystem.  We could do that, but it doesn't really
address the underlying flash issues either, and itself appears to be
layered on top of a sort of rudimentary proto-filesystem that's part
of the underlying smartcard specification.  Again, probably doable,
but still seems more complex than we need.

So it occurred to me that if we really just need a record store, maybe
we should just write that.  Hence the following.

Data sheet for the flash that (I think) is in the Alpha, for reference:


For the current discussion, it's probably most convenient to think of
the flash as a sequence of 4096 4KB sub-sectors, since that's the
minimum erasable unit.  As I think I understand it, flash wear is
caused by the erase cycle, so the basic goal of wear leveling is to
spread erasure load somewhat evenly among the available sub-sectors.

Fredrik tells me that, at least with the current Alpha hardware
design, mapping the flash directly into the ARM's address space isn't
practical, so reading or writing the flash looks like an I/O operation
to the software.

So, the basic proposal is just to store a fixed set of N records per
sub-sector, where N may well end up being one, and not bother with a
filesystem per se at all.

Exactly how many records should be stored per subsector depends on the
target key size:

* An 8192-bit RSA key in the current representation won't fit in 4KB,
  so although we could probably squeeze that a little smaller, we
  might need a scheme which allows us to chain multiple sub-sectors if
  we're serious about 8192-bit RSA (not sure we are -- 8192 bit RSA is
  painfully slow with the current hardware and software);

* We probably want to leave some room for things like storing a set of
  opaque attributes, so that we can move stuff like the PKCS #11
  attributes currently stored on the host in an SQLite3 database into
  the HSM and get rid of the separate database.

For the moment, take it as read that the number of keys per sub-sector
would be a small integer, most likely one, value to be chosen later.
Whether all key blocks are the same size or we vary the size based on
the key type is another question TBD.

Managing this would be relatively simple: we'd have an in-memory index
mapping key names (16 bytes) to the subsectors currently holding those
keys, and an in-memory bitvector and cursor tracking free subsectors.
New subsectors would be allocated round-robin, which seems like about
the right level of work for simple wear leveling.  All of this
in-memory stuff would of course go away when the HSM reboots, but if
we handle the flash correctly it should be straightforward to build
the in-memory stuff by examining the flash at boot time.

We probably want each flash subsector to start with a few header

* A simple type code (PIN block, key block, perhaps overflow block if
  we need to chain multiple blocks to hold things larger than 4KB);

* A field indicating whether this sub-sector is pristine (all-ones,
  ie, state just after flash erasure), in use, or zeroed (no longer in
  use, but not yet erased);

* Field(s) to allow us to distinguish between data under construction
  and data that was fully written to flash (precise details vary
  depending on exactly how we lay things out, but general form follows
  the pristine -> intermediate -> final model and the state
  transitions have to be marked by zeroing bits, since setting bits
  would require erasing the sub-sector).

We can probably keep the index structures fairly simple: for a
4096-entry index doesn't take up a lot of space, and given that we
expect lookup to be much more common than insertion or deletion,
something as simple as binary search of an array may be perfectly
adequate (I confess to a fondness for search algorithms that can be
expressed in half a dozen lines of C code).  If necessary, we can use
hash buckets / balanced trees / ....  No dynamic memory required:
sizes and counts determined at compile time would tell the boot code
how much SDRAM to grab.

The use of both "pristine" (all-ones) and "zeroed" (all-zeros) states
above is not accidental.  One of the minor issues in this design is
how to figure out where to resume the round-robin rotation when
rebuilding the in-memory stuff on reboot.  Leaving the most recently
discarded subsectors zeroed instead of erasing them immediately should
make this fairly straightforward.  Some corner cases if we ever manage
to completely fill the flash, so maybe we also need a timestamp field
in the subsector headers, handwave, details to be worked out.

Anyway, that's the general idea under consideration.  Comments?

More information about the Tech mailing list