[Cryptech Tech] Happier RSA timing numbers

Paul Selkirk paul at psgd.org
Wed May 23 19:27:53 UTC 2018


While I was researching the FMC thing the other day, I came a couple
posts from Pavel that I'd forgotten about.

https://lists.cryptech.is/archives/tech/2015-October/002293.html
On Tue Oct 20 17:47:48 UTC 2015, Pavel Shatov wrote:
>> In the long term, I *really* want to get rid of this double read, so
>> we can do memcpy() to/from the FPGA, because HAL_SRAM_Read_32b()
>> supports that. In the short term, I can live with it, while I get the
>> rest of the software up and running.
>
> I can't tell you how I *really-really-really* wanted to get rid of
> that double read :( I spent several weeks pulling my hair out and
> banging my head against the wall trying to fix it. I just don't have
> any more nerves to spend on it. If anyone wants to try and fix this, I
> can provide full details on the problem. The short story is like, on
> one hand STM32 has a dedicated FMC_NWAIT pin, that can be used in
> variable-latency data transfer mode. On the other hand STM32 also has
> a very nasty hardware bug associated with FMC_WAIT, that causes
> processor to freeze under certain conditions. Because of this
> FMC_NWAIT cannot be used and FPGA can't properly signal to STM32, when
> data transfer is done. Because of that we have to read two times. I'm
> afraid, I can't fix this, I'm sorry.

The thing that confuses me in retrospect is that we do sample the NWAIT
pin (in _fmc_nwait_idle), but we still have to do the double read (in
fmc_read_32).

https://lists.cryptech.is/archives/tech/2015-August/002115.html
On Thu Aug 13 12:16:12 UTC 2015, Pavel Shatov wrote:
> Well, the problem as I see it, is that STM32 doesn't handle
> variable-latency transfers over FMC very well. This variable latency
> arises, because data transfer over FMC runs at its own frequency,
> while cores in FPGA can work at some another frequency. One possible
> solution is to clock cores in FPGA using FMC_CLK. This will make whole
> system synchronous. This was actually not possible with iMX6Q, because
> it could only generate EIM_BCLK during read/write transactions. STM32
> on the other hand can generate FMC_CLK continuously.

According to alpha_fmc.ucf, FMC_CLK runs at 90MHz (its apparent max),
and GCLK is 50MHz, though there's been a lot of work to crank it up to
100MHz. If we sync GCLK to FMC_CLK (say at 90MHz), would that eliminate
the need for NWAIT and the double read? Because that right there would
at least double the read throughput, and simplify the code as well.

				paul


More information about the Tech mailing list