[Cryptech Tech] Status of RSA timing tests etc

Pavel Shatov meisterpaul1 at yandex.ru
Tue May 29 12:40:45 UTC 2018


29.05.2018 11:17, Joachim Strömbergson пишет:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> Aloha!
> 
> Pavel Shatov wrote:
>> Now Paul's idea to run the cores at FMC clock is very interesting,
>> and I believe it's viable. It needs a minor bit of investigation with
>> a scope, which I'm going to do next week.
>>
>> As far as I understand, all the cores we currently have can now run
>> at least at 100 MHz, so there should be no problem running them at 90
>> MHz, and there's no reason to clock them slower than that.
> 
> How about that we do this in two steps:
> 
> (1) Increase the sys_clk for the cores to 100 MHz as discussed and
> decided on before.
> 
> (2) Investigate and try and run the cores synchronously with the 90 MHz.
> 
> The point I see by doing this is that if we do (1) and don't see major
> performance improvements, we should know a bit more about where the
> bottleneck is hiding. Doing (1) should also confirm that we can run the
> cores at 90 MHz with no problems. So it would be a bit of a pipe-clean
> for (2). (1) _should_be an easy thing to do. Lets not wait anymore.
> 
> 
> Pavel, I've asked a few times now, can you please confirm or correct the
> suggested changes to the sys_clk clock generator parameters? And is the
> changes to those parameters the only place we need to update. No changes
> in the ucf-file?

Sorry for taking that long to answer. You're right, no editing of the 
.ucf is necessary. It defines clocks external to the FPGA, and since our 
external 50 MHz clock source stays the same, no changes are needed. 
Clocks generated inside of the device are constrained automatically.

Currently the external 50 MHz goes into a clock synthesizer (MMCM = 
Mixed Mode Clock Manager in Xilinx' terms). It works by first 
multiplying the input frequency by CLK_OUT_MUL to obtain an intermediate 
frequency called fVCO and then dividing it by CLK_OUT_DIV to generate 
the output system clock.

There's a certain limit on fVCO, for our particular device (-1 speed 
grade) according to the datasheet it can be 600..1200 MHz. On one hand 
the rule of thumb is to have fVCO as high as possible, because this way 
during the final division more clock periods are "averaged" and the 
output frequency is more stable.

We currently have CLK_OUT_MUL = CLK_OUT_DIV = 20, that means fVCO = 50 
MHz * 20 = 1000 MHz, and the output clock is 1000 MHz / 20 = 50 MHz. We 
could also use 22 for fVCO = 1100 MHz or 24 for fVCO = 1200 MHz.

Note, that on the other hand it is often not recommended to operate on 
the upper limit of fVCO, because if the input clock is slightly higher 
for some reason (eg. 50,1 MHz), then after multiplication fVCO may fall 
outside of its operating range. What's why I decided to operate at 1000 
MHz to have some margin. There may be other factors that influence fVCO 
choice such as EMI, for example.

To switch the system to 100 MHz I suggest keeping CLK_OUT_MUL = 20 and 
changing CLK_OUT_DIV to 10, this way fVCO = 50 MHz * 20 = 1000 MHz, and 
system clock is 1000 MHz / 10 = 100 MHz.

I guess you most probably know that even if the timing check after 
place&route fails, ISE will still generate a (potentially faulty) 
bitstream. We should be careful with that. Yes, ISE will print a 
warning, that some timing constaints are not met, but it won't stop 
after that. Now in GCC there's a -Werror= thing to turn specific 
warnings into errors, I don't know of such a feature in Xilinx tools, 
maybe you know?

Anyways, I think it's a good idea of yours to still try 100 MHz even if 
we're going to switch to FMC_CLK eventually.


-- 
With best regards,
Pavel Shatov


More information about the Tech mailing list