[Cryptech Tech] Status of RSA timing tests etc
Pavel Shatov
meisterpaul1 at yandex.ru
Tue May 29 12:40:45 UTC 2018
29.05.2018 11:17, Joachim Strömbergson пишет:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Aloha!
>
> Pavel Shatov wrote:
>> Now Paul's idea to run the cores at FMC clock is very interesting,
>> and I believe it's viable. It needs a minor bit of investigation with
>> a scope, which I'm going to do next week.
>>
>> As far as I understand, all the cores we currently have can now run
>> at least at 100 MHz, so there should be no problem running them at 90
>> MHz, and there's no reason to clock them slower than that.
>
> How about that we do this in two steps:
>
> (1) Increase the sys_clk for the cores to 100 MHz as discussed and
> decided on before.
>
> (2) Investigate and try and run the cores synchronously with the 90 MHz.
>
> The point I see by doing this is that if we do (1) and don't see major
> performance improvements, we should know a bit more about where the
> bottleneck is hiding. Doing (1) should also confirm that we can run the
> cores at 90 MHz with no problems. So it would be a bit of a pipe-clean
> for (2). (1) _should_be an easy thing to do. Lets not wait anymore.
>
>
> Pavel, I've asked a few times now, can you please confirm or correct the
> suggested changes to the sys_clk clock generator parameters? And is the
> changes to those parameters the only place we need to update. No changes
> in the ucf-file?
Sorry for taking that long to answer. You're right, no editing of the
.ucf is necessary. It defines clocks external to the FPGA, and since our
external 50 MHz clock source stays the same, no changes are needed.
Clocks generated inside of the device are constrained automatically.
Currently the external 50 MHz goes into a clock synthesizer (MMCM =
Mixed Mode Clock Manager in Xilinx' terms). It works by first
multiplying the input frequency by CLK_OUT_MUL to obtain an intermediate
frequency called fVCO and then dividing it by CLK_OUT_DIV to generate
the output system clock.
There's a certain limit on fVCO, for our particular device (-1 speed
grade) according to the datasheet it can be 600..1200 MHz. On one hand
the rule of thumb is to have fVCO as high as possible, because this way
during the final division more clock periods are "averaged" and the
output frequency is more stable.
We currently have CLK_OUT_MUL = CLK_OUT_DIV = 20, that means fVCO = 50
MHz * 20 = 1000 MHz, and the output clock is 1000 MHz / 20 = 50 MHz. We
could also use 22 for fVCO = 1100 MHz or 24 for fVCO = 1200 MHz.
Note, that on the other hand it is often not recommended to operate on
the upper limit of fVCO, because if the input clock is slightly higher
for some reason (eg. 50,1 MHz), then after multiplication fVCO may fall
outside of its operating range. What's why I decided to operate at 1000
MHz to have some margin. There may be other factors that influence fVCO
choice such as EMI, for example.
To switch the system to 100 MHz I suggest keeping CLK_OUT_MUL = 20 and
changing CLK_OUT_DIV to 10, this way fVCO = 50 MHz * 20 = 1000 MHz, and
system clock is 1000 MHz / 10 = 100 MHz.
I guess you most probably know that even if the timing check after
place&route fails, ISE will still generate a (potentially faulty)
bitstream. We should be careful with that. Yes, ISE will print a
warning, that some timing constaints are not met, but it won't stop
after that. Now in GCC there's a -Werror= thing to turn specific
warnings into errors, I don't know of such a feature in Xilinx tools,
maybe you know?
Anyways, I think it's a good idea of yours to still try 100 MHz even if
we're going to switch to FMC_CLK eventually.
--
With best regards,
Pavel Shatov
More information about the Tech
mailing list