[Cryptech Tech] Getting sync FMC to work

Pavel Shatov meisterpaul1 at yandex.ru
Sat Aug 18 00:03:12 UTC 2018


Hi,

I believe I've finally figured out why the new sync FMC bitstream 
doesn't work. First of all, there's an error in the Makefile, namely in 
core/platform/alpha/build/Makefile on line 15. Part should be 
"xc7a200tfbg484-1", not "xc7a200tfbg484-3". That last digit is the speed 
grade rating which roughly indicates the maximum switching speed of the 
internal configurable logic block. Our boards are populated with the 
slower -1 devices, while in the Makefile the target device is -3.

This explains why I was getting timing errors during implementation 
while Rob and others did not. I'm more used to the GUI flow, especially 
while debugging, so I recreated a GUI project and did set the device 
properly when doing it. Other developers who used the console build flow 
were targeting a slightly faster device, this way people didn't get 
timing errors, but when the bitstream was programmed into a slower than 
intended device it didn't work correctly.

[Michael, if you still need the GUI project, drop me a line. I didn't 
send it to you right away because I wasn't sure that I was doing 
everything properly.]

I've corrected the Makefile and tried building the bitstream under Linux 
using the console flow and now it did fail timing, yay :)


Now speaking of the particular timing errors, they are in the ChaCha 
inside of the CSPRNG. This is consistent with the fact that when running 
both unit-tests.py and parallel-signatures.py STM32 locks up in 
hal_get_random() waiting for CSPRNG valid status bit to go high. I tried 
to hardwire the CSPRNG valid bit to always read 1 and both tests started 
doing some work and then failing with HAL_ERROR_CSPRNG_BROKEN, which 
again makes sense.

 From the timing report I suspect that the problem with ChaCha is that 
there are combinatorial adders in chacha_qr.v module. The first obvious 
thing that comes to mind is adding registers after them, but maybe I'm 
wrong, so I'd leave that to Joachim, because he clearly has a better 
understanding of how ChaCha works.


Regarding the Makefile, I tried to double-check it for other possible 
anomalies. I've discovered two things:

1) Multi-threading was not enabled in xilinx.mk, while both map and 
place-and-route tools support multi-threading. From my experience it 
does speed up implementation, not that it cuts run-time in half, of 
course, but still. I've pushed xilinx.mk with multi-threading enabled 
(the corresponding switch is -mt, see commit message). I don't know the 
configuration of the machine where we build packages though, so if it 
only has two cores and we don't want to cripple it, then please discard 
the change.

2) SmartGuide was enabled. SmartGuide is a technology that is meant to 
be enabled late in the design phase when you are only doing small 
changes. It claims to speed up the implementation by reusing the 
previous implementation result as a reference and trying to keep as much 
of the placement and routing as possible and only touching nets that 
have changed. From my experience it doesn't work very well, moreover 
when the changes are substantial (eg. different configuration of the 
core selector) it only makes things worse, because it's easier to do 
everything from scratch than to convert one design into another. I'm 
also afraid SmartGuide can prevent us from having reproducible builds. 
Suppose that one clones our sources and starts building. His tools will 
work from scratch and build a bitstream. Now if we build a package and 
don't do 'make clean' beforehand the tools will work in that "guided" 
mode and may build a different bitstream. Both bitstreams will function 
correctly, but they may be different. Not a very good situation, I 
guess. Xilinx tools are already a blackbox, and SmartGuide is a blockbox 
inside of a blackbox. I believe that the fewer blackboxes the better, so 
I'd suggest disabling SmartGuide altogether.


-- 
With best regards,
Pavel Shatov


More information about the Tech mailing list