[Cryptech Tech] Short note on the current FPGA activities

Joachim Strömbergson joachim at secworks.se
Sun Jan 25 19:52:41 UTC 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Aloha!

Thanks for your mail Pavel, very interesting and good ideas. I add
comments below.


Шатов Павел wrote:
> Hello, Joachim! I have a suggestion on how to organize the clock
> network inside of an FPGA in our project. I've attached a picture for
> it to be more clear, please take a look. My suggestion is to separate
> the EIM interface from other parts of our design using either a pair
> of FIFOs or a dual-port RAM.

Yes, this is basically what I'm aiming for too. I would probably use
shallow FIFOs rather than dual port memories, chiefly because (I think)
it would be easier to port between FPGA vendor devices as well as
simulate. But I might be wrong and are happy to be proven so.

I've been looking at clocking the cores with bclock/2 (66 MHz) that is i
phase with bclock. This makes it fairly easy to do clock domain
crossing. I wasn't able to reduce the bclock based on the suggestions by
Bunnie or reading the (extensive and somewhat confusing) documentation
for the i.MX6.

But one thing I don't like with the bclock and why I think your solution
is better is that bclock is not running until you start accessing
something via EIM. This means that the cores are basically dead until
that happens.

I think that we at least want to have parts of our cores running as soon
as the device has been configured. Things like key handling, random
number generation etc. But I think it is easier to just having the whole
Cryptech FPGA system running. And then having commands appear and
handled whenever on the EIM interface.

The CLK2 is running at 50 MHz and I don't see any reasons to use a
slower clock than that for clocking our cores. All cores we have today
runs at much higher frequencies in Spartan6. Thus 50 MHz will give us
fairly good performance as well as fairly relaxed timing constraints to
make it easy to meet timing in the devices. For external interfaces
(external key storage for example) we may need other clocks, but those
are special cases.


> I don't understand bunnie's original design with 8(!) BUFGs and TIG
> constraints. Do you understand his timing closure? Floorplanning at
> 133 MHz seems to be an overkill and why are there no IODELAY2
> primitives? I suggest that we rework his design, because having two
> independent clock domains will allow EIM arbiter to run at BCLK 
> frequency and the rest of the logic to run at any other arbitrary
> frequency (not necessarily BCLK or BCLK/2).

No, I don't understand his clock implementation strategy. So far I've
tried to work within his design and not redo it. (Change what we need,
fix blatant errors and leave the rest untouched.) But if you feel
comfortable redesigning and simplifying the EIM clock implementation I
think not only we would be happy and benefit from it. The current
implementation looks very constrainedm, but I assumed the reason was
board timing and requirements from the i.MX6.

Randy: Is it ok to have Pavel do this?
I think it sounds like a good approach.

As you may see from my work code I've at least pushed the EIM into a
separate submodule. I suggest we do that no matter if we redesign the
whole EIM clock and I/O structure or not.

And I think we should decide on having two clock domains where CLK2 is
the one driving our cores.


> setup_fpga() that you call rubber chicken voodoo. I find bunnie's
> original code with all these memcpy(0xDEADBEEF, 0xHEXVALUE) very
> difficult to debug. Have you figured out what mode is used in his
> code? Sync or async? I've created C header with all the EIM-related
> registers and corresponding bitmaps. How can I upload it to the
> repository btw? Do I need some special write permission or
> something?

No, I have not been able to properly figured it out. I've once again
tried to trust Bunnie as much as possible since they do have things
working in the FPGA and accessible from the CPU. So I've tried to just
change what I need to try and lower the bclk from 133 MHz to something
more manageable.

But if you have done a cleanup of that code it sounds really good. Can
you run the bunnie FPGA designs with your code? I.e, do you have a
working baseline (what I try to establish.)

Regarding sync vs async - the bclock is only used for asynch I think. So
I think that is what is being used.


> To sum up I suggest the following smaller sub-plan of your plan for
> now: 1) Sort out how to properly configure EIM on iMX6Q.

If we think we need to do that, or just keep as is. It does work and if
you have working cleaned up code we might not need to much more, no?

I don't think it is vital for us to reduce bclock from 133 MHz.



> 2) Discuss clocking structure.

And implement it if we see that we need to do much more than what we
have today. We have a working clk50 based on CLK2. What we need is to
have either a cleanup Bunnie EIM clocking struture. Or having a reworked
one designed by you.


> 3) Write EIM arbiter.

Do you see that the arbiter would also contain the clock domain crossing
FIFOs (or dual port RAM), or would they be a separate module?
I'm undecided. Having something in the EIM interface that just presents
a memory like interface (cs, we, address, 32-bit write data and 32 bit
read data from cores). But this might be bclock speed. Thus the eim
module would be an entity with just its own clocks.

If we on the alpha board wants to reuse the structure with two clock
domains (which we may want to do), we then just have to replace the
external interface block.

The clock domain crossing functionality could then either be places at
top level as a separate module or inside the cryptech_top module (just
as a suggestion on name) that encapsulates the cores. I think having
them inside would be cleaner.


> 4) Add small amount of BRAM instead of cores in the right side of the
> picture.

Yes. And possibly having something very simple like a 32-bit adder that
takes two words that can be written and return the sum in a third word.
This allows us to test that we can do real operations.


> 5) Write test program to fill BRAM with some data pattern, read it
> back and compare. After we complete 5) and have stable interface
> between the CPU and the FPGA we can move to the right and start
> adding cores to our design.

Exactly. Looks very much what I had in mind.


> Regarding 1) I've rewritten bunnie's setup_fpga() using my header
> file with convenient bitfields, this way it's much easier to debug
> it. Number 2) is in the attached picture, do you have any suggestions
> to improve it? Speaking of 3) I've written write access handler and
> started writing read access handler. Waiting to hear from you,

You could either create a new repo for your work and just start pushing
(something under user/pavel for example.). Your rework in setup_fpga
would be great to get. I would like to take a look at the access
handlers, which I get is what you call the FIFOs, correct?

I have no big suggestions on improving the actual diagram. I think it
conveys what we want to do.

Not sure how we divide the work between us, it seems that we are heading
in the same direction. How much time can you spend on this the coming
week(s) and how long do you think it would take you to complete it? Say
to go from status what you have today to have SHA-1 clocked at 50MHz and
connected to the CPU via EBM?

- -- 
Med vänlig hälsning, Yours

Joachim Strömbergson - Alltid i harmonisk svängning.
========================================================================
 Joachim Strömbergson          Secworks AB          joachim at secworks.se
========================================================================
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBCAAGBQJUxUmIAAoJEF3cfFQkIuyNyBcP/2E4amyAjTZHvWvFKYGxCfeU
eePOo4yHV/U23C0ijkm/tr3IcGB8+FT7Sov31Q9n48WdWHcUytKgoEZuSHjN5YTy
m7GIvOR3GVuxtIq5MycGjn/MpvakGDTPCtS9vTItOT7sJUTzZjhCc5Ww7ZhqWxS5
C9ndvBP3YI/vlzZToguIYNUNVzB2mfCVt5hLrL5XSUlI04ml3Sp9tJPGIxoOHlMK
SmYKAGDBczLojd4qKP7FwEf9HxGG5Fuh+fJJ5z4kIG0uDYnDYcAzRNZOb+dd/SRy
MqCuAu2F8vYHXAbifsHDvH/rKzkYxZQjZcGYj9O/2+22lh8PeII8uxz6BA7npkMt
Zveld3ANrZiDCCAq2yv8YuXaPH384ziKWwLgXyffs7Q8NVAd/lrkVOWYSw7IqaLn
5jAKb/hDJAtQRV8MgiwVWF52LmlOuO660dkaS0pNrxm5ERD6Er3Ywv4OgTPNCjao
7FOI1jDM9LxlNQleeGYfM2TZ0zg7INamRuNozZua3geQQhtsR0oYQr099oHXoUnP
NV+KPMOt7FEpZgswso2HDrjLyE3HC8YD+YqPHT+irwigDbxAQKmIggCFeCI5tTz1
oWr6Zz8IbJ5u0NWCWbtcu/US/7RpzoJiQ7fwGig3Q+5iU4hPG0Svdt24GBxdvy64
dE4F5M4/RB55hdh/5xIB
=3lr+
-----END PGP SIGNATURE-----


More information about the Tech mailing list