[Cryptech Tech] NaCL in hardware
Pavel Shatov
meisterpaul1 at yandex.ru
Wed Sep 23 09:15:59 UTC 2015
On 19.09.2015 13:41, Peter Schwabe wrote:
> Joachim Strömbergson <joachim at secworks.se> wrote:
>
>> Aloha!
>
> Dear all,
>
>> After looking at the source code I'd say its not that hard to do a
>> manual translation to Verilog. Most tools (Xilinx ISE for example)
>> supports both languages. But projects with mixed languages often
>> requires buying a license for that feature.
>
> I checked with Michi Hutter and he also said that it would probably not
> be too hard, but also not really worth the effort because almost all
> tools support both languages. If somebody ends up porting the
> implementation to Verilog, I'll of course be very happy to link to that
> port.
>
> Cheers,
>
> Peter
>
Hello!
I've taken a look at the source code and read the covering paper. I'm
afraid, VHDL -> Verilog translation is not the only thing, that we'll
need to do in order to adapt this library to our needs.
This library is said to be optimized for small slow, low-power
low-resource devices, while our FPGA is going to be large and fast. For
example, they decided to use single-port RAM for storage to save
resources, but contemporary FPGAs all have inherently dual-port block
memory, so it makes no difference for us. Another problem is their
multiplier. As far as I understand, they implemented digit-serial
multiplier, which is not very fast, according to their report (a few
MHz). Well, they are talking about 130 nm technology, white Artix-7 is
28 nm, but it won't be like 130/28=4.6 times faster. Again, in nowadays
FPGAs we have dedicated hardware multipliers, that come at no cost. Yes,
they are vendor-specific, and I obviously cannot decide for the whole
team, whether we can use them or not. But my personal view is that as
long as we clearly document, that in this particular module we are using
vendor-specific multiplier, and provide generic replacement module for
simulation, it's OK.
Another issue is that this library is kind of very specific crypto
co-processor, that has its own very specific instruction set. And it has
something similar to microcode. As far as I understand, the co-processor
itself can only do low-level math stuff, while actual curve arithmetic
is supported by small programs residing in a dedicated ROM. I see, that
they distribute microcode assembly sources (.nacl files) for Curve25519
and a special assembler tool written in Java. As far as I understand, to
add support for curves P-256 and P-384, that we need for DNSSEC, we'll
have to re-write their microcode. At first sight, this is not trivial.
I don't want to say, that this library is bad. It's great, it's huge
amount of work, but it's oriented more towards ASICs than FPGAs, and
again huge effort was made to make ASIC implementation cheap. As I see
it, we have slightly different target, at least in the nearest future.
Please correct me, if I'm wrong.
--
With best regards,
Pavel Shatov
More information about the Tech
mailing list