[Cryptech-Commits] [core/platform/alpha] 05/07: This commit turns off the "equivalent_register_removal" setting for XST.

git at cryptech.is git at cryptech.is
Thu Jan 23 09:01:36 UTC 2020


This is an automated email from the git hooks/post-receive script.

meisterpaul1 at yandex.ru pushed a commit to branch fmc_clk_core
in repository core/platform/alpha.

commit fbf287f57879678fe8cf4a74e07e72ca5c7b153b
Author: Pavel V. Shatov (Meister) <meisterpaul1 at yandex.ru>
AuthorDate: Tue Jan 21 15:46:06 2020 +0300

    This commit turns off the "equivalent_register_removal" setting for XST.
    
    Okay, here's the story. Xilinx synthesis tool ("XST") is smart in the sense,
    that it detects all the registers with equivalent behaviour and then removes
    all of them, but one, and connects all loads to this one flip-flop. This works
    fine most of the time and usually even saves some resources, but for our
    particular design it was starting to cause just too many problems.
    
    The reason is that ModExp* cores exploit the parallel nature of an FPGA,
    for example, the ModExpNG instantiates four copies of the modular multiplier
    internally. Those multipliers all operate the same way (but on different data,
    of course), so all their internal signals such as, say, clock enables and word
    counters are the same. XST happily throws away all the internals from three
    multipliers, leaves only one instance of control signals and then the map and
    place&route tools start struggling for hours fusing this all together. Turning
    off equivalent register removal entirely leads to excessive resource
    consumption, so the optimal solution would be to selectively turn it off only
    for those tricky places where several copies of control signals are actually
    required to meet timing. The problem is that according to Xilinx' docs (UG687
    v14.5, p. 363) "quivalent_register_removal = no" inline constraint can be
    applied to entire modules, not only individual registers, but I was unable to
    get this to work, XST seems to just ignore it. This may have been fixed in
    Vivado though, haven't tried yet.
    
    Another potential solution is to prepend every register declaration inside the
    modular multiplier with this constraint, but that would look just ugly. One
    trick I've seen somewhere is to `define a new 'keep_equivalent_reg' "keyword"
    to be '"quivalent_register_removal = no" reg' and tweak register declarations
    accordingly, that seems to looks somewhat less ugly, don't know.
    
    Yet another way around might be to use the "max_fanout" constraint instead.
    Say there're eight DSP slices per multiplier (thirty two DSP slices total since
    there're four multiplier instances). In theory we can constrain their clock
    enable fanout to not exceed 8. The problem is that XST will first throw away
    three of the clock enables, and then gradually add them back to limit each
    clock enable fanout to 8. This way there's no guarantee, that the first clock
    enable will be routed to all the eight DSP slices in the first multiplier, it
    can be routed to DSP slices in the three remaining multipliers as well, since
    XST will try to just limit the fanout. It's difficult to predict how the
    place&route tools will handle this.
    
    Anyways, the current slice consumption with 2x ModExpA7 and 1x ModExpNG is ~40%,
    and the timing situation is very good (the very first phase of place and
    route already has zero setup time violations, yay!). With global equivalent
    register removal turned on, utilization drops to ~35%, but timing is impossible
    to meet even on the highest map and place&route effort setting. I believe the
    best way forward is to just keep global removal disabled for now. We may
    revisit this in the future, say, if we decide to generate a custom dedicated
    RSA-only signer bitstream with as many core instances as possible. Then every
    register will count, but I suspect we won't get away with just re-enabling
    global equivalent register removal alone, likely some floorplanning will be
    required too at least.
---
 build/xilinx.opt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/build/xilinx.opt b/build/xilinx.opt
index 1ac8957..91ce9fa 100644
--- a/build/xilinx.opt
+++ b/build/xilinx.opt
@@ -43,5 +43,5 @@
 -use_sync_set Auto
 -use_sync_reset Auto
 -iob auto
--equivalent_register_removal YES
+-equivalent_register_removal NO
 -slice_utilization_ratio_maxmargin 5



More information about the Commits mailing list