[CryptechCommits] [core/math/modexpa7] branch systolic_crt updated: Added some info to the README file.
git at cryptech.is
git at cryptech.is
Fri Aug 11 21:54:32 UTC 2017
This is an automated email from the git hooks/postreceive script.
meisterpaul1 at yandex.ru pushed a commit to branch systolic_crt
in repository core/math/modexpa7.
The following commit(s) were added to refs/heads/systolic_crt by this push:
new 256d992 Added some info to the README file.
256d992 is described below
commit 256d99236e6359630621d0fc807e54bd91eecc81
Author: Pavel V. Shatov (Meister) <meisterpaul1 at yandex.ru>
AuthorDate: Sat Aug 12 00:53:33 2017 +0300
Added some info to the README file.

README.md  35 ++++++++++++++++++++++++++++++
src/tb/tb_systolic_multiplier.v  2 ++
2 files changed, 32 insertions(+), 5 deletions()
diff git a/README.md b/README.md
index 8abc6bc..35532d7 100644
 a/README.md
+++ b/README.md
@@ 10,12 +10,17 @@ The core has two synthesistime parameters:
* **OPERAND_ADDR_WIDTH**  Sets the _largest supported_ operand width. This affects the amount of block memory that is reserved for operand storage. Largest operand width in bits, that the core can handle is 32 * (2 ** OPERAND_ADDR_WIDTH). If the largest possible modulus is 1024 bits long, set OPERAND_ADDR_WIDTH = 5. For 2048bit moduli support set OPERAND_ADDR_WIDTH = 6, for 4096bit capable core set OPERAND_ADDR_WIDTH = 7 and so on.
 * **SYSTOLIC_ARRAY_POWER**  Determines the number of processing elements in the internal systolic array, total number of elements is 2 ** SYSTOLIC_ARRAY_POWER. This affects the number of DSP slices dedicated to parallelized multiplication. Allowed values are 1..OPERAND_ADDR_WIDTH1, higher values produce higher performance core at the cost of higher device utilization.
+ * **SYSTOLIC_ARRAY_POWER**  Determines the number of processing elements in the internal systolic array, total number of elements is 2 ** SYSTOLIC_ARRAY_POWER. This affects the number of DSP slices dedicated to parallelized multiplication. Allowed values are 1..OPERAND_ADDR_WIDTH1, higher values produce higher performance core at the cost of higher device utilization. The number of used DSP slices is NUM_DSP = 10 + 2 * (2 + 7 * (2 ** SYSTOLIC_ARRAY_POWER)). Here's a quick reference table:

TODO: Give device utilization numbers for different values of SYSTOLIC_ARRAY_POWER.


+ SYSTOLIC_ARRAY_POWER  NUM_DSP 
+
+ 1  42 
+ 2  70 
+ 3  126 
+ 4  238 
+ 5  462 
+
+Given that Alpha board FPGA has 740 DSP slices, SYSTOLIC_ARRAY_POWER=5 is the largest possible setting. Note that if two cores are needed (eg. to do the two easier CRT exponentiations simultaneously), this parameter should be reduced to 4 to fit two cores into the device.
## API Specification
@@ 56,6 +61,26 @@ Readonly register bits:
[0] "ready" control bit
The "valid" status bit is cleared as soon as the core starts exponentiation, and gets set after the operation is complete. The "ready" status bit is cleared when the core starts precomputation and is set after the speedup coefficient is precalculated.
+ * **MODE**
+Mode register bits:
+[31:2] Don't care, always read as 0
+[1] "CRT enable" control bit
+[0] Don't care, always read as 0
+The "CRT enable" control bit allows the core to take advantage of the Chinese remainder theorem to speed up RSA operations. When the CRT mode is disabled (MODE[1] = 0), the message (base) is assumed to be as large as the modulus. When the CRT mode is enabled (MODE[1] = 1), the message is assumed to be twice larger than the modulus and the core will reduce it before starting the exponentiation. Note that if the core was compiled for eg. 4096bit operands (OPERAND_ADDR_WIDTH=7), it can onl [...]
+
+* **MODULUS_BITS**
+Length of modulus in bits, must be a multiple of 32. Smallest allowed value is 64, largest allowed value is 32 * (2 ** OPERAND_ADDR_WIDTH). If the modulus is eg. 1000 bits wide, it must be prepended with 24 zeroes to make it contain an integer number of 32bit words.
+
+* **EXPONENT_BITS**
+Length of exponent in bits. Smallest allowed value is 2, largest allowed value is 32 * (2 ** OPERAND_ADDR_WIDTH).
+
+* **BUFFER_BITS**
+Length of operand buffer in bits. This readonly parameter returns the length of internal operand buffer and allows the largest supported operand width to be determined at runtime.
+
+* **ARRAY_BITS**
+Length of systolic array in bits. This readonly parameter returns the length of internal systolic multiplier array, it allows SYSTOLIC_ARRAY_POWER compiletime setting to be determined at runtime.
+
+
The second part of the address space contains four operand banks.
Length of each bank (BANK_LENGTH) depends on the largest supported operand width: 0x80 bytes for 1024bit core (OPERAND_ADDR_WIDTH = 5), 0x100 bytes for 2048bit core (OPERAND_ADDR_WIDTH = 6), 0x200 bytes for 4096bit core (OPERAND_ADDR_WIDTH = 7) and so on.
diff git a/src/tb/tb_systolic_multiplier.v b/src/tb/tb_systolic_multiplier.v
index 96e76d5..f476aa4 100644
 a/src/tb/tb_systolic_multiplier.v
+++ b/src/tb/tb_systolic_multiplier.v
@@ 162,6 +162,8 @@ module tb_systolic_multiplier;
.ena (ena),
.rdy (rdy),
+ .reduce_only (1'b0),
+
.a_bram_addr (core_a_addr),
.b_bram_addr (core_b_addr),
.n_bram_addr (core_n_addr),

To stop receiving notification emails like this one, please contact
the administrator of this repository.
More information about the Commits
mailing list