Investigating Yosys's n-bit adder designs for ECP5 #59
bkushigian
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Reverse engineering Yosys's n-bit adders for ECP5
I want to understand how Yosys is creating their adders. This has potential benefits:
See how different features of ECP5 architecture are used (e.g., carry
chains?)
Get a feeling for possible design choices that aren't obvious at first glance
Become familiar with the output format so I can target it when I compile our
modules
Each adder I inspect is written in behavioral Verilog and translated to
structural Verilog with the script:
I'll list the behavioral Verilog of each module as well as the structural
Verilog resulting from running Yosys on the behavioral Verilog.
A Quick Word on LUT4 Inputs
The inputs A..D to a
LUT4
form a 4-bit numberDBCA
(notice they'rereversed!). Here
D
is the MSB andA
is the LSB. We can label the bits bytheir hex address. For an INIT value of
0x0FF0
we would have the following:I've broken each 16-bit number up into nibbles. This is very useful since
D
andC
move across nibbles (inter-nibble inputs)B
andA
move within nibbles (intra-nibble inputs)The following table shows how
D
andC
select the nibble:I find it helpful to have this high level visualization because often times DC
and AB work together in 'intuitive' ways, and this translates to inter- vs
intra- nibble movement.
For instance, we'll see a lot of symmetric
LUT4
INIT
values, where the firstand last nibbles are the same and the middle two nibbles are the same. This
corresponds to the XOR of inputs C and D being important somehow.
We'll also see one or both of inputs A and B be held constant (e.g., to 0). This
means that we don't really care which value within a nibble we use.
2 Bit Adder
A 2-bit adder. This can in theory be handled by a single slice.
Behavior Verilog:
Structural Verilog:
This has the following structure:
Let's look at each of the LUTs:
LUT4_0
LUT4_0
computes the bitwise XOR of inputsa[0]
andb[0]
and wires it toout[0]
: notice that it'sINIT
value0x0ff0
is an XOR truth table0b0110
with each bit repeated 4 times:
Intra-nibble inputs
A
andB
are always zero, so we always read the leastsignificant bit of a nibble (marked with
^
in the above figure), and the LSBsof the nibbles are
0110
respectively (again, just anXOR
truth table).Inter-nibble inputs
C
andD
choose which LSB to read:0
whenC = D
and1
otherwise (e.g.,C XOR D
).LUT4_1
LUT4_1
is a bit more compilcated thanLUT4_0
since it handles carry logic,but if we view it as a slight modification of
LUT4_0
it is actually verysimple.
Inter-nibble inputs
C
andD
are wired toa[1]
andb[1]
, and these willact much like
C
andD
did fora[0]
andb[0]
in the previous LUT.Intra-nibble inputs
A
andB
are wired toa[0]
andb[0]
, and we needthese to reason about the carry value from the addition of
a[0]
andb[0]
.We only need to carry when both
a[0]
andb[0]
are1
, and this correspondsto the most significant bit of each nibble. So when either
a[0]
orb[0]
are0
we perform the same calculation as in the previous LUT; when they are both1
we perform that calculation and then flip the result.This can be accomplished by simply negating the MSB of each nibble in
LUT4_0
'sINIT
value:In fact Yosys could (should?) have used this
INIT
value forLUT4_0
in thefirst place!
4 Bit Adder
This one is more complicated, and I'll speculate on why in a moment. For now,
let's take what would seem to be a very easy solution: we can generalize what we
did above, right?
The problem with this (I think...) is that
a[1]
andb[1]
are used acrossslice boundaries. Is this disallowed?
Anyway, let's look at the Behavioral and Structural Verilog
Behavioral Verilog
Structural Verilog
This code has the following layout:
This is more complicated than the 2-bit adder, but we can break it down:
Computing
out[0]
andout[1]
are identical to the 2-bit adder.To compute
out[2]
we need carry information fromout[1]
. This is handled providedLUT_CROSS_SLICE_CARRY
. This table outputs0
whenever the add at bit 1 results in a carry and1
otherwise...in other words, this table stores the predicate "out[1]
does NOT carry".The output from the carry LUT is used in
LUT_2
, along with inputsa[2]
andb[2]
. The input toLUT_2.A
is hardcoded to 0.To compute
out[3]
we need carry info fromout[2]
. This could be handled in the same way as computing carry info fromout[2]
was, but Yosys opts to use another design. Yosys generates to LUTs,LUT_MUX_B
andLUT_MUX_A
.LUT_MUX_B
computes the value ofout[3]
whenb[3] = 1
LUT_MUX_A
computes the value ofout[3]
whenb[3] = 0
Both LUT4s take the same inputs:
LUT_CROSS_SLICE_CARRY
,a[2]
,b[2]
, anda[3]
.LUT_CROSS_SLICE_CARRY
,a[2]
andb[2]
compute carry info from the add at bit 2, anda[3]
is one of the bits to be added (the other beingb[3]
). Note that the INIT values of the two tables are bitwise negations of each other. Finally, these values are MUXed usingb[3]
as the selector bit, and the result is assigned toout[3]
.Also interesting: note that Yosys is separating uses of input bits 0 and 1 from input bits 2 and 3 (e.g.,
a[1]
anda[2]
are not used in the same LUT). I don't know if this is necessary or not.Beta Was this translation helpful? Give feedback.
All reactions