I've recently been thinking about the use of variables in VHDL. As part of my musings, I've put together (and tested in ISIM, which won’t surprise those who know me) an example of using variables in a VHDL process. I start with two 8-bit bytes called "a" and "b" and two 4-bit nybbles (or nibbles if you prefer) called "c" and "d."
Depending on the value of two input flags ("f1" and "f2"), the outputs "c" and "d" are either the high or low nibbles of the results of the operations ("a"-"b") or ("b"-"a"). All of this is purely combinatorial logic -- there are no registers involved at all:
What is going on? How is this implemented? How can "diff" have two values in the same process?
In fact, this is pretty straightforward. The value originally assigned to "diff" becomes inaccessible to the bottom half of the process when it is assigned a value for the second time. This is recognized and implemented as if they are two distinct variables. A smart designer would most probably actually use two variables (or better yet, two processes), which will completely remove any chance of confusion.
Having said this, where things do start to get confusing is when a value's lifetime is not so clear cut. For example, what if the bottom half of the process did not always assign a value to "diff"?
As the logic between the two halves is no longer distinct, it rapidly becomes a mess that is much more tricky to understand -- especially when other signals may have already been assigned, in which case they will have a "next" value as well as a "current" value. This is not really a good thing at all and is best to be avoided in my book.
Another case to be avoided is when a variable is only assigned a value sometimes as illustrated below:
With the Xilinx ISE (integrated software environment), the code implements with just a warning about finding a latch on "diff$mux000." This surprised me; I assumed that it would just assign "U" or "0" to "diff" each time the process is triggered -- but it looks as though values stored in "diff" can and will be retained. It also looks as if I have some more VHDL textbook reading to do!
But what should come as no surprise to anybody is that these problems are not unique to VHDL -- similar problems occur all the time in the software world, with things like unassigned variables and coders using the same "i" variable to implement multiple loop counters. But coders develop tools and practices to avoid most of these errors.
So, are VHDL variables really so bad that they should never be used? Or is it just that they don't fit with the mindset of "assignments will be made when the process is finished" that is required for hardware design? They seem like a powerful feature to me!
It seems to me that the confusion with using varaibles for synthesis stems from how most of us are taught about how synthesis works.
Generally, we are taught that processes that can run at any time, produce outputs at any time, which should be combinatorial (with the occasional latch). Similarly, structural approaches to avoiding latches emphasized "an else for every if" as the the preferred solution for avoiding latches. Makes sense...
We are also taught that processes that were constrained to execute on a clock edge would produce registered values, presumably because just like a register, the only time they can update their value is on the clock edge. This makes sense too...
Both of these approaches to creating and avoiding storage elements (registers and latches) miss the boat entirely.
Storage is inferred from the requirement to remember a previous value. If your code requires remembering a previous value, then the synthesis tool will infer a storage element to remember the previous value, whether you use a signal or a variable.
Failure to teach this basic principle leads to the inaccurate conclusion that it is the signal assignment that makes the assigned signal a register. Not true. It is the fact that any reference to a signal assigned in a clocked process, is a reference to the previous value from some previous clock cycle, and thus a register is inferred. When interpretted this way, it is much easier to consider how one variable can represent both combinatorial logic and/or registered logic. It is the references to the variable that determine whether the referenced value will be a registered or combinatorial value.
If we were taught that in the first place, there would be far less confusion about variables, and all of us would be more comfortable with using them (as would our employers and co-workers who review or maintain our code, or who write approve and enforce coding/design standards).
Similarly, we would understand that a more effective means of avoiding latches (besides avoiding combinatorial processes!) is to employ default assignments for every signal at top of the process, so that remembering a previous value is not possible. Not surprisingly, this is also the best way to avoid an unintended register from a variable in a clocked process.
In most engineering disciplines, we are taught the theory first, then the simplifications that make applying the theory more practical. Why are we not taught logic synthesis the same way? There are times when a "cookbook" approach to teaching leads to roadblocks that impede a more complete understanding of the subject matter.
I believe synthesis should first be taught using variables to illustrate the truth about storage inference. Then the "shortcuts" of signal assignments in clocked processes can be presented. That way, students have the knowledge necessary to apply both to maximum benefit.
Brian Davis 12/13/2012 7:58:50 PM User Rank Clever Clogs
Re: A software point of view
@JezmoSSL: "If you want to count the number of bits set in a vector, why not cut to the chase and use a look-up table ?"
For small tables that's a perfectly reasonable method.
And once you've written the bit count function, wrapping it in another function that loops through the look-up table address range would allow initialization, at declaration, of a bit counting table of any size and width, _without_ typing out case statements or table entries.
----
The list of links I posted up-thread included an optimized one in which the smallest leaf adders were converted to LUT's:
jandecaluwe 12/13/2012 5:25:46 PM User Rank Blogger
Re: A software point of view
@rfrisbee "I was also referring to JesmoSSL's blog post "Toiling with Testbenches" but did not make that clear in my comment"
Good - happy to hear that we are in agreement about the need to correct that particular post. (I would like to hear you own analysis (beyond sensitivity list issues) because it seems you also know what you are talking about :-)).
rfrisbee 12/13/2012 1:48:04 PM User Rank Clever Clogs
Re: A software point of view
I should also apologise - I was also referring to JesmoSSL's blog post "Toiling with Testbenches" but did not make that clear in my comment.
Something else I've noticed quite a bit is clocked processes with signals in addition to the clock in their sensitivity list. While I doubt this does any harm, I'd imagine it might slow down simulation with processes having to be reevaluated more frequently than strictly necessary. That's another feature of using the "wait until rising_edge(clk)" statement: such processes can't legally have sensitivity lists and the tools will throw an error if one is present.
jandecaluwe 12/13/2012 1:11:17 PM User Rank Blogger
Re: A software point of view
@JezmoSSL "you can write some combinatorial logic to count the set bits in a vector of say 8 bits and run it in parallel for vectors larger than 8 bits adding the results, so if you have a vector of 8 bits its quite easy to write a combinartorial bit of logic to output the number of bits set."
All kind of approaches are possible, but I assume you are not going to write a lookup case with 256 entries for an 8-bit vector manually. Surely, you would use a program or script to generate this automatically and without bugs - and that program surely would use variables.
The HDL version would be very similar - and I suspect there wouldn't be big difference in synthesis results for this relatively small input size.
Therefore, the choice is between an "external" and an "internal" function. Fine with me, but I fail to see why something which is just fine externally would become confusing internally. On the contrary, from the flattened lookup table it would certainly be harder to see what's going on than from the function.
track racing is scary, 40 miles an hour. only way to brake is to put back pressure on the pedals and you have 18 mm wide tyres on wood, and you are 2-3 inches behind the guy in front, and youve got to stay exactly on the right line. and it hurts like hell
there are two approaches to the problem you can write a function to go through the size of the vector and count the '1's or you can write some combinatorial logic to count the set bits in a vector of say 8 bits and run it in parallel for vectors larger than 8 bits adding the results, so if you have a vector of 8 bits its quite easy to write a combinartorial bit of logic to output the number of bits set. you know something like case input when "0000"=> count<=0; when "0001" =>count<=1; when "0011" =>count<=2;
and so on
if you have an ickle vector, say 6 bits, then it all fits into a LUT
If I were an evil genius working on a plan for world domination (with regard to enterprise-level data storage solutions) I would be seriously considering building my design around a Zynq All Programmable SoC.
I would like to present to fellow readers of All Programmable Planet a new technique that I have invented to serialize data within the FPGA's main fabric at 1.5Gb/s.
As with most things, my feeling is that there is no better way to understand high-speed serial links than to implement one from the ground up, so that is what I've set out to do.
To save this item to your list of favorite All Programmable Planet content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.