It's been a busy FPGA week. First, of course, I've been continuing the SPI work that I started in my previous blog. Next, I received a new ZedBoard FPGA Development Board in the mail (I will be writing about this little beauty in future columns). And, last but not least, I've been using my primitive FPGA-based mini-logic analyzer to debug an I2C issue in one of my robot boards, and an RS-232 problem on another. For some reason, I2C slaves always seems to be a bit more difficult to set up than their master counterparts, but -- with the help of my Spartan 6 LX9 logic analyzer -- I was finally able to get it going.
I still have a bit of a noise problem in the logic analyzer setup that I'll need to run down, but that didn't get in the way of the problem solving. I also need to build a little level shifter so that I can work with both 3.3 volt and 5 volt signals. But those are all projects for another day.
As you may recall, in my previous blog, I essentially just got the framework for my SPI interface put together. In this week's installment, I want to get the attention of my slave device (my NXP/ARM mbed) by pulling the chip select low, and maybe, just maybe, start the process of clocking data over.
My Spartan FPGA board and NXP/ARM mbed connected together.
Originally, I had been thinking that it would be easier to create a slave in the FPGA, but I subsequently decided that creating the master won't be that much more difficult, but will be a lot more useful in the long run.
In this version, I only have two inputs: a push-button switch and the SPI MISO line. The MISO signal doesn't need de-bounce, so it just gets a two-stage synchronizer module. By comparison, the switch input gets synchronized and de-bounced. I have a couple of suggested de-bounce improvements from comments in one of my earlier blogs that I want to try out at a later date. For the moment, however, the scheme I'm using is mostly like the one described in my De-Bouncing Around blog.
The code for the synchronization and de-bounce module is shown below (click here to see a larger, more detailed version of this image):
As you can see, I've coded this as an independent module to be instantiated in my main code. In lines 33 to 37, we have the two-stage synchronizer code; other than that, this is the same as we've seen before. You can see all of the setup code in the following image (click here to see a larger, more detailed version of this image):
Lines 21 to 42 have the module I/O and the wires and registers I'll need. Lines 44 to 50 creates and buffers the two clocks using DCM: 66 MHz for the main logic and 400KHz for the SPI clock. Lines 55 to 61 divide the 400KHz clock into the 100KHz that the SPI actually needs. Finally, lines 63 to 73 take care of the push button input and the SPI MISO input.
Last I recall, SPI clocking is something you have to set up manually on each end. That is, it's not something that's signalled between master & slave. As the integrator, you need to know what's goingon on both ends...
@Duane: The other day you mentioned getting your head around the blocking/non-blocking concepts.
Altera's on-line course(free) "Verilog HDL Fundamentals" has a very good example.
Here's some of what I came away with:
1) The clocked always block determines that one or more Dff's will be implemented and the "operators" '=' or '<=' determine the din value that is assigned and when it is assigned.
2) The blocking '=' assignment is for evaluating an expression for the din port:
Their example is x = next_x; y = x: Where y is the flipflop and x is din which has the value of next_x. The main point x = nxt_x | w & z: y = x; could be used so nxt_x, w, and z are combined to get the assigned value.
3) The other interesting thing is that x <= nxt_x; y <= x; results in 2 flops and because of scheduling the non-blocking y will be assigned its new value one cycle after x is assigned because it must wait until x is assigned, In the real world where they are clocked at the same time, actually y would get the value that x was assigned in the previous cycle.
Remember when you used TTL gates to combine signals generate the din signal? Of course you probably drew a schematic and a wiring diagram.
Duane Benson 12/5/2012 10:57:44 AM User Rank Blogger
Re: MISO will be stable when it is sampled
Rfindley - Thanks for the explanation. That makes a lot of sense. In my case, I'm now thinking that I won't need to synchronize the MISO. I'm running the system clock at 100 MHz clocking the SPI at 100KHz, so propogation delay shouldn't be a problem.
I'm also going to clock the SPI byte into a register within the SPI block and set a "data ready" or "data not ready" flag to indicate if it's safe for another part of the chip to read the data. That flag will be set based on a divided system clock so it will already be synchronized.
Max Maxfield 12/5/2012 10:30:43 AM User Rank Blogger
Re: MISO will be stable when it is sampled
@rfindley: Good Point!!! I think it's easy to get carried away and star tsynchronizing everything -- including things that don't need it -- you ar eright to bring us back to Earth a bit :-)
It is worthwhile exercise to consider what the synchronous timing is likely to be for Duane's SPI design. The master is implemented in the FPGA and at some point an internal flip-flop generates the falling edge of the SPI Clock. So now consider some typical delays...
7ns - Getting that falling edge to drive the physical pin of the FPGA 2ns - Flight time of clock signal through board trace to SPI device 7ns - Clock to output time of SPI device 2ns - Flight time of MISO signal through board trace to the FPGA pin 5ns -Pin to the flip-flop that captures MISO (including set-up time)
That is a total path delay of 23ns. But remember that MOSI is to be captured on the rising edge of the SPI clock so that means that the time from falling to rising edges must be greater than 23ns. Allowing for a 40/60 duty cycle then it implies a minimum clock period of ~58ns and hence a maximum SPI clock frequency of ~17MHz. Duane is aiming for a much lower frequency so there should be no doubt that MISO will indeed be stable when it is being captured.
It is interesting to note that the 7ns clock to output time that I used above was for an SPI Flash memory device (I have seen significantly slower SPI devices!) and its data sheet specifies that the maximum clock rate for standard data read operations in 54MHz. It is therefore always vital to consider all the delays forming a synchronous path when implementing a 'system'.
Karl: In the case of chip to chip, don't you have to consider propagation delay that could cause the signal to arrive at just the clock transition?
This is where your OFFSET IN constraint is useful.
You should know what the source device's clock-to-out delay is. You use that, along with the clock period, to set the OFFSET IN constraint on the receiving end.
Furthermore, you should pay attention to what the particular FPGA family has for delay mechanisms and such. For example, in Spartan 3A, the input paths offer two delays, one for the registered path (to the IFF or the IDDR2) and the second as a combinatorial path into the fabric. The first is controlled by the attribute IFD_DELAY_VALUE in the IBUF and the second is controlled by the attribute IBUF_DELAY_VALUE.
If you carefully read the S3A user guide, there's a detail in it that says something like, "to ensure that the design meets hold-time requirements, the tools automatically insert delay on the data path." Run the place and route tools, and open the design in the FPGA Editor, and you'll see that IFD_DELAY_VALUE is not zero -- it's often something like 4 or 5.
Why? Because the clock path length in the FPGA is longer than the data path. The clock signal goes from the IBUFG through to the center of the device where the BUFGs live and then the clock signal is distributed back out to the IOB. The data path goes from the pad to the IFF directly, a much shorter path. So to equalize that delay, the data path gets delayed. The tools "know" this delay (and good effin' luck finding it in the data sheet).
So if you assume that layout ensures clock and data paths are equal length, and that clock to out is positive, then that clock-to-out plus the added input delay ensure that the receive FPGA captures the data correctly.
Got it?
Note that the OFFSET IN constraint won't magically set any input delays. It just reports the timing. If you need to add delay (either on the clock path or the data path) then you must arrange to do that explicitly.
Well, it turned out that not only were they using an 'odd' scheme to generate the serial clock, but they also had lots of level conversion logic in the path too, enough for approx 45ns of delay - almost a full cycle of the 20MHz SPI clock!
Not fun, given that the amount of propagation delay would change depending on operating conditions... some form of dynamic calibration might be needed.
Yes, at higher frequencies, propagation delay becomes a factor. But a synchronizer doesn't fix that problem. Here's why:
A synchronizer is only for when you don't know when a signal will transition (i.e. is it asynchronous).
If your circuit is synchronous, you DO know when it will arrive (within a window relative to the clock), and it is the hardware engineer's responsibility to make sure the data and clock arrive at the right time relative to one another. They can do this with trace length matching, for example, to control the propagation delay.
But if the hardware is not designed properly, and you are sampling right on the transition, then you don't know whether you would be clocking in old or new data. A synchronizer would only provide a guaranteed 1 or 0, but it can't fix the uncertainty of old vs new data.
Re: the 2 clock delay Yes, that can be a factor in a design that mixes signals that are and aren't synchronized. But I've not see that happen often. The only time I can recall seeing that was on an asynchronous FIFO, where the FULL and EMPTY flags could transition at any time. I didn't have to worry about 2-clock delay, because I was content to simply not know, until 2 clocks later, that data was available for processing. If I needed tighter response time, then it might have required a different solution.
@rfindley: In the case of chip to chip, don't you have to consider propagation delay that could cause the signal to arrive at just the clock transition? Also, doesn't the 2 clock delay of the synchronizer throw the whole sequence off?
I think the simplest way to evaluate whether you need a synchronizer is this:
If there is a chance the signal could transition at the same time you are sampling, then you need a synchronizer. If not, you don't.
So, in your SPI example: If the slave updates MISO on the *falling* edge, and your master samples on the *rising* edge, then there is no chance MISO will be transitioning while you are trying to sample it. So, no synchronizer is necessary.
[Of course, you still need to read the datasheet carefully, because the MISO transition won't actually happen exactly on the falling-edge. There is always some delay between clock-edge and actual transition. As long as that delay doesn't push the transition too close to your rising-edge sample point, then you are okay.]
I rarely need synchronizers in my own designs. Nearly everything chip-to-chip is synchronous.... or more explicitly stated, I know (by design) that I am never sampling a signal during a time that it might transition, so I don't need synchronizers. Some typical exceptions are buttons/switches and interrupt lines.
Duane has decided that the time is ripe to get his ZedBoard bolted onto his robot with a Linux distribution up and running. That was the ultimate plan anyway, so why wait?
To save this item to your list of favorite All Programmable Planet content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.