Home    Bloggers    Messages    Webinars    Resources   
Tw  |  Fb  |  In  |  Rss
Comments
Newest First | Oldest First | Threaded View
Page 1 / 4   >   >>
tomii
tomii
12/19/2012 2:14:40 PM
User Rank
Blogger
Catching up, and my comments
So, as I've previously stated, I'm working with the Opal Kelly XEM boards, and I've coe some way with that thing while trying to follow along here.  I actually got the "FrontPanel" working (in a way) - videos available here: https://www.youtube.com/user/TGBII?feature=mhee

 


Now that that's out of the way, I've gotten back to this tutorial, and I would say there's something I find psrticularly disturbing about your code, and that is putting the TEMP_BITS variable as a part of the IO to this moule.

Maybe this is just me, and I understand how you need to do direct register/bit manipulation in some real-time stuff, but in general, I would never make something accessible to the outside world that doesn't need to be.  Why make a hook that someone can take advantage of (perhaps poorly) if you don't want it there?  For this, no big deal, but for someting "real world" - maybe so.  Especially if you want to get a lot of code reuse.

So, here's my modifications to your code:

1) No changes to the module declaration block.  Stays as it was previously.

2) The main "working" code block was changed as follows:

reg  [28:0] led_count;
  reg        [3:0] TEMP_BITS;

  always @(posedge clk)
    begin
       led_count <= led_count + 1;
        if (!PROTO_SW)
            TEMP_BITS[3:0] = led_count[28:25];
    end
        
        
 
  assign COUNTING_LEDS[3:0] = ~led_count[28:25];    // Connects binary counting leds
  assign DPOINT_OUT = !MOMENTARY_SW;                   // Push button for decimal point
 
  always @(led_count[28:25])                            // Look up table: Hex to 7-segment
    case (TEMP_BITS[3:0])
      4'h0: SEG_OUT = 7'b0111111;
      4'h1: SEG_OUT = 7'b0000110;
      4'h2: SEG_OUT = 7'b1011011;
      4'h3: SEG_OUT = 7'b1001111;

 

...

 

In short, I instantiated a register internal to the module, and then set about changing the value of that register internally.  That register is connected via the multiplexer to the SEG_OUT bus.

 

I will caveat this all, though:  I *did* get a warning when "compiling" the design:

WARNING:Xst:905 - "../7-segment count.v" line 87: One or more signals are missing in the sensitivity list of always block. To enable synthesis of FPGA/CPLD hardware, XST will assume that all necessary signals are present in the sensitivity list. Please note that the result of the synthesis may differ from the initial design specification. The missing signals are:
   <TEMP_BITS>

I haven't quite run that down, yet, but maybe someone here can shed some light?

Also, any comments on what I've said/done?

50%
50%
David Ashton
David Ashton
8/26/2012 4:59:15 AM
User Rank
Guru
Re: That chip is old...
Nice!  Only 4 pins - you can't get much simpler than that!

50%
50%
Karl
Karl
8/24/2012 2:24:50 PM
User Rank
Guru
Re: Thanks!
@Duane:  I remember designing with latches, there were multiple clocks per machine cycle so the input of the latch would flush through while the clock was active/"high".  However the clock for the next stage would be inactive so the latches in the second level were held stable.  The key was to prevent unresolved signals from setting latches to the wrong values.

The clock skew had to be accounted for so the clocks did not overlap to preserve stability.  Since skew can be positive or negative the total increase in cycle time is doubled.  No way to know which one is going to be late or early.

Now we have the D flop that only requires a minimum clock width and for the input to not change during the setup and hold intervals.  The D makes a copy of the input then drives the output and the simulators work also by making a copy of the input and then changing the outputs accordingly.  The Verilog always block non-blocking <= assignment also works by scheduling changes to take place after all the next values have been determined.  Only one clock needs to be distributed in general.

If you can imagine what would happen if latches feeding other latches all had clocks enabled -- there would be pure chaos.

Now, that is not to say that multi- clock cycles are not useful especially when embedded memory blocks are used.  For now let's stick with single clocked D triggers.

50%
50%
Duane Benson
Duane Benson
8/24/2012 1:19:34 PM
User Rank
Blogger
Thanks!
There's a whole lot of good feedback here that I'm still absorbing. This is exactly what I and other newbies need. I've been working my way into the simulation tools that came with the Spartan board and that's starting to clear up a few things as well.

In the discrete chip world, "latch" means both a component and a problem. Here it seems it really only means a problem. But there are more than one way to solve most problems.

50%
50%
Karl
Karl
8/24/2012 9:19:32 AM
User Rank
Guru
Re: Where you put the latches can make a big difference.
@thrakkor:  Thanks for clarification -- my turn now.  Synchronous design is fundamental to FPGAs, agreed.  And that means moving data from register to register, to me that is not piping.  Only if new data is available and the results are passed on with the same clock for a calculation is when I see the true benefit.  Algorithmic calculations usually fit.

The use of pipelined cores for control functions seens inefficient.  That is because a decision is based on some comparison(condition) and that means it takes NS cycles  + 1 to compare then another NS cycles to get whatever to the execution stage to start whatever comes next. (NS = number of stages of pipe)

So it seems for similar reasons that critical controls need to be done efficiently and piping for those cases seems inefficient.

As you said external requirements are a big speed factor.  Every time an input is missed or the next output is not ready, penalty time.

I have enough hands on experience with asynchronous(self clocked) design and the glitches and metastability to appreciate synchronous design.

Just standing on the corner beating the drum trying to draw a little attention to decision logic.  Thanks.

50%
50%
thrakkor
thrakkor
8/23/2012 12:47:14 PM
User Rank
Blogger
Re: Where you put the latches can make a big difference.
@Karl, thanks for replying.

in my experience, clock rate is driven by many factors.  upstream ADC sample rate, FPGA to FPGA interface clock rate, DSP requirements, memory interface requirements, throughput/system requirements.

i'm certainly not advocating jacking up the clock rate just because we can.  however, a lot of designs I've dealt with are well over 100 MHz and even over 200 MHz.  these rates (and 300 MHz+) absolutely require awareness of and active pipelining in order to achieve timing closure.  conversely there are plenty of designs in the 10's of MHz clock rates that don't require extra pipelining.  but still require a base synchronous design (plenty full of registers).

I wouldn't say I blindly pipeline everything, but I do design synchronously (99% of my logic is in a synchronous process) and end up with some base level of pipelining.  its natural for me to stage things (say a mag squared operation or chunking up a very wide bus when conditionally checking it) and design in this fashion.  it also scales and allows clock rates to increase with little to no changes to the design (for some clock rate range).

not sure what response time you are talking about.  clock trees generally use dedicated routing resources and are not affected by extra pipeling (in my experience) throughout a design..  sure, more registers toggling on the clock definitely equals more power consumption.

logic levels are very specific to what you are trying to achieve.  i've easily generated logic with 10+ levels before optimization.  bigger LUTs are only as good as the synthesis tool and effectivenss is balanced by how wide the buses the logic is using are. 

I agree design is iterative.  I also agree that tradeoffs are key, most importantly when updating a design to achieve timing closure.  placement and routing are directly affected by the level of pipelining in a design.

I do however, believe that a base synchronous design is an absolute must.  One cannot be stingy with register resources and expect to get acceptable results.

50%
50%
Karl
Karl
8/23/2012 12:22:52 PM
User Rank
Guru
Re: Where you put the latches can make a big difference.
@thrakkor:  Part of what I am trying to understand is exactly what the real value of higher clock rate is in general.  Some things just do not pipeline well, as Mike said.

If the designer's approach is to pipeline every thing then registers and clock rates are important.  The reason that latency is important is for response time.  Every register must be driven by the clock tree and there are more drivers active in the fabric that consume power.

Everything has a cost associated with it and good engineering involves a lot of trad-offs, so blindly registering everything in sight just may not be the best way.

Not every net drives across the chip but the ones that do may need to be piped, handle those as necessary.

If 1T or 2T (10s of nanoseconds) are not important, then I certainly do not see why faster clock rates matter so much.  If I pay the price to pipe and cannot see a benefit that is a concern.

The number of logic levels does not automatically become an issue because the LUTs can evaluate complex Boolean expressions in one level.

I do not think there are any silver bullets and design is an iterative process.  Timing closure is certainly not the first step but things that lead to problems should be avoided.   Let's first understand at least some of the tradeoffs.

By the way I have seen recent Xilinx papers recommending more serial flows.

50%
50%
thrakkor
thrakkor
8/23/2012 11:23:06 AM
User Rank
Blogger
Re: Where you put the latches can make a big difference.
in the FPGA world, pipeling and judicious use of registers are necessary and standard practice.  there really aren't any alternative choices.  if you don't register and pipeline, routing delays due to logic levels would kill all chances of timing closure.

obvious reason is to achieve higher clock rates.  

another is perhaps to get from one corner (or side) of chip to the other in a large device (regardless of utilization).

fanout is another that comes to mind.

as long as data is accompanied by a DV strobe, most downstream logic from pipelined logic should be able to be designed to cope with it.

granted one has to accept increased power dissipation, device utilization and latency, but those are normally a concern anyways.

I guess I don't see what the problem adding 1 or 2 T of latency (10's of nanoseconds) to a chunk of logic is.  1T to compare and generate flag (or new value), and 1T to check new flag and assign new output or make decision.... no extra pipeling necessary unless you need to get to/from BRAM or I/O or other size of chip.

50%
50%
Karl
Karl
8/23/2012 10:30:01 AM
User Rank
Guru
Re: Where you put the latches can make a big difference.
@hamster:  Thanks, Mike.

"If you have a problem that doesn't decompose nicely into balanced stages then results can be really poor."  Yes, computational algorithms benefit by pipelining when new data is available every cycle and the result is not used by a subsequent stage.  The world of DSP generally benefits because matrix algorithms are so common.

You made me stop and think more about the power aspect.  In the ASIC world more than FPGA, I guess, power consumption increases with higher clock speed and the number of registers being driven.  Since the FPGA fabric is not passive, the size is more of a factor in power.

The wide LUTS are very useful.  A 4 input mux has 4 data in and 2 selects -- 1x6 input LUT per bit.  An ALU with 2 data in and encoded operator field comes in at about 1 LUT per bit with 1 level of latency.

If each operator is a separate assignment and a mux selects the result then the count goes up quickly as well as the levels.  Now if it is pipelined and registered to beat hell, I can get the clock rate up but if I want to compare 2 values and make a decision then the latency becomes a factor.

My conclusion is that pipelining and registers is very useful, but not a panacea.

50%
50%
Duane Benson
Duane Benson
8/23/2012 12:28:43 AM
User Rank
Blogger
Re: That chip is old...
How about these?



50%
50%
Page 1 / 4   >   >>


latest blogs
To celebrate Geek Pride Day, Sylvie Barak has created a mega-cool infographic that depicts how geeks have been building the Internet since 1832.
When traversing serial links with optics or backplanes, high-speed signals are degraded by impairments in the link, such as insertion loss, reflections, crosstalk, and optical dispersion.
Can statistical or heuristic verification really work for FPGA designs?
One of the things I've been wondering is whether or not the "okWireOR" module is really just a giant OR, or if the order in which things are attached matters.
I am shocked and horrified. It appears that those little scamps at Planet Analog are writing blogs pertaining to field-programmable issues.
flash poll
follow us on twitter
follow Xilinx on twitter
like us on facebook
like Xilinx on facebook
All Programmable Planet     About Us     Contact Us     Help     Register     Twitter     Facebook     RSS