Home    Bloggers    Messages    Webinars    Resources   
Tw  |  Fb  |  In  |  Rss
Max Maxfield

Ask Max: FPGA Processors vs. Hardware Accelerators

Max Maxfield
Page 1 / 3   >   >>
Max Maxfield
Max Maxfield
7/7/2012 2:18:36 PM
User Rank
Blogger
Re: Cpu's and control functions
@Karl: With regard to the designer being able to check that the synthesis tool implemented what was desired, there are a variety of tools used for this, including simulation and formal analysis in the form of equivalency checking ... I will be writing blogs on all of these in the fullness of time :-)

50%
50%
David Ashton
David Ashton
6/6/2012 4:18:56 PM
User Rank
Guru
Re: Hardware Accellerators
Thanks Max.  Perhaps you can add to your ever-growing list of "Ask Max" topics: A look at the anatomy of the DSP blocks and how they work and how you use them?

50%
50%
Karl
Karl
6/6/2012 11:56:33 AM
User Rank
Guru
Re: Cpu's and control functions
@Max:  I went to the website and could not comprehend the statement something about response time being a constant, no matter how many tasks in the application.

Now, I can't find it again to check the wording. 

I am gun shy of all; the marketing hype in this business and I am wondering when will we get to understanding the FPGA because it seems that the designer needs to verify that synthesis generated what was expected.

Since simulation before and after synthesis can be different(I think) it seems very important to understand the implementation.

 

50%
50%
Max Maxfield
Max Maxfield
6/6/2012 11:16:47 AM
User Rank
Blogger
Re: Cpu's and control functions
@Karl: Re Micrium, I should also have added "Free to play with" and "They let you see the Source Code" which is very important for understanding what's going on.

 

They have books on their RTOS that are HIGHLY RECOMMENDED 

50%
50%
Max Maxfield
Max Maxfield
6/6/2012 11:15:14 AM
User Rank
Blogger
Re: Cpu's and control functions
@Karl: As you say, you do need an operating system (OS) ... of course you can write all of the interrupt handling yourself and run your application "bare-bones" without an OS, but you would be a braver man than I am if you are using a 32-bit machine (grin).


One thing I do like is the small RTOS systems, like uC/OS-III from Micrium -- very small memory footprint -- very deterministic with regard to response time...

50%
50%
Karl
Karl
6/6/2012 11:05:45 AM
User Rank
Guru
Re: Cpu's and control functions
@Max, I am thinking about the hard cpu.  That means there is one that somehow would service all interrupts and that there has to be an OS to schedule interrupt handlers with its overhead.  As you said things have to be tidied up, stacked, unstacked, pipelines have to be filled and emptied, there may be cache misses, etc.

The response times vary widely and debug/verification take lots of time.

So I see a lot more to the control problem than that the cpu can do conditional jumps/branches and evaluate expressions.

Here too, these things can be done by functional blocks in  parallel in an FPGA and the response times determined with good precision.

The interrupt can be handled in a much simpller way.  Simple is the best way.

If you are thinking several 8 bit soft cores with dedicated functions that is a different situation.  That goes back to adopting cpus because the function was too complex for custom logic.  But they are sequential therefore slower, but the function is practically unlimited.

If one is not sufficient, add another and split the load.  Repeat as necessary so long as resources are available.

50%
50%
Max Maxfield
Max Maxfield
6/6/2012 10:30:56 AM
User Rank
Blogger
Re: Hardware Accellerators
@David: I agree -- CPU slowest, programmable logic faster (in the case that you implement a bunch of operations in parralel), hardware adders really really fast.

The sort of hardware blocks that are put in today's FPGAs are blocks of SRAM (with control logic to make them dual port ... I think some also have ECC capability hardwired in but I'm not so sure about that) and DSP functions. The really big FPGAs can have literally thousands of DSP blocks, which -- amongst other things -- contain multipliers and accumulators.

Other functions are more along the lines of clock managers and high-speed serial interconnect (SERDES) and DDR memory interfaces and other interfaces like SPI and I2C and UARTs and suchlike.

50%
50%
Max Maxfield
Max Maxfield
6/6/2012 10:26:36 AM
User Rank
Blogger
Re: Cpu's and control functions
@Karl: I'm not an expert in microcontrollers per se (I mean using them to create professional embedded systems). I know that there are several ways for them to be informed that something is ready for them to look at (like using semaphores, for example).
The way I'm most familiar is when something generates an interrupt. In this case there may be a hardware sub-system that determines the priority of the interrupt – is this one a higher priority than one that is already being serviced, for example. If the interrupt is of sufficient priority, then the processor completes the instruction it is working on, tidies up (pushing key registers to the stack), and then jumps to an interrupt service routine to service the interrupt. When it's finished servicing the interrupt it returns to whatever it was doing before.
Simple 8-bit MCUs might have only a single interrupt. More complex MCUs might have a bunch of them. Maybe Duane or someone else could elaborate a little more here...

50%
50%
EdV
EdV
6/6/2012 9:38:43 AM
User Rank
Guru
Re: Accelerators
A (relatively) Simple DSP Bottleneck Addressed By CPLD (not an FPGA but the principles are same)

I was working on a cell phone base station once upon a time and we moved a bottleneck DSP quadrature detect algorithm to a CPLD.  The decision to do so was fairly stright forward because the DSP was already using an external FIFO as part of the operation.

Basically the operation was:

1. Receive a process request from the DSP

2. Look at four values in the FIFO and depending on which value was largest take the twos complement of a paticular pair of vaulues

3. Write the new values into the FIFO

4. Signal the DSP that the operation was complete.

I wish I was able to describe step 2 better (1996 is a long time ago) because this was the real bottle neck the SW guys could not actually describe what this part of algorithm was actually doing other than "it works."  They also had the only setup that could show that it worked.  Much hair pulling ensued as 99.99% correct in this instance would not tell you whether  you were any closer to detecting quadrature than not working at all.

 

 

 

50%
50%
David Ashton
David Ashton
6/5/2012 8:34:52 PM
User Rank
Guru
Hardware Accellerators
Hi Max

If I have understood you right here, then for your example (adding two 10x10 matrices of 32-bit numbers), you'd be looking at:
  • CPU - slowest
  • Programmable logic fabric - much faster
  • Hardware adders - really fast

So a couple of questions:
  1. I can see that the middle solution  - Programmable logic implementation - would be much faster than the CPU, mainly because of the parallelism but also because of the speed of the logic?  But how much faster would pure hardware be?
  2. Obviously pure hardware needs to be purpose-implemented in the IC and you can't change it.  What sorts of hardware blocks are put in these ICs?  Your example of the Zynq earlier included DSP blocks - what sorts of hardware blocks do you find inside todays ICs?


50%
50%
Page 1 / 3   >   >>
More Blogs from Max Maxfield
We consider complementary versus analogous colors and the meaning of terms like shade, tint, and hue. We also introduce the concept of psychological primary colors.
The appellation "primary colors" refers to a small collection of colors that can be combined to form a range of additional colors, but which "small collection of colors" should we use as our primaries?
To celebrate Geek Pride Day, Sylvie Barak has created a mega-cool infographic that depicts how geeks have been building the Internet since 1832.
I am shocked and horrified. It appears that those little scamps at Planet Analog are writing blogs pertaining to field-programmable issues.
This week's live online chat takes place on Thursday, May 23, 2013, at 1:00 p.m. ET.
flash poll
follow us on twitter
follow Xilinx on twitter
like us on facebook
like Xilinx on facebook
All Programmable Planet     About Us     Contact Us     Help     Register     Twitter     Facebook     RSS