As I am sure you will recall from my Visions of the Future blog, I am in the process of talking to some of the visionaries of the programmable industry to get their ideas on 1) Where have we been? 2) What are today's challenges? and 3) Where might we end up in the next several years? This will provide the rest of us with a starting point for our own discussions, thoughts, and prognostications.
In my previous column, I posted comments from Navanee Sundaramoorthy, the Xilinx Embedded Platform manager, about where we have come from with regard to FPGAs/MCUs. Now, this blog continues with some of Navanee's thoughts on today's FPGA/MCU challenges. As usual, I have started a new message board for us to continue the discussion.
Breaking the MCU mindset
There are a couple of key challenges Navanee sees in the future that will need innovative responses. Some of these are an outcome from the transition to designers who are more MCU-centric. These users are not as familiar with FPGAs as they are MCUs. They are very experienced with MCU-oriented design flows and architectural thinking, but don't have deep (or maybe even any) FPGA experience. Thus, they approach the design of a combined FPGA/MCU device with traditional MCU thinking, which means they can easily miss out on some of the advantages FPGA fabric brings.
Let's consider a common scenario that illustrates this challenge. In the not-so-distant past, a traditional FPGA/MCU implementation would have involved the use of two individually packaged devices, an MCU and an FPGA. In many designs, the MCU needs a little more capability than it supports in its default configuration (maybe more SPI ports, some front-end data processing from an analog-to-digital converter, or a different bus interface), so the use of a companion FPGA would seem to "fit the bill." In a real-world design that is representative of this type of scenario, a parallel bus was used to connect the FPGA and the MCU, and the MCU then treated the FPGA as a peripheral. Although this approach easily fit into the MCU-oriented designer's mindset, it significantly limited the bandwidth of the ensuing system.
Even in situations where the MCU is embedded inside the FPGA as a hard core, many MCU-centric designers would be tempted to create an "old school" interface between these two facets of the device. However, today's state-of-the-art FPGA/MCU combo devices open up a wide range of architectural alternatives that break the standard MCU/peripheral mindset...
I think what really needs to be addressed is the fact that having looked at the zinq for a project we decided that you would have to be very brave to say lets port all our code to the hardcore and run 'some random rtos' and use it in our next project simply because the tool chain isnt robust enough, and Altera are doing exactly the same thing with their latest products, it feels a 'lot' safer going the stand alone processor/ FPGA route until the technology is proven.Given that xilinx tools are not what you might call stable and are routinely a pain in the arse to use we probably wont be going there for a while yet.
@Warren: Using the FPGA to process input "on the fly" takes advantage of the parallelism helps reduce congestion and decrease response time at the system level.
Conceptually it seems that a soft core would be useful with a custom DSP dataflow. The current emphasis is on streaming raw data to memory then sending it back out to GPUs. Your example is much better in general, but image processing uses data that is already in memory nd is quite different, so there will be more discussion.
Meanwhile I will be waiting to see if things evolve to the point where my idea for a paramaterized programmable control block is worthwhile. Maybe PicoBlaze is already established and good enough. Then there's Jacek's micro.
@rfindley, Yes, that diagram answers a lot of questions and helps understand the system. Before it was not clear if only the green and orange areas existed.
The blue/gray area includes the things that I had referred to as a library, but they are hard which is much better. Not only is there a hard ARM core, but a pretty rich basic MCU available.
The hard memory controllers and DMA are plusses.
Now, where can I find documentation for the AXI buss?
Max Maxfield 7/20/2012 9:11:09 AM User Rank Blogger
Re: AXI Busses
@Warren: This all makes sense -- I'm not as aware of the system level stuff as I should be -- I think there's a tool that allows you to define register maps and stuff (in your H/W accelerators) to make things easier also -- I need to research this a bit more...
Warren Miller 7/19/2012 10:20:59 PM User Rank Blogger
Re: AXI Busses
Max-
I believe this is correct. The trick will be to use the peripheral bus to communicate to 'intelligent' peripherals instead of the traditional peripherals in the standard MCU library. For example, you could use the AXI bus to initialize a peripheral with a sequence of commands that could be executed within the FPGA fabric. Soemthing like- start the AtoD conversion, normalize the result, check for out of range result, if out of range interrupt the MCU, if in range use DMA to store result to memory, after 1K successful captures are stored in memory run a simple DSP routine to filer out low frequencies, store filtered results back to memory via DMA and interrupt the processor. See how this approach is very different from having the MCU do all the work?
The orange-colored section is the FPGA fabric. The rest is a standalone ARM with a fairly standard set of hard peripherals that have direct access to external pins via an MCU-controlled I/O Mux.
The only difference from a solo MCU is that the FPGA fabric has access (if you so desire) to a number of MCU subsystems that it would never have access to if the FPGA were separate.
@Karl, you are correct. I glanced at a product brief... it doesn't support extending the instruction set. Accelerators are accomplished via direct access to on-chip memory of the processor, shared access to the external DDR memory, or direct processor-to-logic access via the AXI bus.
Re: reconfiguration.... At the aforementioned meet-up, I asked them a lot of questions about partial configuration. It has the same partial-configuration features as the rest of the 7-series fabric, including all the associated challenges.
@rfindley: I don't know if ARM architecture has a code point for a custom, but can imagine that an assembly level code could treat it as a psuedo MMIO. Maybe it is an accelerator.
Do you think that using the processor to setup the fabric may be a step toward re-configurable computing? Or partial reconfig of the fabric?
When traversing serial links with optics or backplanes, high-speed signals are degraded by impairments in the link, such as insertion loss, reflections, crosstalk, and optical dispersion.
Warren has finally started to write some HDL code to implement his chess-playing FPGA, but he's not a professional coder, so he needs our help and advice.
What might we see in new Ultra Low Density (ULD) CPLD families three-to-five years down the road? Are there new technologies or programmable structures that will find their way into ULD devices?
To save this item to your list of favorite All Programmable Planet content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.