In this column, I'm starting a mini-series on the development of Intellectual Property (IP) cores from FPGA suppliers. The subject of IP cores from third-party vendors will be a follow-on topic since there are significant issues that differ between IP delivered from the FPGA manufacturers and third-party IP suppliers.
I discussed some of the background on IP development, from FPGA manufacturers' perspective, with Tim Vanevenhoven, Director of Marketing, IP Design Methodology at Xilinx. Tim has many years of experience developing IP and the associated tools, methodologies, and strategies used within Xilinx. We started out talking about the key issues faced by FPGA suppliers over the last several years with respect to IP core development.
IP cores improve my productivity, right?
Tim and I started out talking about how, over the last several years, IP cores have become a key (perhaps the key) element in enabling FPGA designers to work effectively with million-plus-gate devices.
For example, IP cores made it easier to use many of the "hardened" functions within large FPGAs (like memory, arithmetic functions, and SERDES). No one (well, no one I knew) really wanted to reinvent the wheel to create a FIFO or a floating-point multiplier or an I2C interface. Designing these things didn't really add any value to the design (they didn't help fundamentally differentiate my design from a competitor -- they were just standard building blocks). It was better if these standard "components" were offered by the FPGA manufacturer so I could more quickly and easily do my design (and use more FPGA gates, since that's what the FPGA suppliers wanted to sell me).
Over time, these building blocks became somewhat standard and were expected to be supported by the supplier as part of the FPGA design tool chain (third-party vendors couldn't make any money off many of these simple blocks). From this starting point, more complex blocks were demanded by customers -- even things like MCUs, DSP functions, and complex standard interfaces like USB or Ethernet, etc. Unfortunately, the FPGA suppliers had not really invested in the infrastructure to make all these IP cores play nicely with each other. Looking back, this shouldn't really be a big surprise since the manufacturers were primarily responding to specific customer requests (or demands, depending on how you look at it) -- they weren't really following a completely planned-out roadmap and grand strategy.
Now, the result of this approach to IP core development was to place some "bumps in the road" for FPGA designers who wanted to use IP cores for more and more of their designs. One of the biggest issues was the fact the even IP cores from the same FPGA manufacturer many times didn't easily "talk" to each other. In many cases, the designer needed to add "glue logic" between various IP cores because the bus interfaces were different or the handshake logic operated slightly differently (for example, latencies could be different or signals might be of the wrong polarity).
More complex differences might be that one IP core had a streaming interface while another had a memory-mapped interface. These differences required the designer to dig into the specifications and create some the glue logic needed to connect things up correctly. Luckily, FPGAs are good at gluing things together, but this cost designers significant development time in design, test, and verification. Time that IP cores were supposed to save!
I come across someone who had a board (with Altera's Arria V FPGA) to play around with but it has no UART interface for debugging. Then I remembered the NIOS II soft processor can utilize JTAG interface to emulate UART, that is the JTAG UART core. But there are some problems:
- JTAG UART core must be instantiated in SOPC/QSYS, meaning it must come together with the NIOS II CPU.
- In the computer side, a normal terminal window won't work, we must use NIOS2-terminal.
He only needs a UART interface to test his design, but the board doesn't have it. He can use the JTAG UART core, but he needs to understand the basics of NIOS II before he can even use the core.
Jacek Hanke 11/5/2012 3:44:59 AM User Rank Blogger
Re: Trusting vendor cores
@ Jezmo: " The thing with open cores is you can try before.you don't actually buy"
Not exactly, cause as I wrote in my previous blog, you can always evaluate IP Core before buying. So both vendor and customer will do everything to make sure that the IP is really working... http://www.programmableplanet.com/author.asp?section_id=2109&doc_id=252730
The "Central Interconnect" is what controls how the peripherals are connected to the CPU, and you configure it by just writing to a set of configuration registers. So no FPGA code is needed for the CPU to connect and use the peripherals.
The Processor System (PS) portion of the chip contains the following hard AXI interfaces: 2x AXI 32b master 2x AXI 32b slave 4x AXI 64b/32b memory interface AXI 64b ACP (Accelerator Coherency Port)
If you want to connect the CPU to a synthesized IP, you would use one of the hard AXI Masters. * Master 1 has 1GB address space starting at 0x40000000 * Master 2 has 1GB address space starting at 0x80000000
And, of course, you would need to synthesize a soft AXI Slave to communicate with the hard AXI Master.
The hard AXI Slaves are used when you want the FPGA fabric to control one or more of the hard peripherals. Of course, to control an AXI Slave, you would need to synthesize a soft AXI Master... so I'm guessing that is what you saw in the Vivado documentation.
In a Zynq tutorial that I was browsing, they connected a soft GPIO core to the CPU. Here's the basic flow:
CPU <--> hard AXI Master <--> soft AXI Interconnect <--> soft AXI GPIO
Strictly speaking, I'm not sure the AXI Interconnect is necessary if you only plan to connect a single slave. But the interconnect provides a few useful services if you want them:
* Data width conversion * Clock rate conversion * Pipelining * Multiplexing between multiple slaves (or masters)
As for memory and cache, the amounts on the Zynq are pretty typical for ARM processors (256KB on-chip RAM, and I don't remember the cache size). The assumption is that if you need more memory, you can use the hard DDR2/3 controller to add the right amount of memory for your design. So this is also a pretty typical ARM design.
As for dual-core, you can use them independently, though they share the same address space. So as long as you run them on separate areas of memory, I think you could do whatever you like (e.g. an OS on one, and bare-metal code on the other??). In the bootstrap code, you could have something like this:
if (cpu_id==0) jump to address xxxx else jump to address yyyy (or just sleep, if you only need one core)
Once I get my hands on a Zynq board, I plan to go through this process myself, and will write a blog series about it. So stay tuned :-)
@rfindley: Thanks for good comments, my points are based on the Vivado documentation with its emphasis on synthesis and the cores that are listed.
There are 3 AXI4 cores and some other standard cores. I think the ARM core uses the AXI master and since it is a synthesis core that means that the user must add it to the Zynq. Synthesis generates HDL so the whole physical design must be done.
An MCU with an ARM core has a set of peripherals physically connected to the cpu in a fixed configuration. The user must live with what is there rather than define and connect a custom configuration, but the point is that it is ready to be programmed, not ready to build then program.
So the FPGA fabric must be configured before use, as I see it.
What about memory and/or cache? It seems unlikely that there is enough on chip memory to satisfy a dual core ARM that almost certainly must have an OS that can use dual cores.
Hoping to be wrong, but feel that this is what you will see when looking under the hood.
First, let me say: I have no affiliation with Xilinx. I'm just an interested developer, and it seems to me that a lot of people don't really understand Zynq yet.
So, with that in mind.... I'm not sure I understand the points you are making.
1) Zynq comes with all the usual hard-IP peripherals that you'll find in any standalone MCU. In fact, you could even ignore the fact that it has FPGA fabric in it, and just treat it like any other MCU. Nothing special needed.
2) You can still use the same development process for Zynq as you do with a separate MCU and FPGA. You can use any 3rd-party ARM compiler that supports the common Cortex-A9, and you can continue to use Xilinx ISE design suite for the FPGA portion. There's no need to use the Xilinx tools for the MCU portion.
3) I think you are overlooking a lot of advantages of the Zynq that you can't get with a two-chip (MCU and FPGA) solution.
* Higher-bandwidth, lower-latency interconnect between processor and FPGA fabric, with multiple buses that can work in parallel. * Dedicated hardware FIFOs some of the above-mentioned interconnects. * Direct access from FPGA fabric to processor on-chip memory (you could achieve some pretty unique accelerators this way) * Processor and FPGA can both use the on-chip peripherals, so you can divide the workload as you see fit.
Granted, many FPGA+MCU applications don't need those features. Even so, there are some additional benefits that have a more general appeal:
* Chipscope on steroids (use a spare internal AXI connection to dump a large number of internal logic signals directly into ARM-accessible RAM. No external pins needed, and fully user-programmable analysis.... HELLO!!). * Better board space utilization * Direct software-based FPGA configuration, giving many new possibilities. * Possibly longer product life cycle (many ARMs follow short mobile electronics lifecycles. Xilinx is likely to keep their solutions around longer.)
Of course, there are downsides to consider: * Highly-integrated solutions are more difficult to replace if they go defunct. But at least it is a marriage fairly capable of surviving a split (... that doesn't work very well as a metaphor, does it? :-) ) * Can't upgrade the constituent parts piecemeal. * GPU not included with the MCU (this can be good or bad)
@devel: The hard ARM core is slower than a separate chip and a lot of other things have to be put on the FPGA and verified so it is hard to see if there is real advantage over an MCU. If they don't come up with a good set of cores that can be used easily and quickly then it is a waste.
Starting with HDL/RTL and going through the whole tool process looks like a bad idea.
A lot of us have been burned by the pretty shitty cores provided by the FPGA vendors. It will take us a lot to convince us that Zynq or whatever will be better.
Warren has finally started to write some HDL code to implement his chess-playing FPGA, but he's not a professional coder, so he needs our help and advice.
What might we see in new Ultra Low Density (ULD) CPLD families three-to-five years down the road? Are there new technologies or programmable structures that will find their way into ULD devices?
We are ready to consider how to use our Move Generator to traverse the tree of possible moves efficiently and find the sequence that produces the best board position.
To save this item to your list of favorite All Programmable Planet content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.