Sie sind auf Seite 1von 34

Using the AXI DMA Engine

One of the essential devices for maximizing performance in FPGA designs is the DMA Engine. DMA
stands for Direct Memory Access and a DMA engine allows you to transfer data from one part of
your system to another. The simplest usage of a DMA would be to transfer data from one part of the
memory to another, however a DMA engine can be used to transfer data from any data producer
(eg. an ADC) to a memory, or from a memory to any data consumer (eg. a DAC). In older systems,
the processor would handle all data transfers between memories and devices. As the complexity
and speed of systems increased over time, this method obviously was not sustainable. DMA was
invented to remove the bottleneck and free up the processor from having to deal with transferring
data from one place to another. In high performance digital and FPGA systems, the data throughput
is typically way too high for the processor to deal with, so a DMA is essential.
Xilinx provides us with an AXI DMA Engine IP core in its EDK design tool. In this tutorial, Ill write
about how to add a DMA engine into your design and how to connect it up to a data
producer/consumer. We will test the design on the ZC706 evaluation board. Well use the Xilinx
DMA engine IP core and well connect it to the processor memory. The data producer/consumer will
be created using the Peripheral Wizard which will generate a custom IP core that implements an AXI
streaming input (data consumer) and an AXI streaming output (data producer). Internally, the AXI
streams will be connected in loopback to enable us to test the design. After, you will be able to break
the loop and insert whatever devices you would like, be it an IP core for processing data, an ADC, a
DAC, you name it.

Start with the base project
You will need to use the Base System Builder to create the base EDK project. If you are not familiar
with the BSB, I have gone through this process in another tutorial here: Using the base system
builder. Otherwise you can download the base project from my Github page at the link below:
https://github.com/fpgadeveloper/zc706-bsb
In this tutorial, I have copied the base project files into a folder called zc706-axi-dma.

Add the DMA Engine
Open the base EDK project using Xilinx Platform Studio 14.7. Your screen should look somewhat
like the image below.

In the IP catalog, open the DMA and Timer branch and find the AXI DMA Engine IP core.
Right click on the AXI DMA Engine and select Add IP.


Click Yes to confirm.


EDK will now open the settings for the AXI DMA Engine.


Disable the Scatter Gather Engine and click OK. EDK will then propose to make the connections to
the processor for you. Click OK.


The EDK will then place the DMA into our base design. Click on the Bus Interfaces tab to see the
AXI DMA Engine in our design and how its connected.



Expand the axi_dma_0 branch to see the bus connections.

What just happened?
Over those few steps there was quite a bit of magic performed behind the curtains, here are a few
things that the EDK has done for you:
An AXI interconnect was added to the design and labelled axi_interconnect_1. The base
design had only an AXI lite interface to connect the processor to the GPIO peripherals
DIP_Switches_4Bits, GPIO_SWs and LEDs_3Bits. For a high performance DMA, you need a
full AXI interconnect.
The DMA bus ports have been connected. Ill explain these buses in another post.
The DMA interrupts have been connected to the processor. You have to click on the Ports
tab to see this.
The DMA engine has been given an address on the memory map. You have to click on the
Addresses tab to see this.
Notice that there are four buses that are not connected to anything:
M_AXIS_MM2S
S_AXIS_S2MM
M_AXIS_MM2S_CNTRL
S_AXIS_S2MM_STS
The last two are control buses which we will not use. The first two buses are the AXI streaming
master and slave interfaces (the data producer and data consumer respectively). We will have to
connect these up to the custom peripheral that we will generate in the next few steps.

Create the data producer/consumer peripheral
Well now use the Peripheral Wizard to create an IP core that will serve as our data
producer/consumer. It will have an AXI streaming master interface (output/producer) and an AXI
streaming slave interface (input/consumer).
From EDK, select Hardware->Create or Import Peripheral.


The Peripheral Wizard will open to the welcome screen. Click Next.

Select Create templates for a new peripheral and click Next.


The next window wants to know where you will place the peripheral files. Tick To an XPS project,
make sure that the folder is your current project and click Next.


Now you have to name the peripheral. I called mine axi_stream_generator but you can use the
name you like. In a real-world design, this peripheral would be wrapping your data producer or data
generator, so it might be called axi_adc or axi_dac depending on what device you are pushing
data to or getting data from.


Now you have to chose the type of AXI interface for this peripheral. We want to use AXI streaming.


On the next page we provide information specific to the loopback example that the EDK will
generate. The example peripheral will take in a number of 32-bit words on the AXI-stream slave
interface (lets call it a packet), calculate the sum of those values and then output the sum on the
AXI-stream master interface. This page of the wizard allows us to specify the packet size. Leave the
default of 8 x 32-bit words and click Next.


Just click Next on the page for optional file generations. We wont need any of that.


Click Finish on the last page and EDK will generate the template for our new custom peripheral.


If you now go down to the bottom of your IP catalog, you should see your custom peripheral listed in
the Project Local PCores->USER branch.

Patch time
The template that the EDK just generated for us is great, however it doesnt quite satisfy the
requirements for the AXI streaming interfaces of the DMA Engine. The AXI streaming protocol
includes a signal called TLAST which should be asserted when the last data is sent, unfortunately
the template peripheral generated by the Peripheral Wizard does not drive the TLAST signal and so
we have to make a minor modification to the code.
In your favourite text editor, open the file \zc706-axi-
dma\EDK\pcores\axi_stream_generator_v1_00_a\hdl\vhdl\axi_stream_generator.vhd. This is the
VHDL code for the peripheral template we just generated.
Replace ALL the code with the following code you can get from Github:
https://github.com/fpgadeveloper/zc706-axi-
dma/blob/master/EDK/pcores/axi_stream_generator_v1_00_a/hdl/vhdl/axi_stream_generator.vhd
Save and close the file.
If you want to eventually modify the custom peripheral to suit your application, this is the file you will
have to modify so I suggest you read the code and try to get a good idea of how it works.

Add the Custom Peripheral to the project
Right click on the IP core we just created (axi_stream_generator or whatever you called it) and
select Add IP.

Click Yes to confirm.


EDK will now open the configuration window for the peripheral. Just leave the defaults and click OK.


Now go into the Bus Interfaces tab and open up the axi_stream_generator branch to display its
buses.


We must now connect the buses as follows:
S_AXIS of the axi_stream_generator_0 must be connected to axi_dma_0_M_AXIS_MM2S


S_AXIS_S2MM of the axi_dma_0 must be connected to axi_stream_generator_0_M_AXIS


After making those connections, your Bus Interfaces window should look like in the image below.


Shift over the bus visualization window to see the AXI streaming buses in a light blue colour.


Now you can see that we have an AXI streaming interface going from the DMA to our peripheral,
and another going from our peripheral to the DMA.
You will not find our custom peripheral in the Addresses tab because being an AXI streaming
peripheral, it is not on the memory map.

Patch time
Normally the Xilinx tools would connect up the clock and reset signals for our custom peripheral
when we make the bus connections. In this case, it hasnt done so, so we have to do it manually.
Using your favourite text editor, open the system.mhs file from the EDK project folder.
Go to the bottom of the file and find the following code:
BEGIN axi_stream_generator
PARAMETER INSTANCE = axi_stream_generator_0
PARAMETER HW_VER = 1.00.a
BUS_INTERFACE S_AXIS = axi_dma_0_M_AXIS_MM2S
BUS_INTERFACE M_AXIS = axi_stream_generator_0_M_AXIS
END
Add two lines to make it the following:
BEGIN axi_stream_generator
PARAMETER INSTANCE = axi_stream_generator_0
PARAMETER HW_VER = 1.00.a
BUS_INTERFACE S_AXIS = axi_dma_0_M_AXIS_MM2S
BUS_INTERFACE M_AXIS = axi_stream_generator_0_M_AXIS
PORT ACLK = processing_system7_0_FCLK_CLK0
PORT ARESETN = processing_system7_0_FCLK_RESET0_N_0
END
Save the file and close it.

Generate the bitstream
In EDK click Generate Bitstream.


Once the bitstream has been generated, click Export Design to bring the design into SDK.


Click Export and Launch SDK.

Software Development Kit
1. The SDK should automatically open after the design is exported.
2. When the SDK starts up, it will ask you which workspace to open. Create a folder called SDK in the
zc706-axi-dma folder (or the project folder you are using) and select this as your workspace. Click
OK.


SDK opens up with a welcome screen that should look like the following image.


Now we need to create an application that will run on our ZC706 evaluation board and test our DMA
engine. We will use the UART as an output console so that we can put print statements in our code
to make it easier to see what is going on.
Select File->New->Application project.


In the dialog box that appears, type the name of the project as dma_test and click Next.


Were now asked if we would like to use a template for the application. Select the hello world
template and click Finish.


The SDK will now build the dma_test application and the dma_test BSP (board support package).
When it is finished, your Project Explorer should look like the image below.


Modify the application code
Now we will add code to the template to test our DMA. Double click the helloworld.c file to open it in
the SDK, then replace ALL the code with the following code on Github:
https://github.com/fpgadeveloper/zc706-axi-dma/blob/master/SDK/dma_test/src/helloworld.c
When you select Save, the SDK should automatically start building the application.
The code at the link above comes from an example provided by Xilinx in the installation files. You
can find it at this location on your PC:
C:\Xilinx\14.7\ISE_DS\EDK\sw\XilinxProcessorIPLib\drivers\axidma_v7_02_a\e
xamples\xaxidma_example_simple_poll.c
By the way, if you didnt know about it already, that folder contains heaps of examples that you will
find useful, I suggest you check it out.
Once the application is built, youre ready to run it on the ZC706 evaluation board.

Load the FPGA with the bitstream
1. Turn on your hardware platform (ZC706 or whatever you are using).
2. Connect a USB cable from your boards UART port (J21 on the ZC706) to your computers USB
port.
3. Open your terminal program (eg. Putty or Miniterm) and connect to the COM port that
corresponds to your UART over USB device. Make sure the port settings are 115200 baud, 8-bits,
no parity, 1 stop bit.


4. From the SDK menu, select Xilinx Tools->Program FPGA.


5. In the Program FPGA dialog box, the defaults should already specify the correct bitstream for
the hardware project. Make sure they correspond to the image below and click Program.


The Zynq will then be programmed with the bitstream and the console window should give you the
message:
FPGA configured successfully with bitstream
"E:/Github/fpgadeveloper/zc706-axi-dma/SDK/EDK_hw_platform/system.bit"
Run the Software Application
1. First make sure that the dma_test application is selected in the Project Explorer, then select Run-
>Run or click the icon with the green play symbol in the toolbar.


2. In the Run As dialog box, select Launch on Hardware and click OK.


3. SDK will then program the Zynq with the dma_test application and run it. You should see the
following output in your terminal window.


If you go through the application code, you will see that the test is run 10 times. This is what we did
in each test:
We write a packet of 8 words (specifically 0,1,2,3,4,5,6,7) to a transmit buffer that is located
in memory
We setup and trigger a DMA transfer from our peripheral to the receive buffer (streaming to
memory mapped) at this point there is no data being sent by our peripheral, but we setup
the RX in preparation because there soon will be.
We setup and trigger a DMA transfer from the transmit buffer to our peripheral (memory
mapped to streaming) this triggers the DMA to send the data from memory to the AXI-
streaming master interface, which is connected to the AXI-streaming slave interface of our
custom peripheral. That data then gets summed and the answer gets pumped out of the AXI-
streaming master interface 8 times (the size of one packet).
We wait for both transfers to complete.
We read the receive buffer which is also located in memory and the DMA should have just
filled up with the received data.
We print the received data to the console.
The result should be 0+1+2+3+4+5+6+7=28=0x1C in hexadecimal!

Das könnte Ihnen auch gefallen