Beruflich Dokumente
Kultur Dokumente
Christian Baumann
Christian Baumann
Christian Baumann
Disadvantages of FPGAs:
The same application needs more space (transistors) on chip and the application runs slower on a FPGA as modern as the ASIC counterpart.
Christian Baumann Field Programmable Gate Arrays (FPGA)
Christian Baumann
CLB The Congurable logic blocks are were the user specic functions are calculated. IOB The Input/Output block make it possible to connect the FPGA to the other elements of the application. Interconnect Interconnect is essential for writing between CLB and from IOBs to CLBs.
Christian Baumann
CLB
Interconnect
IOB
Modern FPGAs
Memories. Logic blocks for arithmetic calculations: e.g. multiplication Digital signal processors (DSP) Microprocessors Soft cores directly build on the FPGA fabric.
Christian Baumann
CLBs possess a LUT which can be congured as one 6-input LUT or two 5-input LUTs. Between 156 and 1064 dual-port block RAMs, each storing 36 Kbytes. Many dedicated, full-custom, low-power DSP slices Between 8 and 72 gigabit transceiver circuits. Integrated interface block for PCI Express technology
Christian Baumann Field Programmable Gate Arrays (FPGA)
XC6VHX5651
Christian Baumann
Conguring FPGAs
Hardware description language (HDL)
Most common approach to congure a FPGA. VHDL, Verilog Several block of hardware running in parallel are described The wires for data movement have to be explicitly written on the FPGA Synthesis tools translate the code into bit stream, which is downloaded to the FPGA
Conguring FPGAs
Library-based solution
Parameterized macros to generate code for common blocks such as arithmetic functions or specialized memories The output of the macros is HDL code that can be included in the developers synthesis process.
Christian Baumann
Vision system such as vehicle detection and security systems. After acquisition of the image, a signicant amount of computation is require in the pre-processing phase before the features are extracted and classied. Computation contain ne-grained parallelism Implemented on a Xilinx Virtex-2 FPGA
Christian Baumann
Image pre-processor
Christian Baumann
Pre-processing element
Implementation
Celoxica RC1000-PP board with a Xilinx Virtex FPGA. 256x256-pixel sensor connected to the board via external I/O The board communicates with the host via a PCI bus Clock frequency: 50 MHz Performance optimization compared with a software implementation executed on a PC (processor clock frequency 266 MHz):
FPGA: 125 Frames/s PC: 50 Frames/s Optimization mainly due to the parallelism of the architecture
Christian Baumann
References
M. Michael Vai David R. Martinez, Robert A. Bond. High Performance Embedded Computing Handbook. CRC Press, 2008. Jonathan Rose Ian Kuon, Russel Tessier. Fpga architecture: Survey and challenges. Foundations and Trends in Electronic Design Automation, 2007. Peter Lee Stephanie McBader. An fpga implementation of a exible, parallel image processing architecture suitable for embedded vision system. IEEE, 2003.
Christian Baumann Field Programmable Gate Arrays (FPGA)
Questions?
Christian Baumann