Вы находитесь на странице: 1из 16

Journal of Advances in Communication Engineering and Its Innovations

Volume 4 Issue 2

Hw/Sw Co-Design Implementation on Zedboard

Shruti Vinayak Shet*, Dr. Nitesh Guinde


Student*, Associate Professor**
Department of Electronics and Telecommunication
Goa College of Engineering mPonda, India
Corresponding author’s email id: shrutishet908@gmail.com, nitesh.guinde@gec.ac.in

Abstract
Field Programmable Gate Arrays is an empowering technology for
application-oriented systems, providing a means for rapid prototyping and
evaluation, as well as algorithm acceleration. Many FPGA vendors have
recently started experimenting with embedded processors in their devices, like
Xilinx with ARM Cortex A cores, together with programmable logic cells.
These are known as Programmable System on Chip (PSoC). These ARM cores
(embedded in the Processing System or PS) communicate with the
programmable logic cells (PL) using ARM standard AXI buses. The hardware
setup used in this project is Zedboard along with ADFMCOMMS2-EBZ is a
high speed analog module which has preinstalled IIO OSCILLOSCOPE and
GNU RADIO software. The IIO OSCILLSCOPE Linux Application supports
different plots for real time processing and analyzing the signals obtained
from the antennas of the analog module. This application captures the desired
incoming RF signal in the IIO OSCILLOSCOPE wherein the entire
computation is run on to the PS, while the FPGA fabric remained idle during
this process. During profiling it was found that the most computational
expensive block is Fast Fourier transform (FFT) block that took a longer time
to display the output. So to lessen the computation time we transfer the FFT
block on to the PL side via AXI buses to communicate to the PS side of the
board. Due to the parallel nature of FPGA the capability to calculate large
mathematical calculation can be made smoother and in lesser time period.

13 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

Key Words: Field Programmable Gate Arrays, Programmable System on


Chip, IIO OSILLOSCOPE, Zedboard Fast Fourier transform, AXI buses.

INTRODUCTION The Zynq®-7000 SoC family incorporates


System-on-a-Chip (SoC) technology was the software programmability of an
designed for applications that require all ARM®-based processor with the hardware
components implemented at the chip level programmability of an FPGA, enabling
and also to maintain the smaller key analytics and hardware acceleration
dimension. This refers to integrating all while integrating CPU, DSP, ASSP and
components of an electronic system into a mixed signal functionality on a single
single integrated circuit. This chip can device.
contain a variety of signals including
digital, analog, mixed-signal, and often RF ZYNQ-7000 ARCHITECTURE
functions, and it does all in one single A. Main Purpose of Zedboard
chip. The recent development in the SoC The real reason for choosing the Zedboard
family is the Programmable Systemon- as the prime element in the project was to
Chip devices which are designed to explore the H/W S/W compatibility
replace multiple traditional MCU-based between the PS and PL and the ease to
system components with one, low cost modify the programmable blocks with the
single-chip programmable device. PSoC, help of necessary software tools such as
as the name suggests devices that mainly Xilinx Vivado and the Hardware
include configurable blocks of analog and Description Language (HDL) such as
digital logic, as well as programmable VHDL and Verilog .
interconnects. This architecture not only
allows the user to create customized B. ZYNQ- 7000 SOC Specifications
peripheral configurations that match the The Zedboard is an evaluation and
requirements of each individual development board based on the Xilinx
application and also provides integration Zynq-7000 Protactible Processing
of a fast CPU, Flash program memory, Platform. Interfacing a dual Cortex-A9
SRAM data memory, and configurable I/O Processing System (PS) with 85000
that are binded together in a range of Series-7 Programmable Logic (PL) cells,
convenient pinouts and packages. this Zynq-7000 can be marked for a
broader use in many applications. The

14 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

Xilinx Vivado suite offers the designer a The essential part of the Zedboard is that
variety of environment to work on the the PS and PL are both linked by a series
zedboard that includes flexible design, of interfaces which follow the AMBA
support for designing and testing of IP AXI4 interconnect standard as seen in Fig-
blocks which can be used and reused for 1. This interface enables a designer to
various design approach. The Zedboard’s implement custom logic blocks according
powerful mix of on-board peripherals and to their requirements in the PL which can
expansion capabilities make it an ideal then be connected to the PS and also
platform for both starter and experienced extend the range of peripherals that are on
designer. This board contains everything the AXI bus are easily available and are
that is necessary for any developer to visible on to the processor’s memory map
create a Linux, Android, Windows® or through the use of software.
other OS/RTOS based design.
C. AD9361- ANALOG DEVICES
The Zynq SOC chip consist of 2 major The AD-FMComms2-EBZ is an FPGA
sections Mezzanine Cards (FMC) board for
PS: Processing System AD9361,a highly integrated RF Agile
 Dual ARM Cortex-A9 processors, Transceiver™. The purpose of the AD-
866 MHz to 1GHz frequency. FMComms2-EBZ is to provide an RF
 Multiple peripherals platform which shows the greater extent
 Hard silicon core performance of the AD9361.
 Dedicated DDR memory controller
The expected performance displayed by
PL: Programmable Logic this platform according to the datasheet
 Logic cells – 28k-44k specification is up to 2.4GHz frequency.
 AD converter – two 12 bits This device combines an RF front end with
 Provides the user the ability to a flexible mixed- signal baseband section
develop their own custom logic or and integrated frequency synthesizers,
IP which can work in conjunction simplifying design-in by providing a
with the software running on configurable digital interface to a
processor core. processor or FPGA as seen in Fig-2
Zedboard is interfaced with AD9361.

15 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

Figure: - 1 Zynq-7000 SOC Architecture internal

Fig-2 Hardware setup of Zedboard and AD9361 RF Device

16 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

D. ADI IIO OSCILLOSCOPE tuning and filter profiles of a particular RF


The ADI IIO OSCILLOSCOPE is a signal can be observed and modified on
product by Analog Devices, Inc. that uses the fly.
libiio to interface with Linux IIO Devices
developed by Analog Devices. It is E. Fast Fourier Transform IP Core
basically a Linux application, which A Fast Fourier Transform (FFT) is
demonstrates to interface with various basically an algorithm that computes the
evaluation boards i.e. AD9361 From direct Fourier Transform (DFT) of a
within a Linux Application. This particular sequence, or its inverse (IDFT).
application also supports plotting of Fourier analysis converts a desired signal
various RF captured data in 4 different from its original domain (often time or
modes (time domain, frequency domain, space) to a representation in the frequency
constellation and cross correlation) as domain and vice versa. The DFT obtained
observed in Fig-3.This is a great tool for above by decomposing sequence of values
testing, debugging and fine tuning of an into different components of different
RF system. Parameters such as gain, frequencies.

Fig-3 Waveform generated by an Incoming RF signal


detected by the antennas.

17 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

The Fast Fourier Transform code that is Making use of synchronization pulse to
used in this project is composed of a series keep track of the butterfly output and a
of stages. These stages right from the start counter which is used to keep things
are split into an even and odd stage. The aligned, produces the first pulse, next N/4
first stage is numbered N according to the clock cycles will produce valid butterfly
size it represents, the second stage is outputs. The left output is sent
labelled as N/2, N/4 and so on down to immediately to the next FFT stage, where
N=8 as seen in the Fig-5. Internal to each the right output is saved in the memory.
FFT stages is a butterfly and complexes
multiply stage. These FFT stages are a Once this cycle is complete the butterfly
form of decimation in frequency FFT, the outputs will be invalid for the next N/4
coefficients are alternated between 2 clock cycles. During this clock cycles, the
stages. The even stages get all the even FFT stage output’s data that had been
coefficients, and the odd stages get all the stored in the memory as can be analyzed in
odd coefficients. Each stage spends the Fig-5. The complex multiply formed in the
first N/4 clocks storing its inputs into the internal of butterfly, is formed from three
allocated memory, and then the next N/4 very simple shifts and adds multiplies,
clocks pairing a stored input with a single whose output’s is then transformed into a
external input, so this value obtained single complex output. The complex
becomes input to the butterfly. Later on coefficient, Zn, for these multiplies are
the butterfly coefficients is read from given by
small ROM table.

Figure-4 FFT Black Box Representation

18 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

Zn = Cn + jSn , where With the availability of this high-end


Cn = [2 C – 2 Cos ( 2πn/N) + ½], processing systems in the project Xilinx
Sn = [2C – 2 Sin ( 2πn/N) + ½], and Zynq™- 7000 All Programmable SOC,
C is the number of bits allocated to the makes it ideal for any user to fully utilize
coefficient. (See Figure:-5) the processing system and programmable
logic at the same time.
HARDWARE SETUP
Xilinx FPGA’s have a long application in An accurate embedded system should
the implementation of fixed-point DSP and possess the best accuracy and adaptation
video algorithms in hardware. The between hardware acceleration blocks and
flexibility of programmable logic allows software execution and processing
the fixed-point arithmetic calculations to management, this is called a proper
use custom bit widths tat are not bounded combination of H/W S/W co-design
to the 8-, 16- or 32-bit boundaries that are environment.
prescribed for a fixed- point processors.

Fig-5 Single FFT butterfly output stage

19 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

Due to timing and speed constraints of the are then either transmitted or received
system have made it mandatory to make based on the user requirements and
use of H/W acceleration, since algorithms displayed on to the in-built Linux
implemented in the PL will need lesser Application of IIO OSCILLOSCOPE. The
execution time compared to its software above Fig-6 demonstrates the hardware
counterpart. connection and its internal connection
between Zynq platform and AD-
The AD-FMComms2 EBZ analog module FMComms2 board. Since the application
is interfaced with zedboard to create a IIO OSCILLOSCOPE runs entirely on PS
Software Defined Radio (SDR) platform side and therefore the highly extensive
which enables the hardware to capture or calculations needed for the Linux
send signals of desired application on to Application makes it a slower process.
the antennas of the AD9361. These signals

Fig-6 Basic Hardware Arrangement

20 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

A. System Architecture Integrated Development Environment


The whole system is design is centered on (IDE). The IP Integrator environment of
the programmable logic. The main Vivado IDE is an excellent tool for the
interface between the programmable logic creation and generation of a Zynq
and processing system is a set of processor based design which can be
communication protocols composing of implemented on the Zedboard. Later on
multi-channels called the AXI bus the generated design is passed through the
interface. The role of this dedicated Software Development Kit (SDK) which
communication protocol and interfaces is creates the software application of the
to execute a convenient and fast data above design which helps to run on the
interaction while processing. Zynq’s ARM processing System to control
the hardware that is implemented in the
The process of designing a Zynq system is Programmable Logic.
created with the help of Xilinx Vivado™

Fig-7 Overall Procedure Used in Project

21 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

1) Creating IP a) First Stage Boot Loader: Once the


A user has got no restriction on the BootRom is configured, the very next step
complexity of an intellectual property (IP) to forward is creating the First Stage Boot
that can be added in the fabric to be tightly loader (FSBL) for Zynq. “The FSBL
coupled with the Zynq SOC PS. Since the configures the FPGA with HW bit stream
Zynq deices comprises of both PS and PL (if it exists) and loads the Operating
parts, the IP's that are created are made to System (OS) Image or Standalone (SA)
run on to the PL, it should therefore be Image or 2nd Stage Boot Loader image
able to communicate with software from the non-volatile memory
running on the PS. Therefore this requires (NAND/NOR/QSPI) to RAM (DDR) and
that IP should be packaged with an starts executing it. It supports multiple
interface that is compactable with the PS. partitions, and each partition can be a code
When creating IP in HDL, Vivado image or a bit stream." This is a better
provides a list of AXI interface templates explanation is provided by the Xilinx in
which can be created and customized via the SDK software while opting for zynq
the Create and Package IP Wizard. fsbl.

2) Custom Embedded Linux OS on b) Programmable Logic Hardware Bit


Zedboard stream File: "A Bit Stream or bit stream is
The ultimate aim of this project is to load a time series or sequence of bit."
the custom design created by the designer From the above it gives a clearer idea that
on to the FPGA Fabric. To initialize this PL HW Bit stream file is a sequence of
process we need to build a custom bits which contains HW and PL
Embedded Linux OS on to the Zedboard. configuration of a particular board selected
It basically means we need to setup by the user details in the form of binary
environment according to the application language so that the selected board can
in order to build the HDL project from the understand and implement.
available Analog Devices repository.
Below are procedures that give us insight c) Second Stage Boot Loader U-Boot: U-
of how to have a communication between Boot is an open-source, primary boot-
PL and PS side of Zedboard. loader used in embedded devices to
package the instructions to boot the
device's operating system kernel. It boots

22 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

an operating system by reading the kernel on a given board and has been choosen by
and any other required data into memory default mechanism to be passed on as low-
and executing the kernel with the proper level hardware information from the
argument. Bootloader to the Kernel. The DTB file is
build from the Device Tree Source (DTS)
d) Generation of "boot.bin" file: For file. The DTS file is the DTB file write in
adding the Boot image partitions, the most a human-editable format.
important order that is to be followed is:
First, bootloader i.e. First Stage Boot f) Linux Kernel File, uImage: It is a
Loader ("zynq fsbl.elf"); second the zImage file that has a U-Boot wrapper that
Programmable Logic includes the OS types and loader
Bitstream("system.bit") and finally the information.
software application _le i.e. U-Boot("u-
boot.elf"). All this _le names are stored in g) Root File System Image, "uramdisk.
a boot.bif file. "Bootgen" is a standalone image.gz": It is basically a compressed _le
tool to create a bootable image appropriate which contains all the operating system
for Zedboard. Making use of following files. Below Fig- illustrates in brief all
command: accommodation of _les in the SD Card
which is then later on attached to the SD
bootgen -image boot.bif -o i boot.bin. Card slot in the Zedboard.

The above program assembles the boot The below Fig-8 give a brighter step by
image by merging the elf an the bit files to step initializatoion of creating Custom
develop a single boot image with the Embedded OS on Zedboard. This all files
binary output file boot.bin file which is are then later on embedded into the SD
then introduced in the SD Card. Card which is attached to the SD Card slot
of the Zedboard as can be seen in Fig-9.
e) Device Tree Binary, "devicetree.dtb":
The Linux Kernel uses board specific data
structure called "Device Tree Blob" or
"Device Tree Binary (DTB)" to describe
the hardware. It basically is a database
which represents the hardware components

23 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

Fig-8 Step by Step Execution of Creating Custom Embedded OS on Zedboard

Fig-9 SD Card Various files needed to boot the system

24 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

RESULTS AND DISCUSSIONS In the Previous chapter a detailed


The hand written code implemented on the Discussion was mentioned further how to
Vivado IP Integrator using Verilog code of create a custom embedded OS on to the
the Fast Fourier Transform may have Zedboard with the help of SD card and the
many errors that is to be rectified, so the requirements of each and every _le that
Vivado has functionalities that check the helps in the Zedboard to act according to
code by passing the code through various the user requirements. This is basically a
steps of Evalaution such as Synthesis, medium of interaction between the
Verification and Implementation. This Software Implementation of the designer
further gives a detailed insight of the code code with the implementation of Hardware
and the project itself develops a RTL by the Zedboard. The Below Fig-7 gives a
structure of the said IP and also creates a brief idea about how overall process has
netlist out the available details enlisted by been conducted for the manipulation of
the user. FFT core on to the Zedboard.

Fig-10 Synthesis Result

25 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

Fig-11 IP fft_fifo_ip inclusion in the analog project

Below Fig-10, shows the systhesis result automatically reads the driver code and
obtained from the interaction of Axi Bus implements the necessary allocation of
pheripheral master and slave units with the data on the memory and makes it available
Fast Fourier Transform IP. After the on the display console.
overall process has been completed a code
is run on to the PS side of the Zedboard so CONCLUSION
as to map the evaluation results obtained The dissertation aim was to implement a
from the PL on to the PS and directly custom application-specific design on the
display on the user console so as to view Programmable Logic of the Zedboard
the correctness of the code an also to fabric was accomplished with optimized
evaluate the speed of the Fast Fourier and proper output. This experiment not
Transform on to the ARM processor. only gave the insight that due to to its
Basically mmap is used for mapping the Computing platform and excellent
outputs of the PL onto the PS. mmap() is performance to perform parallel in nature
basically a Unix System call that allocates and also to execute at a faster rate than the
or maps files or devices on to the memory. Processing System of The ARM. The Task
The c code is run on to the Zedboard and it of communication with the PL an d PS

26 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

was very much accomplished with the help Electronics (Comptelix) Manipal
of a AXI- standard bus which had its own University Jaipur, Malaviya
sets of peripherals that enabled any user to National Institute of Technology
create a custom code in association with Jaipur & IRISWORLD, July 01-
peripherals and talk to the PS with no 02,2017
further disturbances. The use of Software
gave better perspective of a System-on- II. Real-Time System Implementation
Chip process. Later on the software details for Image Processing with Hard-
had to be made available in the proper ware/Software Co-design on the
bootable image that was developed using Xilinx Zynq Platform M. Ali
the Software Development Kit. Lastly Altuncu, Taner Guven, Yasar
using a executable c code that was Becerikli, and Suhap Sahin
compactible with the ARM Processing International Journal of
system was executed and result was then Information and Electronics
obtained. Engineering, Vol. 5, No. 6,
November 2015.
ACKNOWLEDGMENT
I would like to thank the immense III. REAL TIME
guidance provided by my guide Prof. IMPLEMENTATION OF
Nitesh Guinde in accomplishing this SPATIAL FILTERING ON FPGA,
dissertation. I would also like to thank my Chaitannya Supe, Advances in
College for providing me access to various Vision Computing: An
materials and hardware facility whenever I International Journal (AVC),
would require. Vol.1, No.4, December 2014

REFERENCES IV. Design of Image Display


I. A Robust Technique for Image Controller Component for VGA
Processing based on interfacing of Inter- faced Monitor using
Raspberry- Pi and FPGA using ZedBoard, Divyansh A Thakar,
IoT, Ajay Rupani, Pawan Whig , Rikin J Nayak, Jaiminkumar B
Gajendra Sujediya, Piyush Vyas, Chavda, Jay R Patel, Mit P Patel,
2017 International Conference on 2nd International Conference on
Computer, Communications and Current Research Trends in

27 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved


Journal of Advances in Communication Engineering and Its Innovations
Volume 4 Issue 2

Engineering and Technology c X. Vivado Design Suite Tutorial,


2018 IJSRSET | Volume 4 | Issue 5 Embedded Processor Hardware
| Print ISSN: 2395-1990 | Online Design by Xilinx
ISSN : 2394-4099 Themed
Section: Engineering and
Technology

V. Zynq FPGA based and Optimized


Design of Points of Interest De-
tection and Tracking in Moving
Images for Mobility System
Abdelkader BEN AMARA,
Mohamed ATRI, Edwige
PISSALOUX,
VI. Richard GRISEL (IJACSA)
International Journal of Advanced
Computer Science and
Applications, Vol. 9, No. 10, 2018

VII. https://wiki.analog.com/resources/e
val/user-guides/ad-fmcomms2
ebz/software/linux/zynq 2014r2

VIII. https://wiki.analog.com/resources/f
pga/docs/build

IX. The Zynq Book Tutorials by


Louise H. Crockett, Ross A. Elliot,
Martin A.Enderwitz, Robert W.
Stewart

28 Page 13- 28 © MANTECH PUBLICATIONS 2019. All Rights Reserved

Оценить