Академический Документы
Профессиональный Документы
Культура Документы
Abstract—In recent times, most satellite applications perform arithmetic operations on the numbers in a parallel
consist of complex and computationally intensive data manner. RNS also obviates the need for a carry mechanism
processing systems. The challenge is to meet the demands and allows us to execute the arithmetic operations of addition
of onboard processing while keeping the power and multiplication in the same time as that required by an
consumption at a minimum. In this paper, we explore the addition operation [2].
scope of an FPGA based neural network using Residue
B. Organisation of the Paper
Number System for space based applications. We propose
an implementation that uses RNS arithmetic to exploit the The paper is organized as follows: In Section II, we
parallelism present in neural networks for faster discuss the related work done in the field of Residue number
computations thus meeting onboard processing demands System and FPGA implementation of Neural Networks. In
of satellites without the use of high powered CPUs or Section III, we describe the key concepts of Neural Network
GPUs. and Residue Number System as background necessary for our
work. In Section IV, we describe the proposed method for this
Keywords—FPGA, RNS, Neural Network, Space Application paper along with the hardware flow. In Section V, we present
an implementation to show the feasibility of our work. In
I. INTRODUCTION
Section VI, we conclude the paper with a few results obtained
A. Overview and in Section VII, we provides a brief mention of the future
A plethora of satellite applications demand high computing work that can be done on this topic.
resources that are generally gratified by using high power II. RELATED WORK
consuming embedded CPUs or onboard GPUs. In this work,
we explore the possibilities of using an FPGA as a substitute A lot of work has been done on the Residue Number
in order to curb the high power demand and to meet the System and its use in implementing Neural Networks. A
real-time data processing and transmission needs of the comprehensive study covering the theory and implementation
system. To show the feasibility of the proposed setup we of the Residue Number system and its arithmetic is presented
implement a neural network classifier acting as a payload data in [3]. It provides a very thorough insight on the RNS usage
processing system on an FPGA. The onboard payload data and practical hardware implementations. Reference [4]
processing system allows us to reduce the load on the main presents a compilation of studies presenting the FPGA and
CPU as well as perform computation on the data collected by ASIC implementation of artificial neural networks. It covers
the satellite for onboard analysis and transmitting only the end important parameters and constraints to be considered when
result to the ground station, thus reducing the cost of data implementing neural network on hardware platforms.
transmission. A proposed implementation of a trained multilayered
The FPGA allows us to use an appropriate low precision perceptron network is presented in [6]. A framework for
representation which reduces hardware resources and neural network implementation is provided in [7]. Various
increases clock frequency [5]. Another significant merit of the types of neural network require different type of design
FPGA is lower power consumption than that of a traditional considerations for their hardware models. Hardware design for
CPU or GPU. A previous work [5] showed that the FPGA Convolutional Neural Networks introduce a lot of constraints
based Neural Network is about 10 times more efficient, in due to their size and complexity. Reference [8] and [9] present
terms of performance per power than that of a GPU based implementation of CNN on FPGA platforms.
neural network.
Representation of numbers in Residue Number System An implementation of residue number system in neural
decomposes the number into smaller units, allowing us to networks can be seen in [5], which proposes a Nested Residue
Number System to realize a deep convolutional neural
In each layer, the output vector from the previous layer and
the weight-bias matrix of the current layer is used to calculate
neuron output. Refer equation (1).
The output from each neuron in the current layer is used to
B. Moduli Set Determination
create the output vector which is then subjected to an
activation function and passed as input vector to the next As discussed before, we need to determine a precision
layer. In order to exploit parallelism, we make use if M criterion as well as the moduli set before we can work in the
activation function modules, where M is the total number of residue number system. The set of moduli used for the
neurons present in the current layer. We repeat this cycle for arithmetic operations in RNS also depend upon various factors
every layer of the neural network. The total number of clock of the pre trained neural network and determine the
cycles required to convert an input vector to its RNS architecture and complexity of the Neural Network. One
representation depends upon the representation of the input fallout of working in the Residue Number system is the
binary weighted numbers(32-bit or 64-bit). Finally, in the last representation of numbers containing fractional parts. In our
layer, we use a comparator module to return the target class of implementation, the non-integer inputs are scaled by a factor
the input vector. The final result is a [[ log2n ]] bit output of 104, and then approximated to the nearest integer before
where n is the total number of target classes. they are used in the FPGA. Allowing the numbers to scaled
using a precision criterion, as the one chosen above, doesn’t
cause significant drop in the accuracy of the neural network
V. IMPLEMENTATION and allows us to work in the residue number system. One can
work with higher precision, however that requires increasing
A. About the DataSet the dynamic range of the system which depending on the
In this paper, we worked with an in-house trained classifier on moduli set taken, may require more hardware.
the data from UCI Machine Learning Repository In this work, we determined the set {31, 37, 41, 43, 47, 53, 59,
Crowdsourced Mapping DataSet [16]. The dataset was derived 61} to be our moduli set with a dynamic range of 1.812x1013 .
from geospatial data from two sources: Arithmetic precisions for synthesis are: 32 bits for
1) Landsat time-series satellite imagery from the years conventional binary weighted inputs, 6 bits for each modulus
2014-2015 in the moduli set, thus resulting in 48 bit RNS representation
2) Crowdsourced georeferenced polygons with land of each element of the input data vector after the forward
cover labels obtained from OpenStreetMap. conversion. The elements of the weight-bias matrix are also
The data contained 28 features and the neural network was stored in their RNS representation corresponding to the
trained to classify an input vector in one of the six moduli set taken above. Thus each weight is also represented
classes(water, farm, impervious, orchard, grass, forest). by 48 bits.
This data set was chosen so as to show the application of this C. Hardware Realization
work in orbital satellites that could perform onboard data
Figures 1 to 3 show the organisation of the hardware modules
processing.
used for mapping the neural network onto the FPGA. Each
The neural network was trained on MATLAB 2017a with an
hardware module can be realised as an all-ROM structure,
accuracy of 94.37. The input was standardized with respect to
all-combinational logic, or a combination of both. A simple
the training data so as to achieve optimum output. The
and direct way to implement the Forward Converter Module is
activation function used for training was Rectifier or ReLU.
to have a sequential structure that consists of a lookup table
that stores all the values of |2j|m , a modular adder, a counter
and an accumulator[3].
n−1
|X|m = | ∑ xj 2j |m (6)
j=0