Вы находитесь на странице: 1из 4

Java-Based Programmable Networked Embedded System

Architecture with Multiple Application Support


Sunghyun Lee, Kiwook Yun, Kiyoung Choi, Seongsoo Hong, Soomook Moon, Jeonga Lee*
School of Electrical Engineering, Seoul Natl Univ.
*
Department of Computer Science, Chosun Univ.
S.Lee@mithra.snu.ac.kr
think FPGA is a viable option for the following reasons:

Abstract
In this paper, we address the problem of designing
embedded system architecture for emerging
information appliances such as PDAs and
IMT2000
terminals.
These
systems
are
characterized by multiple-application support and
network connectivity. Programmable architecture
is an efficient way of implementing the multipleapplication support. We propose an FPGA-based
embedded system architecture using Java as a
software platform. We define a set of native
communication APIs for the communication
between Java application and the FPGA.

1. Introduction
In traditional embedded systems design, the key
objective is to find an optimal architecture to perform a
single, specific application [1] [2]. For the system
architecture, ASICs and processors are considered as
typical build blocks. The designer performs partitioning
of the application onto ASICs and processors according
to the performance metrics such as processing power,
flexibility and power consumption. Recently emerging
embedded systems in the area of information appliance
(e.g. PDAs and IMT2000 (International Mobile
Telecommunication) terminals [3]) are different from
traditional embedded systems in the following aspects:
1) They run multiple applications such as web-browsers,
audio/video communication applications, etc.
2) They require network connectivity.
Designing a multiple-application embedded system
requiring considerable processing capability for each
application is a challenging task from the viewpoint of
the traditional design approach which gives in general an
architecture that is good for only one application but is
inferior or just not suitable for other applications. Thus,
in many cases, the designer will fall into costly and
sometimes even non-convergent re-design loops to
satisfy cost and power consumption constraints, or at
best will produce over-design. One way of solving such
problem is to introduce programmability to the system
architecture so that the embedded system can be adapted
to new application.
For the purpose of introducing programmability to
the architecture, we embed FPGAs into the system. We

1) Recently introduced FPGAs have processing


capability and logic capacity close to those of ASICs.
2) By reprogramming the FPGAs, the embedded system
can adapt itself to a new application.
3) Although currently available FPGAs are not very
power-efficient, there are continuous efforts to
achieve low-power consumption through methods
such as applying low supply-voltage and power-down
mode [4].
In the respect of embedded system management,
network connectivity is an opportunity in that it enables
downloading new applications from remote places. To
exploit such connectivity, we adopt Java [5] as a
software platform. Java is another enabling technology
for a programmable embedded system in that it can
download and execute new application code without
shutting down the entire system.
In this paper, we propose a Java based embedded
system architecture with an FPGA coupled with a
standard processor. Within the architecture, we download
Java code and FPGA bit-stream of new application over
network and program the FPGA. We also support
dynamic reconfiguration of the FPGA. For the
communication between hardware (FPGA) and software
(Java application code), we define a set of native APIs
that can be used to access the underlying hardware from
Java applications. To invoke native APIs from the Java
application, we use Java Native Interface (JNI) [5]. In
this paper, we assume that the partitioning granularity is
Java method, i.e. we implement some Java methods into
HW components (HW methods).
The remainder of the paper is organized as follows.
We explain related work in section 2, and give an
overview of the proposed architecture in the respect of
both hardware platform and software platform in section
3. We explain in detail the native communication APIs in
section 4. Finally, we show some experiments conducted
with the proposed system architecture in section 5, and
conclude in section 6.

2. Related Work
The most closely related work can be found in [6],
where Fleischmann et al. present a prototyping
environment for a re-configurable Java-based embedded
system. The environment consists of a Pentium PC and
an FPGA prototyping board connected via a PCI bus.

They report that 80-90% of the system's execution time


is consumed for communication over PCI bus.
There has been a few research efforts regarding the
design of multiple-application embedded system.
In [1], Kalavade and Subrahmanyam address the
design problem of multiple-application embedded
systems. They partition a given set of applications on an
architecture template that consists of one or more
processor cores, hardware accelerators, and coprocessors.
However, they do not use FPGAs in their target
architecture template.
In [2], Kienhuis proposes the Y-chart approach to
explore the design space of multiple-application
embedded systems. Starting from his stream-based
function (SBF) model of a set of applications he
produces quantitative performance numbers for
parameterized target architectures. For this purpose, he
constructs a simulator that can perform interpreted or
non-interpreted simulation while extracting the
performance figures.
Java is getting more popular in the design process of
embedded systems. Some of the examples are:
1) System specification [7] [8] sometimes with
extensions for expressing real-time constraints.
2) SW implementation and execution platform,
sometimes with enhanced real-time capability [9].
To exploit the programmability of an FPGA, many
propose run-time reconfiguration method that
dynamically changes the functionality of an FPGA
during system execution. In [10], we address the issue of
applying run-time partial reconfiguration strategy to an
FPGA so that the programming and execution can be
performed in a concurrent manner. Specifically, we try to
solve the problem of reducing configuration overhead
that has significant impact on the overall system
performance.

3. Proposed FPGA-based System Architecture


An example of HW component running in the FPGA,
which corresponds to a Java method, is shown in Figure
1. It has an input buffer and an output buffer to receive
arguments and store the results, respectively. Also it has
a control signal buffer to manipulate and check - e.g. to
give a start signal and to check the done signal - the state
of the HW method. These buffers are all mapped onto
the processor address space and the processor can access
them with normal memory read/write instructions.
3.1 Hardware Platform
The hardware platform of the architecture consists of
a standard processor, an FPGA, and a system memory,
which are all connected through a shared processor bus
as shown in Figure 2.

Processor bus

Data
Bus

Address
decoder

Control
buffer

Input
buffer

HW
method
core

Address
Bus

Output
buffer

Figure 1. Internals of a HW method.


Shared
System
Memory

FPGA
Local
Memory
Shared bus

Processor

Network
Interface

FPGA

Configuration
Controller

Bitstream
Storage

Figure 2. Proposed hardware platform.


When the processor initiates a read or write operation
to some HW method, the corresponding HW method
responses to the bus activity with the help of its own
address decoder. Currently, we assume that the processor
initiates HW/SW communication and the FPGA
responds as a slave in the bus activity. Also a
configuration controller with bit-stream storage is
included in the proposed architecture to support dynamic
reconfiguration of the FPGA. Since the configuration
controller takes the responsibility of programming the
FPGA with the initiation from the processor, the
processor can perform other useful jobs in parallel with
an FPGA programming job.
3.2 Software Platform
We use Java as a software platform as shown in Figure 3.
We have an embedded OS to provide services - e.g.
thread service and other hardware management service for Java run-time environment. Java applications in
remote places can be downloaded into the embedded
system through the System Manager. The System
Manager implements three basic protocols such as:
1) Application code (including FPGA bit-stream)
download protocol
2) System maintenance related protocol for remote
management
3) Authentication protocol to control an access to the
embedded system
The System Manager contains a custom ClassLoader
[4] that is based on a socket connection. An application
in remote place can be downloaded and executed in the
following procedures:

FPGA programming operations occur simultaneously,


the corresponding status variable ConfigStatus is
declared as a critical section. That means, the method
configHW( ) that accesses the configuration controller
is static and synchronized.

Java Application
Processor
address space

System Manager

Java Virtual Machine


Embedded Native
OS
Library

direct
physical
address

HW
methods
(FPGA)

Processor

Figure 3. Proposed software platform.


1) Get an authority to make the system enter
management mode.
2) Download new applications code, and then performs
initialization such as placing FPGA bit-stream into
bit-stream storage.
3) Execute new application.
In this paper, for simplicity, we assume that buffers of
HW methods are mapped onto pre-defined and fixed
physical address region.

4. HW/SW Communication
When we use Java as a software platform, the
application cannot access the underlying hardware
directly. That means, in the proposed architecture, the
application running in the JVM on processor cannot
access the physical address region mapped to buffers of
HW methods in the FPGA directly. This can be solved in
two ways:
1) Modify the JVM to have direct access service to the
physical address of the processor.
2) Develop a native communication library so that an
application can access it through Java Native
Interface (JNI) with some overhead.
While the first one will result in overhead-minimal
solution, it is a painful job to modify the JVM internals
when we consider that we should follow the latest
releases of JVM. Thus, we have developed an extra
native communication library outside of JVM and
introduced a JNI layer (Figure 4 and Figure 5). In Figure
4, XXX represents primitive Java data type such as
integer, float, double, etc.
The Java application accesses the native library
through the JNI layer shown in Figure 5. We have shown
the pseudo code for a setParam( ) method as an
example. In the figure, CID is an object that contains
buffer address information of HW method. The CID is
defined uniquely for each HW method after the
hardware/software partitioning so that the address and
size of buffers are all determined during application
design time. Furthermore, since only one configuration
controller is present in the system architecture, we cannot
program two or more HW methods into an FPGA at the
same time. For this purpose, a dummy variable
ConfigStatus is included to reflect the current status of
a configuration controller. To ensure that no multiple

int hwWriteXXX(int addr, XXX p);


int hwWriteArrayXXX(int addr, XXX[ ] p );
XXX hwReadXXX(int addr);
XXX[ ] hwReadArrayXXX(int addr);
int hwConfig(int cf_mem_addr, int bitstr_size);

Figure 4. Native communication APIs.


class HWInterface {
static int ConfigStatus;
public static native int setParam(CID hw_cid, object P)
{
if (type_of_P == XXX)
err = hwWriteXXX(hw_cid.addr, (XXX) P);
return err;
};
public static native int getResult(CID hw_cid, object R);
public static native int setCMD(CID hw_cid, int cmd);
public static native int getStatus(CID hw_cid);
public synchonized static native int configHW(CID hw_cid); }

Figure 5. Java Native Interface layer.


methodA( )
{
// do something..
int a = objA.m1(3); // SW
int b = objB.m2(4); // HW
int c = a + b;
// do something..
}

methodA( )
{
// do something..
// cid2 = id of HW method m2
1 HWInteface.configHW(cid2);
2 Object P = new Integer(4);
3 HWInterface.setParam(cid2, P);
4 HWInterface.startHW(cid2); // HW start
5 int a = objA.m1(3); // SW
6 Object R = new Integer( );
7 while (HWInterface.getStatus(cid2) = 0)
8
; // wait until HW method finishes
9 HWInterface.getResult(cid2, R);
10 int b = ((Integer) R).getValue( );
11 int c = a + b;
// do something..
}

Figure 6. An example usage of native library.


Figure 6 shows an example of using the API defined
above. To execute a method 'objB.m2( )' (shown in lefthand side) in FPGA, we first program the FPGA with the
corresponding bit-stream (line 1, right-hand side). And
then, we copy the parameters into the HW method input
buffer (line 3), and initiate the HW method (line 4). For
the parameter passing, since Java is an object-oriented
language, it is important to include not only explicitly
declared arguments of a method but also implicitly
accessed member variables. Finally, we check the status
buffer of the HW method to wait until the computation
finishes (line 7, 8), and then copy back the result (line 9).
Since the processor and the FPGA run independently
except the read/write operation (for parameter passing
and result copying), we can perform other useful SW
jobs while the HW method is running (line 5). Currently,
we use a polling approach and we try to reduce the
polling overhead by delaying the access to the result until
we really need it. We are planning to develop an
interrupt-based approach to eliminate the polling
overhead.

5. Experiments
We have implemented an embedded system board
with an ARM710T processor [11] and a Virtex FPGA
[12]. The board contains a network interface, and two
debug interfaces, a PCI host interface and a serial port
(Figure 7). We have ported a commercial embedded OS
and a small Java runtime environment on it.
Debugging interface (PCI)

ARM710T
Network
Interface

PLX

support. We have investigated the possibility of using


FPGAs and Java to implement the embedded system. We
argue that the performance and flexibility offered by
modern FPGAs are key to the multiple-application
support.
Currently, we are conducting more thorough
experiments to refine the architecture. Specifically we
are focusing on the communication overhead reduction
problem, which are problematic in the proposed
architecture.
Cycles

Network
Interface
XC 4085

UART

EPROM

sdram

Cycles
FPGA
Processor
Total

RFPGA

Configuration
controller

SRAM

SRAM SRAM

Figure 7. Networked embedded system board.


Scheduler

DCT

BitGen
1199

Quant

Point, PB

Zigzag

MBCollect

Dequant

BitGenerator

IDCT

SINK

PQuant, TR, TRB, Mode, QP, MV

1.

2.
3.

MB, TRB

4.
Restore

Inter frame
1170230
1327475
2497705

This work was supported by grant No. 98-0101-04-01-3


from the interdisciplinary Research program of the
KOSEF.

Control
Mode, MV

Intra frame
1168200
1140649
2308849

7. Acknowledgements

QP, Mode
Mode, QP

Predict

Predict
1640

Reference

Point, PB

SRC

Dquant
450

Table 2. Frame encoding delay (CLK@25MHz).

Bitstream Storage FPGA configuration bus


(SELECT MAP bus)

Debugging interface (Serial port)

MBDist

Quant
569

Table 1. Core encoding tasks delay (CLK@25 MHz).


SRAM

Serial
port

Dct/IDct
1190

MBCollect

FPGA

5.

Figure 8. Example H.263 encoder.


We have developed an H.263 [13] encoder (Figure 8)
as an experimental example. We performed behavioral
synthesis for each block to obtain the corresponding
hardware modules. For partitioning, we assigned all the
core encoding tasks to the FPGA except the interfacing
tasks, which provide the encoding tasks with the input
source stream and receive the encoded output stream.
Table 1 and Table 2 summarize the implementation
results. Table 1 shows the delay of some of the core
encoding tasks. Based on these figures, we applied
pipelining technique to maximize the throughput. Table 2
shows the resultant delay of encoding a single frame. In
the table, FPGA cycles represent the cycles consumed in
the FPGA for encoding tasks and the Processor cycles
indicate those consumed in the processor for interfacing
tasks. The overall framerate is about 10 frames per
second, which is quite a good result considering the
clock speed of 25 MHz.

6. Conclusion
In this paper, we present a networked embedded
system architecture, which aims at multiple-application

6.

7.

8.

9.
10.

11.

12.

13.

A. Kalavaide and P.A. Subrahmanyam, "Hardware/software


partitioning for multi-function systems," Proc. Int. Conf. on
Computer Aided Design, pp. 516-521, 1997.
International
Telecommunication
Union
(ITU),
"International mobile telecommunications".
Inc.
Xilinx,
Low
Power
Documentation,
http://www.xilinx.com/products/xaw/pwr/pwr_doc.htm
Sun Microsystems Inc., Java 2 SDK Documentation,
http://java.sun.com/products/jdk/download-pdf-ps.html
Josef Fleischmann, Klaus Buchenrieder, and Rainer Kress,
"A Hardware/Software Prototyping Environment for
Dynamically Reconfigurable Embedded Systems," Proc. Int.
Workshop on Hardware-Software Codesign, pp. 105-109,
1998.
R. Helaihel, K. Olukotun, Java as a Specification Language
for Hardware-Software Systems, Proc. Int. Conf. on
Computer Aided Design, pp. 516-521, 1997.
Claudio Passerone et al., "Modeling reactive systems in
Java," Proc. Int. Workshop on Hardware-Software Codesign,
pp. 15-19, 1998.
T. Kuhn, W. Rosenstiel, Java Based Modeling and
Simulation of Digital Systems on Register Transfer Level,
Int. Workshop on System Design Automation, 1998.
Sun Microsystems, Inc., Embedded Java Application
Environment, http://java.sun.com/products/embeddedjava/
B. Jeong, S. Yoo, S. Lee, and K. Choi, Hardware-Software
Cosynthesis for Run-time Incrementally Reconfigurable
FPGAs, Proc. Asia and South Pacific Design Automation
Conference, 2000
ARM
Ltd.
ARM710T
Data
Sheet,
http://www.arm.com/Documentation/Datasheets/PDF/DDI00
86B.pdf
Inc. Xilinx, Virtex 2.5V Field Programmable Gata Arrays
Datasheet v1.9, http://www.xilinx.com/partinfo/ds003.pdf
Telenor,
Telenors
H.263
Software,
http://www.nta.no/brukere/DVC/h263_software/

Вам также может понравиться