Вы находитесь на странице: 1из 26

Microcontrollers in FPGAs

Tomas Sdergrd University of Vaasa

Contents

Finite state machine Design of instructions Architecture Registry file Hardware aspects of MCUs Comparison of microcontrollers

Picoblaze, Nios II and Atmega328P

Conclusions

Finite state machine

Moore Machine

Output only dependent on current state (Pedroni: 2004: 159) Output dependent on current state and external input. Clock Reset General purpose FSM (Meyer-Baese 2007: 537, Chu 2008: 324-326)

Mealy Machine

Synchronisation (Zwolinski 2000: 82)

Programmable state machine

Programmable state machine

Control Program

Memory
Data Memory

ALU

Instructions

Operations

ALU operations - Add - Mul - Not

Data move - Move - Push - Pop

Branch - Compare - Jump - Loop

Addressing modes

Addressing modes describe how the operands for an operation are located. (Meyer-Baese 2007: 544) Implied addressing (Meyer-Baese 2007: 544-545)

Location is implicitly defined No operands in the instruction

Immediate addressing (Meyer-Baese 2007: 546)


One operand in the instruction The operand is a constant

Addressing modes

Register addressing (Meyer-Baese 2007: 546547)


Data is fetched from fast CPU registers Used for ALU operations in most RISC machines

Memory addressing (Meyer-Baese 2007: 547549)

Direct addressing

Additional register needed due to instruction size In base addressing the additional register contains a constant that is added to the constant in the instruction. In page addressing the additional register contain the most significant bits of the address. Full address is obtained by concatenation. The additional register contains the full address

Indirect addressing

Data flow

An instruction contains at least one (the first) of the following:


Operation code Operands Result location

Parameters affecting the instruction size


Number of operations Number of operands Memory size

Zero address CPU Stack machine


No operands in the instruction All operations are performed on the two top elements of the stack Code example:
Push #5 Push #3 Add Pop Reg1

(Meyer-Baese 2007: 552-553)

One address CPU Accumulator machine


One operand in the instruction The second operand is the value of the accumulator The destination is the accumulator Code example
Load #5 Add #3 Store Reg1

(Meyer-Baese 2007: 553-554)

Two address CPU


The instruction contains two operands The destination of the result is the location of the first operand

Code examples
Move Reg1, #5 Add Reg1, #3 Move Reg2, #5 Move Reg1, #3 Add Reg1, Reg2

(Meyer-Baese 2007: 555)

Three address CPU


The instruction contains three addresses Destination and sources can be specified separately

Code examples
Move Reg2, #5 Move Reg3, #3 Add Reg1,Reg2,Reg3 Add Reg1, #5, #3

(Meyer-Baese 2007: 555-556)

Architecture

Von Neumann Architecture

Shared data and program memory = One bus Separate data and program memory = Two buses Separate X and Y data memories and separate program memory = Three buses

Data & CPU Program Data CPU Program

Harvard Architecture

Super Harvard Architecture

Data X
CPU Data Y

Fast cache registers for immediate results


(Meyer-Baese 2007: 558)

Program

Registry file

Two dimensional bit array Has a mechanism for storing data to the registry file Has a mechanism for reading data from the registry file Consumes many logical elements in a FPGA

The registry file in the example discussed on the following pages is of size 8x16 and consumes 211 LEs (Meyer-Baese 2007: 560)

VHDL registry file example

Entity declaration (Meyer-Baese 2007: 560)


Entity reg_file IS generic (W: integer:=7; N: integer :=15); port(clk, reg_ena : in std_logic; data : in std_logic_vector(W downto 0); rd, rs, rt : in integer range 0 to 15; s, t : out std_logic_vector(W downto 0)); End;

VHDL registry file example

Architecture: type declarations (Meyer-Baese 2007: 560)


Architecture fpga of reg_file is subtype bitw is std_logic_vector(W downto 0); type SLV_NxW is array (0 to N) of bitw; signal r : SLV_NxW; Begin Mux: Process Begin wait until clk=1; if rd>0 then r(rd)<=data; end if; End Process Mux;

VHDL registry file example

Architecture: Demux for outputs (Meyer-Baese 2007: 560)


Demux: Process(r,rs,rt) Begin if rs>0 then s<=r(rs); else s<=(others=>0); end if; if rt>0 then t<=r(rt); else t<=(others=>0); end if; End Process Demux;

FSM vs PSM
FSM
Special purpose State register Generates certain output based on simple logic Next state can be specified freely

(Chu:2008:324)
PSM
General purpose Program counter (PC) Generates outputs based on encoding and decoding Next state is normally an incrementation of the PC. Exceptions are branch instructions.

Structural aspects for FPGAs

Harvard Architecture better for FPGA MCUs Reason: Memory size more limited (and slower)

Data flow (Meyer-Baese 2007: 556-557)

A more complex instruction implies:


Easier assembly programming More complicated C compiler development Longer instruction Fewer instructions needed Lower speed Larger constant is immediate addressing

Comparison of instructions
Parameter Architecture Registry file Clk/instr. Instr. count Picoblaze Harvard 16 x 8 bit 2 57 Nios II Harvard 32 x 32 bit 1 256 Atmega328P Harvard 32 x 8 bit 1-2 131

Data mem.
Instr. width LE count Data flow

64 B
18 bit ~200 2 address

?
32 bit >700 3 address

2 kB
? 2 address

(Chu 2008: 323, 326-327, 329, 332-337 Altera Nios II/e, Altera Nios II/f, Altera 2011: 3,11-12, Atmega328P: 1, 8, Moshovos 2007)

Recently developed MCU


Article publiched in Semptember 2011 by Martin Shoeberl. Properties:


Name= Leros 16 bit microcontroller Accumulator machine/one address CPU 200 LEs 2 stage pipeline = fectch and decode 2 clock cycles/instruction Portable= Successfully tested in Altera and Xilinx devices Assembly compiler available

Conclusions Useful technology?

Area optimisation

Algorithms like FFT may consume less resources, but will hence become slower. (Meyer-Baese 2007: 537) Main purpose of FPGA technology is processing speed?

Reuse of code

Controller and datpath partitioning (Zwolinski 2000: 160) General vs special purpose state machine (Chu 2008: 324)

Complexity

Moves some of the complexity of VHDL (or Verilog) to the compiler

Conclusions Useful technology?

Speed

No parallism anymore Backwards development?

Especially useful when:


Part of a larger circuit Multi controller systems that perform simpler tasks

Sources
Atmega 328P. 8-bit Microcontroller with 4/8/16/32K Bytes In-System Programmable Flash [online] [cited 17.11.2011] Available from Internet: URL http://www.atmel.com/dyn/resources/prod_documents/doc8271.pdf AVR assembly. Beginners introduction to AVR assembler [online][cited 17.11.2011] Available from Internet: URL http://www.avr-asm-tutorial.net/avr_en/beginner/index.html Altera Nios II (2011). Processor Architecture. [online][cited 18.11.2011] http://www.altera.com/literature/hb/nios2/n2cpu_nii51002.pdf Altera Nios II/e Core. Economy. [online][cited 18.11.2011] URL: http://www.altera.com/devices/processor/nios2/cores/economy/ni2-economycore.html

Sources
Altera Nios II/f Core. Fast for Performance Critical Applications [online] [cited 18.11.2011]. URL: http://www.altera.com/devices/processor/nios2/cores/fast/ni2fast-core.html Chu, Pong P. (2007). FPGA Prototyping by VHDL Examples. Ohio: Wiley. Meyer-Baese, U. (1999). Digital Signal Processing with Field Programmable Gate Arrays. 3. Edition. Heidelberg: Springer. Moshovos, Andreas (2007). Using Assembly Language to Write Programs. [online] [cited 18.11.2011]. Available from Internet. URL: http://www.eecg.toronto.edu/~moshovos/ECE243-2009/lec5%20%20Intro%20to%20Assembly.htm

Sources
Shoeberl, Martin (2011). Leros: A Tiny Microcontroller for FPGAs. Field Programmable Logic and Applications (FPL), 2011 International Conference. 1014. Zwolinski, Mark (2000). Digital System Design with VHDL. 2. Edition. Essex: Pearson Education Limited.

Вам также может понравиться