Академический Документы
Профессиональный Документы
Культура Документы
Lecture 7
FPGA Devices
Reading
Required
P. Chu, FPGA Prototyping by VHDL Examples
Chapter 2.2, FPGA
Recommended
S. Brown and Z. Vranesic, Fundamentals of Digital
Logic with VHDL Design
Chapter 3.6.5 Field-Programmable Gate Arrays
Recommended Reading
Xilinx, Inc.
Spartan-3E FPGA Family
Module 1:
Introduction
Features
Architectural Overview
Package Marking
Module 2:
Configurable Logic Block (CLB)
and Slice Resources
Dedicated Multipliers
ECE 448 FPGA and ASIC Design with VHDL
Required Reading
Xilinx, Inc.
Spartan-3 Generation FPGA User Guide
Extended Spartan-3A, Spartan-3E, and Spartan-3
FPGA Families
Chapter 5 Using Configurable Logic Blocks (CLBs)
Chapter 6 Using Look-Up Tables as Distributed
RAM
Chapter 7 Using Look-Up Tables as Shift Registers
(SRL16) [up to Library Primitives]
FPGA
Field Programmable
Gate Array
What is an FPGA?
Configurable
Logic
Blocks
Block RAMs
Block RAMs
I/O
Blocks
Block
RAMs
FPGAs
Off-the-shelf
High performance
Low development cost
Low power
Short time to market
Low cost in
high volumes
Reconfigurability
~ 85%
Xilinx
Programmable
Logic Devices
10
High-performance families
Virtex (220 nm)
Virtex-E, Virtex-EM (180 nm)
Virtex-II (130 nm)
Virtex-II PRO (130 nm)
Virtex-4 (90 nm)
Virtex-5 (65 nm)
Virtex-6 (40 nm)
Virtex-7 (28 nm)
Low Cost Family
Spartan/XL derived from XC4000
Spartan-II derived from Virtex
Spartan-IIE derived from Virtex-E
Spartan-3 (90 nm)
Spartan-3E (90 nm) logic optimized
Spartan-3A (90 nm) I/O optimized
Spartan-3AN (90 nm) non-volatile,
Spartan-3A DSP (90 nm) DSP optimized
Spartan-6 (45 nm)
Artix-7 (28 nm)
11
12
CLB Structure
Programmable
interconnect
Programmable
logic blocks
14
CLB
CLB
CLB
CLB
Slice
Slice
Logic cell
Logic cell
Logic cell
Logic cell
Slice
Slice
Logic cell
Logic cell
Logic cell
Logic cell
15
Y
Look-Up
O
Table
Carry
&
Control
Logic
CK
EC
F5IN
BY
SR
XB
F4
F3
F2
F1
X
Look-Up
Table O
CIN
CLK
CE
Carry
&
Control
Logic
S
D
CK
EC
SLICE
16
17
CLB Structure
18
Storage element
Latch or flip-flop
Set and reset
True or inverted inputs
Sync. or async. control
19
Y
Look-Up
O
Table
Carry
&
Control
Logic
CK
EC
F5IN
BY
SR
XB
F4
F3
F2
F1
CIN
CLK
CE
X
Look-Up
Table O
Carry
&
Control
Logic
S
D
CK
EC
SLICE
20
21
x2
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
x3
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
x4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
x1
x2
x3
x4
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
LUT
x1 x2 x3 x4
x1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
x2
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
x3
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
x4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
Look-Up tables
are primary
elements for
logic
implementation
Each LUT can
implement any
function of
4 inputs
x1 x2
y
y
22
LUT
ROM
RAM
A2
A1
WS
DI
F5
0
F4
A4
F3
A3
F2
A2
F1
A1
BX
WS
DI
D
F5
GXOR
LUT
ROM
RAM
nBX
BX
1
0
23
Y
0
1
0
0
1
1
0
0
1
0
0
1
1
1
1
1
0
0
0
0
0
0
0
1
0
1
0
1
0
1
0
0
LUT
OUT
LUT
24
25
Distributed RAM
RAM16X1S
Synchronous write
Synchronous/Asynchronous
read
Accompanying flip-flops used
for synchronous read
D
WE
WCLK
A0
A1
A2
A3
LUT
RAM32X1S
D
WE
WCLK
A0
A1
A2
A3
A4
LUT
=
LUT
or
RAM16X2S
D0
D1
WE
WCLK
A0
A1
A2
A3
O0
O1
or
RAM16X1D
D
WE
WCLK
A0
SPO
A1
A2
A3
DPRA0 DPO
DPRA1
DPRA2
DPRA3
26
27
Shift Register
LUT
IN
CE
CLK
Dynamically addressable
delay up to 16 cycles
For programmable
pipeline
Cascade for greater cycle
delays
Use CLB flip-flops to add
depth
LUT
D
CE
D
CE
D
CE
D
CE
OUT
DEPTH[3:0]
28
29
30
Shift Register
12 Cycles
64
Operation A
Operation B
4 Cycles
8 Cycles
64
Operation C
3 Cycles
3 Cycles
9-Cycle imbalance
Register-rich FPGA
Allows for addition of pipeline stages to increase
throughput
31
32
33
Examples:
Determine the amount of
Spartan 3 resources needed
to implement a given circuit
Circuit 1:
Top level
m
0
run
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
clk
a
b
c
d
Circuit 1:
F function
a
b
a
b
c
d
y3
w1
y2
w0
y1
En
y0
2-to-4 Decoder
x3
y3
x2
y2
<<<3
x1
y1
x0
y0
0
1
2
3
4
5
6
7
1
e
1
0
f
g
s
cout
Full
Adder
cin
y
g h
Circuit 2:
Top level
run
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
clk
a
b
c
d
e
Circuit 2:
F function
a
e
a
w3
y1
w2
y0
c
w1
w0 Encoder
Priority
a
b
c
x3
y3
x2
y2 g
x1
y1 h
>>2
d
x0
y0
0
1
2
3
4
5
6
7
1
g
1
0
i
cout
Half
Adder
Y
Look-Up
O
Table
Carry
&
Control
Logic
CK
EC
F5IN
BY
SR
XB
F4
F3
F2
F1
X
Look-Up
Table O
CIN
CLK
CE
Carry
&
Control
Logic
S
D
CK
EC
SLICE
39
Full-adder
cout
FA
x
y
cin
x + y + cin = ( cout s )2
x
0
0
0
0
1
1
1
1
y
0
0
1
1
0
0
1
1
cin cout
0 0
1 0
0 0
1 1
0 0
1 1
0 1
1 1
s
0
1
1
0
1
0
0
1
Full-adder
Alternative implementations
x
0
0
1
1
y
0
1
0
1
cout
0
cin
cin
1
s
cin
cin
cin
cin
Full-adder
Alternative implementations
Implementation used to generate fast carry logic
in Xilinx FPGAs
x
0
0
1
1
y
0
1
0
1
cout
y
cin
cin
y
Cout
0
S
x
y
A2
p=xy
g=y
s= p cin = x y cin
XOR
A1
g
Cin
LUT
Hardwired (fast) logic
MSB
Carry Logic
Routing
LSB
49
50
Embedded Multipliers
52
53
54
55
56
Inferred Multiplier
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity mult18x18 is
generic (
word_size
: natural
:= 17;
signed_mult : boolean
:= true);
port (
clk : in
std_logic;
a
: in
std_logic_vector(1*word_size-1 downto 0);
b
: in
std_logic_vector(1*word_size-1 downto 0);
c
: out
std_logic_vector(2*word_size-1 downto 0));
end entity mult18x18;
architecture infer of mult18x18 is
begin
process(clk)
begin
if rising_edge(clk) then
if signed_mult then
c
<= std_logic_vector(signed(a) * signed(b));
else
c
<= std_logic_vector(unsigned(a) * unsigned(b));
end if;
end if;
end process;
end architecture infer;
Unsigned
Signed
1111
x 1111
15
x 15
1111
x 1111
-1
x -1
11100001
225
00000001
58
CORE Generator
CORE Generator
62
Block RAM
Port B
Port A
Spartan-3
Dual-Port
Block RAM
Block RAM
65
4k x 4
8k x 2
4,095
16k x 1
8,191
8+1
0
2k x (8+1)
2047
16+2
0
1023
1024 x (16+2)
16,383
66
67
DO[w-p-1:0]
DI[w-p-1:0]
68
DOA[wA-pA-1:0]
DIA[wA-pA-1:0]
DOA[wB-pB-1:0]
DIB[wB-pB-1:0]
69
Input/Output Blocks
(IOBs)
Three-State
FF Enable
Clock
SR
Three-State
Control
Set/Reset
D Q
EC
Output
FF Enable
Output Path
SR
Direct Input
FF Enable
Registered
Input
D
EC
Input Path
SR
71
IOB Functionality
IOB provides interface between the
package pins and CLBs
Each IOB can work as uni- or bi-directional
I/O
Outputs can be forced into High Impedance
Inputs and outputs can be registered
advised for high-performance I/O
72
74
FPGA Nomenclature
75
XC3S100E-4CP132
Spartan 3E
100 k
family
equivalent
logic gates
speed
grade
-4
= standard
performance
132 pins
package type
76