Вы находитесь на странице: 1из 7

Spintronic Logic Gates for Spintronic Data Using Magnetic Tunnel Junctions

Shruti Patil, Andrew Lyle, Jonathan Harms, David J. Lilja and Jian-Ping Wang Department of Electrical and Computer Engineering University of Minnesota Twin Cities

{pati0036,lylex031,harms047,lilja,jpwang}@umn.edu

Abstract— The emerging field of spintronics is undergoing exciting developments with the advances recently seen in spin- tronic devices, such as magnetic tunnel junctions (MTJs). While they make excellent memory devices, recently they have also been used to accomplish logic functions. The properties of MTJs

are greatly different from those of electronic devices like CMOS semiconductors. This makes it challenging to design circuits that can efficiently leverage the spintronic capabilities. The current approaches to achieving logic functionality with MTJs include designing an integrated CMOS and MTJ circuit, where CMOS devices are used for implementing the required intermediate read and write circuitry. The problem with this approach is that such intermediate circuitry adds overheads of area, delay and power consumption to the logic circuit. In this paper, we present a circuit to accomplish logic operations using MTJs on data that

is stored in other MTJs, without an intermediate electronic

circuitry. This thus reduces the performance overheads of the spintronic circuit while also simplifying fabrication. With this circuit, we discuss the notion of performing logic operations with a non-volatile memory device and compare it with the traditional method of computation with separate logic and memory units. We find that the MTJ-based logic unit has the potential to offer a higher energy-delay efficiency than that of

a CMOS-based logic operation on data stored in a separate memory module.

I. INTRODUCTION

Traditionally, computing has been achieved through the use of charge, a property of electrons that has single- handedly carried technology to unimaginable limits. So far, little has been done to leverage the property of ‘spin’ of electrons. Currently substantial research efforts are being de- voted to develop our understanding and utilization of electron spin for computing. The field of spin electronics, or ‘spin- tronics’, strives to exploit electron spins along with charges to increase the advantages obtained from electronic circuits. In recent years, researchers have succeeded in developing and using spintronic devices such as the magnetic tunnel junctions (MTJ). These devices operate on the principle of tunneling magnetoresistance (TMR), an effect seen at and below micro-scale dimensions ([1], [2]). This makes them highly scalable, and hence high integration densities are possible with MTJs [3]. Being ferromagnetic in nature, they are also capable of retaining their states in the absence of power, which eliminates static power dissipation. These properties of MTJs have been leveraged to obtain dense, low-power and non-volatile memory [4]. From the study of MTJs so far, other properties that have been observed

are programmability, noise-resistance and radiation-hardness ([3], [5]). The combination of these desirable properties in a single device makes them promising devices for applications in many areas of digital computing. As technology nodes shrink in size, the current state-of- the-art CMOS technology is facing challenges such as scal- ability limits, power dissipation, device variability, etc. By exploiting the spintronic effects in devices, it may be possible to overcome some of these challenges. For example, the properties of non-volatility and low-power consumption of spintronic devices can bring power enhancements to CMOS- based designs. Therefore, it is necessary to investigate tech- niques and means to utilize the spintronics properties in computing circuits and evaluate their potential for improving computer performance. In previous works, MTJ-based memories have been fabri- cated commercially as Magnetic Random Access Memories (MRAMs) [4]. In this type of memory, data is intrinsically stored as a spintronic state. Logic operations using MTJs have been demonstrated [5] and other components, such as hybrid flip-flops [6], SRAMs [7] and adders [8] have been designed, simulated and fabricated. These works have taken the critical first steps towards spintronic computation and demonstrated the capability of the MTJ devices for efficient logic operations. They have also established the advan- tages of including spintronic devices within logic modules. However, these designs include more electronic components than spintronic components. This has been necessitated by the requirement to read or sense the spintronic data as an electronic signal and then to write it in MTJs using current signals. These intermediate circuits add integration complexity, power consumption, area and delay overheads to logic modules, and hence should be minimized to gain a full advantage of the spintronic technology. To enable an efficient use of MTJs for logic functions, we examine the problem of performing logic operations directly on data that is stored in an MTJ in the spintronic form, and investigate the possibility of avoiding the need for intermediate electronic circuitry to convert it into electronic signals. In this paper, we present a logic circuit to achieve this objective. Without intermediate devices, the delay and power consumption of the circuit is expected to reduce. A significant feature of this circuit is its simplicity of operation as well as fabrication.

AB775C:D7B A8E7F5C:D7B (a) Basic Struc-
AB775C:D7B
A8E7F5C:D7B
(a)
Basic
Struc-

ture

5

5

5 5 ! ! " ! " 34)567(8(9:;<7 34=8<>2 ?8=@567(8(9:;<7 34=8<>" (b) CIMS operation
5
5
!
! "
! "
34)567(8(9:;<7 34=8<>2
?8=@567(8(9:;<7 34=8<>"
(b) CIMS operation of MTJ

!

! K & K ! 5 5 34=8<>LMN G J;F
!
K &
K !
5
5
34=8<>LMN
G J;F

!

Fig. 2.

!

! I A I B Net I Direction of I Result State through of MTJ-C
!
I A
I B
Net I
Direction of I
Result State
through
of MTJ-C
MTJ
(Preset H)
-I (0)
-I (0)
-2I
bottom ! top
H
(1)
-I (0)
+I (1)
0
- H
(1)
+I (1)
-I (0)
0
- H
(1)
+I (1)
+I (1)
+2I
top ! bottom
L
(0)

!

NAND ! operation using CIMS-based MTJs

5 5 !
5
5
!

Fig. 1.

Magnetic Tunnel Junction

The proposed circuit demonstrates a spintronic logic cir- cuit that can both store and process data. This dual capability provides for an opportunity to implement logic operations inside a non-volatile memory unit, thus reducing the com- munication overhead between logic and memory units. With our circuit, we investigate the notion of accomplishing logic using the MTJ devices within a single logic-and-memory unit, and compare it to the von Neumann approach of computation with separate logic and memory units. We measure the delay and power consumption of a circuit that accomplishes logic operations on data stored in a memory cell, implemented using CMOS-based circuits and the MTJ- based proposed circuit. We show that a single bit operation using the MTJ-based logic setup shows an improved energy- delay efficiency compared to a CMOS-based logic operation configuration that reads data from a 180nm 256x3 memory module and a 130nm 512x3 memory module.

II. THE MAGNETIC TUNNEL JUNCTION (MTJ) DEVICE

The magnetic tunnel junction (Fig. 1(a)) is a sandwich of two layers of ferromagnetic materials separated by a thin insulating barrier made of a metal oxide like AlO or MgO [9]. The two ferromagnetic layers, the free layer and the fixed layer, are magnetized when the spins of their electrons are all aligned in a single direction. With a parallel relative orientation of the magnetization of the two layers, the resistance of the device (R P ) is lower than its resistance with an anti-parallel relative orientation (R AP ). Thus, the device exhibits two distinct resistance values, low and high, which are used to denote the digital states of logic-0 and logic-1. Usually, the fixed layer is magnetized in one direction and held constant by using additional layers. The magnetization of the free layer is controlled externally to set the resistance of the device to a high or low value as desired. One way of controlling the free layer is by applying a magnetic field to it. This technique is known as field-induced magnetization switching (FIMS). With the FIMS technique, programmable logic has been demonstrated using MTJs [10]. This has further led to the development of logic circuits such as a full adder [11], 1-bit ALU [12] and 3-bit gray counter [13]. An improved technique of controlling the direction of the free layer magnetization uses a recently discovered mech- anism called spin torque transfer (STT)([14], [15], [16]). With this effect, currents passing through an MTJ have been shown to magnetize the device. As shown in Fig. 1(b), an

input current (I) larger than or equal to threshold current (I C ) passing from the free layer to the fixed layer (top to bottom) magnetizes the free layer in a direction parallel to the fixed layer, and a reverse current magnetizes it in the opposite direction. This method of controlling the magnetization of the free layer is known as current induced magnetization switching (CIMS). With this technique, the input to the MTJ are currents +I or I which can be denoted as logic-1 and logic-0 respectively. The resistance of the device can be measured by passing a small sense current (I sense ) through the device. The CIMS device operation has been leveraged for per- forming logic operations and demonstrated in [5]. By con- structing multiple current lines that act as inputs, the state of an MTJ is controlled by the net current that passes through it. Fig. 2 shows an example of an MTJ accomplishing the NAND operation. The MTJ is preset to the logic-1 state. Inputs are currents of magnitude I (logic-0) and +I (logic- 1), with I = I c /2. Logical inputs (0,0),(0,1) and (1,0) do not affect the state of the MTJ. Only an input of (1,1) generates a net current of +I c through the MTJ that changes the magnetization of the device to a logic-0 state, giving the NAND truth table. Recent commercially developed Magnetic Random Access Memories (MRAMs) utilize the CIMS technique to write data to MTJ devices [17]. A CMOS-based write circuitry converts data voltage into appropriate signals that generate directional currents through an MTJ. Reading of MTJ devices is accomplished using CMOS-based sense amplifiers, in a manner similar to reading any DRAM memory cell. Sense amplifiers allow different bias requirements to be met easily, and hence using sense amplifiers for reading spintronic memory is beneficial when MTJ-based storage devices are used within CMOS circuits that are performing computation. If the same concept is used for computation of logic operations using MTJs themselves, the circuitry would consist of intermediate electronic circuitry to convert signals between the spintronic devices. A possible circuit is shown in Fig. 3, where the sense amplifier senses the resistance difference between the MTJ to be read (MTJ-A) with a reference MTJ (MTJ-Ref) in a low state and outputs a voltage of 0V or +Vdd, which are then fed into a write circuitry consisting of current mirror circuits that generate the critical current for the logic-MTJ. Clearly, when it is desired to perform simple logic operations on data stored in MTJs using MTJs themselves, this extra level of electronic circuitry within the spintronic unit puts extra overheads on

5

)*'+*' ,488 , -'+*'. %/012(2'3 4567$% :23;627 9327'. $ $4** )*'+*' !"#$%
)*'+*'
,488
,
-'+*'.
%/012(2'3 4567$%
:23;627 9327'.
$
$4**
)*'+*'
!"#$%
!"#$&'(
?+8
?+8
)*'+*'
,488
<5=2;$!"#
,
-'+*'.
%/012(2'3 4567$>
:23;627 9327'.
$
?+8
$4**
)*'+*'
!"#$>
!"#$&'(
?+8
?+8

Fig. 3.

intermediate electronic circuitry for reading and writing data

Logic operation using an MTJ for data stored in two MTJs with

the circuit. This motivates the need for a simpler mechanism that performs the function of a sense amplifier, but entirely in the spintronics domain with as little extra circuitry as possible. We address this requirement in this work. We present a logic circuit that senses or ‘reads’ the data in MTJs and computes the result of a logic operation in a third MTJ with no intermediate electronic circuitry. We describe this circuit in the next section.

III. CIRCUIT FOR LOGIC OPERATIONS USING MTJS

Noting that the inherent spintronic state of an MTJ is a resistance, we propose a logic circuit that senses the net resistance of a circuit to generate a current through it. The circuit consists of three MTJs A, B and C connected as shown in Fig. 4(a) or Fig. 4(b). The inputs are stored as a spintronic state in MTJs A and B. The MTJ that performs the logic operation and stores its result within itself is MTJ-C. In Fig. 4(a) MTJ-C performs the NAND or NOR operation on the data stored in input MTJs A and B. MTJ-C initially has a logic state of low, while input MTJs A and B have resistance- type data, i.e. a spintronic state corresponding to logic-0 or logic-1. Since the input devices are connected in parallel, the circuit current gets divided into the two branches without subjecting any input device to critical current values. This ensures that the states of the ‘input’ MTJs remain unchanged through the operation. On applying bias voltage V MTJ , the total current that flows through the circuit experiences the resistance of all three devices. If this happens to be above the critical value, it changes the state of the result MTJ (MTJ-C). The circuit current then experiences the new total resistance of the circuit and settles into a steady state. If the initial current is not above the critical current value, the result MTJ retains its initial state. The value of V MTJ controls the operation performed by the circuit. For the NOR operation, this value is selected so that the current that flows through the circuit is greater than I C only when MTJs A and B are low, while for the NAND

5

5 5 G H9I %&' !"#$ & ! ' ( ) * J;F
5
5
G
H9I
%&'
!"#$
&
!
'
(
)
*
J;F

(a) An entirely MTJ-based circuit (b) An entirely MTJ-based circuit

for NAND/NOR operation

for AND/OR operation

Fig. 4.

Proposed MTJ-based circuit for logic operation

TABLE I

SELECTION OF BIAS VOLTAGE FOR IMPLEMENTATION OF THE FOUR LOGIC OPERATIONS

Logic

Implementation (Set V MTJ such that)

Voltage requirement for delay of 1ns for MTJ fabricated in [18]

Operation

NOR

I > I C when both inputs are low

1.7V

NAND

I > I C when either or both inputs are low

1.8V

OR

I > I C when both inputs are low

2.5V

AND

I > I C when either or both inputs are low

2.6V

operation, the current is greater than I C for the first three rows of the truth table, when either or both the inputs are low. Thus, the same circuit performs both logic operations

at different bias voltages.

Fig. 4(b) shows a variation of the circuit where the current passing through the logic-MTJ is in a direction from top to bottom. In essence, this can also be achieved by the circuit in Fig. 4(a) by simply interchanging the voltages on terminals V MTJ and Gnd. For the OR operation, the value of V MTJ is adjusted such that the current flowing through the circuit is greater than I C only when MTJs A and B are low, while for the AND operation, bias voltage is such that the current is greater than I C for the first three rows. This is summarized

in Table I.

IV. VERIFICATION AND FABRICATION

We simulated the circuit in Fig. 4(a) using a SPICE model

of

operation. MTJ device parameters reported for a fabricated device [18] were used for the simulation: R low =3472 , R high =5902 , calculated I C =320uA for a device of size 120nmx240nm. The bias voltage (V MTJ ) calculated for the function of NOR, at a delay of 1ns, is 1.66V. Note that the NOR operation can be obtained even for a lower voltage, albeit with a longer delay, while it can be obtained with a smaller delay at a higher voltage.

the MTJ [19] for functional verification of the NOR

After applying a bias voltage of 1.7V to the circuit in

Fig. 4(a), Fig. 5 shows the resistance of the three devices for four combinations of inputs. During NOR operation, the resistance of MTJ-C changes to a high state when MTJs

A and B are in the low state. This occurs at about 300ps.

Fig. 6(a) shows the circuit current for MTJ-A=MTJ-B=R low

in the NOR circuit. The initial current is above the critical

value I C which changes the resistance of MTJ-C, and then

[NOR opr] Resistance of MTJs A,B,C for A=0,B=0 [NOR opr] Resistance of MTJs A,B,C for
[NOR opr] Resistance of MTJs A,B,C for A=0,B=0
[NOR opr] Resistance of MTJs A,B,C for A=0,B=1
6000
6000
MTJ
!A
MTJ !A
5500
5500
MTJ MTJ
!C !B
MTJ MTJ !B !C
5000
5000
4500
4500
4000
4000
3500
3500
3000
3000
0
0.5
1
1.5
0
0.5
1.5
Time (s)
x 10 ! 2 9
Time 1 (s)
x 10 ! 2 9
[NOR opr] Resistance of MTJs A,B,C for A=1,B=0
[NOR opr] Resistance of MTJs A,B,C for A=1,B=1
6000
6000
MTJ !A
5500
5500
MTJ MTJ MTJ
!A !B !C
MTJ MTJ !C !B
5000
5000
4500
4500
4000
4000
3500
3500
3000
3000
0
0.5
1.5
0
0.5
1.5
Time 1 (s)
x 10 ! 2 9
Time 1 (s)
x 10 ! 2 9
Resistance (ohms)
Resistance (ohms)
Resistance (ohms)
Resistance (ohms)

Fig. 5. (Color online) SPICE simulation for functional verification for the NOR operation with bias voltage=1.7V. Resistance (ohms) of MTJs A, B and C shown vs Time (s).

settles into a steady state with a magnitude lesser than I C . The waveforms also show that the current that flows through individual branches of input MTJs does not change their state. The same circuit (Fig. 4(a)) can be used for NAND operation when a bias voltage (V MTJ ) of 1.8V is applied. This bias voltage is large enough to generate a current more than I C for the first three rows of the truth table. The circuit current settles into a steady state after the NAND operation, as shown in Fig. 6(b). For the AND and OR operations, MTJ-C is initially preset to a high resistance state. A bias voltage of 2.5V is required for accomplishing the OR function, while a bias voltage of 2.6V is required for the AND operation. This can either be accomplished by the circuit in Fig. 4(b) or by the circuit in Fig. 4(a) by interchanging the terminals of V MTJ and Gnd. Thus, in general, the circuit in Fig. 4(a) is programmable for the four logic operations by applying suitable voltages to the two terminals. The differences in bias voltages depend on the TMR ratio of the devices. This is the ratio (R H R L )/R L . The higher the ratio, the greater the difference between the voltages required for different logic operations, and hence the greater the noise or process variations that can be tolerated by the circuit. To demonstrate the idea of the proposed circuit, we fabricated and verified this 3-MTj circuit. Fig. 7 shows the optical image of the fabricated circuit on a chip. More details and characterization measurements are presented in [20].

V. E XPECTED TRENDS IN CRITICAL PARAMETERS

An important consideration for a logic circuit in terms of performance is its delay and power. The delay of the circuit is primarily governed by the magnitude of the switching current (I) flowing through the MTJ which is generated due to the bias voltage (V MTJ ), while the power consumption is based on the bias voltage and the current through the circuit (I). The bias voltage applied for the NOR and NAND operation is calculated as a direct function of the critical current density J c of the fabricated devices. Thus, both the delay and the

3.5 x 10 ! 4 3 2.5 2 1.5 0 0.5 1 1.5 Time (s)
3.5 x 10 ! 4
3
2.5
2
1.5
0
0.5
1
1.5
Time (s)
x 10 ! 2 9
Current (A)
Current (A)
3.4 x 10 ! 4 3.2 3 2.8 2.6 2.4 2.2 2 0 0.5 1
3.4 x 10 ! 4
3.2
3
2.8
2.6
2.4
2.2
2
0
0.5
1
1.5
Time (s)
x 10 ! 2 9

(a) Circuit Current for A=0, B=0 in

NOR circuit (V MTJ = 1.7V)

(b) Circuit Current for A=0,B=1 in NAND circuit (V MTJ = 1.8V)

Fig. 6.

Current in the MTJ-based logic circuit obtained from simulation

in the MTJ-based logic circuit obtained from simulation Fig. 7. Optical Image of fabricated circuit power

Fig. 7.

Optical Image of fabricated circuit

power consumption of the circuit are influenced by the bias voltage requirement and the critical current density. As the critical current density decreases, the power requirements for the logic operation decrease. A great amount of research aimed at reducing the critical current density of magnetic tunnel junctions is currently underway. In only three years of prototyped devices (2005-2007), the critical current density has decreased from 16MA/cm 2 to 1MA/cm 2 , a factor of almost 16X ([21], [22], [23], [18]). The device reported in [18] with a critical current density of 1MA/cm 2 is a dual barrier structure specially designed to reduce the critical switching current, while behaving as a standard MTJ. With the parameters of the devices fabricated and characterized in the three years, we calculated the bias voltage requirements for the NOR, NAND, OR and AND operations for the proposed circuit. These are shown in Fig. 8. With fabrication of smaller MTJs, these voltages decrease proportionally with the area reduction. The bias voltage requirements for the MTJ device fabricated in [18] (120nm×240nm device with a critical current density of 1MA/cm 2 ) are comparable to those of the 180nm and 130nm CMOS technologies. Thus, the proposed MTJ circuit operates on practical voltage values.

VI. IMPACT ON COMPUTING

The proposed circuit offers some interesting and novel possibilities for computation. Without the need for interme- diate electronic circuitry, the circuit provides the potential to perform logic operations within a non-volatile memory unit, using the memory devices themselves. This capability is a critical advantage over the conventional von Neumann ap- proach to computing, where the logic units and memory units

(#$" (" '#$" ./*01%2!34" '" ./*01%'!34" &#$" /567%28" &"
(#$"
("
'#$"
./*01%2!34"
'"
./*01%'!34"
&#$"
/567%28"
&"
/567&%8"
%#$"
/567&&8"
%"
/567&'8"
!#$"
!"
)*+"
),)-"
*+"
,)-"
Fig. 8.
Bias voltage requirements for logic operations of NOR, NAND,
!"#$%&'()#*+%,+-."/+0+1)$%2&3%

OR, AND in fabricated devices

are separated because the same circuits cannot achieve both functions simultaneously. Working around this constraint of CMOS-based devices, the concept of logic-in-memory with von Neumann architectures has been researched extensively,

in

which the processor is brought closer to memory in order

to

reduce the processor-memory communication bottleneck.

Such processor-in-memory architectures have shown signif- icant performance benefits for data-centric applications like image processing, signal processing, etc. ([24], [25]). With a device capable of simultaneous storage and logic operations,

the memory element itself becomes a processor, and the overhead of communication further decreases. In a general-purpose computing system, data resides in non-volatile memory elements like hard drives, flash memory cells, etc. Since these are relatively slow compared to the fast computing circuits, data is brought closer to the processor using an intermediate hierarchy of volatile memory elements consisting of CMOS-based DRAM cells and SRAM cells in order to reduce the communication costs and increase performance. The complexity of such a setup could be potentially alleviated by using a logic-and-memory circuit for simple logic functions. To investigate the impact of this idea, we compare the energy and delay measurements of

a CMOS-based logic operation setup with the MTJ-based

logic circuit. We use device parameters related to the recent technology status for both technologies. The delay and power consumption of a single logic gate using the CMOS technology is lower than the MTJ-based logic circuit using devices fabricated in the laboratory so far. However, a complete data path for a logic operation with the CMOS technology on stored data includes the memory elements that contain input data and those that contain the result of the operation. To compare against the non-volatile MTJ memory cells performing a single logic operation on stored data, we choose the faster but volatile CMOS-based SRAM cells as memory elements that contain the input data and will store the result of the operation. We next describe the experimental set up for the evaluation.

A. Experimental Methodology

A CMOS-based circuit performing the NAND function on data stored in SRAM devices is shown in Fig. 9. The 1-bit input operands are stored in two SRAM cells (shaded in the figure), which feed into a simple ALU circuit capable of

performing the four logic operations of NAND, NOR, AND and OR. The result is stored into a third SRAM cell (shaded in the figure). In order to read and write the data in the SRAM cells, supporting circuitry is required. Thus, the two input SRAM cell arrays contain precharge and read circuitry, while the output SRAM cell array contains the precharge and write circuitry. A 3-MTJ circuit that performs a logic operation and stores operands and the result of the logic operation is functionally equivalent to a 3-cell SRAM circuit performing a logic operation using a logic gate, except that the spintronic im- plementation is also non-volatile. However, for CMOS-based SRAM circuits, the read and write delay also depends on the number of cells connected to a single bit line. Performance of the SRAM configuration degrades as the number of cells increases. The number of cells in the cell array depends on the design of the memory unit and the requirements of the overall system in which it resides. In order to incorporate this effect of CMOS circuits, we built an n-bit SRAM cell array during evaluation. We scale the number of cells per bit line (or array size) to show the trade-off points of performance between MTJ and SRAM circuits. We simulated this SRAM-based circuit in HSPICE using CMOS devices from 180nm and 130nm technologies, and measured the power and delay of a single-bit operation. The bias voltage (Vdd) used for the 180nm and 130nm simulations was 1.8V and 1.6V, respectively. We make the assumption that data is already present in the input SRAM cells. Being volatile in nature, the SRAM cells consume power just for retention of the data before and after the logic operation. However, we ignore this power consumption. The MTJ-based circuit in Fig. 4(a) is used to determine the current through the circuit for the time of a clock cycle. Since an MTJ circuit must be isolated from the rest of the circuitry in order to have a minimal connected path for current to flow through so that the power consumption is minimized, we simulated a single 3-MTJ circuit. It is assumed that the result MTJ is already preset to logic- 0 before the operation commences. We will discuss this assumption further in Section VI-C. The CMOS-based circuit is simulated with minimal sized devices. Each SRAM cell is implemented as a 6T circuit. A single logic-and-memory MTJ-based circuit clearly provides area advantages over a separate logic and memory module using CMOS-based circuits. This area advantage is further enhanced by the scalability possible with the spintronic tech- nology. We do not quantify the area advantages in this paper, instead focus on the power and delay performance of the MTJ-based spintronic technology to get an insight into the potential of the technology to improve circuit performance.

B. Results: Power-Delay Product (PDP) and Energy-Delay

Product (EDP)

The power consumption of the MTJ circuit is given by

V.I avg for the length of a high clock cycle. Since the bias

voltage for the spintronic device is computed for a delay of 1ns, we apply a clock high time of 2ns and compute the

!"

!"!

!"

!"!

&"*

&"* #$%&'($)% +",
#$%&'($)% +",
#$%&'($)%
+",
$%(01&*2

$%(01&*2

+"-

+"./

3(%

!"

!"!

&"*

&"* #$%&'($)% +",
#$%&'($)% +",
#$%&'($)%
+",

+"-

+"./

+%

67

+"- +"./ +% 67 +$62%1&*2

+$62%1&*2

,"!!!!# +"!!!!# *"!!!!# $%+/011# )"!!!!# %()/011# ("!!!!# 234#5$+6# '"!!!!#
,"!!!!#
+"!!!!#
*"!!!!#
$%+/011#
)"!!!!#
%()/011#
("!!!!#
234#5$+6#
'"!!!!#
%()/011#
&"!!!!#
($%/011#
%"!!!!#
234#5$+6#
$"!!!!#
!"!!!!#
$+!-.#
$&!-.#
Fig. 11. Comparison of the Energy-Delay Product of 180nm CMOS-based
SRAM circuit and the MTJ devices reported in [18]
#
#
#
C. Note on additional supporting circuitry
#
#
In order to develop into a complete logic-and-memory
#
module, there are three fundamental functions required of
#
#
each MTJ in the unit: read, write and logic. When data is
#
present in the devices, these three functions occur inherently
#
#
in the proposed circuit during a logic operation (the circuit
# #
#
#
#
current is established after ‘reading’ the inputs – this current
#
$
$
$
!"#$%&$'()*'+$
$
$

&"*

&"* #$%&'($)% +",
#$%&'($)% +",
#$%&'($)%
+",

+"-

+"./

3(%

+"- +"./ 3(% $%(01&*2

$%(01&*2

452.( 452.! 3- 8.9:;<=>?;1 ("5 3, 3@A@<=13>B;CAD
452.(
452.!
3-
8.9:;<=>?;1 ("5
3,
3@A@<=13>B;CAD

Fig. 9.

CMOS-based circuit for logic operations on data stored in volatile

SRAM cells &"!!#'(&% (",!#'(&% ("+!#'(&% ("*!#'(&%
SRAM cells
&"!!#'(&%
(",!#'(&%
("+!#'(&%
("*!#'(&%
("&!#'(&%
("!!#'(&%
,"!!#'()%
+"!!#'()%
*"!!#'()%
&"!!#'()%
!"!!#$!!%
!"!#

(,!-.%/012%

()!-.%/012%

(&,3455%

&6+3455%

078%9(,:%

&6+3455%

6(&3455%

078%9(,:%

Fig. 10.

SRAM circuit and the MTJ devices reported in [18]

causes a ‘write’ in the result-MTJ). When data is not alreadyFig. 10. SRAM circuit and the MTJ devices reported in [18] present in the devices, an

present in the devices, an external read and write is neces-a ‘write’ in the result-MTJ). When data is not already sary. To add a read/write capability

sary. To add a read/write capability to the present circuit, present in the devices, an external read and write is neces- simple switches would be required
sary. To add a read/write capability to the present circuit,

simple switches would be required to establish a read/writesary. To add a read/write capability to the present circuit, path. These switches would also allow

path. These switches would also allow the circuit to toggle between the logic mode and the memory mode. The proposed circuit also requires a ‘preset’ mode for the result MTJ before applying inputs. In a CMOS analogy, this is similar to the requirement of precharging the bit lines of a memory module before applying read or write signals. In the proposed circuit, a preset may be achieved through external write circuitry in a separate and necessary time step before a logic step. Another way is by applying a suitable voltage to the present circuit that will set the state of the result MTJ to the desired value irrespective of the states of the input MTJs. However, to prevent the current due to this ‘preset’ voltage from affecting the states of the input MTJs, the circuit must include an additional MTJ (‘preset’ MTJ) in parallel with the input MTJs. Since the proposed circuitry inherently writes to the result-MTJ, the preset state will be written to the result MTJ while the preset-MTJ protects the input data. Alternatively, techniques to amortize the overhead of the preset may be used. Though beyond the scope of this paper, it may be possible to preset an entire unit of MTJ-based logic- and-memory cells by using separate magnetic or spintronic means. For example, with a current plane above the array of result-MTJs, a current may be passed through the plane to magnetize the free layers of the entire row of result-MTJs at once.simple switches would be required to establish a read/write VII. C ONCLUSION In this work, we

VII. CONCLUSION

In this work, we demonstrated an MTJ-based spintronic circuit that enables a magnetic tunnel junction to perform a logical operation on the spintronic states stored in two other MTJs. It eliminates the need to convert the spintronic states into an intermediate voltage or current signal with interme- diate CMOS-based circuitry. The circuit can be used most

Comparison of the Power-Delay Product of 180nm CMOS-based

power consumption for that period. This ensures that the operation is reliably completed. The CMOS-based operation consists of read, write, trans- fer and the logic operation. We measure the power and delay of the entire operation including the reading of operands, the logic function and the writing of the result. We ignore the de- lay and power consumed during the transfer of data through communication lines between memory module and the ALU. Fig. 10 shows the PDP comparisons between the CMOS and MTJ technologies with our simulation parameters. The data points show the trade-off points of performance of the MTJ-based circuit with respect to 180nm and 130nm CMOS technologies. The PDP of a logic operation using device parameters of 120nmx240nm MTJs falls between that of a 180nm CMOS-based logic setup with 128 cells and 256 cells connected to a single bitline in a memory array. With 130nm technology, the PDP of the MTJ-based logic operation falls between that of a CMOS-based memory array with 256 cells and 512 cells per bitline. The energy-delay product is an indicator of the energy-efficiency of the logic operation. The EDP for the CMOS and MTJ circuit configurations is shown in Fig. 11. For the 180nm technology, though the performance of the MTJ is between the CMOS circuit configurations, the EDP of the MTJ circuit is better than both the CMOS circuits.

effectively for an operation of the type C = logic op(A, B), where A,B,C are spintronic elements that store data. This circuit requires voltages that are close to the bias voltages required for CMOS devices of comparable dimensions. Traditionally, data is transferred from a memory unit to a logic unit for processing and back to the memory unit for storage. In contrast, by leveraging the dual ability of MTJs to store and process data, simple logic operations may be processed directly in the memory unit. We compared these two schemes of computation and showed that performing a logic operation with an MTJ inside an MTJ-based memory unit with the proposed logic circuit has the potential to provide a better energy-efficiency than a CMOS-based setup consisting of a volatile SRAM memory unit with its read, write and precharge circuitry and a CMOS-based processor. In CMOS-based von Neumann architecture, power is consumed during the read, write and transfers and during the logic operation. The need for these four functions is obviated in the proposed MTJ-based logic circuit, thus dissipating less power and keeping the operation simple. The proposed circuit is capable of performing simple logic operations of NAND, NOR, AND and OR. For more com- plex operations, multiple time steps may be necessary. With the hardware parallelism of MTJs, multiple logic operations can be performed in parallel, providing an overall speed- up. The extent and trade-offs of the performance gain by performing logic operations in a large memory remain to be evaluated.

VIII. ACKNOWLEDGMENTS

This work was supported partially by the MRSEC Program of the National Science Foundation under Award Number DMR-0212302 and DMR-0819885, NSF ECCS (0702264) and the Schnell Professorship.

REFERENCES

W. Zhao, E. Belhaire, C. Chappert, and P. Mazoyer, “Spintronic device based non-volatile low standby power sram,” IEEE Computer Society Annual Symposium on VLSI, pp. 40–45, 2008.

[8] S. Matsunaga, J. Hayakawa, S. Ikeda, K. Miura, H. Hasegawa,

[7]

T. Endoh, H. Ohno, and T. Hanyu, “Fabrication of a nonvolatile full

adder based on logic-in-memory architecture using magnetic tunnel junctions,” Applied Physics Express, vol. 1, pp. 091 301–3, 2008. [9] J. S. Moodera, L. R. Kinder, T. M. Wong, and R. Meservey, “Large magnetoresistance at room temperature in ferromagnetic thin film tunnel junctions,” Physical Review Letters, vol. 74, no. 16, pp. 3273– 3276, 1995. [10] J. Wang, H. Meng, and J.-P. Wang, “Programmable spintronics logic device based on a magnetic tunnel junction element,” Journal of Applied Physics, vol. 97, no. 10, p. 10D509, 2005. [11] H. Meng, J. Wang, and J.-P. Wang, “A spintronics full adder for magnetic cpu,” IEEE Electron Device Letters, vol. 26, no. 6, pp. 360– 362, 2005.

[12] S. Patil, X. Yao, H. Meng, J.-P. Wang, and D. Lilja, “Design of a spintronic arithmetic and logic unit using magnetic tunnel junctions,” Proceedings of the 5th conference on Computing frontiers, pp. 171– 178, 2008.

[13]

S. Lee, N. Kim, H. Yang, G. Lee, S. Lee, and H. Shin, “The 3- bit gray counter based on magnetic-tunnel-junction elements,” IEEE Transactions on Magnetics, vol. 43, no. 6, pp. 2677–2679, 2007.

[14] L. Berger, “Emission of spin waves by a magnetic multilayer traversed by a current,” Physical Review B, vol. 54, no. 13, pp. 9353–9358, 1996.

[15]

E. B. Myers, D. C. Ralph, J. A. Katine, R. N. Louie, and R. A. Buhrman, “Current-induced switching of domains in magnetic multi- layer devices,” Science, vol. 285, no. 5429, pp. 867–870, 1999.

[16] J. Slonczewski, “Current-driven excitation of magnetic multilayers,” Journal of Magnetism and Magnetic Materials, vol. 159, no. 1-2, pp. L1–L7, 1996. [17] T. Kawahara, R. Takemura, K. Miura, J. Hayakawa, S. Ikeda, Y. Lee,

R.

Sasaki, Y. Goto, K. Ito, I. Meguro, F. Matsukura, H. Takahashi,

H.

Matsuoka, and H. Ohno, “2mb spin-transfer torque ram (spram)

with bit-by-bit bidirectional current write and parallelizing-direction

current read,” IEEE International Solid-State Circuits Conference, pp. 480–617, 2007.

[18] Z. Diao, A. Panchula, Y. Ding, M. Pakala, S. Wang, Z. Li, D. Apalkov,

H. Nagai, A. Driskill-Smith, L-C.Wang, E. Chen, and Y. Huai, “Spin

transfer switching in dual mgo magnetic tunnel junctions,” Applied Physics Letters, vol. 90, no. 13, p. 132508, 2007. [19] J. Harms, F. Ebrahimi, X. Yao, and J.-P. Wang, “Spice macromodel of spin-torque-transfer operated magnetic tunnel junctions,” IEEE Transactions on Electronic Devices, 2010.

[20] A. Lyle, J. Harms, S. Patil, X. Yao, D. Lilja, and J.-P. Wang, “Direct communication between magnetic tunnel junctions for non-volatile logic fan-out architecture,” To be published in Applied Physics Letters,

[1]

2010.

[2]

S. S. P. Parkin, X. Jiang, C. Kaiser, A. Panchula, K. Roche, and M. Samant, “Magnetically engineered spintronic sensors and memory,” Proceedings of the IEEE, vol. 91, pp. 661–680, 2003. E. Y. Tsymbal, O. N. Mryasov, and P. R. LeClair, “Spin-dependent tun-

neling in magnetic tunnel junctions,” Journal of Physics: Condensed Matter, vol. 15, pp. R109–R142, 2003.

[21] H. Kubota, A. Fukushima, Y. Ootani, S. Yuasa, K. Ando, H. Mae- hara, K. Tsunekawa, D. Djayaprawira, N. Watanabe, and Y. Suzuki, “Evaluation of spin-transfer switching in cofeb/mgo/cofeb magnetic tunnel junctions,” Japanese Journal of Applied Physics, vol. 44, pp. L1237–L1240, 2005.

[3] S. A. Wolf, D. D. Awschalom, R. A. Buhrman, J. M. Daughton,

[22]

J. Hayakawa, S. Ikeda, Y. M. Lee, R. Sasaki, T. Meguro, F. Matsukura,

S. von Molnar, M. L. Roukes, A. Y. Chtchelkanova, and D. M. Treger,

H.

Takahashi, and H. Ohno, “Current-induced magnetization switching

“Spintronics: A spin-based electronics vision for the future,” Science, vol. 294, no. 5546, pp. 1488–1495, 2001.

in mgo barrier based magnetic tunnel junctions with cofeb/ru/cofeb synthetic ferrimagnetic free layer,” Japanese Journal of Applied

[4] M. Hosomi, H. Yamagishi, T. Yamamoto, K. Bessho, Y. Higo, K. Ya-

Physics, vol. 45, pp. L1057–L1060, 2006.

mane, H. Yamada, M. Shoji, H. Hachino, C. Fukumoto, H. Nagao, and H. Kano, “A novel nonvolatile memory with spin torque trans- fer magnetization switching: spin-ram,” IEEE International Electron

[23] Y. Huai, M. Pakalaa, Z. Diaoa, D. Apalkova, Y. Dinga, and A. Panchu- laa, “Spin-transfer switching in mgo magnetic tunnel junction nanos- tructures,” Journal of Magnetism and Magnetic Materials, vol. 304,

Devices Meeting, IEDM Technical Digest, pp. 459–462, 2005.

 

no.

1, pp. 88–92, 2006.

[5] J.-P. Wang and X. Yao, “Programmable spintronic logic devices for

[24]

B. R. Gaeke, P. Husbands, X. S. Li, L. Oliker, K. A. Yelick, and

recongurable computation and beyond – history and outlook,” Journal

R.

Biswas, “Memory-intensive benchmarks: Iram vs. cache-based

of Nanoelectronics and Optoelectronics, vol. 3, pp. 12–23, 2008. [6] W. Zhao, E. Belhaire, V. Javerliac, C. Chappert, and B. Dieny, “A non-volatile flip-flop in magnetic fpga chip,” International Conference on Design and Test of Integrated Systems in Nanoscale Technology (DTIS), pp. 323–326, 2006.

machines,” Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), 2002. [25] R. Murphy and P. M. Kogge, “The characterization of data intensive memory workloads on distributed pim systems,” Proceedings of Intel- ligent Memory Systems Workshop, ASPLOS-IX, 2000.