Вы находитесь на странице: 1из 83

Electronic copy available at: http://ssrn.

com/abstract=2216338
An Electricity Primer for Energy Economists:
Basic EE to LMP Calculation
Richard M. Benjamin
Round Table Group
January 23, 2013
Electronic copy available at: http://ssrn.com/abstract=2216338
Chapter 1
Basic EE and PTDFs
1.1 Introduction
This manuscript provides a technical primer for the energy economist wishing to study the
optimal power ow problem and locational marginal pricing (LMPP), but, who like this
papers author, nds it dicult to simply plow through Schweppe et al. (1988). This work
aims to provide the reader with sucient technical background to calculate power transfer
distribution factors (PTDFs, or network shift factors) for a more complicated model than
the simple 3-node model common to energy economics studies.
1
The point of departure
for this work is basic electrical engineering fundamentals. We start o Section 1.2 with a
simple direct current (DC) circuit analysis, demonstrating Ohms law and Kirchhos laws.
As these are captured by very basic equations, we quickly move on to alternating current
(AC) analysis. Here, we spend quite a bit of eort manipulating fairly straightforward
(i.e. sinusoidal) AC voltage equations to derive equations for average voltage, current,
and power in an AC circuit. We delve next into phase angles, showing how electrical
components such as inductors and capacitors bring voltage and current out of phase. We
then demonstrate both graphically and algebraically the eect of non-zero phase angles on
resistance and power relations in an AC circuit. The goal of Section 1.2 is to familiarize the
reader with concepts that economists and operations researchers generally take for granted,
but are quite foreign to those of us who are self-taught.
Getting this background under our belts, we move on to DC network analysis, taking this
intermediate step to demonstrate circuit concepts to the reader in the simple setting of a DC
network. We then introduce the reader to the linearized version of the AC network model
in Section 1.3. In this section we demonstrate a methodology for linearizing the real-power
1
See e.g. Bushnell and Stoft (1997) for a detailed analysis of the three-node model.
1
Electronic copy available at: http://ssrn.com/abstract=2216338
AC network equations (the latter of which we derive in Appendix A). After linearizing
the AC network equations, we quickly demonstrate DC sensitivity analysis in section 1.4.
We conclude Chapter 1 by demonstrating the calculation of the PTDFs for the linearized
(DC) version of the AC network in section 1.5. Sections 1.2, 1.3, and 1.5 of this chapter
aim to provide the reader with sucient technical background to understand basic power
equations, as presented in energy economics papers, along with deriving PTDFs. Section
1.4 oers a conceptual preview of locational marginal price calculation, as demonstrated
in chapter 3.
In Chapter 2, we present the AC power ow model and a powerful method for solving the
AC power ow problem, the Newton-Rhapson Method. Chapter 2 serves two purposes.
First, it familiarizes the energy economist with the type of problems setup that one sees
in electricity papers written by operations research analysts. The economist spending
any time at all in the study of electricity will almost certainly run into such, and this
chapter provides some nice background material. Second, the method we use to solve
the DC optimal power ow algorithm corresponds to a single run of the Newton-Rhapson
algorithm for solving nonlinear equations. Thus, this serves as valuable material for the
economist wishing to place our solution method in its proper context.
In Chapter 3 we demonstrate LMP calculations. Fortunately, chapters 1 and 2 do most of
the heavy lifting of AC optimal power ow setup, allowing the reader to focus on the
specics of LMP derivation, while simply reviewing OPF principles.
1.2 Electrical Engineering Fundamentals
We begin our power ow analysis with a DC series circuit.
2
In a basic series circuit, a
source of electromagnetic force (emf) is connected to one or more sources of resistance
(e.g. light bulbs), which form a circuit, as shown below:
Battery
R
a
R
b
+
Figure 1.1: A Basic DC Circuit
2
Other DC circuits include paralled circuits, where current ow splits at junctures into parallel paths.
2
In a DC circuit current ows in one direction only: from negative to positive (as opposed
to AC, where current alternates direction several times per second). Thus, current ows
in a counterclockwise direction in the example above. The mathematics of this circuit
are governed by three laws: Ohms Law, Kirchhos Current Law (KCL), and Kirchhos
Voltage Law (KVL). These three laws are as follows: Ohms law: current, resistance and
voltage in a circuit are related as follows:
V = I R. (1.1)
where:
V is voltage, or emf, measured in volts,
I is current, measured in amperes, and
R is resistance, measured in ohms.
Applied to our analysis, Ohms law reveals that the voltage drop across resistors a and
b is equal to the resistors resistance value times the current owing through the resistor.
Kirchhos voltage law (KVL): The directed sum of the electrical potential
dierences (voltage) around any closed circuit is zero.
Kirchhos current law (KCL): At any node in an electrical circuit, the sum of
currents owing into that node is equal to the sum of currents owing out of
that node.
Kirchhos laws are both conservation laws. KVL states that the sum of the voltage
dropped across the two resistors is equal to the voltage produced by the current source
(battery). KCL is likewise the law of conservation of current. For a circuit with only
one path, KCL implies that current is constant throughout the circuit. Given these three
laws and sucient values for the circuit, we may solve for the remaining unknown values.
Assume that we have a 9-volt battery, the value of the rst resistor, R
a
, is 3 ohms(), and
an ammeter yields a measurement of 2 amps. We are tasked with nding the remaining
values. The easiest, most organized, answer is to ll in these numbers on a table, as shown
below:
Table1.1: V , I, and R Calculations
R
a
R
b
Total
V 6 3 9
I 2 2 2
R 3 1.5 4.5
(numbers in bold are given)
3
By KCL, current at all points in the circuit is equal to 2 amps.
3
Since we are given
R
a
= 3, Ohms law allows us to solve for the voltage drop across this resistor as 6 volts.
That is, running 2 amps through a 3 ohm resistor produces 6 volts of force. Next, KVL
tells us that the 9 volts produced by the battery dissipate across the two resistors. Since
the rst resistor dissipates 6 volts of force, the second one must dissipate 3 volts, or V
b
= 3.
Further application of Ohms law reveals that the second resistor is a 3/2 = 1.5 resistor.
Ohms law also applies to the circuit as a whole. So, if a 9-volt battery produces 2 amps
of current, total resistance in the circuit is 9/2 = 4.5. This reveals another law of series
circuits. Resistance, as well as voltage, is additive.
4
While batteries are a DC voltage source, most electrical devices run on AC. An AC gener-
ator, or alternator, produces alternating current as a magnetic eld rotates around a set of
stationary wire coils. When the magnet starts its rotation, at the reference angle of zero
degrees, it is completely out of alignment with the coil. It moves closer into alignment
until it reaches an angle of 90

, and moves further out of alignment again. Because the


electromagnetic force, or voltage, created in this operation varies directly with the degree of
alignment between the magnet and the coil, we may (for the present) express instantaneous
voltage mathematically as:
(t) = V
max
sin(t), (1.2)
where:
V
max
is maximum instantaneous voltage, or voltage amplitude,
is angular frequency, in radians per second,
5
and
t is time, in seconds.
Multiplying (radians per second) times t, (seconds), gives us the angle covered by the
magnet in t seconds. Thus, we may write (1.2) as either (t) = V
max
sin(t), or
(t) = V
max
sin(), (1.3)
where is the familiar symbol for measuring angles (e.g.: An alternator with an amplitude
of 100 volts, produced by a magnet turning one radian per second, has an instantaneous
voltage of (1) = 100 sin(1) = 84.15 volts at t = 1 second). Note that voltage is
3
When we look at multiple lines across which current ows (known as parallel circuits), this relation
will no longer hold because current will split along parallel paths. KCL then implies that the sum of the
currents along parallel paths is equal to current running through a single path (so that current is preserved).
4
This rule does not hold for parallel circuits (although it does hold for parallel components in series).
See, e.g. Dale (1995).
5
For those of us who have long since forgotten, there are 2 radians in a circle, so a radian equals
approximately 57.3. Thus we may also write (t) = Vmax sin(2ft) , where f is frequency, in rotations per
second.
4
important in the study of optimal power ow because it is a determining factor of the
power a generator produces, that is,
P = I V. (1.4)
where:
P is power, measured in Watts.
6
From Ohms law, we can substitute for either V or I, giving us two alternative expressions
for power:
P = I
2
R and P =
V
2
R
.
As resistance is a many-headed beast, for now we will focus on the derivation of power given
by (1.4). The most basic resistance source (the resistor) has a xed resistance. Ohms law
tells us that in a purely resistive circuit, voltage and current will vary only by a constant
of proportionality (the xed resistance). Therefore, since the formula for voltage is given
by (1.2), the formula for current is given by:
i(t) = I
max
sin(t), (1.5)
where:
I
max
is maximum current, or current amplitude.
Using (1.4), we combine (1.2) and (1.5) to yield the formula for instantaneous AC power:
p(t) = V
max
sin(t) I
max
sin(t). (1.6)
Using trigonometric identities, we may write sin
2
(x) as
1
2
(1 cos 2x). Thus we rewrite
(1.6) as:
p(t) =
1
2
V
max
I
max

1
2
V
max
I
max
(cos 2t). (1.7)
Of course, when examining a generators power output, we are not concerned with its
instantaneous value, which varies continuously, but rather its average value. Looking
at (1.7), we see that instantaneous power has both a xed (
1
2
V
max
I
max
) and a variable
(
1
2
V
max
I
max
cos(2t)) component. Since, like that of sin x, the average value of cos x is
zero over one full cycle, the average value of power reduces to the rst component:
P
avg
=
1
2
V
max
I
max
. (1.8)
6
We get the much more familiar expression, P = I E, by denoting voltage as E rather than V
5
Alternately, we may calculate the average value of power directly from the formulas for
average values of voltage and current, which we do below. By inspection, we will not get
very far using either (1.2) or (1.5) directly, because we run into the same phenomenon of
zero average value for a sinusoidal signal. This is unimportant, however, as the sign of the
voltage or current in question is altogether arbitrary.
7
Intuitively, one will feel the same
shock whether a voltmeter registers 50V or +50V (dont try this at home). We get around
the sign problem for voltage and current by squaring the instantaneous values of each and
integrating to arrive at their respective averages. Starting with (1.3), (t) = V
max
sin(),
we are reminded that the magnet travels 2 radians per complete rotation. Therefore, the
average value for voltage-squared over one complete cycle is:
V
2
max
2
_
2
0
sin
2
d =
V
2
max
2
_
2
0
1 cos(2)
2
d.
Since the integral of cos(2x) is
1
2
sin(2x), the average value of voltage-squared over a com-
plete cycle is V
2
max
/2.
8
Taking the square root of this value, we nd that the average value
for voltage is:
V
avg
= |V | =
V
max

2
.
9
(1.9)
Going through the same steps, we nd that the average value for current is:
I
avg
= |I| =
I
max

2
. (1.10)
Multiplying (1.9) and (1.10), we again nd that P
avg
is given by (1.8).
If electrical systems contained only voltage sources, wires, and resistance sources (e.g.,
resistors and load), the world would be a happy place for the energy economist, with current
and voltage always and everywhere in phase. Unfortunately (for us) though, electrical
devices contain capacitors and inductors, casting us out of the Eden of synchronous voltage
and current.
10
Let us begin by demonstrating how inductors bring voltage and current out of phase, thus
creating a non-zero phase angle. Inductors (typically an inductor is a conducting wire
shaped as a coil) oppose changes in current, as expressed by the formula:
V = L
dI
dt
= LI
max
cos(t), (1.11)
7
In a DC circuit, for example, a voltmeter may register a positive or negative value, depending on the
reference point.
8
See, e.g. Mittle and Mittal (2006), Section 6.4.1 6.4.2.
10
Inductors and capacitors, of course, do serve useful purposes, such as storing electrical energy and
ltering out specic signal frequencies, but that is beside the point.
6
where:
L is inductance, measured in Henrys.
(1.11) states that the voltage drop across an inductor is equal to the inductors inductance
times the rate of change of current over time. To see how this creates a phase shift between
current and voltage, we must refer to the latter twos sinusoidal nature.
11
time
V, I
Instantaneous voltage
Instantaneous current
Figure 1.2: Instantaneous AC Current and Voltage Perfectly in Phase
Since
dI
dt
at a maximum or a minimum, from (1.11) instantaneous voltage, (t), is equal to
zero when I = (I
max
, I
max
). Thus, voltage and current are 90

out of phase for an purely


inductive current, as shown below:
time
V, I
Instantaneous voltage
Instantaneous current
Figure 1.3: Instantaneous AC Current and Voltage in a Purely Inductive Circuit
Note that in this case we have set R = 1, simply for aesthetics. Note further that we may
demonstrate the phase shift for an inductor by recognizing that instantaneous voltage is
11
Note that V > I in Figure(1.2). Therefore, from (1.1), it must be the case that R > 1.
7
at its maximum or minimum synchronously with
dI
dt
.
dI
dt
reaches its extrema at its points
of inection, which occur at i(t) = 0.
While we may still write i(t) = I
max
sin , it is no longer the case that v(t) = V
max
sin .
Taking advantage of the fact that cos = sin + 90

, we write:
i(t) = I
max
sin ,
v(t) = V
max
sin( + 90

).
In this instance we say that voltage leads current by 90

, or equivalently, current lags


voltage by 90

. More generally, we write:


v(t) = V
max
sin( +),
where (0

, 90

] is commonly known as the phase angle between voltage and current.


Like inductors, capacitors also introduce phase angles between current and voltage. A
capacitor contains two conductors separated by an insulator. Current ow through a
capacitor is directly related to the derivative of voltage with respect to time, as follows:
i(t) = C
dV
dt
, (1.12)
where:
C is capacitance, measured in Farads.
Capacitance will depend on the materials composing the insulator. From (1.12), we see
that instantaneous current will be zero when voltage is at its maximum or minimum values,
and will reach its maximum and minimum values at the inection points for voltage.
Graphically, the relationship between voltage and current in a capacitor is as shown below:
time
V, I
Instantaneous voltage
Instantaneous current
Figure 1.4: Instantaneous AC Current and Voltage in a Purely Capacitive Circuit
8
In a capacitor, current lags voltage by 90

, and thus we may write:


(t) = V
max
sin( 90

).
Note the dierence between a resistor versus an inductor or capacitor. A resistor simply
opposes the ow of current, like sand opposes the ow of water. An inductor or capacitor,
however, does not oppose a constant ow of current. Inductors oppose changes in current
ow, while capacitors oppose changes in voltage, both at angles of |90|

. The name for


the opposition exerted created by inductors and capacitors is reactance, denoted by the
symbol X and measured in ohms. For an inductor, we denote reactance by a vector |90|

out of phase with current, creating a |90|

phase angle between voltage and current, as


shown in Figure 1.5.
Rectance
V
I
90

Figure 1.5: Geometric Representation of Reactance


In general, electrical circuits will not simply contain inductors or capacitors or resistors,
but some combination of the three. Therefore, real-world electricity systems will usually
be characterized by phase angles of other than 0

or 90

. To determine the angles for


such systems, we add the vectors for reactance and resistance, arriving at a quantity called
impedance, denoted by Z and measured in ohms. We demonstrate a circuit with a 4 ohm
resistor and a 3 ohm inductor, in Figure 1.6, below:
9
R = 4
X = 3
Z = 5

Figure 1.6: Geometric Representation of Impedance


Impedance is the vector sum of resistance and reactance. A circuit with a 4 resistor and
a 3 inductor will thus have total impedance equal to 5.
While we have correctly calculated impedance as 5, it is more common to denote impedance
not only by its magnitude, but also its direction as well. We can do this by referring to
either polar or rectangular notation. Polar notation uses the vector compass to express the
variable of interests quantity in terms of magnitude and direction (that is, phase angle).
To compute the phase angle for impedance, we simply calculate the angle corresponding
to this quantity. We express impedance in this case as:
Z =
_
4
2
+ 3
2
at phase angle , or Z = 5 at [cos
1
(0.8)]

= 5 at 36.87

.
Thus, we would express this examples impedance in polar notation as:
Z = 536.87

(1.13)
(alternatively, the combined eect of the resistor and inductor is to produce an impedance
of 5 at 36.87

).
Notice that no information is lost in switching to polar notation. We may go the other
way to calculate the resistive and reactive components of impedance as follows:
Resistance = 5 cos(36.87

) = 4,
Reactance = 5 sin(36.87

) = 3.
This yields a second way to express impedance: in terms of resistance plus reactance. We do
this using rectangular notation. To switch from polar to rectangular notation graphically,
we switch from using the vector compass to representing the horizontal axis as real units,
and the vertical axis as imaginary units. In doing so, we place resistance (to real power)
on the horizontal axis and reactance on the vertical, as shown below:
10
R = 4
X = 3j
Z = 5
Imaginary
Axis
Real Axis
Figure 1.7: Impedance in Rectangular Form
Thus, we write Z = 4 + 3j in rectangular form.
12
Having introduced impedance, we may compute power in a mixed resistive/reactive circuit.
Let us then draw a circuit with a 4-ohm resistor (R) and a 3-ohm inductor (L) (alternatively,
reactor):
Inductor Resistor

R = 40

L = 390

Figure 1.8: AC Circuit with Resistance and Reactance


The symbol, , represents an AC voltage source. As is standard, we will assign a phase
angle of 0

to the voltage source, with the phase angle and magnitude of impedance given
by (1.13). Given total voltage and impedance, we apply Ohms law as applied to AC
12
In electrical engineering,

1 is commonly denoted as j, instead of i. The change in axes accounts


for the (initially) confusing switch in the formulas for voltage and current from v(t) = Vmax sin to v(t) =
Vmax cos . The real quantity is now the adjacent, not the opposite.
11
circuits (V = IZ) to calculate current.
13
I =
V
I
=
20V0

536.87

= 4A 36.87

From here, calculating the remaining voltages is straightforward (answers in Table 1.2,
below).
Table 1.2: V , I, and Z Calculations
R L Total
V 16 36.87

12 + 53.13

200

I 4 36.87

4 36.87

4 36.87

Z 40

390

5 36.87

Finally, let us move on to AC power calculation. Real power is the rate at which energy
is expended. According to Grainer and Stevenson (1994), reactive, or imaginary power,
expresses the ow of energy alternately toward the load and away from the load.
14
Intuitively, we can divide a power generators (or simply, generators) output into power
capable of doing work (real power), and power not capable of doing work (reactive power).
Reactive power is the result of current moving out of phase with voltage. The greater
the phase angle between voltage and current, the less ecient is power output in terms of
capability to do work, and the greater is reactive power. Apparent, or complex, power is
the (geometric) sum of real and reactive power. Thus, we derive the power triangle in the
same manner as we derived the relationship between resistance, reactance, and impedance:
13
Again, since there is only one pathway for current, the amount amperage is constant across the entire
circuit. The angle, 36.87

, is the amount by which current lags voltage.


14
p. 8. The exact nature of reactive power is not well understood.
12
Real Power P = P +j0
Reactive Power Q = 0 +jQ
Complex Power S = P +jQ
Imaginary
Axis
Real Axis
Figure 1.9: Complex Power (the Power Triangle)
Calculating complex power in rectangular notation is straightforward: Simply take real
power plus reactive power equals complex power, or S = P +jQ.
Calculating complex power in polar form is a bit more complicated. Taking the phasors
for voltage and current as V = |V | and I = |I| the formula for complex power is:
V I

= |V I| ,
where

denotes the complex conjugate (as we will demonstrate shortly). Note that the
angle, , is once again the phase angle between voltage and current, as may be veried
by designating voltage as the reference phasor.
To see why we take the complex conjugate of current (i.e., we switch the sign on the angle
for current), let us turn to Appendix A eqs.(1A.5) and (1A.6) for reactive and complex
power:
Q = |I|
2
X,
S = |I|
2
Z.
Dividing (1A.5) by (1A.6):
Q
S
=
Q
Z
.
That is, the cosine of the phase angle for impedance is equal to that of the power triangle,
and thus the phase angles themselves are equal. Since the phase angle for impedance is
, the phase angle for the power triangle is this quantity as well. But we have to switch
the sign on the current phasor to maintain this result.
13
Eulers identity: e
j
= cos + j sin oers further insight into both rectangular and
polar representation of electrical quantities such as impedance and power. Resistance is
equal to |Z| cos , and reactance is equal to |Z| sin . If we denote the horizontal axis as
the real number line, and the vertical axis as the imaginary number line, we may express
impedance as:
Z = |Z| cos +j|Z| sin = |Z|e
j
,
and complex power as:
S = |I|
2
|Z|e
j
.
1.3 DC Network Calculations
1.3.1 Nodal Voltages and Current Flows
Having demonstrated the basics of voltage, current, resistance, and power, we may demon-
strate the calculation of these values in circuit analysis. We start with a DC circuit, then
move on to its AC counterpart. First let us specify the network itself. We will examine an
elementary circuit, called a ladder network, as shown below:
15
I
4
R
f
R
d
R
b
I
1
R
g R
e
R
c
R
a
0
Figure 1.10: A Five-node Ladder Network
A network is composed of transmission lines, or branches. The junctions formed when two
or more transmission lines (more generally, circuit elements) are connected with each other
15
Taken from Baldick (2006, p. 163 et. seq.)
14
are called nodes (e.g., in the gure above, line 12 connects node 1 with node 2). In the
above example, then, there are seven transmission lines [12, 23, 34, 01, 02, 03, and 04]
connecting ve nodes [0, 1, 2, 3, 4].
Denote:
N = Number of nodes in a transmission system,
k = Number of lines in a transmission system,
R
ij
= Resistance of transmission line connecting nodes i and j; or
R

= Resistance of transmission line ,


Y

= Admittance of transmission line ,


I
j
= Current source, located at node j,
I
ij
= Current owing over line ij,
V
j
= Voltage at node j,
V
ij
= Voltage dierential between nodes i and j
(also known as the voltage drop across line ij).
Note that all of the above networks N = 5 nodes are interconnected, directly or indirectly,
by transmission lines and are subject to Kirchhos laws. Since the (directional) voltage
drops across the system must sum to zero, satisfaction of KVL takes away one degree of
freedom in the system. Thus we need write voltage equations for only N 1 nodes to fully
identify an N-node network. KVL also allows us to choose one node as the datum (or
ground) node, and set voltage at this node equal to zero. The researcher is free to choose
any node as datum. To minimize computational cost, one generally chooses as datum the
node with the most transmission lines connected to it. It is customary to denote the datum
node as node 0.
There are two current sources in this circuit, I
1
and I
4
. The current sources are generators
located at nodes 1 and 4. These generators are connected to the transmission grid by
lines 01 and 04, or, alternatively, lines a and g. These lines are generally denoted limited
interconnection facilities.
16
In the electrical engineering literature, such lines are known
as shunt elements.
17
The ladder network above might represent, say, a transmission
16
Generally, limited interconnection facilities sole purpose is to interconnect power plants to the grid.
Such facilities are not required to provide open access to customers wishing to transmit power over the
former. However, the distinction between a limited interconnection facility and a transmission line which
must provide open access (that is, a facility which must submit an Open Access Transmission Tari to
the FERC) can become blurry when the limited interconnection facility is, say, a 40+ mile long line.
This was the case with the Sagebrush line (see, e.g. the case lings before the Federal Energy Regulatory
Commission in docket numbers ER09-666-000 and its progeny (e.g. ER09-666-001). Two relevant orders
in these cases are found in 127 FERC 61,243 (2009) and 130 FERC 61,093 (2010).
17
See, e.g., Glover et al. (2012), section 4.11, and Grainger and Stevenson (1994), chapter 6.
15
line stretching across New York State, where node 1 represents Bualo, node 2 represents
Rochester, node 3 represents Syracuse, and node 4 represents Albany. Node 0 then, actually
represents four physically distinct locations, where generators at or around these four cities
are connected to the transmission grid by limited transmission interconnection facilities.
Again, in standard electrical engineering parlance, node 0 is the ground node.
Note that one may label the resistance associated with a particular transmission line by
naming the nodes the line connects, or simply assigning the line its own, alternative,
subscript. Resistance in transmission lines is the opposition to current as current bumps
into the material composing the transmission lines. Good conductors, or materials that
oer relatively little resistance, are ideal candidates for transmission lines. Metals with
free electrons, like copper and aluminum, make good conductors as they provide little
resistance.
Before we analyze the equations corresponding to Kirchhos Laws, let us introduce the
concept of admittance. Admittance measures the ease with which electrons ow through a
circuits elements, and is the inverse of impedance.
18
Since there are no reactive components
in our example, though, impedance and resistance are equivalent. We thus loosen our
terminology and let admittance denote the inverse of resistance. Labeling admittance by
Y , we thus have Y =
1
R
. Ohms Law is then:
I =
1
R
V = Y V.
Ohms Law allows us to write current ow along a particular line as a product of the
voltage drop between the two nodes that the line connects and the admittance of the line.
It thus yields the ow of current across the k = 7 transmission lines in our example. To
start, we write current ow along line 10, I
10
, as:
I
10
= V
10
Y
10
= (V
1
V
0
)Y
10
= V
1
Y
10
(1.14)
(since V
0
= 0). We write current ow across the other six lines analogously:
I
12
= (V
1
V
2
)Y
12
, (1.15)
I
20
= V
2
Y
20
, (1.16)
I
23
= (V
2
V
3
)Y
23
, (1.17)
I
30
= V
3
Y
30
, (1.18)
I
34
= (V
3
V
4
)Y
34
, (1.19)
I
40
= V
4
Y
40
. (1.20)
18
The inverse of resistance is actually conductance.
16
KCL expresses the conservation of current at the (N 1) = 4 nodes. That is, KCL states
that the sum of the currents entering a node equals the sum of currents exiting that node.
We will use the convention that only current sources (generators) produce current entering
a node, while all transmission lines carry current away from that node. Let us begin with
node 1. Current I
1
enters node 1, while transmission lines 10 and 12 carry current away
from this node. Thus KCL implies that I
1
= I
10
+I
12
. Note that I
ij
denotes ow of current
from node i to node j. If the actual current ow is in the opposite direction i.e., from node
j to node i, we write I
ij
< 0. Otherwise, I
ij
0. The important point is that we specify
the assumed direction of current ow by the ordering of the subscripts.
Because we do not incorporate load in this example,
19
we will have I
1
0. I
1
= 0
i the current source is not currently operational, as when a generator is shut down for
maintenance. I
10
and I
12
may be either positive or negative, depending on the direction
of current ow. Substituting (1.14) and (1.15) for I
10
and I
12
, respectively, yields:
I
1
= V
1
Y
10
+ (V
1
V
2
)Y
12
.
Grouping the voltage terms yields:
I
1
= (Y
10
+Y
12
)V
1
Y
12
V
2
. (1.21)
There is no current source at node 2, simply three transmission lines which, by convention,
carry current away from this node. Our three current equations for lines 21, 20, and 23
are then, respectively:
Y
21
(V
2
V
1
) = I
21
,
Y
20
(V
2
) = I
20
,
Y
23
(V
2
V
3
) = I
23
.
Let us remark at this point that Y
ij
= Y
ji
, i.e., admittance is not a directional value.
20
Applying KCL to node 2 yields I
21
+I
20
+I
23
= 0,
21
or:
Y
21
V
1
+ (Y
21
+Y
20
+Y
23
)V
2
Y
23
V
3
= 0. (1.22)
The alert reader will notice that node 3 is similar to node 2, in that there is no current
source and three branches emanating there. Thus KCL generates:
Y
32
V
2
+ (Y
32
+Y
30
+Y
34
)V
3
Y
43
V
4
= 0. (1.23)
19
If there were a load located at node 1, then current ow from node 1 would be negative (i.e., net current
would ow to node 1, rather than away from it) whenever node 1 load is greater than node 1 generation.
20
This is a likely explanation for the single-subscript notation for resistance and admittance common in
EE texts.
21
Note that this equation does not imply that no current ows through node 2. It states that current
traveling toward node two, which takes on a negative sign, equals the amount of current owing out of node
2 (which has a positive sign), and thus the principle of conservation of current.
17
Finally, our treatment of node 4 is symmetric to node 1, producing:
Y
43
V
3
+ (Y
43
+Y
40
)V
4
= I
1
. (1.24)
Note that (1.21) (1.24) form a system of four equations in four unknowns, which we
express in matrix form below (after switching to single subscript notation for line admit-
tances, for expositional ease):
_

_
Y
a
+Y
b
Y
b
0 0
Y
b
Y
b
+Y
c
+Y
d
Y
d
0
0 Y
d
Y
d
+Y
e
+Y
f
Y
f
0 0 Y
f
Y
f
+Y
g
_

_
_

_
V
1
V
2
V
3
V
4
_

_
=
_

_
I
1
0
0
I
4
_

_
. (1.25)
The LHS matrix is known as the admittance matrix. The (square) admittance matrix is
usually denoted as A R
nn
, with generic element A
ij
. V R
n
and I R
n
are the
vectors of unknown voltages and known current injections at each of the networks nodes,
respectively.
Given values for admittances and current injections, we may solve for nodal voltages and
current ows over each line. As the simplest example, consider resistances of one unit for
each resistor in the circuit and current injections of one unit at both nodes 1 and 4. In this
case,
Y
i
=
1
R
i
= 1 i = a, b, . . . , g.
The system to be solved is given in matrix form below:
_

_
2 1 0 0
1 3 1 0
0 1 3 1
0 0 1 2
_

_
_

_
V
1
V
2
V
3
V
4
_

_
=
_

_
1
0
0
1
_

_
, (1.26)
yielding V

=
_
2
3
1
3
1
3
2
3
_
.
18
We may now solve for current ows across all lines in the network, using (1.14) (1.20).
I
10
= Y
a
V
1
=
2
3
,
I
12
= Y
b
(V
1
V
2
) =
1
3
,
I
20
= Y
c
V
2
=
1
3
,
I
23
= Y
d
(V
2
V
3
) = 0,
I
30
= Y
e
V
3
=
1
3
,
I
34
= Y
f
(V
3
V
4
) =
1
3
,
I
40
= Y
g
V
4
=
2
3
.
(all currents in amps). The negative sign on I
34
means that current is owing in the
opposite direction than we assumed (i.e., current is owing from node 4 to node 3, not
from node 3 to node 4). One may easily check that KCL is indeed satised at the four
nodes.
1.4 A Quick DC Sensitivity
Sensitivity analysis is the study of how the value of a problems solution, or some function
of that solution, changes as we tweak the value(s) of either the solution (vector) or some
other model parameter. As a quick introduction to sensitivity analysis, we examine the
sensitivity of the vector of voltages in the ladder circuit problem when we change the
current injected at a single node. We do this analysis to demonstrate a basic method for
calculating sensitivities (e.g. a locational marginal price is an example of a sensitivity).
To calculate the sensitivity of a solution to a change in a variable , we start with a square
system of equations Ax = b (in our example, A R
44
, x R
41
, and b R
41
).
22
Since we are examining the sensitivity of the solution of this system, we assume such a
solution exists, and denote it as x

(0), or the base case solution. We denote the amount


by which we wish to change a given variable as R
1
. At the base case solution, none of
the problem inputs have changed at all. Therefore, the base case corresponds to = 0.
23
A non-zero value for will refer to a change case. For example, we may change an
22
We assume the reader is familiar with matrix algebra, so we do not go into details such as conditions
for matrix invertibility.
23
The notation is, admittedly, a bit squirrelly, since we change only one variable when performing a
19
admittance from Y
i
to Y
i
+, or change a current source from I
j
to I
j
+. Let us denote
the change-case equation as:
A()x

() = b(). (1.27)
That is to say, the coecient matrix and right hand side vector are now dependent on (or
functions of) .
We calculate a sensitivity for x

(0) as the partial derivative of the original solution vector


(i.e., the base case) to a change in a specic variable, the latter of which we denote as
j
.
First, since the base case corresponds to no change in any variable, we will denote the base
case equation as:
A(0)x

(0) = b(0).
To calculate the sensitivity of the original solution to a change in a variable, solve for the
derivative of the base-case solution with respect to the variable in question. To evaluate
this derivative, start by totally dierentiating (1.27):
A(0)

j
x

(0) d
j
+A(0)
x

(0)

j
d
j
=
b(0)

j
d
j
.
Since the d
j
terms all cancel, we solve the remaining equation for
x

(0)

j
, obtaining:
x

j
(0) = [A(0)]
1
_
b

j
(0)
A

j
(0)x

_
.
Let us return to the example shown in (1.25) - the ladder network with current injections
I
1
and I
4
equal to one amp each, and admittances Y
a
Y
g
equal to one unit each. We
wish to calculate the sensitivities of the voltages obtained, (x

=
_
2
3
1
3
1
3
2
3
_
, to a
change in the node 1 current source from I
1
to I
1
+
j
. We show this change in (1.28),
below:
_

_
Y
a
+Y
b
Y
b
0 0
Y
b
Y
b
+Y
c
+Y
d
Y
d
0
0 Y
d
Y
d
+Y
e
+Y
f
Y
f
0 0 Y
f
Y
f
+Y
g
_

_
_

_
V
1
V
2
V
3
V
4
_

_
=
_

_
I
1
+
j
0
0
I
4
_

_
. (1.28)
sensitivity analysis. And yet we alternatively speak of A(0) andA(). Perhaps the best explanation is
that as soon as we perturb one variable, designated as , the original solution vector no longer holds.
20
Notice that
A

_
= 0 and
b

j
_

_
= [1 0 0 0]. Finally, inverting the matrix, as found
on the lhs of (1.26), and multiplying, we obtain:
x

j
(0) = [A(0)]
1
_
b

j
(0)
_
=
_

_
0.6191
0.2381
0.0952
0.0476
_

_
.
1.5 The DC Approximation to the AC Network: Calculating
PTDFs
We started with DC network calculations to familiarize the reader with concepts used in the
analysis of an AC Network. Electrical engineers, intrepid folks that they are, will actually
set up and solve AC power ow problems as systems on non-linear equations.
24
Energy
economists generally study linearized versions of the AC power equations, though. The
obvious advantage of linearizing these systems is that one need not run iterative algorithms
to nd the critical points for systems of non-linear equations.
This section focuses on only one element of linearized AC analysis, the calculation of
PTDFs. Doing so, we will present the reader important concepts such as the Jacobian of
power balance equations (as derived in Appendix A), noting the similarity between this
matrix and the admittance matrix of the linearized AC system (derived in Appendix B).
We believe that the advantage of this approach is that it is incremental. We introduce the
reader to these important concepts in the context of the evaluation of a single system of
equations, rather than throwing the reader the several derivations necessary to solve the
optimal power ow problem all at once.
1.5.1 Linearizing AC Power Equations
We derive the equations for power balance and power ow across transmission lines in
Appendix A. (1A.8) and (1A.9), shown below:
P

=
n

k=1
u

u
k
[G
k
cos(

k
) +B
k
sin(

k
)],
Q

=
n

k=1
u

u
k
[G
k
sin(

k
) B
k
cos(

k
)],
24
Well, perhaps they are not that intrepid, since available computer packages solve the problems for them!
21
are called power ow equality constraints,
25
and must be satised at each bus in order for
the power system to maintain a constant frequency.
26
(1A.12) and (1A.13) display power
ow across transmission lines as:
p
k
= (

)
2
G

+u

u
k
[G
k
cos(

k
) +B
k
sin(

k
)],
q
k
= (

)
2
B

+u

u
k
[G
k
sin(

k
) B
k
cos(

k
)].
The linearized AC model is generally known as the DC approximation to the AC network.
Thus, in this model terms such as reactive power and impedance are assumed away. We
make two simplifying assumptions to transform the complex admittance matrix to its DC
approximation: we set its real terms, G
k
, equal to zero and ignore the shunt elements
(i.e., the terms corresponding to node 0, or the ground). This reduces (1A.8) and (1A.12)
to:
P

=
n

k=2
u

u
k
[B
k
sin(

k
)],
27
(1.29)
p
k
= u

u
k
[B
k
sin(

k
)]. (1.30)
We nish linearizing these equations by assuming that the base-case solution (the point of
reference for our derivatives) involves zero net power ow at nodes 2 n, and all voltage
magnitudes equal to one per unit, so that u(0) = 1. This reduces (1.29) and (1.30) to:
P

=
n

k=2
B
k
sin(

k
), (1.31)
p
k
= B
k
sin(

k
). (1.32)
At this point we deviate from the standard methodology by linearizing these equations
about the base-case solution for (real) power. As per Baldick (2006, p. 343), we dene a
at start as the state of the system corresponding to phase angles equal to zero at all nodes
and all voltage magnitudes equal to one (which, in fact, is the base-case solution to the
problem when we ignore the circuits shunt elements).
28
As we will see shortly, the model
is now linear at the base-case solution, because
sin

=0
= 1.
25
The power ow equality constraints express the relationship that complex power generated at bus is
equal to voltage times (the complex conjugate of) current.
26
We derive these equations in Appendix A by dividing the term for the amount of complex power injected
into a generic node, , as the sum of its real P

and reactive Q

1 components.
28
The traditional interpretation of DC power ow emphasizes small angle approximation to sine and
cosine (i.e. cos(i
k
) 1, sin(i
k
) (i
k
)), and the solution of DC power ow being the
same as the solution of an analogous DC circuit with current sources specied by the power injections and
voltages specied by the angles (see Schweppe et al.(1988), Appendix D). While our presentation diers
from the standard, the reader should be able to digest the traditional method once they have solved ours.
22
1.5.2 Calculating PTDFs
The term PTDF, or shift factor, means the change in the ow of power across a particular
transmission line, k, induced by an incremental increase in power output at a given node.
The alert reader will note that there exist N shift factor matrices, each one corresponding
to the change in power ow across all lines in the system as the result of an increase in
power generation at one of the systems N nodes. Mathematically, we calculate a real-
power shift factor as
p
k
P
j
. We denote this matrix as L
j
=
p
k
P
(m)j
. This is the matrix
of incremental line ows across all of a systems lines due to an injection of power at node
j and withdrawal of that power at node m, for each of the systems other nodes. Power
injected and withdrawn at the same node has PTDF
(j)j
= 0, trivially.
We cannot take the desired partial derivative directly, because there is no explicit term for
P
j
in (1.32). Therefore, we incorporate the chain rule to solve for the desired term based
on a variable common to both (1.31) and (1.32), nding that:
p
k
P
j
=
p
k

i
P
j
.
We obtain the second term, of course, by inverting the Jacobian corresponding to
P
j

i
.
Dierentiating these equations is now fairly straightforward. First, we shorthand equation
(1.31) to express net power ow from node as a function of phase angles and voltage
injections, and the vector of net power ows as:
P

= p

__

_ _
,
P = p
__

_ _
,
respectively. As alluded to earlier, there are (n1) degrees of freedom in the power balance
equations, because power balance indicates that one cannot specify the power balance
equation at the n
th
node independently of the power balance in the rest of the system.
Therefore, we subtract a row and column of the power-balance Jacobian. It is customary
to denote the node corresponding to the omitted row and column as the reference node.
As shown below, we calculate shift factors by treating the reference node as the node at
which power is withdrawn. As is also customary, we denote the reference node as node 0.
29
.
29
the only drawback of this approach is that we have already denoted the ground node as node 0, and
the ground node and the reference node are not the same thing. However, since we have assumed away
the shunt elements, we are not examining the ground node any longer. Thus, there should be no further
confusion regarding denoting the reference node as node 0
23
We will demonstrate the calculation of two dierent Jacobians for expositional clarity.
Denote the power-balance Jacobian (nodal Jacobian) for all of the systems nodes (the full
nodal Jacobian) as J
p
, and the Jacobian for all the systems nodes, excepting the reference
node, (the reduced nodal Jacobian) as

J
p
, respectively. The term p indicates that we
are taking the derivatives of real power ows with respect to phase angles. When solving
for PTDFs, we will use the reduced nodal Jacobian because the power-balance equations
have (n1) degrees of freedom. Remembering that we are evaluating the linearized power
balance equations, we write the reduced nodal Jacobian for our calculation of network shift
factors as:

J
(0)
p
=

J
p
_
0
1
_
=
p

_
0
1
_
30
(1.33)
(that is, we evaluate this Jacobian at a at start).
Next we examine the Jacobian for the equations of real-power ow over the transmission
lines (branch Jacobian): K R
kn1
. This matrix has k rows, one for each of the equations
for the systems transmission lines (ignoring shunt elements). Again, because the system
behaves dierently when power is withdrawn at each of the systems non-reference nodes,
there are (n1) of these matrices, one every node at which we may withdraw power when
power is injected at a set node. We denote the branch Jacobian, evaluated at the base-case
of a at start, as:
K
(0)
= K
_
0
1
_
=
p
k

_
0
1
_
.
31
(1.34)
For expositional simplicity, let us now solve (1.33) and (1.34) and the resulting shift fac-
tors for a sample network before introducing the general notation corresponding to these
equations. Consider the 5-node model shown below. The value shown between each node
pair is the impedance of the transmission line connecting the two nodes. Let us choose
node 1 as the reference node, so that we will replace the notation 1 with 0 whenever
we refer to this node.
24
Figure 1.10: A Sample Network
Node 5 Node 3
0 +j0.004
Node 4 Node 2
0 +j0.004
0 +j0.001 0 +j0.001
0 +j0.001
0 +j0.001
Node 0
Each line, k, has R
k
= 0. That is, the real part of impedance (i.e. resistance) is zero
for all lines. This corresponds to the assumption that the real component of conductance
(i.e. the inverse of impedance) is zero as well. The complex component of impedance
(i.e. reactance) is given by X
k

1. Invert each lines impedance to derive its admittance,


denoted Y
k
:
Y
02
= Y
03
= Y
23
= Y
45
=
1
0 +j0.001
= j1, 000,
Y
24
= Y
35
=
1
0 +j0.004
= j250.
Note that Y
k
= B
k

1.
For any node , we write sin(

j
), for all nodes j connected to node by a transmission
line. We may thus express (1.31) for each of the ve nodes in this example as:
P
0
= 1, 000 sin(
0

2
) + 1, 000 sin(
0

3
),
P
2
= 1, 000 sin(
2

0
) + 1, 000 sin(
2

3
) + 250 sin(
2

4
),
P
3
= 1, 000 sin(
3

0
) + 1, 000 sin(
3

2
) + 250 sin(
3

5
),
P
4
= 250 sin(
4

2
) + 1, 000 sin(
4

5
),
P
5
= 250 sin(
5

3
) + 1, 000 sin(
5

4
).
Deriving the full nodal Jacobian for this example is straightforward, nonetheless, we show
25
explicitly the elements corresponding to the rst row of this matrix:
32
J
(0)
11
=
P
0

=0
= 1, 000 cos(
0

2
) + 1, 000 cos(
0

3
) = 2, 000,
J
(0)
12
=
P
0

=0
= 1, 000 cos(
0

2
)(1) = 1, 000,
J
(0)
13
=
P
0

=0
= 1, 000 cos(
0

3
)(1) = 1, 000.
Since neither
4
nor
5
are arguments of P
1
, their corresponding terms are obviously zero.
Taking the remaining partial derivatives yields the matrix:
_

_
2, 000 1, 000 1, 000 0 0
1, 000 2, 250 1, 000 250 0
1, 000 1, 000 2, 250 0 250
0 250 0 1, 250 1, 000
0 0 250 1, 000 1, 250
_

_
.
A couple of characteristics of this matrix stand out immediately. First, notice the diagonal
terms are equal to the sum of the o-diagonal elements times minus one. Recognizing that
the o-diagonal term, J
(0)
k
= B
k
, this allows us to shorthand the terms of this Jacobian
as:
P

k
__
0
1
_ _
=
_

jJ()
B
j
if k = ,
B
k
if k J(),
0 otherwise.
Second, notice that the full nodal Jacobian is symmetric (this fact makes calculation of the
full AC model less computationally expensive).
As mentioned above, we omit the power ow equality constraint and partial derivatives
corresponding to the base node in order to derive the reduced nodal Jacobian. Thus, the
reduced system of equations is:
P
2
= 1, 000 sin(
2

0
) + 1, 000 sin(
2

3
) + 250 sin(
2

4
),
P
3
= 1, 000 sin(
3

0
) + 1, 000 sin(
3

2
) + 250 sin(
3

5
),
P
4
= 250 sin(
4

2
) + 1, 000 sin(
4

5
),
P
5
= 250 sin(
5

3
) + 1, 000 sin(
5

4
).
32
We could really use some dierent subscript/superscript/parenthetical notation for the entries in this
Jabobian, but what we have will have to do for now.
26
We take the derivatives of these equations with respect to
2
,
3
,
4
and
5
, obtaining:

J
(0)
=
_

_
2, 250 1, 000 250 0
1, 000 2, 250 0 250
250 0 1, 250 1, 000
0 250 1, 000 1, 250
_

_
.
with inverse:
_

J
(0)

1
=
_

_
0.000655 0.000345 0.000517 0.000483
0.000345 0.000655 0.000483 0.000517
0.000517 0.000483 0.002724 0.002276
0.000483 0.000517 0.002276 0.002724
_

_
.
Notice that we do not remove the terms in power ow equality constraints P
2
P
5
corresponding to the base node, because these terms capture the physical characteristics of
the network (i.e. the transmission lines between the node in question and the base node).
To calculate the branch Jacobian, K, remember that this matrix is of dimension k(n1).
Accordingly, we will take derivatives of ows on the k = 6 transmission lines with respect
to the N 1 nodes at which power is injected.
Before taking these derivatives, however, we must determine the (reference) direction of
power ow across all k lines. That is, we know that the equation for power ow across a
given line is p
k
= B
k
sin(


k
), where this equation indicates that power ows from
node to node k. We just have to gure out whether, for any given node pair, power ows
from node k to , or vice-versa. There is no one right answer to this question, though,
because the direction of power ow will change whenever one changes the node at which
power ows into the network (e.g., the direction of ow across line 45 will be dierent if
one injects power at node 2, versus injecting it at node 3).
Fortunately, this observation leads us to the correct method for assigning power ow di-
rection. We simply designate both the node at which power is withdrawn and the node
at which power is injected. Since we withdraw power at the reference node, we need only
choose an additional, arbitrary, node at which to inject power. From here, we assign power
ows such that no physical laws (First Law of Thermodynamics, Kirchhos laws) are vio-
lated. So, let us arbitrarily designate node 2 as the injection node. Having specied the
injection node, one may follow two shortcuts to power ow direction:
1. Power ows to the reference node along the lines directly connected to it;
2. Power ows away from the injection node along the lines directly connected to it.
33
33
These rules of thumb no longer hold with multiple points of injection and withdrawl, but that obser-
vation is not relevant to the present analysis.
27
Following these two shortcuts, we may assign directional power ows as follows:
Figure 1.11: Directional Power Flows
Node 5 Node 3
Node 4 Node 2
Node 0
The rst observation tells us that power ows from nodes 2 and 3 to node 0. Load at
node 0 draws power along lines 20 and 30. The second observation tells us that power
ows away from node 2, to nodes 0, 3, and 4. All that remains is to determine directional
ows along the lines connected to node 5. Kirchhos Current Law tells us that the sum
of currents entering a node equals the sum of currents exiting that node. Since no current
(or the power corresponding to it) is being withdrawn at node 4, all of the power entering
node 4 will ow to node 5, indicating the direction of ow on this line. The same reasoning
applies at node 5, so power ows from node 5 to node 3. Note that KCL also tells us that
the current owing from node 3 to node 0 is equal to the sum of currents owing from
nodes 2 and 5 to node 3.
Using the directional power ows indicated above, we write the six equations for power
ow as:
p
20
= 1, 000(
2

0
),
p
30
= 1, 000(
3

0
),
p
23
= 1, 000(
2

3
),
p
24
= 250(
2

4
),
p
53
= 250(
5

3
),
p
45
= 1, 000(
4

5
).
28
Taking the required derivatives across all equations yields:
K
_
0
1
_
=
_

_
1, 000 0 0 0
0 1, 000 0 0
250 0 250 0
0 250 0 250
0 0 1, 000 1, 000
_

_
.
We shorthand this matrix as:
K
(ij)
=
_
B
ij
if = i,
B
ij
if = j.
Multiplying the two matrices yields the shift factor matrix (H) below:
P
2
P
3
P
4
P
5
H
__
0
1
_ _
=
line 20
line 30
line 31
line 24
line 35
line 45
_

_
0.6552 0.3448 0.5172 0.4828
0.3448 0.6552 0.4828 0.5127
0.3103 0.3103 0.0345 0.0345
0.0345 0.0345 0.5517 0.4483
0.0345 0.0345 0.4483 0.5517
0.0345 0.0345 0.4483 0.4483
_

_
.
To interpret this matrix, observe that column values indicate the source of the power, that
is, the node at which power is generated (as well at the phase angle corresponding to this
node).
34
The row value corresponds to the line over which this power ows. Therefore,
the columns of the matrix yield the PTDFs for every line in the network when power is
injected at the node corresponding to that column and withdrawn at the reference node.
Reading down the rst column, then, 65.52 percent of the power generated at node 2 ows
directly to node 0. The remaining 34.48 percent ows in a more circuitous fashion along
the networks other lines. 31.03 percent ows from node 2 through node 3 to node 0, while
3.45 percent ows around the horn, from node 2 to node 4 to node 5 to node 3 and on to
node 1. As mentioned above, the amount exiting node 3, 34.48 percent of the increment of
power generated, is equal to the sum of the ows entering that node: 31.03 percent, plus
3.45 percent.
34
Even though we assume that no power is generated at nodes 1-3, we may, in principle, calculate shift
factors for any node in a transmission system.
29
1.A Power Equation Derivation
When current and voltage are in phase, an electrical system is producing its maximum
eciency of (real) power output. A graphical analysis is illustrative. Consider rst rela-
tionship between power, current, and voltage for a purely resistive system.
time
P, I, E
Instantaneous Voltage
Instantaneous Current
Instantaneous Power
Figure 1A.1: Current, Voltage, and Power in a Purely Resistive System
As shown in Figure 1A.1, above, when current and voltage are in phase, power is strictly
nonnegative because voltage is never positive when current is negative, and vice-versa.
As voltage and current move out of phase, though, this synchronicity vanishes. Consider
the inductor example from Figure 1A.2, with instantaneous power added to instantaneous
current and voltage.
time
P, I, E
Instantaneous Voltage
Instantaneous Current
Instantaneous Power
Figure 1A.2: Current, Voltage, and Power in a Purely Inductive System
Since voltage and current are 90

out of phase for a purely inductive current, half of the


time voltage and current will be of the same sign, while they will be of opposite sign the
other half of the time. This causes power in this circuit to alternate in cycles of positive
and negative values. Positive power means that load in the system is receiving (absorbing)
30
power from the generation source(s), while negative power means that power is actually
being returned to the generation source. This means that reactive components (inductors
and capacitors) dissipate zero power, as they equally absorb power from, and return power
to, the rest of the circuit.
35
Adding capacitance into the picture, of course, means that the circuits phase angle will
decrease, moving voltage and current closer into phase and allowing for positive power
output, as shown graphically in Figure 1A.3 below:
time
P, I, E Instantaneous Voltage
Instantaneous Current
Instantaneous Power
Figure 1A.3: Current, Voltage, and Power in a Resistive/Reactive Circuit
In a circuit characterized by both resistance and reactance, voltage and current are out of
phase by less than |90

|, and power cycles between positive and negative values, but net
power output is positive.
When calculating power in circuits with reactive and resistive elements, we apply Ohms
law. Once again denoting reactance by X, resistance by R, impedance by Z, current by I,
and voltage by E, we have the following three expressions for Ohms law:
E = I R (circuit with resistive elements only),
E = I X (circuit with reactive elements only),
E = I Z (circuit with resistive and reactive elements).
Note that the basic relationship, voltage equals current times resistance, is unchanged. The
point is simply that the nature of resistance varies with the dierent resistance sources.
Since the basic relationship for power is unchanged as well, power for resistive, reactive,
35
For further reading, see Rau(2003), Appendix A, and Mittle and Mittal, Ch. 6 - 8.
31
and resistive plus reactive circuits is, respectively:
P = |I|
2
R,
Q = |I|
2
X,
S = |I|
2
Z, (1A.1)
where:
P is real power, measured in Watts,
Q is reactive power, measured in volt-amperes reactive (VAR), and
S is apparent power, measured in volt-amperes (VA).
Just as we expressed the relationship between resistance, reactance, and impedance with
a right triangle, we express the relationships between dierent power elements with the
power triangle. Consider a circuit containing a 60 volt power source, a capacitor rated at
30 farads, and a 20 resistor. In order to calculate P, Q, and S for this circuit, we must
calculate the current owing through it. Impedance is simply:
Z =
_
R
2
+X
2
= 36.06.
Therefore, current is:
I =
E
Z
= 1.66 amps.
Real Power is:
P = I
2
R = 55.1 watts,
reactive power is:
Q = I
2
X = 82.7 VAR,
and complex power is:
S = I
2
Z = 99.4VA.
The power triangle is shown graphically as:
P
Q
S
Figure 1A.4: The Power Triangle
32
It is straightforward to verify that P/S = R/Z, and so one may derive the phase angle
either through either power or impedance manipulations. The former expression is com-
monly known as the power factor. The ability to reduce a systems phase angle is known
as reactive power supply. Only generators with this ability may compete in the reactive
power market.
Like we calculated instantaneous power for purely resistive circuits in (1A.1), we may also
calculate instantaneous power for a purely inductive circuit, a purely capacitive circuit,
and a combined resistive-reactive circuit. For a purely inductive load, current lags voltage
by 90

. Therefore, we have:
i(t) = I
max
cos( 90),
v(t) = V
max
cos .
Instantaneous power is then:
p(t) = V
max
I
max
cos cos( 90),
= V
max
I
max
cos sin ,
= |V ||I| sin(2).
Integrating over [0, 2], we nd that instantaneous power for a purely inductive circuit does
indeed have an average value of zero, as argued above. One may also show that average
power for a purely capacitive circuit is zero, as found by integrating |V ||I| sin(2). Finally,
for mixed resistive/reactive circuits, we may write:
p(t) = V
max
I
max
cos cos( ),
=
V
max
I
max
2
[cos
2
cos sin
2
cos + cos + 2 sin cos sin ].
The rst three terms inside the parentheses reduce to (cos )(1 + cos 2), while the fourth
reduces to (sin )(sin2). Thus, we may express instantaneous power in a single-phase AC
circuit as:
s(t) =
V
max
I
max
2
(cos )(1 + cos 2) +
V
max
I
max
2
(sin )(sin 2).
Alternatively, since |V | = V
max
/

2 and |I| = I
max
/

2, we have
s(t) = |V ||I|(cos )(1 + cos 2) + |V ||I|(sin )(sin 2). (1A.2)
The rst term is once again real power, while the second term is reactive power. Note that
when the phase angle is equal to zero, reactive power is equal to zero, as expected. These
33
equations give us an alternative method for calculating average real, reactive, and complex
power in AC circuits. We calculate average real power as:
1
2
_
2
0
|V ||I|(cos )(1 + cos 2)d = |V ||I| cos . (1A.3)
By inspection, (1A.3) is at a maximum when = 0. Notice that since k
_
2
0
sin 2d = 0,
the average value of reactive power from (1A.2) is always zero.
We will also nd it useful to express power in terms of the systems admittance matrix. To
do this, we note that complex power injected into the network at bus is given by:
S

= V

. (1A.4)
Post-multiplying the admittance matrix, given by Eq. (36), by the voltage vector V once
again yields AX = I, with individual elements:
I

= A

kJ()
A
k
V
k
. (1A.5)
Substituting (1A.5) into (1A.4) yields
S

= V

_
A

kJ()
A
k
V
k
_

= |V

|
2
A

kJ()
A

k
V

V
k
.
We may break down complex power into real and reactive power, starting by dividing
the admittance terms into their real and imaginary components. Let us start with the
expression for admittance as the inverse of impedance. That is, given Z
k
= R
k
+jX
k
,
Y
k
=
1
Z
k
=
1
R
k
+jX
k
=
1
R
k
+jX
k

R
k
jX
k
R
k
jX
k
=
R
k
jX
k
(R
k
)
2
+ (X
k
)
2
. (1A.6)
Using Eq. (1A.6), we dene the real and imaginary parts of resistance, G
k
and B
k
,
respectively, as follows:
G
k
=
R
k
(R
k
)
2
+ (X
k
)
2
B
k
=
X
k
(R
k
)
2
+ (X
k
)
2
.
Therefore,
Y
k
= G
k
jB
k
. (1A.7)
34
When writing the equations for complex power, the notation V
max
gets cumbersome. So at
this point we replace V
max
with u.
36
We employ Eulers identity to write complex voltage
at node as:
V

= u

e
j

,
and complex power as:
P

+jQ

kJ()
u

u
k
(G
k
jB
k
)e
j(

k
=

kJ()
u

u
k
_
G
k
cos(

k
) +B
k
sin

k
)

+ju

u
k
_
G
k
sin(

k
) B
k
cos

k
)

,
where:
P

kJ()
u

u
k
_
G
k
cos(

k
) +B
k
sin

k
)

, (1A.8)
Q

kJ()
u

u
k
_
G
k
sin(

k
) B
k
cos

k
)

. (1A.9)
(1A.8) and (1A.9) are known as power ow equality constraints, and must be satised at
each bus of the power system. Note that the lhs of this equation represents net power ow
out of node . If real/reactive power generation at this node is greater than real/reactive
power consumption, then P

, Q

> 0, respectively. The rhs of the equation represents


the fact that this net power ow travels across the transmission network, from the given
node, , to the nodes with which it is connected k J(). Summing up all of these positive
(indicating net power ow from node to node k) and negative (vice-versa) power ows, we
arrive at the net real/reactive power ow out of node . Finally, the KCL implies that net
power ow out of node is equal to the sum of the power ows along each line connected
to that node, so we write the power balance constraints as:
p

(x) =

kJ()
u

u
k
_
G
k
cos(

k
) +B
k
sin

k
)

= 0, (1A.10)
q

(x) =

kJ()
u

u
k
_
G
k
sin(

k
) B
k
cos

k
)

= 0. (1A.11)
36
We use alternate notations because electrical engineering texts do, and we wish to expose the reader
to these stylistic dierences. Our exposition is closest to that of Baldick (2006, Ch. 6.2).
35
We also derive the formulas for power ow across a given line k, connecting buses and
k, in the direction of to k. For a simple circuit with no shunt elements, the ow of power
across line k would be equal to:
u

u
k
_
G
k
cos(

k
) +B
k
sin(

k
)

.
37
This equation simply refers to the drop in voltage (and thus power ow) moving across
nodes in a series circuit. Once we take explicit account of the elements connecting genera-
tion to the transmission system (the shunt elements), we have to take into the consideration
the power sourcing at these elements. We derive this element of the power ow equation
by setting k = in equations (1A.10) and (1A.10). Doing so yields the quantities:
u

u
k
_
G

cos(

k
) +B

sin(

k
)

= (u

)
2
G

,
and, in the same manner, (u

)
2
B

. Therefore, denoting real power ow across line k in


the direction of k as p
k
(and doing so analogously for reactive power ow), we have:
p
k
= (u

)
2
G

+u

u
k
_
G
k
cos(

k
) +B
k
sin(

k
)

, (1A.12)
q
k
= (u

)
2
B

+u

u
k
_
G
k
sin(

k
) B
k
cos(

k
)

. (1A.13)
Likewise, we may rewrite (1A.10) and (1A.11) as:
P

= u
2

kJ()
u

u
k
[G
k
cos(

k
) +B
k
sin(

k
)], (1A.14)
Q

= u
2

kJ()
u

u
k
[G
k
sin(

k
) B
k
cos(

k
)]. (1A.15)
36
1.B Admittance Matrix for the DC Approximation to the
AC Circuit
For convenience, we return to the admittance matrix for the ladder circuit from (1.25).
For clarity, however, we now denote the admittances with double subscript notation Y
k
,
to denote the admittance of the line connecting nodes and k (e.g., Y
a
now becomes Y
01
,
because line a connects nodes 0 and 1). We may thus express the admittance matrix from
(1.25) as:
_

_
Y
02
+Y
12
Y
12
0 0
Y
12
Y
12
+Y
02
+Y
23
Y
23
0
0 Y
23
Y
23
+Y
30
+Y
34
Y
34
0 0 Y
34
Y
34
+Y
40
_

_
,
From (1A.7) we have Y
k
= G
k
jB
k
, which we could plug in for each value in the
matrix above. However, at this point we simplify the matrix by setting the real component
of admittance equal to zero. Electrical engineers justify this assumption by noting that
|G
k
| |B
k
|. Further, we neglect the shunt elements in the line models. The electrical
engineering justication for this is that shunt components sometimes have values such that
their eect on the circuit is negligible. This reduces the admittance matrix to:
_

_
jB
12
jB
12
0 0
jB
12
j(B
12
+B
23
) jB
23
0
0 jB
23
j(B
23
+B
34
) jB
34
0 0 jB
34
jB
34
_

_
.
37
Chapter 2
The Newton-Rhapson Method and
AC OPF
Having tackled the basic concepts of AC power ow and linearization of AC power ow
equations and related shift factor calculations, we are ready to move on to the optimal power
ow problem itself. At this point, even though most energy economists will not delve into
solutions of systems of nonlinear equations (rather, the energy economist will normally
just skip directly to the DC approximation of the AC optimal power ow problem), in
this chapter we tackle the nonlinear AC system to present the reader with additional
background material for the presentation of the DC approximation.
2.1 The Newton-Rhapson Method
AC power ow is captured by a system of nonlinear equations. Thus, before moving to the
solution of the AC optimal power ow problem (OPF), we must rst introduce a method
for solving systems of nonlinear equations. Except in relatively simple cases, analytical
solutions to systems of nonlinear equations will not exist. Therefore, our best hope is to
approximate the solution to this system of equations with a high degree of accuracy (to
be specied by the researcher). One uses an iterative algorithm to nd this answer. In an
iterative algorithm, the researcher makes an initial estimate as to a problems solution and
tries to improve upon this conjecture with successive iterates of the algorithm. The Newton-
Rhapson method is quite popular in electrical engineering texts, so we will demonstrate
it.
38
The general form for the iterates created by an iterative algorithm is:
x
v+1
= x
v
+
v
x
v
v = 1, 2, 3 . . . ,
where:
x
0
is initial estimate of the solution,
v is iteration counter,
x
v
is the value of the iterate at the v-th iteration,

v
is step size, usually 0 < 1,
x
v
is the step direction, and

v
x
v
is the update to x
v
to obtain x
v+1
.
Let us consider a function g() : R
n
R
n
and suppose we wish to solve the system
of simultaneous nonlinear equations g(x) = 0. We start with an initial estimate to the
solution, x
0
R
n
. We may think of this guess as our predicted answer, based on our
understanding of the problem. Except in the rare circumstance that our rst conjecture
is exactly correct, we will have g(x) = 0, and will wish to nd an updated vector x
1
=
x
0
+ x
0
such that:
g(x
1
) = g(x
0
+ x
0
) = 0.
A rst-order Taylor approximation of g() about x
0
yields:
g(x
1
) = g(x
0
+ x
0
) g(x
0
) +
g
x
x
0
x
0
. (2.1)
For a square system of equations we may write g(x
0
+ x
0
) g(x
0
) +J(x
0
)x
0
. There-
fore:
g(x
1
) = g(x
0
+ x
0
) g(x
0
) +J(x
0
)x
0
. (2.2)
Since we seek an x
1
such that g(x
1
) = 0, and the rhs of (2.2) approximates g(x
1
), we will
choose a value for x
0
such that:
g(x
0
) +J(x
0
)x
0
= 0. (2.3)
Solving (2.3) for x
0
yields x
0
=
_
J(x
0
)

1
g(x
0
). The stage zero update is thus:
x
0
=
_
J(x
0
)

1
g(x
0
), (2.4)
x
1
= x
0
+ x
0
. (2.5)
Equations (2.4) and (2.5) constitute the stage zero Newton-Rhapson update, and x
0
is
the stage zero Newton-Rhapson step direction. More generally, at any stage v we have:
x
v
=
_
J(x
v
)

1
g(x
v
), (2.6)
x
v+1
= x
v
+ x
v
. (2.7)
39
Example 2.1: Newton-Rhapson method
Consider the system of simultaneous non-linear equations g(x) = 0, where g() : R
2
R
2
is dened by:
x R
2
, g(x) =
_
1
2
(x
1
)
2
+x
2
4
x
1
+
1
2
(x
2
)
2
7
_
.
We expect that the pair of quadratic equations will have multiple roots. Let us focus on the
positive root pair. Since x
2
is of larger power in the expression with larger absolute value
(i.e. 7), our initial estimate will have x
2
greater than x
1
. One method for an initial choice
is to solve one of the equations exactly, while eyeballing the other. Let us start, then,
with an initial guess of (x
0
)

=
_
1 3.5

. This conjecture yields [g(x)


0
]

=
_
0 0.125

. To
calculate the Newton-Rhapson update, x
0
, we must nd, and then invert, the Jacobian
J(x
0
). By inspection,
J(x
0
) =
_
x
1
1
1 x
2
_
.
Inverting this matrix, we nd that:
_
J(x)

1
=
_

_
x
2
x
1
x
2
1
1
x
1
x
2
1
1
x
1
x
2
1
x
1
x
1
x
2
1
_

_
. (2.8)
Therefore,

_
J(x)

1
g(x
0
) =
_
1.4 0.4
0.4 0.4
_ _
0
0.125
_
, (2.9)
or,
x
0
=
_
0.05
0.05
_
.
Thus we have that:
x
1
=
_
1.05
3.45
_
.
Next we must decide if we wish to stop here, or perform another iteration. There are three
basic rules for determining when to stop iterating: (1) the step size, x
v
, is suciently
small; (2) the error term, g(x
v
), is suciently small (where small is determined by a
metric of the researchers choosing); or (3) after a xed number of iterations. Since we
40
are simply demonstrating the method, we will stop after two iterations. Given the update,
(x
1
)

=
_
1.05 3.45

, we compute g(x
1
)

=
_
0.00125 0.00125

. To perform the second


iteration, we insert the values x
1
into (2.8) and iterate on (2.9), nding:

_
J(x
1
)

1
g(x
1
) =
_
1.31554 0.38132
0.38132 0.40038
_ _
0.00125
0.00125
_
,
or,
x
1
=
_
0.00117
0.00002
_
.
Therefore, (x
2
)

=
_
1.049998 3.44883

, and g(x
2
)

=
_
0.00006 0.00281

. Notice
that we haveovershot the value of x
2
, in that |g(x
2
2
)| is greater than |g(x
1
2
)|. This is
where the issue of step size comes in. We wish to decrease absolute values of the errors for
all elements of x from one repetition to the next. Therefore, one would generally perform
a new stage 2 iteration to obtain:
x
2
= x
1
+
1
x
1
,
1
< 1.
It is common to iterate using
v
i
= (
1
2
)
i
, i = 0, 1, 2, . . . , where
v
i
is the i
th
iterate of
at stage v of the Newton-Rhapson update, and to stop iterating on when the error term
has fallen by the desired amount, as specied by the researcher.
Having demonstrated (if briey) the Newton-Rhapson method, the next step in solving an
AC optimal power ow problem is to set up the problem using this method. To do this,
we return to equations (1A.14) and (1A.15)for real and reactive power ow, respectively:
P

= u
2

kJ()
u

u
k
[G
k
cos(

k
) +B
k
sin(

k
)],
Q

= u
2

kJ()
u

u
k
[G
k
sin(

k
) B
k
cos(

k
)].
41
Example 2.2: 2-node power-ow problem
Let us examine the power ow equations in a simplistic two-node network, as illustrated
below:
X
12
2 1
X
12
= j0.25
P
g1
= 0.6
Q
g1
= 0.3
u
1
= 1.0

1
= 0.0
P
d2
= 0.6
Q
d2
= 0.3
u
2
= ?

2
= ?
Figure 1: The 2-Node Network
As discussed in Chapter 1, we will solve this power-ow problem by designating a node
(node 1) as the reference node (not bothering to denote it as 0). By convention, we
set the phase angle at the reference node,
1
, equal to 0.0 (radians). We will also set the
reference voltage, u
1
, equal to one per unit (1 p.u.). In this example, the generator at node
1 produces 0.6 units of real power and 0.3 units of reactive power, consumed by load at
node 2 (thus, we assume line 12 is lossless). As above, we also assume that line 12 exerts
no resistance, but only impedance, equal to j0.25, where again, j =

1. We also ignore
the impedance of any ground (shunt) elements. Our problem is now to solve for the two
unknowns, voltage angle and magnitude at node 2. Applying the power-ow equations for
real and reactive power at node 2 to this example, we have:
0.6 = (u
2
)
2
0 + 1 u
2
[4 sin(
2
)],
0.3 = 4(u
2
)
2
u
2
[4 cos(
2
)].
That is, we have a system of two nonlinear equations in two unknowns, which we will
(approximately) solve using the Newton-Rhapson method. Carrying all terms over to the
lhs, we have:
u
2
[4 sin(
2
)] + 0.6 = 0,
4(u
2
)
2
u
2
[4 cos(
2
)] + 0.3 = 0.
To use the Newton-Rhapson method, we make initial estimates for the two unknowns as
x
0
=
_

0
2
u
0
2
_
=
_
0
1
_
.
42
This estimate yields initial errors,
g(x
0
) =
_
0.6
0.3
_
.
Next, we partially dierentiate the equations wrt the unknowns, yielding:
J(x) =
_

_
p
2

2
p
2
u
2
q
2

2
q
2
u
2
_

_
=
_
4u
2
cos(
2
) 4 sin(
2
)
4u
2
sin(
2
) 8u
2
4 cos(
2
)
_
.
The stage zero update, from (2.6) and (2.7), is the solution to:
_
4 cos(0) 4 sin(0)
4 sin(0) 8 4 cos(0)
_ _

0
2
u
0
2
_
=
_
0.6
0.3
_
.
Inverting and multiplying, we obtain:
x
0
=
_
0.15
0.075
_
,

1
2
=
0
2
+
0
2
= 0.0 0.15 = 0.15 rad,
u
1
2
= u
0
2
+ u
0
2
= 1.0 0.075 = 0.925 p.u.
The new errors are
g(x
1
) =
_
g(x
1
1
)
g(x
1
2
)
_
=
_
4(0.925) sin(0.15) + 0.6
4(0.925)
2
+ 4(0.925) cos(0.15) + 0.3
_
=
_
0.047079
0.064047
_
.
Stopping Criterion: We will stop when each of the errors are less than 3 10
3
. The rst-
iteration errors do not satisfy this criteria, so we perform another iteration. The stage one
update is then:
_
3.658453 0.597753
0.552921 3.444916
_
1
_
0.047079
0.064047
_
=
_
0.016335
0.021214
_
.
The updated values are:

2
2
= 0.15 0.016335 = 0.16635 rad, and
u
2
2
= 0.925 0.021214 = 0.903786 p.u.,
43
and the new errors are:
[g(x
2
)]

= [0.001391, 0.002077].
Since these errors satisfy the stopping criterion, the procedure ends here.
Electrical engineers use circuit-ow studies to predict the behavior of a power system
under such contingencies as lightning strikes that cause short-circuits to occur between
transmission lines. Power-ow studies allow system operators to control the transmission
system so as to keep line ows within operating limits even if the most important element
in the system fails (known as the N-1 Criterion). We have introduced the power ow
problem simply to aid our understanding of optimal power ow, though, and will not
consider such issues as contingency analysis.
Looking forward to the AC optimal power ow problem, our next step is to use the Newton-
Rhapson method to solve nonlinear optimization problems. First, we introduce a generic
nonlinear optimization problem. After solving the basic problem, we then add constraints
onto it and nally address the optimal power ow problem itself. Since the optimal power
ow problem involves cost minimization, we will illustrate the Newton-Rhapson method
for a minimization problem. We will work with a strictly convex function, so as to nd a
unique global minimum and minimizer.
Example 2.3: Minimization of a convex nonlinear function
Using the Newton-Rhapson method, nd the minimizers of the function:
x R
2
, f(x) = 0.01(x
1
1)
4
+ 0.01(x
2
3)
4
+ (x
1
2)
2
+ (x
2
1)
2
,
The FOSC for a minimum are:
f
x
1
= 0.04(x
1
)
3
+ 2(x
1
2) = 0, (2.10)
f
x
2
= 0.04(x
2
)
3
+ 2(x
2
1) = 0. (2.11)
As the alert reader has already guessed, since this is a minimization problem, the FOCs
constitute the set of equations which we wish to set equal to zero. Notice that the value of
the iterate, x
v
, thus represents the error function. This is a system of nonlinear equations
(a square system, with two variables in two unknowns), which we may solve using the
Newton-Rhapson method. More formally, we have that:
g(x) = f(x), (2.12)
where () is the gradient function.
44
Applying (2.1):
g(x
1
) = g(x
0
+ x
0
) g(x
0
) +
g
x
x
0
x
0
to (2.12) yields:
(f(x
1
)) = f(x
0
+x
0
) fx
0
+
2
f(x
0
)x
0
. (2.13)
Setting the last term in (2.13) equal to zero, solving for x, and generalizing to generic
step v, the Newton-Rhapson update is:
x
v
= [
2
(f(x
v
)]
1
[f(x
v
)]. (2.14)
Therefore, in setting up the Newton-Rhapson update, we must take the Jacobian of the
system of rst-derivatives(the Hessian of the original system). Doing so for this example,
we obtain:

2
f(x
0
) =
_
0.12(x
1
1)
2
+ 2 0
0 0.12(x
2
3)
2
+ 2
_
.
A sensible initial estimate is [x
0
]

= [2, 1], as the contribution of the second rhs terms is


much larger than that of the rst in the error functions, (2.10) and (2.11). Using this
estimate, our initial update is:
[x
0
] =
_
2.12 0
0 2.96
_
1
_
0.04 0
0 0.32
_
=
_
0.0189 0
0 0.1081
_
,
and the rst iterate is:
(x
0
)

= (1.981132, 1.108108).
The error is then (0.000042, 0.054647). Performing another iterate yields:
[x
0
] =
_
2.1155 0
0 2.4295
_
1
_
0.000042 0
0 0.054647
_
=
_
0.00002 0
0 0.022493
_
.
The second iterate is thus:
[x
1
]

= [1.981112, 1.1306],
with associated error [2 10
8
, 0.000114]

. We deem this close enough for our purposes.


Since the OPF problem is one of constrained minimization of a nonlinear function, we
must determine how to incorporate constraints into the Newton-Rhapson method. Though
demonstrating this problem based on examination of the null space of the coecient matrix,
A, is quite intuitive, we will stick with the Lagrange multiplier approach since economists
are generally more familiar with it.
45
Example 2.4: Minimization of a convex objective with linear constraints
Let us apply the Newton-Rhapson method to minimize the objective function:
f(x) = 0.01(x
1
1)
4
+ 0.01(x
2
3)
4
+ (x
1
1)
2
+ (x
2
3)
2
1.8(x
1
1)(x
2
3)
s.t. x
1
x
2
= 8.
We may set up the Lagrangian as:
L = 0.01(x
1
1)
4
+0.01(x
2
3)
4
+(x
1
1)
2
+(x
2
3)
2
1.8(x
1
1)(x
2
3)+(x
1
x
2
8).
The rst-order necessary (and sucient) conditions for a minimum are:
L
x
1
= 0.04(x
1
1)
3
+ 2(x
1
1) 1.8(x
2
3) + = 0, (2.15)
L
x
2
= 0.04(x
2
3)
3
+ 2(x
2
3) 1.8(x
1
1) = 0, (2.16)
L

= (x
1
x
2
1.8) = 0. (2.17)
From (2.7) and (2.14), the Newton-Rhapson update is:
x
v
= [
2
(f(x
v
)]
1
[f(x
v
)],
x
v+1
= x
v
+ x
v
.
Once again the values that the Jacobian and error term take on are determined by the initial
estimates for x and , and the sequence of iterates generated therefrom. For purposes of
illustration, let us choose [x
0
]

= [5.5 2.5]. Without any intuition regarding the value


the Lagrange multiplier will take (other than that it will be non-positive), we set
0
= 0.
We may now calculate the components of the Newton-Rhapson update as:
L(x
0
) =
_
_
22.545
25.755
0.000
_
_
,
_

_
x
0
1
x
0
2

_ =
_

_
0.497
0.497
23.852
_

_.
It might seem obvious, but the addition of the Lagrangian multiplier adds a third com-
ponent to the system of FONC, yielding a 3 3 Jacobian, as opposed to a 2 2 matrix.
The errors: L(x
1
)

=
_
0.138 0.158 0

indicate that we are in the ballpark of the


46
solution.
1
Since the OPF problem has nonlinear constraints, as opposed to simply linear ones, we
move from the Newton-Rhapson method for solving nonlinear problems with linear restric-
tions to solving nonlinear problems with nonlinear restrictions. Doing so is analytically
straightforward, though the presence of nonlinear constraints is problematic because the
constraint set is no longer convex. Once a set of nonlinear restrictions is introduced, in-
stead of simply searching for the minimizer of the objective function by moving around
the hyperplane in n-dimensional space that corresponds to the ane restrictions, we must
adjust our search by tracing out the constraint function as best we can while scoping out a
descent direction for the objective function. Formally, we address a problem of the form:
min
xR
n
{f(x)|g(x) = 0}, (2.18)
where f() : R
n
R
m
. The problem is non-ane, both in the n variables and the m
constraints.
For a general physical problem, several functions may approximate the feasible set (electric
energy being an exception, since it itself operates according to well-dened mathematical
functions). In the case that we do have such choice, however, we will choose a function
g() that satises the following denition:
Denition 1: Let g() : R
n
R
m
. We say that x

is a regular point of the equality


constraints g(x) = 0 if:
1. g(x

) = 0,
2. g() is partially dierentiable with continuous partial derivatives at x

, and
3. the m rows of the Jacobian J(x) of g() evaluated at x

are linearly independent.


For g() to have any regular points, it must be the case that m n, since otherwise the
m rows of J(x

) cannot be linearly independent. Also, if g() has a regular point, then we


can nd an invertible m m submatrix:
g

(x

). This last observation is important, of


course, because the Newton-Rhapson method uses the inverse of the relevant Jacobian.
Next, we dene the tangent plane to the feasible set as follows:
Denition 2: Let g() : R
n
R
m
be partially dierentiable and let x

R
n
. Further,
let J() : R
n
R
mn
be the Jacobian of g(). Suppose that x

is a regular point of the


constraints g(x) = 0. Then the tangent plane to the set S = {x R
n
| g(x) = 0} at the
point x

is the set:
T = {x R
n
| J(x

)(x x

) = 0}.
1
We leave reiteration of the Newton-Rhapson method as an exercise to the reader.
47
The tangent plane at x

is the set of points such that the rst-order Taylor approximation


of g(x) about x

is 0. The alert reader will recognize that the set T corresponds to an


ane constraint set with its Jacobian J(x) passing through the point x

. A little math
(Baldick (2006 Ch. 13) demonstrates that the set T is the null space of the ane system
of constraints Ax = b.
Therefore, if we had an ane system of constraints, we could simply scan the null space
of A in search of the minimizer of f(x). Since this is not the case, the best we can hope
for is to move along the tangent plane T of S, evaluated at the regular point x

. As we
move away from x

along T, we will necessarily move out of the feasible set S. Because the
tangent plane is generally a better approximation of the constraint set for small values of
(x x

) = x

, we will keep our step sizes small to avoid straying far from S. Using this
update, we arrive at a new (hopefully) regular point, x

, and perform another update.


Having made the rst update, however, we will have necessarily moved out of the feasible
set. This creates a certain tension in the analysis. At the next update we will at the same
time, wish to: (1) move back toward the constraint set, S, and; 2) reduce the value of the
objective function. Focusing on the rst point, we will have chosen our initial value, x
0
such that g(x
0
) = 0. As we have just argued, though, taking a linear approximation to
the constraint set in making our initial iterate implies that g(x
1
) = 0. Therefore, we will
try to make the next update, x
2
bring us back to the constraint set (i.e., we will try to
choose x
2
such that g(x
2
) = 0). Note that the Newton-Rhapson update at x
1
is simply
x
1
= [J(x
1
)]
1
g(x
1
). This update is consistent with the denition of the tangent plane,
above, except that now we are no longer on the constraint set (i.e. g(x
1
) = 0). Moving on
to the second point, we once again will set up the FONC to solve for the minimizer of the
objective function. However, since the constraint set is no longer convex, the rst-order
conditions are no longer sucient. Formally, the SOSC for a local minimizer are given by
the following theorem:
Theorem 1 Suppose that f() : R
n
R and g() : R
n
R
m
are twice partially dierentiable
with continuous second partial derivatives. Let J() : R
n
R
mn
be the Jacobian of g().
Consider problem (2.18) and points x

R
n
and R
m
. Suppose that:
f(x

) +J(x

= 0, (2.19)
g(x

) = 0, and (2.20)

2
f(x

) +
m

=1

2
g

(x

) is positive definite on the null space :


N = {x R
n
|J(x

)x = 0}.
Then x

is a strict local minimizer of Problem (2.18).


48
Proof See D.G. Luenberger (1984, Section 10.5)
This condition, requiring that the Hessian of the Lagrangian be positive denite, is to be
expected, but it is convenient to have it on hand. It also yields a compact representation
of the rst-order conditions to problem (2.18). That is, the FONC for minimization of the
Lagrangian:
L = f(x) +(g(x) 0),
are given by (2.19) and (2.20).
2
Applying the Newton-Rhapson method to this system of
equations yields:
_

2
xx
L(x
v
,
v
) J(x
v
)

J(x
v
) 0
_ _
x
v

v
_
=
_
fx
v
+J(x
v
)

v
g(x
v
)
_
.
At this point, with the help of fn. 2 the reader should be able to go back to (2.15)
(2.17) and deduce that the dierence between the Newton-Rhapson update for the
nonane vs. ane constraint set is the replacement of terms of the matrix A

by terms
of the matrix J(x
v
)

and g(x
v
) for 0 as the second-row on the rhs (the latter point arising
because we no longer can guarantee that the iterate will stay within S). Alternatively,
we may say that the only dierence between the two solution methods is that when we
have a nonlinear constraint set, we must use the Newton-Rhapson approximation of the
corresponding linear constraint set. Since we are already familiar with this method, adding
a nonlinear constraint set imposes no new analytical diculties.
Having solved for (x
v
,
v
) using the system of Newton-Rhapson equations, we write
the generic Newton-Rhapson update as:
_
x
v+1

v+1
_
=
_
x
v

v
_
+
v
_
x
v

v
_
.
We explicitly include the step size,
v
, because it reminds us of the tradeo we face between
satisfaction of the constraints and improvement in the objective. We may feel it necessary
to set
v
< 1 should the full Newton-Rhapson update lead us to stray from the constraint
g(x) = 0.
The Newton-Rhapson method handles ane- and non-ane equality constraints easily
because the method is meant to solve problems of the form g(x) = 0, whether g(x) is ane
or not. However, incorporating inequality constraints is more problematic for precisely the
same reason.
2
At this point, one may verify that the corresponding FOSC for the linear counterpart to this problem
are: f(x

) +A

= 0, and Ax

b = 0, where A is dened in the latter equation as the coecient


matrix of the linear constraint set.
49
We examine two types of inequality constraints: (1) non-negativity constraints; and (2)
feasible operating ranges (i.e. feasible production levels). We address the rst problem,
non-negativity constrained minimization, subject to ane equality constraints. If the ob-
jective function is convex in this case, then we have a convex minimization problem. We
introduce the FONC for this problem and the familiar Kuhn-Tucker conditions in the
following theorem:
Theorem 2 Let f() : R
n
R be partially dierentiable with continuous partial derivatives,
A R
mn
, and b R
m
. Consider the problem,
min
xR
n
{f(x)| Ax = b, x 0}, (2.21)
and a point x

R
n
. If x

is a local minimizer of (2.21), then:

R
n
such that : f(x

) +A

= 0;
M

= 0;
A

= 0; (2.21a)
x 0; and
0;
where M

is a diagonal matrix with diagonal entries equal to

, = 1, 2, . . . , n. The vectors

and

satisfying the conditions (2.21a) are called the vectors of Lagrange multipliers
for the constraints Ax = b and x 0, respectively. The conditions M

= 0; are the
familiar complementary slackness conditions.
Proof Nash and Sofer (1996, Section 14.4).
To see why the addition of non-negativity constraints is problematic for the Newton-
Rhapson method, consider that the complementary slackness conditions for minimization,
M

= 0, require that for each choice variable, x

, either

= 0 or x

= 0(orboth) =
1, 2, . . . , n. To apply the Newton-Rhapson method to this condition, we linearize the
equation M

= 0 about the values of and x at iteration v and use the linearized


equations to construct an update. Doing so, we obtain:

v+1

x
v+1

= (
v

+
v

)(x
v

+
v

)
v

x
v

+
v

x
v

+
v

x
v

.
Our goal is to obtain the complementary slackness conditions corresponding to the min-
imizer, x

. Unfortunately, though, it is entirely possible that we arrive at an iterate for


which x
v

= 0 before we have found the minimizer. If this happens, then by the Newton-
Rhapson method, we have:

v+1

x
v+1

x
v

+
v

x
v

+
v

x
v

=
v

x
v

,
50
(since x
v

= 0). But the Newton-Rhapson method implies that:


J()() = g(),
and thus:

x
v

= 0 (
v

x
v

) = 0 x
v

= 0.
Therefore, once we get to the point where x
v

= 0 (or
v

= 0, for that matter), we will


never move from that point. Successful application of the Newton-Rhapson method in this
case requires that we avoid these two outcomes.
Fortunately there is a simple (and rather clever) method for doing just that, called the
interior point algorithm.
3
Conceptually, the interior point algorithm erects a barrier that
(usually) prevents violation of all of the equality constraints, so that x
v
,
v
R
n
++
, v.
To avoid the case where an entry of x or is on the boundary of the positive orthant, the
barrier function should increase suciently rapidly as the objective function approaches
the boundary of the feasible region.
To illustrate the interior point algorithm and its associated barrier function, let us consider
the following problem:
min
xR
n
{f(x) s.t. x 0}. (2.22)
We add a barrier function to the objective function to form the barrier objective:
(x) = f(x) +f
b
(x).
A suitable barrier function will be dierentiable on the interior of the constraint set, but
will be unbounded as we approach the sets boundary. Two such barrier functions are the
reciprocal function and the negative of the logarithm function, known as the logarithmic
barrier function, the latter of which we will study further. Let us dene the logarithmic
barrier function f
b
() : R
n
++
R for the constraints x 0 by:
x R
n
++
, f
b
(x) = ln x.
At this point, the typical response for an economist is probably a lot of head scratching,
because adding a function to the objective that becomes innite just as the objective
is approaching its minimum value makes no sense at all. Fortunately (and this is the
point where the approach becomes exceedingly clever), this is not where the methodology
terminates. At this juncture, we multiply the barrier function by the barrier parameter,
t 0 , obtaining the revised barrier function:
tf
b
(x) = t ln x,
3
Alternatively, one might use the active set method (see, e.g. Baldick (2006, section 16.3)).
51
and the revised barrier objective:
(x) = f(x) +tf
b
(x).
Thus, instead of solving (2.22), we will solve the barrier problem:
min
xR
n
{(x)| Ax = b, x 0}. (2.23)
That is, we minimize (x) over values of x R
n
that satisfy Ax = b and which are also
in the interior of x 0.
Notice that for a xed value of t, the swift increase in thje value of the barrier as we
approach the boundary means that the minimizer of problem (2.23), x

(t), R
n
++
. Thus,
problem (2.23) will not have a solution unless x R
n
| Ax = b.
4
Notice also that because
we wish to avoid iterates that fall outside of R
n
++
, we must carefully monitor the step size
to prevent this from happening.
5
Since the barrier function distorts the objective function, actually preventing us from nd-
ing the optimal value of the decision vector x, the key to the interior point algorithm is
that its initial iterate starts at some positive value for t, but then constructs a path of
iterates for successive values of the barrier objective as t 0.
As mentioned above, for a xed value of t, we minimize the barrier objective, obtaining
the minimizer x

(t). We then run another iterate of the Newton-Rhapson mechanism


for a smaller value of the barrier parameter. That is, t
y+1
< t
y
, where y is the iteration
counter for the iterates of t (as opposed to v, the iteration counter for the Newton-Rhapson
iterates). At the rst run of the interior point algorithm, we are free to set the initial value,
x
0
(t
0
). However, for every successive iteration, y > 0, we set the initial value of the vector
of choice variables equal to the optimal value(s) for the previous iteration. That is,
x
0
(t
y+1
) = x

(t
y
).
As the alert reader will already have surmised, the limit of this sequence of iterates is the
minimizer, x

, of the objective function, f(x). The trajectory of the sequence of minimizers


of problem (2.23) is known as the central path (Nash and Sofer (1996, Section 17.4)).
If the sequences of minimizers and Lagrange multipliers associated with problem (2.23)
4
This is known as the Slater condition(see e.g. Nash and Sofer (1996, p. 485)).
5
Even though we penalize values of the choice variables that fall near the boundary of the feasible set,
this alone does not guarantee that we will not step outside of the this set. If the Newton-Rhapson step
direction, x, generated by the Newton-Rhapson method is too large, we might still nd ourselves on (or
outside) the boundary of the choice set. We stop this from happening by appropriate choice of
v
. Thus,
the interior point algorithm is not quite as foolproof as one might initially imagine.
52
converge, then the limits of these sequences satisfy the FONC for problem (2.22).
6
Provided that we have successfully implemented the barrier function, so that the central
path falls strictly within R
n
++
, we may (partially) ignore the inequality constraints and the
domain of the barrier function. That is, computationally, we may examine the problem:
min
xR
n
{(x)| Ax = b}. (2.24)
Thus, we have eectively reduced the problem of non-negatively constrained minimization,
subject to equality constraints, to equality-constrained minimization alone. So long as f()
and f
b
() are convex, we may write the FOSC for minimization of (2.24) as:
(x) A

= 0, (2.25)
b Ax = 0. (2.26)
These are a set of nonlinear simultaneous equations which we will once again solve using
the Newton-Rhapson method.
To use the Newton-Rhapson method to solve (2.24), write the barrier objective function
in terms of its separate components:
(x) = f(x) +tf
b
(x).
Therefore, the rst term in (2.25) is:
(x) = f(x) +tf
b
(x). (2.27)
Writing (2.27) in terms of the logarithmic barrier function yields:
(x) = f(x) +t(ln x)
= f(x) +t
_

1
x
1
.
.
.

1
xn
_

_
= f(x) t
_
X

1
1,
6
As Baldick (2006, problem 16.20) demonstrates, it is dicult to solve (2.23) directly for a very small
value of t, because a badly chosen initial estimate leads to a poor update of the algorithm. Baldick (2006,
p. 638) also notes that, in problems with more than one variable, if the initial estimate that is far from the
minimizer of (2.23) for the current value of the barrier parameter t, then the coecient matrix to determine
the step-direction may be ill-conditioned. Thus, we are well-advised to start with a fairly large value of t
and work toward the minimizer of (2.22) using the central path generated from (2.23), instead of cheating
by starting with a small value for t.
53
where X = diag{x

} R
nn
is a diagonal matrix with entries equal to x

, = 1, , n.
The NewtonRhapson update for this problem is expressed by the familiar equations:
_

2
(x
v
) A

A 0
_ _
x
v

v
_
=
_
(x
v
) +A

v
Ax
v
b
_
.
Decomposing the barrier function, this system becomes:
_

2
f(x
v
) +t
_
X
v

2
A

A 0
_ _
x
v

v
_
=
_
f(x
v
) +t
_
X
v

1
1 +A

v
Ax
v
b
_
. (2.28)
Updates of this family are known as the primal interior point algorithm.
As we move along here, implementation of the Newton-Rhapson method becomes increas-
ingly complicated. The primal interior point algorithm does not simply involve nding a
single vector of minimizers, but rather successive rounds of them. This emphasizes the
indispensability of computer programming in successfully addressing any but the simplest
of problems using this algorithm. Nevertheless, it is instructive to work out an example
using this method.
Example 2.5: Primal interior point algorithm
Consider the following problem:
min
xR
2
_
(x
1
)
2
(x
2
)
2

x
1
+x
2
= 1, x
1
0 x
2
0
_
.
To implement the primal interior point algorithm with a negative logarithmic penalty
function, we set up the augmented Lagrangian as follows:
L = (x
1
)
2
(x
2
)
2
t ln x
1
t ln x
2
+(1 x
1
x
2
).
To demonstrate the algorithm eectively, we want to avoid too big of a step size. We will
set t =1. Having done so (and cheating a little bit), setting x
0

= [0.275 0.7125 2.825]


will avoid a large step. To solve for the step zero Newton-Rhapson update, we must solve
for the FOC for a minimum and the Hessian of the Jacobian, which we show below:
L
x
1
= 2x
1

t
x
1
= 0,
L
x
2
= 2x
2

t
x
2
= 0,
L

= 1 x
1
x
2
= 0, and
54
H =
_

_
2 +
t
(x
1
)
2
0 1
0 2 +
t
(x
2
)
2
1
1 1 0
_

_
. (2.29)
Notice that (2.29) and the rst rhs term in (2.28) are, in fact, equivalent, since:

2
f(x
v
) +t
_
X
v

2
=
_

_
2 +
t
(x
1
)
2
0
0 2 +
t
(x
2
)
2
_

_,
and:
A =
_
1 1

.
Notice also that the lhs of (2.28) is simply the system of FOCs (with the exception that
the zeros are explicit in the FOCs, whereas they are implicit in the lhs terms of (2.28)
(because we want g(x
v
) = 0, so the lhs of (2.28) is g(x
v
) = g(x
v
) 0, but the zero
vector is implicit).
The Newton-Rhapson step-zero step direction is thus derived from the solution of:
_

_
2 +
t
(x
1
)
2
0 1
0 2 +
t
(x
2
)
2
1
1 1 0
_

_
_

_
x
0
1
x
0
2

0
_

_ =
_

_
2x
0
1

t
x
0
1

0
2x
0
2

t
x
0
2

0
1 x
0
1
x
0
2
_

_
.
7
(2.30)
Substituting in the initial values for x
1
, x
2
, and , we nd:
_

_
x
0
1
x
0
2

0
_

_ =
_

_
0.005314
0.005314
0.003349
_

_,
and therefore,
_

_
x
1
1
x
1
2

1
_

_ =
_

_
0.2928
0.7072
2.8280
_

_,
55
with associated errors:
g
_

_
x
1
1
x
1
2

1
_

_ =
_

_
0.00170
0.00043
0.00000
_

_.
For our purposes, these errors are suciently small that we may move on to the subsequent
step in the algorithm. Next, we iterate on the barrier parameter, t. Doing so, we decrease
t from 1 to 0.9 (and to 0.8 in the next run, and so on). As mentioned above, we use
the optimal values, x
1
, found for t = 1 for the initial estimates for x
1
, x
2
, and . The
augmented Lagrangian is again:
L = (x
1
)
2
(x
2
)
2
t ln x
1
t ln x
2
+(1 x
1
x
2
),
so that the Newton-Rhapson update is of exactly the same form (as one would expect),
but this time with the reduced value of t.
So, inserting the value [x
1
1
x
1
2

1
]

= [0.2928 0.7072 2.8280], t = 0.9 into (2.30). gives:


_

_
x
2
1
x
2
2

2
_

_ =
_

_
0.0162
0.0162
0.1377
_

_,
_

_
x
2
1
x
2
2
x
2
3
_

_ =
_

_
0.2766
0.7234
2.6903
_

_, and g
_

_
x
2
1
x
2
2

2
_

_ =
_

_
0.0103
0.0006
0.0000
_

_.
8
(2.31)
Assuming a tolerance of 0.005 for the error term, we would reiterate with t = 0.9 before
ratcheting t down to 0.8 (assuming, of course, we reach the desired tolerance in the next
iteration).
As we would expect, after we decrease the barrier parameter, the values for x
1
and x
2
move
closer to the true minimizers (x

1
, x

2
) = (0, 1). The central path will converge to this value
as t 0.
Having introduced the primal interior point algorithm, we are ready to move on to a variant
of this algorithm that includes explicit consideration of non-negativity constraints, called
the primal-dual interior point algorithm. If the primal interior point algorithm is
clever, the primal-dual interior point algorithm is truly inspired. The algorithm focuses
on the handling of the complementary slackness conditions for non-negatively constrained
minimization:

= 0

0, x

0; = 1, 2, , n,
where n is the number of non-negatively constrained variables.
9
9
The following discussion is based on Baldick (2006, Section 16.4.3.3).
56
As argued above, one cannot incorporate the complementary slackness conditions directly
into the Newton-Rhapson algorithm, because the ordered pairs (x

) that satisfy these


conditions lie either on the x axis, the axis, or at the origin; and linearization of the
complimentary slackness conditions implies that once a variable hits the axis, it stays
there.
But when we replace the conventional complementary slackness condition with a revised
condition:

= t, = 1, 2, . . . , n, (2.32)
the following hold:
1. The complementary slackness condition no longer brings us to either the x or axis;
2. Linearization of (2.32), together with an explicit requirement to avoid the x and
axes, yields a useful update that can approximate the kink in the complementary
slackness conditions;
3. Solving for

= t/x

will come in handy in the FOCs; and


4. As t 0, the modied complementary slackness conditions approach the
complementary slackness conditions for minimization of the objective function.
That is, as t is reduced, the hyperbolic-shaped sets corresponding to the modied
complementary slackness conditions come closer to the set of points satisfying the
complementary slackness conditions and the non-negativity constraints.
Point 3 suggests use the logarithmic barrier function with barrier parameter t: t(ln x).
We may set up the barrier problem for non-negatively constrained minimization with linear
constraints as:
min
xR
n
L = f(x) +t(ln x) +(b Ax). (2.33)
The FONC for a minimum are:
L
x
= f(x) t[X]
1
A

= 0, (2.34)
L

= b Ax = 0. (2.35)
From point 3, we may re-write (2.34) as:
f(x) A

= 0.
10
Next, we use the Newton-Rhapson method to nd a step direction to solve the comple-
57
mentary slackness condition and the two rst-order conditions, (2.34) and (2.35):
t1 X = 0,
f(x) A

= 0,
Ax = b.
By inspection, the Newton-Rhapson step direction is:
_

_
X
v
M
v
0
I
2
f(x
v
) A

0 A 0
_

_
_

v
x
v

v
_

_ =
_

_
X
v

v
t1
f(x
v
) +A

v
+
v
Ax
v
b
_

_,
where M
v
= diag{
v

}, and X
v
= diag{x
v

}. The Newton-Rhapson update is:


11
_
_

v+1
x
v+1

v+1
_
_
=
_
_

v
x
v

v
_
_
+
_
_

v
x
v

v
_
_
.
Note that iterating until we approach a minimizer, x
t

0
, corresponding to a xed value of
t = t
0
would be very computationally expensive, as we continue this process for smaller
and smaller values of t. Therefore, instead of doing so, a more conventional method is to
reduce t slightly at every iteration of the process. Reducing t excessively yields a poor
update direction because the step-size to maintain non-negativity of the iterates will be
very small (See Baldick (2006, Exercise 16.22).
As before, we will nd the optimal vector of choice variables, x

as the limit of the central


path as t approaches zero, provided that the problem is well-behaved.
Example 2.6: Primal-dual interior point algorithm
Consider again the problem from Example 2.5:
min
xR
2
_
(x
1
)
2
(x
2
)
2
| x
1
+x
2
= 1, x
1
0 x
2
0
_
.
We write the augmented Lagrangian for problem (2.65) corresponding to the primal-dual
interior point algorithm as:
L = (x
1
)
2
(x
2
)
2
t ln x
1
t ln x
2
+(1 x
1
x
2
).
11
Because the full Newton-Rhapson update may take us outside the bounds of the feasible region, we
will modify the update according to the step size,
v
, as necessary. For further discussion of the step-size
for the primal-dual problem, see Baldick (2006, pp. 643-645).
58
The FOSC for a minimum are:
L
x
1
= 2x
1

t
x
1
= 0,
L
x
2
= 2x
2

t
x
2
= 0,
L

= 1 x
1
x
2
= 0,
with associated complementary slackness conditions:
t x
1

1
= 0,
t x
2

2
= 0.
The Newton-Rhapson update is given by the solution of the following system of equations:
12
_

_
x
1
0
1
0 0
0 x
2
0
2
0
1 0 2 0 1
0 1 0 2 1
0 0 1 1 0
_

_
_

2
x
1
x
2

_
=
_

_
t x
1

1
t x
2

2
2x
1

2x
2

1 x
1
x
2
_

_
. (2.36)
Even for a simple problem, the Newton-Rhapson method is already getting out of hand
(which is why for any problem worth solving, one writes code and lets the computer do
the rest of the work). Nevertheless, we will demonstrate the rst iteration of this process.
Using the same barrier parameter, t = 1, we will use the revised values, x
1
1
, x
1
2
, and
1
found in (2.31) for the rst run of the primal interior point algorithm. With x
1
and x
2
determining the values for
1
and
2
, the initial vector is:
_

0
1

0
2
x
0
1
x
0
2

0
_

_
=
_

_
3.6153
1.3824
0.2766
0.7234
2.6903
_

_
12
It does seem strange that we lose the t/x2 term in the Hessian, but if you look at Baldick (2006, p.
643) the expected term reappears.
59
Inserting these values into eq.(2.36), the stage zero update is:
_

1
1

1
2
x
1
1
x
1
2

1
_

_
=
_

_
3.6153
1.3824
0.2766
0.7234
2.6903
_

_
+
_

_
0.2033
0.0297
0.0156
0.0156
0.1375
_

_
=
_

_
3.4120
1.4121
0.2922
0.7078
2.8278.
_

_
.
As one might expect, the method for solving inequality constrained optimization problems
is quite similar to that for solving non-negatively constrained optimization problems, using
a barrier parameter to bound the iterates away from the constraint. The dierence is
that we use the familiar method of slack variables (along with complementary slackness
conditions) in solving the problem. Formally, we consider the following problem:
min
xR
n
{f(x)| Ax = b, Cx d}, (2.37)
where A R
mn
, b R
m
, C R
rn
, and d R
r
are constants. The feasible set dened
by the linear equality and inequality constraints is convex (see Baldick (2006, Exercise
2.36)). Thus, if f() is convex on the feasible set, the problem is convex. We introduce the
FONC for this problem in the following theorem:
Theorem 3 Let f : R
n
R
n
be partially dierentiable with continuous partial derivatives,
A R
mn
, b R
m
, C R
rn
, and d R
r
. Consider problem (2.37)
min
xR
n
{f(x)| Ax = b, Cx d},
and a point x

R
n
If x

is a local minimizer of (2.37) then:

R
m
,

R
n
such that : f(x

) +A

= 0;
M

(Cx

d) = 0;
Ax

= b;
Cx

d; and

0,
where M

= diag{

} R
rr
. The vectors x and

satisfying the above conditions


are called the vectors of Lagrange multipliers for the constraints Ax = b and Cx d,
respectively. The conditions M

(Cx

d) = 0 are the complementary slackness conditions.


Proof Nash and Sofer (1996, Section 14.4). Again, the FOCs are sucient provided that
the objective is convex on the feasible set.
60
Introducing the slack variable, w R
r
, we transform the inequality constraint Cx d into
the equivalent equality constraint: Cx + w = d. Upon making this substitution, (2.37)
becomes:
min
xR
n
, wR
r
{f(x)| Ax = b, Cx +w = d, w 0},
transforming the inequality constrained problem into a non-negatively constrained prob-
lem. This formulation suggests a method for solving the inequality constrained problem
analogous to that of solving the non-negatively constrained optimization problem: intro-
duce a barrier function into the objective function and then minimize the barrier objective
function. We form the barrier objective function () : R
n
R
r
++
R as:
(x, w) = f(x) +tf
b
(w), x R
n
, w R
r
++
.
Note that the barrier function does not apply to x here, unless x is non-negatively con-
strained as well.
We must assume that the Slater condition holds in order to solve this problem using an
interior point algorithm. We again specify a logarithmic barrier function, so that:
f
b
(w) =
r

=1
ln w

, w R
r
++
,
f
b
(w) = [W]
1
1,
where W = diag{w

} R
rr
is a diagonal matrix with diagonal entries equal to
w

, = 1, 2, . . . , r. Setting up the Lagrangian for this general problem, we have:


L = f(x) +t(ln w) +(b Ax) +(d Cx w). (2.38)
Since this problem is quite similar to problem (2.37) we will not explicitly work out an
example, but rather simply set up the general FONC for problem (2.38) and the Newton-
Rhapson step direction, as follows: FONC:
f(x) A

= 0, (2.39)
Ax = b, (2.40)
Cx +w = d, (2.41)
tf
b
(w) + = 0. (2.42)
The Newton-Rhapson step direction is given as the solution to:
_

_
M
v
0 0 W
v
0
2
f(x
v
) A

0 A 0 0
I C 0 0
_

_
_

v
x
v

v
_

_
=
_

_
W
v

v
t1
f(x
v
) A

v
b Ax
v
d Cx
v
w
v
_

_
. (2.43)
61
Again, decreasing the barrier parameter yields the central path, and the limit of the central
path, as the barrier parameter approaches zero yields the solution to the inequality con-
strained problem. One special case of inequality constrained minimization is when C = I
r
,
the identity matrix of order r. This is the case when there exist non-zero upper and/or
lower limits on a choice variable. In the case of a power ow problem, these constraints
take the form of: (1) limits on ows of energy across power lines, and; (2) minimum and
maximum run constraints on generators, both with respect to voltage and power.
Fortunately, we may approach this problem using the tools already in place. Let x denote
vector of maximum values that the variables x may take. We transform the inequality
constraint x x into an equality constraint using the slack variable approach, as follows:
x + w = x, w 0. From there, the method proceeds in the same manner as the general
solution method for inequality constrained minimization (i.e., set up problem (2.38), de-
rive FONC (2.39) (2.42), and set up the Newton-Rhapson step direction (2.43), etc.).
We present an example of this type of inequality-constrained minimization below, for the
interested reader.
Example 2.7: Inequality-constrained minimization Consider the following problem:
min
xR
2
_
(x
1
1)
2
(x
2
3)
2
| x
1
x
2
= 0, x
2
1.5
_
.
Perform one iteration of the primal-dual interior point algorithm, using as the initial esti-
mate:

0
= 1, x
0
1
= 1, x
0
2
= 1, w
0
= 0.5,
0
= 0, t
0
= 0.5.
First, we set up the Lagrangian
L = (x
1
1)
2
+ (x
2
3)
2
+t(ln w) +(x
1
x
2
) +(x
2
+w 1.5).
The FONC are:
L
w
=
t
w
+ = 0,
L
x
1
= 2(x
1
1) + = 0,
L
x
2
= 2(x
2
3) + = 0,
L

= x
1
x
2
= 0, and
L

= x
2
+w 1.5 = 0.
62
The stage zero Newton-Rhapson update to the barrier problem is thus given by the solution
to the following system of equations:
_

0
0 0 0 w
0
0 2 0 1 0
0 0 2 1 1
0 1 1 0 0
1 0 1 0 0
_

_
_

_
w
0
x
0
1
x
0
2

0
_

_
=
_

_
w
0

0
+t
2(x
0
1
1)
0
2(x
0
2
3) +
0

0
x
0
1
+x
0
2
x
0
2
w
0
+ 1.5
_

_
.
We then have:
w
1
= 0, x
1
1
= 1.5, x
1
2
= 1.5,
1
= 1,
1
= 2.
Note that we have reached the minimizers of the problem after only one repetition of the
algorithm, and that the minimizers occur at the boundary for x
2
. So, even though the
interior point mechanism is supposed to keep us away from the border, it does not, but it
does give us the right answer anyway.
At this point, we are just about ready to attack AC OPF. The one additional complication
that we face in addressing this problem is that of incorporating nonlinear constraints into
the primal-dual algorithm. That is to say, we replace the linear inequality constraints
{Ax = b, Cx d} with the nonlinear constraint sets {g(x) = 0, h(x) 0} to consider
the nonlinear inequality-constrained minimization problem:
min
xR
n
_
f(x| g(x) = 0, h(x) 0
_
. (2.44)
As we previously remarked, however, introducing nonlinear constraint sets means the min-
imization problem is no longer convex. We deal with this problem by choosing functional
forms that yield regular points of the constraints g(x) = 0 and h(x) 0. At a regular point
of the inequality constraints, linearization of the equality constraints and of the binding
inequality constraints will once again yield a useful approximation to the feasible set or its
boundary, at least in the vicinity of the regular point.
We state the FONC for minimization of the nonlinear inequality constrained problem as
follows:
Theorem 4 Suppose that the functions f() : R
n
R, g() : R
n
R
m
, and h() :
R
n
R
r
are partially dierentiable with continuous partial derivatives. Let J() : R
n

R
nm
and K() : R
n
R
nm
be the Jacobians of g and h, respectively. Consider Problem
(2.44):
min
xR
n
_
f(x| g(x) = 0, h(x) 0
_
.
63
Suppose that x R
n
is a regular point of the constraints g(x) = 0 and h(x) 0. If x

is
a local minimizer of Problem (2.44), then:

R
m
,

R
n
such that : f(x

) +Jx

+Kx

= 0;
M

h(x

) = 0;
g(x

) = 0;
h(x

) 0; and

0.
We again transform the inequality-constrained problem into an equality-constrained prob-
lem using slack variables, as follows:
min
xR
n
, wR
r
_
f(x)| g(x) = 0, h(x) +w = 0, w 0
_
(2.45)
We once again add a barrier function f
b
(w) : R
n
++
R and a barrier parameter t R
+
to transform problem (2.45) into the barrier problem:
min
xR
n
, wR
r
_
(x, w)| g(x) = 0, h(x) +w = 0, w 0
_
.
In review, we minimize (x, w) over values of x R
n
and w R
r
that satisfy g(x) = 0
and h(x) + w = 0, and are in the interior of w 0. In order to address this problem,
we again must assume that the Slater condition holds. As before, we partially ignore
the inequality constraints and the domain of the barrier function and solve the following
nonlinear equality-constrained problem:
min
xR
n
, wR
r
_
(x, w)| g(x) = 0, h(x) +w = 0
_
,
which has rst-order necessary conditions:
f(x

) +Jx

+Kx

= 0;
g(x

) = 0;
h(x

) +w = 0; and
tf
b
(w) + = 0.
and complementary slackness condition:
M

h(x

) = 0;
where J() and K() are the Jacobians of g() and h(), respectively, and and are
the Lagrange multipliers corresponding to the constraints g(x) = 0 and h(x) + w = 0,
respectively.
64
2.2 AC Optimal Power Flow
Conceptually, the AC OPF problem is straightforward. We wish to minimize the cost of
producing a given amount of electricity
13
subject to the power ow equality constraints,
generator operating constraints (i.e., minimum and maximum power output),
14
voltage
magnitude constraints,
15
and transmission-line capacity constraints. We express this gen-
eral problem as:
min
xR
n
_
f(x, w)| g(x) = 0, x x x, h h(x)

h
_
, (2.46)
where:
f(x) : R
n
+
R
+
.
The objective, f(x), thus represents the cost of generating (real) power.
16
Note that
f() is separable, since the decisions of one generation owner typically do not aect the
costs of any other generation owner. The equality constraint system, g(x), corresponds
to Kirchhos current law, applied to each node in the system. The inequality constraint
system, x x x corresponds to generation constraints. Generators have lower and upper
operating bounds on real power output and the range of voltages (generally measured in
per unit quantities) at which they may be operated. Finally, the inequality constraint
system, h h(x)

h, corresponds to transmission line constraints. The function, h(x),
represents the fact that ows of power on dierent transmission lines are functions of the
power transfer distribution factors for generators at each node of the system.
With respect to the objective function, Stoft (2002) notes that simplied diagrams of gener-
ation supply curves typically assume a constant marginal cost up to the point of maximum
generation. This produces jump discontinuities in the market supply curve as we move
from one generators marginal cost to the next.
17
Stoft argues that one may smooth
these point discontinuities by assuming that supply curves have extremely large, but -
nite, slopes when we move from one generator to another, corresponding to a generators
emergency operating region. This assumption allows us to assume a twice-continuously
dierentiable market supply function.
13
Where given amount is understood to encompass location-specic electricity demand. Given
amount may also be interpreted as optimal amount, if we assume price-responsive demand.
14
We will not examine minimum output constraints (minimum run constraints), other than to assume
that a generators output cannot be negative.
15
Which we leave to the operations research analysts.
16
Since we are examining the AC power problem, we will bring reactive power back into the discussion.
However, the total cost of power generation generally depends only on the entries of x corresponding to
real power generation. See Baldick (2006, p. 595). In some formulations, f() will also depend on reactive
power generations.
17
See Stoft (2002, Chapter 1-6).
65
As for the constraint system, g(x) = 0, Kirchhos current law implies that the net ow
of power away from a given node, P

, is equal to the net sum of the ows on all lines


connected to that node, as given by the power ow eqs. (1A.14) and (1A.15), which we
repeat below:
P

= u
2

kJ()
u

u
k
[G
k
cos(

k
) +B
k
sin(

k
)],
Q

= u
2

kJ()
u

u
k
[G
k
sin(

k
) B
k
cos(

k
)].
Since we are working with AC power we do not linearize these equations, but take them
into account as non-linear constraints that must be satised in our analysis.
We now move on to the equations for real and reactive power ow on line k (1A.12)
and (1A.13):
p
k
= (

)
2
G

+u

u
k
[G
k
cos(

k
) +B
k
sin(

k
)],
q
k
= (

)
2
B

+u

u
k
[G
k
sin(

k
) B
k
cos(

k
)].
In the usual case that there is a real power ow limit, p
k
, on all lines in the transmission
system, we express these constraints as p
k
p
k
p
k
, k J

.
18
As mentioned above,
this is the constraint system h h(x)

h.
Note that the objective function will typically be convex, but because (1A.14) and (1A.15)
are nonane in x, we will generally not have a convex choice set. As Baldick (2006,
pp. 599-601) demonstrates, however, it can be shown that the feasible set dened by
the constraints p

(x) 0 and q

(x) 0 is convex under the assumption that all voltage


magnitudes are constant, provided that |

k
| 0.1 radian.
19
As one might readily guess, this problem gets complicated really quickly as one examines
larger test networks. Therefore, in practically every AC OPF paper you might read, the
authors simply present the classical AC OPF formulation, write the Lagrangian of the
problem, present rst-order conditions, and solve the problem using various computer
algorithms. To keep the problem manageable, we will use a three-node model. Additionally,
18
Some transmission lines have dierent ow limits for opposite directional ows, a fact that we will
ingore in our analysis.
19
As Baldick (2006) notes, stating that p

(x) 0 and q

(x) 0 is equivalent to assuming that power can


be thrown away, and can be justied by the observation that as long as a generator is not at its minimum
operating range, equality can be reestablished by lowering the output of the generator in question. Since
we are presenting the AC problem only to make the linearized AC problem easier to understand, we will
not go into the proof of this proposition.
66
we will add as few constraints as possible while preserving the avor of the power ow
problem.
Example 2.8: 3-node optimal power ow problem.
Let us examine the three-node network shown below:
1
2
3
P
g1
= ?
u
1
= 1.0

1
= 0.0
P
d3
= 0.6
u
3
= 1.0

3
= 0.0
P
g2
= ?
u
2
= ?

2
= ?
Figure 2: The 3-Node Network
Even with just a three-node network, if we include reactive power in the calculations, the
problem gets unwieldy, so we focus on real power alone. The transmission line character-
istics are given in Table 2.1 below:
Table 2.1: Transmission Line Characteristics
Line Impedance Admittance Real Power Flow Limit
1 2 0+j0.001 -j1,000 0.3 p
12
0.3
1 3 0+j0.002 -j500 0.3 p
13
0.3
2 3 0+j0.001 -j1,000 0.4 p
23
0.4
Again, just to keep the problem manageable, we will assume away voltage constraints at
nodes 2 and 3, as well as generator operating constraints.
We assume that there are two generators, located at nodes 1 and 2, and let their cost of
67
real-power production be given by:
f
1
(P
1
) = P
1
+ 0.2(P
1
)
2
,
f
2
(P
2
) = 1.1P
2
+ 0.1(P
2
)
2
,
respectively, where P
i
, i = 1, 2 is real power generation.
We set up the augmented Lagrangian for the OPF problem, using a logarithmic barrier
function with barrier parameter t = 1.0, and perform one run of the Newton-Rhapson
algorithm using initial estimates:
P
0
1
= 0.3, P
0
2
= 0.3, , u
0
2
= 1.0,
0
2
= 0.0, u
0
3
= 0.0
Returning to (2.46), rst focus on the constraints, g(x) = 0, and h h(x)

h. The rst
constraint corresponds to KCL. This law must hold at all three buses of the network, but
since we have only two degrees of freedom, we will express KCL at nodes 2 and 3 only,
with node 1 current ow determined residually. Ignoring shunt elements, KCL applied to
nodes 2 and 3 yields:
P
2
= u
2
u
1
[G
21
cos(
2

1
) +B
21
sin(
2

1
)]
+u
2
u
3
[G
23
cos(
2

3
) +B
23
sin(
2

3
)] + (u
2
)
2
G
22
,
P
3
= u
3
u
1
[G
31
cos(
3

1
) +B
31
sin(
3

1
)]
+u
3
u
2
[G
32
cos(
3

2
) +B
32
sin(
3

2
)] + (u
3
)
2
G
33
,
respectively. Substituting the assumed values for u
1
, G
ij
, B
ij
, and
1
yields:
P
2
= u
2
[1, 000 sin
2
] +u
2
u
3
[1, 000 sin(
2

3
)],
P
3
= u
3
[500 sin
3
] +u
2
u
3
[1, 000 sin(
2

3
)].
OK. So far, so good. Moving on to the transmission line constraints, notice that there are
three lines with upper and lower bounds on real power ow, for a total of six constraints.
Taking the general formula for real power ow across line k:
p
k
= (

)
2
G

+u

u
k
[G
k
cos(

k
) +B
k
sin(

k
)]
and inserting the parameters corresponding to the three lines yields:
p
12
=
2
[1, 000 sin(
2
)], (2.47)
p
13
=
3
[500 sin(
3
)], (2.48)
p
23
=
2

3
[1, 000 sin(
2

3
)]. (2.49)
Since power is being withdrawn at node 3, power will ow along lines 13 and 23 towards
node 3, as reected in (2.48) and (2.49), respectively. Because the impedance of line 13 is
68
greater than that of line 23, our initial estimate is that the net power ow on line 12 will
be from node 1 towards node 2, as reected in eq.(2.47). We rewrite the line constraints
shown in Table 1 as:
0.3 u
2
[1, 000 sin(
2
)], u
2
[1, 000 sin(
2
)] 0.3,
0.3
3
[500 sin(
3
)],
3
[500 sin(
3
)] 0.3,
0.4
2

3
[1, 000 sin(
2

3
)],
2

3
[1, 000 sin(
2

3
)] 0.4.
We may now express the AC optimal power ow problem as:
min
P
1
,P
2
: P
1
+ 0.2(P
1
)
2
+ 1.1P
2
+ 0 + (P
2
)
2
,
s.t.
u
2
[1, 000 sin(
2
)] +u
2
u
3
[1, 000 sin(
2

3
)] P
2
= 0 ,
u
3
[1, 000 sin(
3
)] +u
2
u
3
[1, 000 sin(
3

2
)] + 0.6 = 0,
0.6 P
1
P
2
= 0,
u
2
[1, 000 sin(
2
)] 0.3,
u
2
[1, 000 sin(
2
)] 0.3,
u
3
[500 sin(
3
)] 0.3,
u
3
[500 sin(
3
)] 0.3,
u
2
u
3
[1, 000 sin(
2

3
)] 0.4, and
u
2
u
3
[1, 000 sin(
2

3
)] 0.4.
Note that each of these six equations will have an associated slack variable, turning the
inequality- into an equality constraint. Also, the since we want the slack variable to be
bounded away from zero, we will attach a barrier function to the it. This motivates the
augmented Lagrangian:
L = P
1
+ 0.2(P
1
)
2
+ 1.1P
2
+ 0.1(P
2
)
2
+t(ln P
1
) +t(ln P
2
) +t(ln w
1
) +t(ln w
2
)
+t(ln w
3
) +t(ln w
4
) +t(ln w
5
) +t(ln w
6
)
+
1
(0.6 P
1
P
2
) +
2
_
u
2
[1, 000 sin(
2
)] +u
2
u
3
[1, 000 sin(
2

3
)] P
2
_
+
3
_
u
3
[1, 000 sin(
3
)] +u
2
u
3
[1, 000 sin(
3

2
)] + 0.6
_
+
4
_
u
2
[1, 000 sin(
2
)] w
1
+ 0.3
_
+
5
_
u
2
[1, 000 sin(
2
)] +w
2
0.3
_
+
6
_
u
3
[500 sin(
3
)] w
3
+ 0.3
_
+
7
_
u
3
[500 sin(
3
)] +w
4
0.3
_
+
8
_
u
2
u
3
[1, 000 sin(
2

3
)] w
5
+ 0.4
_
+
9
_
u
2
u
3
[1, 000 sin(
2

3
)] +w
6
0.4
_
.
69
The rest of the problem is left as an exercise to the intrepid reader.
In any event, we now see that the AC OPF formulation is quite involved, and its solution
is best left to the computer. We will use the principles contained herein to aid us in our
understanding of the DC approximation to the AC OPF problem, as addressed in chapter
3.
70
2.A Shunt elements
While it is usual to ignore shunt elements in DC analysis, the same is not true of AC, so we
examine this topic here. As mentioned previously, shunt elements ground overhead lines.
We will now explore two properties of shunt elements: conductance and capacitance. As per
Grainger and Stevenson (1994), conductance exists between conductors or between conduc-
tors and the ground. Conductance accounts for current leakage at insulators of overhead
lines and the insulation of cables. Since leakage at insulators of overhead lines is trivial,
conductance between conductors of an overhead line is usually neglected. Another reason
electric engineers generally ignore conductance is that leakage at insulators, the principal
source of conductance, changes considerably with atmospheric conditions and with the
conducting properties of dirt that collects on the insulators. Another eect, corona, which
causes line leakages, also changes appreciably with the weather. Since conductance also
has a minimal eect on shunt lines, electrical engineers will generally ignore its eect on
shunt admittance. Capacitance of a transmission line results from the potential dierence
between conductors. Capacitance causes conductors to be charged in the same manner as
the plates of a capacitor when there is a potential dierence between them. Capacitance
between parallel conductors is a constant depending on the size and spacing of the conduc-
tors. For lines less than roughly 50 miles (80 km) long, the eect of capacitance is small
and is often neglected. Without going into the technical details,
20
the capacitance of a
transmission line is an increasing function of the (positive) charge of q (Coulombs/meter)
and inversely related to the voltage drop along the line, (v). In the most basic example,
we consider a transmission line composed of two parallel wires. In this case, capacitance
is dened as the charge on the conductors (wires) per unit of potential dierence between
them. That is:
C =
q
v
F/m (2A.1)
(Farads per meter). We calculate the voltage drop v
ab
between the two wires of a two-
wire line by computing the voltage drop due to the charge q
a
on conductor a and the
voltage drop due to the charge q
b
on conductor b. The voltage drop from one conductor
to another due to the charges on both conductors is the sum of the voltage drops caused
by each charge alone. This last point is central to our (simplied) analysis. For a two-
wire line, we have that q
a
= q
b
.
21
The other determinant of voltage induced by a given
charge is the radius of the wire. Assuming that the two wires have the same radius, we
have |v
ab
| = |2v
a
| = |2v
b
|. Since the ground has voltage = 0V , by denition, the voltage
dierence between each conductor (of a two-wire shunt element) is v
s
= v
a
= v
b
=
1
2
v
ab
.
Referring to Eq. (2A.1), capacitance to ground, or capacitance to neutral, is thus equal
20
See, e.g. Grainger and Stevenson (1994; Ch. 5).
21
As an EE fried why this holds.
71
to twice the capacitance between the two conductors, or C
n
= 2c
ab
We represent these
concepts diagrammatically as follows:
a
b
a
b
C
ab
C
an
= 2C
ab
C
bn
= 2C
ab
n
a: Representation of line-to-line capacitance b: Representation of line-to-neutral capacitance
72
Chapter 3
DC Optimal Power Flow and LMP
Derivation
Having covered the principles of AC and DC power ow, as well as the construction of an
optimal power ow problem, we now address the DC OPF problem and associated LMP
calculations.
The DC OPF problem is the minimization of cost of electricity output, subject to trans-
mission constraints. There are two ways to determine the amount of electricity production.
First, we may assume that consumers face the real-time price of electricity, and we thus con-
struct downward-sloping, location-specic, electricity demand curve. The OPF problem is
thus to maximize aggregate social surplus, subject to the various technological constraints
present in electricity systems. We will demonstrate this method analytically. Most con-
sumers have never even seen a real-time electricity price, though. For this reason (and
because it is less complicated to do so), we assume that electricity demand is perfectly
inelastic when solving for LMPs in our sample network.
3.1 Analytical Presentation of DC Optimal Power Flow
We start by analyzing a generic K node transmission network, connecting N K distinct
nodes. We model the relevant concepts as follows:
Generation. Denote the number of generating units as J. Unit j has maximum output
p
j
. We ignore unit outages, so that each unit is always available up to full output. Let
p
j
(t) denote the output of unit j at time t. The associated constraint is then:
0 p
j
(t) p
j
, j = 1, 2, . . . , J.
73
Demand. We model consumers as price-taking expected prot-maximizing rms. Let f
i
be the short-run value-added function of consumer is use of electricity. Thus, f
i
is consumer
is prot, minus the cost of all nonelectricity inputs. Denoting consumer i

s electricity use
as d
i
, the consumers prot is given by:

i
(t) = f
i
(d
i
(t))
i
(t)d
i
(t),
where we denote the price of electric energy as . The associated rst-order condition is:
f
i
(d
i
(t))
d
i
(t)
=
i
(t).
Transmission. Flows along line k at time t are given by p
k
(t). Ignoring losses, an
electric power system has the energy balance constraint:
J

j=1
p
j
(t) =
I

i=1
d
i
(t).
Assuming no transmission line failures, real power ow over each line must satisfy:
p
k,min
p
k
(t) p
k,max
.
Additionally, power ows depend on generation and demand at each node, or:
p(t) = p
_
p(t), d(t), H
_
.
where H is again the shift factor matrix.
Optimization. We use the standard welfare criterion of maximizing consumers plus
producers surplus, subject to the system constraints. For pricing and dispatch decisions,
we maximize short-run welfare with a xed capital stock. The Lagrangian to be maximized
over all generation levels p
j
and over prices
i
(t) is:
L =

i
f
i
_
d
i
(
i
(t))
_
(consumer value added)

j

j
p
j
(t) (fuel costs)
+(t)
_
j
p
j
(t)

i
d
i
(t)

(energy balance) (3.1)

j
u
j
(t)[p
j
(t) p
j
] (unit capacity constraint)

k
( p
k
(t) p
k,max
)
k,max
(t) (upper transmission constraint)
+

k
( p
k
(t) p
k,min
)
k,min
(t) (lower transmission constraint),
74
where
j
is the cost of fuel required by plant type j to produce one unit of electricity and
(t), u
j
(t), and
k(max, min)
(t) are the Lagrange multipliers on the energy balance, unit
capacity, and upper and lower transmission constraints, respectively. Note that we may
dene the shadow value of an additional unit of transmission capacity on a given line as:

k
(t) =
k,max
(t)
k,min
(t).
Since both constraints cannot bind simultaneously, for any given network conguration we
have:

k
(t) (
k,max
(t),
k,min
(t), 0).
Dierentiating (3.1) with respect for demand yields the equation for the LMP for consumer
i as:

i
(t) = (t) +

k
p
k
(t)
d
i
(t)

k
(t). (3.2)
Of course, a few words are in order here. Note that we have not considered losses in our
example, otherwise we would see the term
t

t
d
i
(t)
in this expression, where (t) is the
transmission-line loss function. Also notice that we generally speak of LMP at a given
node, rather than the LMP facing a given consumer. Replacing the subscript i with j does
the trick (that is, we assume that each consumers load is located at a single node only).
Now lest us examine this equation in further detail. We introduced the concept of a
reference- or base node earlier. This node is also known as a swing bus.
1
While having
used the base node as the reference for the shift factor matrix, we treat the swing bus very
much like a numerarie good in LMP calculation. That is to say, we take the marginal cost
of a unit of power delivered to the swing bus as a xed point of reference, and we compute
the prices of electricity delivered to any and all other buses in the network by referencing
the price at the swing bus. Thus, in (??), the value
t
is the reference price the value of
supplying an additional unit of demand at the swing bus. Although it is not necessary, it
is easiest to assume that the marginal generator is located at the swing bus. In this case,
the marginal cost of generation for the system (known as system lambda) is the same as
the marginal value/marginal cost of generation at the swing bus.
The remaining term in the equation is the cost of congestion
2
associated with supplying
an additional unit of demand located at generic network node k. This term states that the
cost of congestion at any node at time t is equal to that the contribution of an additional
1
Bus and node are synonomous in the EE literature.
2
In the EE literature, a transmission system is said to be congested when transmission line limitations
necessitate curtailing the output of certain generator(s) because one or more of the systems lines is at its
upper or lower limit.
75
unit of demand of consumer i to power ow over a specic line in the network,
p
k
(t)
d
i
(t)
,
multiplied by the shadow value of incremental capacity of that same line, or
k
(t), summed
over all of the lines of the network.
3
3.2 DC Optimal Power Flow Demonstration
Having developed the general OPF formulation, we now demonstrate DC OPF with an
example. And yes, while we stated in the introduction that the goal of this primer is to
enable the reader to understand and formulate more complex examples than the simple
3-node model, we will demonstrate the DC OPF model and associated LMP calculations
using a 3-node model, because adding more nodes and transmission lines to the example
makes it simply too unruly to demonstrate longhand.
4
This is not a problem, though as
the tools demonstrated here may be applied to a network model of any size.
Let us begin with the 3-node transmission network, shown below:
1
2
3
Figure 3.1: The 3-Node Network
0 +j1
500 p
13
500
0 +j1
500 p
12
500
0 +j1
500 p
23
500
We assume there is a single generator located at node 1 with a constant marginal cost
of MC
1
= $40/megawatt hour (MWh) up to its maximum output of p
1
= 850 MW.
Likewise, there is a single generator located at node 2 with a constant marginal cost of
MC
2
= $110/MWh up to its maximum output of p
2
= 150 MW. Load (i.e. energy
3
Yes, the notation still leaves something to be desired.
4
In fact, even with only three nodes, full demonstration of the problem is suciently messy that we will
gloss over some of the details.
76
demand, d
3
) at node 3 is assumed to be perfectly inelastic at 800 MW. Transmission line
characteristics are shown in Figure 3.1.
We start by calculating the shift factor matrix, which we will use to calculate the equations
for the maximal amount of output the two generators may produce without violating any
of the constraints on the systems three transmission lines. Applying (1.31) and (1.32) to
this example yields:
P
1
= sin(
1

2
) + sin(
1

3
),
P
2
= sin(
2

1
) + sin(
2

3
),
P
3
= sin(
3

1
) + sin(
3

2
).
Just for practice, we will calculate the full nodal Jacobian, J
(0)
before calculating the
reduced nodal Jacobian,

J
0
. We calculate the rst row of the former as follows:
J
(0)
11
=
P
1

=0
= 2
J
(0)
12
=
P
1

=0
= 1
J
(0)
13
=
P
1

=0
= 1
Calculating the remaining terms, we have:
_
J
(0)
_
=
_

_
2 1 1
1 2 1
1 1 2
_

_
Since power is being withdrawn at node 3 in this example, we will denote this node as
the reference node. Thus, subtracting the row and column of the full nodal Jacobian
corresponding to node 3, the reduced Jacobian is:
_

J
(0)
_
=
_
2 1
1 2
_
.
Thus:
_

J
(0)
_
1
=
_
2
3
1
3
1
3
2
3
_
.
77
Likewise, we calculate the network incidence matrix. Since there are three lines in the
network, the network incidence matrix is derived from the three equations:
p
12
= 1 sin(
1

2
),
p
23
= 1 sin(
2

3
),
p
13
= 1 sin(
1

2
),
where we have implicitly calculated B
k
= Y
k
/

1. Note from our discussion in chapter


1, since load is located at node 3, we know that power will ow from nodes 1 and 2 to node
3, accounting for the node ordering in (??) and (1.19). Since there are generators located
at both nodes 1 and 2, the direction of power ow across line 12 is ambiguous. But, given
the parameter values chosen, direction of power ow in (1.11) will obviously be from node
1 to node 2 (by the principle of superposition, which we do not explicitly address here).
We derive the network incidence matrix by holding the phase angle at the reference node
(the reference angle) xed, and forming the matrix of partial derivatives with respect to
the systems other (two) angles. Thus, we have:
K =
_

_
p
12

1
p
12

2
p
23

1
p
23

2
p
13

1
p
13

2
_

_
=
_
_
1 1
0 1
1 0
_
_
.
And thus, the shift factor matrix, K
_

J
(0)
_
1
is:
H =
_

_
1
3

1
3
1
3
2
3
2
3
1
3
_

_.
Post-multiplying the shift factor matrix by [P
1
P
2
]

yields power ow across each of the


systems three lines as:
p
12
=
1
3
P
1

1
3
P
2
,
p
23
=
1
3
P
1
+
2
3
P
2
,
p
13
=
2
3
P
1
+
1
3
P
2
.
78
Since each of the three transmission lines have upper and lower limits equal to |500|, we
may write the constraints for lines 12, 23, and 13 in terms of generation at nodes 1 and 2
as:
500
1
3
P
1

1
3
P
2
500,
500
1
3
P
1
+
2
3
P
2
500,
500
2
3
P
1
+
1
3
P
2
500,
respectively. Since there is no demand (load) at either nodes 1 or 2, we have that P
i
=
p
i
, i = 1, 2. We thus present the DC OPF problem as:
min
p
1
,p
2
TC = 40p
1
+ 110p
2
,
s.t.
500
1
3
p
1

1
3
p
2
500,
500
1
3
p
1
+
2
3
p
2
500,
500
2
3
p
1
+
1
3
p
2
500,
p
1
+p
2
= 800,
0 p
1
850, and
0 p
2
150.
The standard solution method for minimizing a linear problem is linear programming,
using the simplex method. Unfortunately, full demonstration of the solution method of
this problem by hand is suciently messy (and provides only marginal insight) to be
beyond the scope of this paper.
5
We rst simplify the problem by focusing only on the
relevant components to the problem. Specically, we will ignore all nonbinding transmission
constraints, as they have no impact on the solution. Ignoring the irrelevant constraints
5
For those uniniatiated in linear programming, we suggest Bradely et.al (1977).
79
(since we already know the answer), the problem reduces to:
min
p
1
,p
2
TC = 40p
1
+ 110p
2
,
s.t.
2
3
p
1
+
1
3
p
2
500,
p
1
+p
2
= 800,
p
1
850,
p
2
150, and
p
1
, p
2
0.
To solve this problem by linear programming, we add slack variables to ( ) and an articial
variable to eq (). The linear programming problem in canonical form is then:
min
p
1
,p
2
TC = 40p
1
+ 110p
2
,
s.t.
2
3
p
1
+
1
3
p
2
+p
3
= 500,
p
1
+p
4
= 850,
p
2
+p
5
= 150,
p
1
+p
2
+p
6
= 800, and
p
1
, . . . , p
6
0.
6
Those with linear programming experience will note that x
3
, x
4
, and x
5
are slack variables,
while x
6
is an articial variable.
At this point, let us examine the nal tableau (showing the solution to the linear program-
ming problem).
Looking in the CV column, the nal tableau tells us that the values of p
1
and p
2
are 700 and
100, respectively (because any variable not uniquely determined in the nal tableau is set to
zero). The values of p
4
and p
5
yield the excess capacity at the two plants. The coecients
on p
3
and p
6
(read in the (z) row of their respective columns) are the reduced costs for
these variables. They are also the negative of the shadow values/Lagrange multipliers for
these variables. Thus, increasing x
3
, transmission capacity on line AC, by one unit would
decrease the value of the objective function by $210. Likewise, an additional unit of load at
node 3 would increase system cost by $180. The alert reader will note that these quantities
80
correspond to:

13
(t) = $210, and
(t) = $180.
We may also interpret (t) = $180 as the cost of delivering an additional unit of energy to
the reference node (node 3, in our formulation). Since this is the case, we have the node 3
LMP as
3
= $180/MWh, orLMP
3
= $180/MWh.
7
We may now calculate LMP
1
and LMP
2
by taking

i
(t) = LMP
i
= (t) +

k
p
k
(t)
p
k
(t)

k
(t), i = 1, 2.
8
obtaining:

1
= $180 +
_
2
3
_
($210) = $40/MWh, and

2
= $180 +
_
1
3
_
($210) = $110/MWh,
where (2/3) and (1/3) are the respective shift factors for power owing from nodes 1 and
2 along line 13,
p
12
(t)
p
1
(t)
and
p
12
(t)
p
2
(t)
, respectively, while $210 is the shadow value of
additional capacity on line 13, as mentioned above. In larger network examples when more
than one line is congested, one sums the shift factors associated with node i generation
times the shadow value of all congested lines. We did that (trivially) in this example as
well, implicitly multiplying all of node 1 and node 2 generations shift factors times the
shadow values of all lines, it just so happened that only one of the lines was congested, and
thus had a nonzero shadow value.
7
our demonstration is just a tad dierent than the standard one found in Schweppe et al. (1988) and
Bohn et al. (1984) because explaining shift factors in terms of incremental generation instead of incremental
demand makes more sense to us.
81
Bibliography
Baldick R (2006) Applied Optimization: Formulation and Algorithms for Engineering
Systems, Cambridge University Press, New York.
Bohn R, Caramanis M, Schweppe F (1984) Optimal Pricing in Electrical Networks
over Space and Time. Rand Journal of Economics 15(3): 360-376.
Bradley S, Hax A, Magnanti T (1977) Applied Mathematical Programming,
Addison-Wesley Reading, MA.
Bushnell J, Stoft S (1997) Improving Private Incentives for Electric Grid Investment.
Resource and Energy Economics 19(1-2): 85-108.
Dale C (1995) Basic Electricity & DC Circuits, Prompt Publications, Indianapolis, IN.
Glover J, Sarma T, Overbye T (2012) Power system Analysis and Design, 5th Edition.
Cengage Learning, Stamford, CT.
Grainger J, Stevenson W (1994) Power System Analysis. McGraw-Hill, Inc., New
York, NY.
Luenberger D (1984) Linear and Nonlinear Programming, 2nd Edition. Addison-Wesley
Publishing Company, Reading, MA.
Mittle V, Mittal A (2006) Basic Electrical Engineering, 2nd Edition, Tata McGraw-Hill
Education Private Limited. New Delhi, India.
Nash S, Stofer A (1996) Linear and Nonlinear Programming. McGraw-Hill, Inc.,
New York, NY.
Rau N (2003) Optimization Principles: Practical Applications to the Operation and
Markets of the Electric Power Industry, IEEE Press, Piscataway, NJ.
Schweppe F, Caramanis M, Tabors R, Bohn R (1988) Spot Pricing of Electricity,
Kluwer Academic Publishers.
Stoft S (2002) Power System Economics. IEEE Press, Piscataway.
82

Вам также может понравиться