control

© All Rights Reserved

Просмотров: 6

control

© All Rights Reserved

- CBRMEMS JCISE Reviseddraft 013110 Cobb Agogino
- Dynamic Economic Dispatch Using Model Predictive Control Algorithm
- A Heuristic Algorithm to Optimise Stope Boundaries
- Thèse Gustavo Mendoza
- 1-s2.0-S1751570X13000642-main.pdf
- Rehabilitation of Existing Structures by Optimal Placement of Viscous Dampers
- Optimum Design of Shell-And-tube Heat Exchanger
- Discrete Optimum Design of Cable-Stayed Bridges
- Cq 2643524359
- IEEE IndusElec2009 PWMconverters
- THESIS ABSTRACT
- STRUCTURAL OPTIMIZATION OF REAR BUMPER FOG LAMP PUNCHING MACHINE
- CRITICAL POINTS IN REACTIVE MIXTURES
- A BIM Based Dynamic Model for Site Material Supply 2016 Procedia Engineering
- A Novel Correlation for Prediction of Gas Viscosity
- Grid Search in Matlab
- 04-120921225450-phpapp02
- Four Tanks
- PGM_45
- 1902.09964.pdf

Вы находитесь на странице: 1из 20

journal homepage: www.elsevier.com/locate/trc

trafc networks q

Lucas Barcelos de Oliveira, Eduardo Camponogara *

Department of Automation and Systems Engineering, Federal University of Santa Catarina, Cx.P. 476, 88040-900 Florianpolis, SC, Brazil

a r t i c l e

i n f o

Article history:

Received 15 October 2008

Received in revised form 3 March 2009

Accepted 29 April 2009

Keywords:

Urban trafc networks

Split control

Distributed agents

Distributed optimization

Model predictive control

a b s t r a c t

The operation of large dynamic systems such as urban trafc networks remains a challenge

in control engineering to a great extent due to their sheer size, intrinsic complexity, and

nonlinear behavior. Recently, control engineers have looked for unconventional means

for modeling and control of complex dynamic systems, in particular the technology of

multi-agent systems whose appeal stems from their composite nature, exibility, and scalability. This paper contributes to this evolving technology by proposing a framework for

multi-agent control of linear dynamic systems, which decomposes a centralized model

predictive control problem into a network of coupled, but small sub-problems that are

solved by the distributed agents. Theoretical results ensure convergence of the distributed

iterations to a globally optimal solution. The framework is applied to the signaling split

control of trafc networks. Experiments conducted with simulation software indicate that

the multi-agent framework attains performance comparable to conventional control. The

main advantages of the multi-agent framework are its graceful extension and localized

reconguration, which require adjustments only in the control strategies of the agents in

the vicinity.

2009 Elsevier Ltd. All rights reserved.

1. Introduction

The steady advances in communications and computer technology are shaping the way trafc control systems are designed. Today, operating centers can receive data from remote sensors and apply control policies that respond to the prevailing trafc conditions. Among the existing real-time control systems, the Trafc-responsive Urban Control (TUC) framework

(Diakaki et al., 2002) has drawn interest for its simplicity, robustness, and good performance corroborated with eld applications in Munich, Southampton, and Chania (Bielefeldt et al., 2004; Diakaki and Papageorgiou, 1997; Kosmatopoulos et al.,

2006). TUC uses a modied store-and-forward model of trafc ow (Gazis and Potts, 1963) with purely continuous state and

control variables which greatly simplies the synthesis of a control strategy. In its baseline form, TUC has an off-line and an

on-line module (Diakaki et al., 2002). The off-line module solves an unconstrained linear-quadratic-regulator (LQR) problem

that minimizes a quadratic cost function on queue lengths and deviations from nominal split signals. The on-line module

produces feasible split signals, which satisfy green time bounds and add up to cycle time, by solving a quadratic program

that minimizes the distance from the infeasible signals obtained with the LQR policy. Invariably, such a framework does

not necessarily reach optimal solutions to the underlying constrained control problem (Camacho and Bordons, 2004). To this

end, model predictive control (MPC) approaches have been proposed to explictly handle constraints and thereby improve

solution quality of the TUC framework (Aboudolas et al., 2007; de Oliveira and Camponogara, 2007).

q

This research was supported in part by Conselho Nacional de Desenvolvimento Cientco e Tecnolgico (CNPq) under Grant #473841/2007-0.

* Corresponding author. Tel.: +55 48 3721 7688; fax: +55 48 3721 9934.

E-mail address: camponog@das.ufsc.br (E. Camponogara).

0968-090X/$ - see front matter 2009 Elsevier Ltd. All rights reserved.

doi:10.1016/j.trc.2009.04.022

121

The technology of multi-agent systems has also advanced in the past decades, particularly so in articial intelligence and

software engineering (Jennings, 2000; Maturana et al., 2005). This evolving technology aims to arrange agents of limited perception and expertise in an organization to perform tasks that are beyond the abilities of the agents. The problem-solving

ability of a multi-agent system emerges from the interactions of the agents, which employ some form of reasoning to cooperate with others and resolve conicts when driven by the interests of the organization.

Intelligent agents and multi-agent systems have been successful in solving unstructured problems (for which adequate

models are not known), in the replacement of and assistance to humans (Toms and Garcia, 2005; Rigolli and Brady,

2005; Nguyen-Duc et al., 2008), in solving high abstraction problems (Pechoucek et al., 2006; Tumer and Agogino, 2007),

and in handling discrete decisions (Yamashita et al., 2005; de Oliveira et al., 2005; Balan and Luke, 2006). The nature of these

problems contrasts with dynamic control problems, which are typically structured (for which good models based on differential equations are known) and where the aim is to control machines, the decisions are of low level and demand guarantees

of stability and convergence, and the control variables are continuous.

While multi-agent systems are very adaptive in unstructured problems, they have been mostly used as a software engineering paradigm in the eld of dynamic control systems (Maturana et al., 2005; Srinivasan and Choy, 2006; Tatara et al.,

2007). Control engineers and computer scientists are bridging the gap between these disciplines by developing multi-agent

systems to cope with the sheer size and complexity of large dynamic control systems (Li et al., 2005; Manikonda et al., 2001;

Tatara et al., 2005; Negenborn et al., 2008). The appeal for multi-agent technology stems from the composite nature, exibility, and scalability.

Aligned with these efforts, this paper proposes a framework for a network of distributed agents to control linear dynamic

systems, which are put together by interconnecting linear sub-systems with local input constraints. Our framework decomposes the optimization problem arising from the MPC approach into a network of coupled, but small sub-problems to be

solved by the agent network. Each agent senses and controls the variables of its sub-system, while communicating with

agents in the vicinity to obtain neighborhood variables and coordinate their actions. A well-crafted problem decomposition

and coordination protocol ensure convergence of the agents iterates to a global optimum of the MPC problem.

The work reported here builds upon preceding work on distributed control (Camponogara et al., 2002; Camponogara and

Talukdar, 2007) by exploiting the linear dynamic structure to develop simpler models and algorithms. The paper focuses on

the development of the multi-agent MPC framework and its application to the control of signaling split in urban trafc networks. While being able to attain performance comparable to centralized MPC, the multi-agent MPC framework is more robust in that the failure of a control agent compromises only its local sub-system. And it also supports a plug-in technology

that allows for graceful expansion and reconguration to be performed locally, rather than having to coordinate at the control center.

The remaining sections are structured as follows. Section 2 presents basic concepts of urban trafc networks and describes the store-and-forward model used by the TUC strategy. Section 3 formulates split control as an MPC problem for

a network of dynamically coupled sub-systems, one for each intersection. Last but not least, the section develops a decomposition of the MPC problem into a set of sub-problems and outlines a distributed algorithm for the agent network to reach

an optimal solution. Section 4 reports results from computational experiments aimed to compare the TUC LQR strategy with

the multi-agent MPC approach. Section 5 draws some nal remarks and suggests directions for future work.

2. Urban trafc control

The origin of urban trafc control dates back to the early 20th century with the appearance of trafc lights. The rst attempts of real-time trafc control began in the 1980s with the implementation of SCOOT (Robertson and Bretherton, 1991;

Hunt et al., 1981) and SCATS (Lowrie, 1982) strategies. Nevertheless, despite the continuous research in the past decades,

most of the control strategies still rely on heuristics to compute the signaling split such as the acclaimed TRANSYT (Robertson,

1969).

Urban trafc control is usually divided in several modules which are responsible for several aspects of trafc control.

These modules include ramp metering, dynamic message signaling, signaling split control, and public transport. By split

we mean the green light time assigned to each street or road of an intersection. This is one of the four control factors that

mostly inuence trafc (Diakaki, 1999; Papageorgiou, 2004), with the others being stage specication, cycle duration, and

offset between intersections. The signaling split control module of the TUC strategy is of particular interest to this paper.

The trafc-responsive urban control framework uses a store-and-forward model which represents trafc ow with continuous variables, thereby facilitating the synthesis of multi-variable control algorithms such as LQR and MPC. The underlying assumption of this store-and-forward model is shown in Fig. 1. The bold full line in colors green1 and red represent

the cycle of a junction. The square wave in full line represents the usual trafc ow model of a single stream of vehicles, using

integer variables to differentiate periods with right of way and saturated ow, associated with the green portion of the cycle

line, from periods with no ow, where the cycle line is red. The dashed line on the other hand represents the same ow of vehicles as seen by the model proposed by Gazis and Potts (1963). From this illustration, one can view the store-and-forward model

as the mean ow crossing the stop line of an intersection during the control interval, meaning that this interval has to be greater

For interpretation of color in Fig. 1, the reader is referred to the web version of this article.

122

Fig. 1. Store-and-forward ow (dashed line) and ow modeled with binary variables (full line).

than the intersections cycle time. The TUC trafc model does not try to realistically model the complex and rapidly evolving

dynamics of trafc, such as driver reaction times, acceleration, and deceleration, but rather it is concerned with the long term

evolution of the in-and-outows of the network.

2.1. Urban trafc network modeling

The models for trafc network and trafc ow presented below are from (Diakaki, 1999). A urban trafc network consists

of intersections or junctions joined by links which represent streets, avenues, roads or any other infrastructure connecting

them. A junction comprises a set of approaches ending at a common crossing area. An approach is a subset of the lanes of

a link from which vehicles are able to cross the intersection simultaneously, being dened by the topology and stages of

the network. A stage, or phase, is the period of time during which the trafc light signals are held constant at the intersection.

Approaches may also be further divided into one or more streams. The maximum ow that can cross the stop line of an intersection when a stream has the right of way (r.o.w.) is the saturation ow, which is usually expressed in vehicles per hour. The

yellow time introduced between consecutive phases to ensure safety is known as lost time. And the time frame until the repetition of stages is called cycle time or cycle. These concepts are the building blocks for trafc modeling.

Fig. 2 shows a urban trafc network with two roads each of which has 4 lanes. Taking the horizontal link on the westeast

direction as the reference, one notices two distinct approaches: one bundling vehicles willing to make a left turn and the

other bundling the vehicles wishing to go straight ahead. The arrows show all streams of this network. The gure also illustrates the three stages of the intersection which are repeated in each cycle.

A urban trafc network is therefore viewed as a directed graph whose nodes are the junctions j 2 J and whose arcs correspond to the links z 2 Z. The sets Ij and Oj have the incoming and outgoing links of junction j, respectively. The routes of

123

vehicles entering the network are assumed to follow statistical patterns that are modeled by turning rates. Specically, the

turning rate sz;w gives the rate of vehicles that reach a junction j from link z 2 Ij and turn into a link w 2 Oj . For the purpose of

the trafc control analysis herein, turning rates sz;w , cycle times C j and lost times Lj at the junctions, and saturation ows Sz

for the links are all known constants.

Let F j be the set of phases at junction j, while uj;i denotes the green time of phase i 2 F j . It is typical to have all intersections

P

operating with a common cycle time C, which is enforced by the constraint i2F j uj;i Lj C. An additional constraint is

h

i

max

uj;i 2 umin

umax

where umin

is the minimum (maximum) allowable green time. Also, let V z # F j be the subset of

j;i ; uj;i

j;i

j;i

phases for which link z has the r.o.w. at junction j.

The trafc ow dynamics of the network link z in Fig. 3 is given by

Dxz t 1 DTqz t dz t pz t cz t

where t 1; 2; . . . is a discrete time index and DT is the control interval; xz denotes the number of vehicles in link z; qz (pz ) is

the inow (outow) of link z during the time window DTt; t 1; dz is the demand, that is, the vehicles not originating from

adjacent links that enter the network; and cz is the exit ow.

Because turning rates are known, the trafc ow into link z is expressed as

qz t

sw;z pw t

w2Ij

where sw;z is the turning rate towards link z 2 Oj1 coming from link w 2 Ij1 . Demand and exit rates are lumped together as a

single disturbance, say ez t. Assuming that inows and outows of link z with r.o.w. are equal to their saturation ow, Sz , Eq.

(1) becomes

2

xz t 1 xz t DT 4

w2Ij

3

Sw X

Sz X

sw;z

uj ;i t

uj ;i t ez t5

C i2V 1

C i2V 2

w

P

where the control signal uj1 ;i t is the green time for vehicles going through junction j1 during phase i, whereas i2V z uj2 ;i t is

the green time for vehicles leaving link z. Notice that link z starts at junction j1 and ends at j2 . Generalizing Eq. (2) for all

network links leads to the matrix equation

xt 1 Axt But et

where xt is the state vector; ut is the control vector containing signals uj;i t; 8i 2 F j ; 8j 2 J; et is the vector with the

disturbances; and A I is the state matrix, whereas B is the control input matrix.

2.2. Split control

Trafc-responsive control systems adjust split signals according to the demands of involved streams. In standard form,

the TUC strategy uses the LQR technique to nd a time-invariant gain matrix, which is simpler than optimizing a performance criterion (Diakaki et al., 2002) but potentially delivering a sub-optimal control law. To apply the LQR technique,

the disturbances are disregarded and the dynamic system (3) becomes:

xt 1 Axt But

Such assumption is plausible since the goal is to attain a satisfactory gain matrix. The minimization of the proportional

, where xmax

is the link capacity, is attempted to reduce the risk of oversaturation and spillback.

occupancy of the links xz =xmax

z

z

To this end, the following quadratic function is used:

124

1

1X

kxtk2Q kutk2R

2 t0

where Q and R are diagonal matrices, with the rst being positive denite and the second being positive semi-denite2.

According to the LQR theory, an innite time horizon is used in (5) to achieve a time-invariant control law. As matrix Q weighs

the states (the number of vehicles in the roads), the minimization of the average occupancy is achieved by making its diagonal

2 for the corresponding links z 2 Z. Matrix R reects the penalty imposed on control effort, usually

elements equal to 1=xmax

z

dened as R rI where r is found experimentally. Minimizing criterion (5) leads to the control law

ut Lxt

where L is Ricattis gain matrix which depends on A; B; Q , and R, but with small susceptibility to variation of these matrices

(Diakaki et al., 2002). The feedback control law (6) does not account for the constraints on the control signals, which are imposed in an ad hoc manner by solving the following problem at each sample time t and for each junction j 2 J

X

uj;i t U j;i t2

Q j t : min

U j;i t

7a

i2F j

s: to :

U j;i t Lj C j

7b

i2F j

max

U j;i t 2 umin

j;i ; uj;i ; 8i 2 F j

7c

where U j;i t is the closest solution in Euclidean space to uj;i t. Q j t is a quadratic program which can be solved in real-time

with an efcient algorithm (Diakaki, 1999) that converges in at most jF j j steps. Although this approach gives a feasible split,

the resulting solution does not necessarily satisfy the optimality conditions for the dynamic system dened by Eq. (4) subject

to the constraints on control signals. Actually, this multi-variable regulator behaves in a purely reactive way to unknown

disturbances because no predictions on disturbances are made. On the other hand, the structure of matrix L provides the

regulator with a gating effect, that is, the split of highly loaded links on peripheral junctions are reduced to preclude saturation in upstream links and thereby avoid gridlocks.

Previous works (Aboudolas et al., 2007; de Oliveira and Camponogara, 2007; de Oliveira, 2008) report that signicant

improvements may be induced by replacing the standard LQR control law with a procedure that accounts for systems constraints such as model predictive control. Generally speaking, the MPC approach is composed by (Camacho and Bordons,

2004; Khne, 2005)

a prediction model satisfactorily describing the process dynamics in a nite-time horizon;

a cost function which gives the control signals when minimized; and

a sliding horizon of prediction and control, which is translated a step forward at each sample period, requiring the computation of new control actions from which only that of the actual time is implemented.

Model predictive control minimizes the same cost function of LQR control, except that it covers a limited time frame given

by the prediction horizon. MPC is regarded as a feed-forward control strategy because a disturbance model can be embedded

in its prediction model. Nevertheless, the use of a disturbance model may mask the benets of the computation of a better

control signal under equal circumstances. Put another way, the dynamic model for trafc ow should be the same for TUC

and MPC strategies when comparing their performances. Following these principles, the MPC problem for signaling split control at time t is cast as

Pt : min

K

K1

X

X

1

1

^ t kjt

^ t kjt0 Ru

^ t kjt0 Q x

^t kjt

x

u

2

2

k1

k0

^tjt xt

s: to : x

8a

8b

For k 0; . . . ; K 1 :

^ t kjt

^t k 1jt Ax

^t kjt Bu

x

8c

^ t kjt P c

Cu

8d

^ t kjt d

Du

8e

where K is the length of the prediction horizon; xt is the current state of the trafc network at time t; x

^ t kjt is the control prediction for time t k, but only u

^ tjt is implemented with ut u

^ tjt; C

prediction for time t k; u

and c dene the inequality constraints; and D and d dene the equality constraints.

p

x0 Mx.

125

This section introduces the concept of linear dynamic network (LDN) which models the trafc ow dynamics and split

control problem shown above. It presents a distributed formulation Pt for LDNs which generalizes the MPC formulation

for split control given in Eqs. (8a)(8e). Further, this section develops a decomposition of Pt into a set fP m tg of sub-problems and proposes a distributed algorithm for an agent network to reach a solution to Pt by iteratively solving fPm tg.

3.1. MPC formulation

A dynamic network consists of the interconnection of M sub-systems that forms a graph G V; E, where each subsystem is a node in V and each arc i; j 2 E denes a coupling between sub-systems i and j. Vector xm 2 Rnm has the local

state and um 2 Rpm has the local controls of sub-system m. The state of sub-system m evolves in time depending on its local

state, local control signals, and the control signals at the up-stream sub-systems. For discrete-time dynamics, the state equation for sub-system m is:

xm t 1 Am xm t

Bmi ui t

i2Im

where t 2 N is the discrete sample time and Im fmg [ fi : i; m 2 Eg is the set of input neighbors of sub-system m, which

includes m and the up-stream sub-systems. The network state is x x1 ; . . . ; xM , whereas its control vector is

u u1 ; . . . ; uM . Clearly, the dynamic Eqs. (9) are collectively given by xt 1 Axt But for suitable matrices A and

B. Hereafter, the network dynamic system is assumed to be controllable3.

Given the network state xt, the MPC framework obtains the control signals for time t by solving the following quadratic

programming problem:

M

X

Pt : min

/m t

m1

M X

K

X

1

^ m t k 1jt0 Rm u

^ m t k 1jt

^m t kjt u

^m t kjt0 Q m x

x

2

m1 k1

10a

s: to :

^ m tjt xm t; m 2 M

x

^m t kjt

^ m t k 1jt Am x

For all m 2 M; k 2 K : x

10b

^ i t kjt

Bmi u

10c

i2Im

^ m t kjt P cm

Cmu

^ m t kjt dm

Dm u

10d

10e

^m t kjt is sub-system ms state prediction for time t k calculated at time t, whereas u

where x

control signal; Q m is positive semi-denite and Rm is positive denite; C m and cm (Dm and dm ) dene the inequality (equality)

constraints; and M f1; . . . ; Mg is the set with the indices of the sub-systems and K f0; . . . ; K 1g denes the prediction

horizon.

^ m tjt. The other control signals are calOnly the control signals predicted for time t are implemented, namely um t u

culated merely to predict the long-term effects of the present control actions and thereby avoid actions that have poor longterm performance. Because of this predictive feature, the framework is called model predictive control. At the next sample

time, t 1, the prediction horizon is rolled forward: the current state xt 1 is measured, Pt 1 is solved, and new control

signals um t 1 are obtained and implemented. The process continues indenitely receding into innity. This is why such

control framework is also known as rolling, sliding, and receding horizon control.

The test bed is the trafc network depicted in Fig. 4 with 13 one-way roads and six junctions. The state x3 x6 x7 0 of subsystem three has the number of vehicles in roads 6 and 7, while its control vector u3 u6 u7 0 has the green time for each

road. The coupling graph G appears in Fig. 5. The set of input neighbors to sub-system three is I3 f1; 3; 4g. Matrix B33

expresses the discharge of queues x3 as a function of green times u3 , while B31 B34 expresses how queues x3 build up as

x1 x4 are emptied. For the purpose of illustration,

B33 T

SC6

SC7

!

;

B34 T

!

;

B31 T

0

where T (seconds) is the control interval, si;j is the conversion rate from road i into j; Si (vehicles/s) is the saturation ow of

road i, and C (seconds) is the cycle time. The inequality constraints impose minimum and maximum green times on the

phases. The equalities guarantee that the total green time plus lost time (yellow time) add up to cycle time.

3

With A being n n and B being n m, the pair A; B is said to be controllable if the n nm matrix A AB A2 B An1 B has full row rank. For a controllable

plant xt 1 Axt But, there exist control vectors u0, u1, . . ., un 1 that force xn to the origin regardless of the initial state x0.

126

Fig. 4. Trafc network. The shaded area indicates sub-system 3 whose incoming queues are modeled by state variables x6 and x7 .

The elimination of linear dependencies and the aggregation of variables over the prediction horizon leads to an equivalent

form of Pt that simplies the design of algorithms. Note that sub-system ms state prediction for time t k is a function of

its state at time t and the control signals prior to time t k

^m t kjt Akm xm t

x

k

X

X

^

Al1

m Bmi ui t k ljt

11

l1 i2Im

^ m t u

^ m tjt; . . . ; u

^ m t K 1jt collect the control variables and x

^ m t x

^ m t 1jt; . . . ; x

^m t Kjt be

Let vector u

the state variables predicted over the time horizon. By dening matrices

Am

Bmi

6 2 7

6

6 Am 7

6 Am Bmi

7

6

6

m 6

mi 6

7 and B

A

6 .. 7

6

..

6. 7

6

.

5

4

4

AKm

K1

Bmi

Am

Bmi

..

.

..

AK2

m Bmi

7

0 7

7

7

7

0 7

5

Bmi

127

where 0 denotes a matrix with zeros of suitable dimension, the state predictions are calculated as

m xm t

^m t A

x

mi u

^ i t

B

12

i2Im

m IK Q and R

m IK Rm in terms of the Kronecker

Let In denote the identity matrix of dimension n. By dening Q

m

4

product , and using Eq. (12), the objective term /m t becomes

/m t

X

1

1 X X

1

0

0 Q

0 Q

^

^

0 Q

^

^ i t0 B

^

xm t0 A

xm t0 A

u

m m Am xm t

m m Bmi ui t

mi m Bmj uj t um t Rm um t

2

2

2

i2Im

i2Im j2Im

13

0 Q

gmi t B

for i 2 Im

mi m Am xm t

0

Hmij Bmi Q m Bmj for i; j 2 Im; i m or j m

mB

0 Q

m

mm R

Hmmm B

14b

X 1

0 Q

ct

xm t0 A

m m Am xm t

2

m2M

14d

mm

14a

14c

Pt : min

X X

1 X X X

^ i t0 Hmij u

^ j t

^ i t ct

gmi t0 u

u

2 m2M i2Im j2Im

m2M i2Im

mu

^ m t P cm ; m 2 M

s: to : C

m; m 2 M

^ m t d

Dm u

15a

15b

15c

m IK C m ; D

m IK Dm , and

where C

cm c0m c0m 0 and d

m

m

Here, the issue is how a network of distributed agents solves Pt instead of a centralized agent. In what follows, we develop a decomposition of Pt into a set of coupled sub-problems fP m tg and outline a distributed solution protocol.

3.3. Problem decomposition

For the distribution of decision-making, an agent m decides upon the values of the local control variables of sub-system

m. The values um t are obtained by solving a local optimization problem Pm t at each sample time. The design of the subproblem set fP m tg and the couplings among the agents is the so-called problem decomposition. The decomposition is said to

^ m t.

be perfect if each sub-problem Pm t encompasses all of the objective terms and constraints of Pt that depend on u

Models and algorithms for perfect and approximate decomposition are found in (Camponogara and Talukdar, 2004,

2005). For a perfect decomposition, let:

^i t is

Im fi : m 2 Ii; i mg be the set of output neighbors of sub-system m, that is, any sub-system i whose state x

^ m t;

affected by u

^ m t; and

Cm fi; j 2 Im Im : i m or j mg be the sub-system pairs of quadratic terms in /m that depend on u

^ m t.

Cm; k fi; j 2 Ik Ik : i m or j mg be the pairs of quadratic terms in /k t, k 2 Im, that depend on u

In the sample trafc network (Figs. 4 and 5), I1 f1g; I1 f2; 3; 5; 6g; C1 f1; 1g, and C1; 3

^ m t can affect the state of systems other than Im [ Im. For instance, subf1; 3; 1; 4; 1; 1; 3; 1; 4; 1g. Notice that u

system 1 is coupled to sub-system dened by Eq. (4) via sub-system 3, but 4 R I1 [ I1. The notion of neighborhood establishes the interdependence among sub-systems. Agent ms view of the network is divided in three sets:

local variables: the variables in vector um t;

neighborhood variables: all the variables in vector ym t ui t : i 2 Nm where Nm Im [ fi :

i; j 2 Cm; k; k 2 Img fmg is the neighborhood of agent m. The neighborhood of agent m consists of the sub-systems

other than m that are affected by the decision um t or whose decisions affect xm t. Notice that Im # Nm; and

remote variables: all of the other variables which consist of vector zm t ui t : i R Nm [ fmg.

P

^ t m2M /m t denote the objective function of Pt.

From agent ms view point, ut um t0 ym t0 zm t0 0 . Let f u

A perfect problem decomposition requires the local problem Pm t to account for all the dependencies with the neighbors

^ m t

of agent m. This is achieved if P m t is obtained from Pt by (i) discarding from the objective f the terms not involving u

and (ii) dropping the constraints not associated with agent m. Formally, agent ms local problem is

The operator denotes the Kronecker product. Q

m

128

1

^ m t gm t0 u

^ m t

^ m t0 Hm u

u

2

^ m t P cm

s: to : C m u

m

mu

^ m t d

D

^m t : min fm

Pm t; y

16a

16b

16c

where Hm is a suitable matrix and gm t is a suitable vector. A step-by-step procedure to obtain Hm and gm t from Hijl and

gij t is developed in (Camponogara and de Oliveira, 2009). Evidently, a perfect decomposition ensures that

^ t fm u

^ m t; y

^m t f m y

^ m t; ^zm t ct

f u

^ m t will be shorthands

for each agent m where f m is a suitable function. To simplify notation, hereafter P m ; Pm t, and P m t; y

for sub-problem (16a)(16c).

A perfect problem decomposition leads to some relationships between Pt and fP m tg that are handy to the design of a

distributed algorithm for the agent network. Assumptions and resulting properties are presented below. The reader can refer

to (Camponogara and de Oliveira, 2009) for the demonstrations and some illustrations.

^ t satises rst-order KKT (Karush-Kuhn-Tucker) optimality conditions for Pt if, and only if,

Proposition 1. A solution u

^ m t; y

^ m t satises KKT conditions of P m t; y

^ m t for each m 2 M.

u

Denition 1. (Feasible spaces) The feasible spaces are:

m g is the feasible space for P m t;

^m P

^m d

^ m : Cmu

cm ; Dm u

U m fu

U U 1 U M is the feasible space for Pt; and

Y m i2Nm U i is the feasible space for agent ms neighborhood variables.

m for all m 2 M.

mu

mu

^m >

^m d

^ 2 U such that C

cm and D

Assumption 2. (Strict feasibility) There exists u

Compactness5 is a plausible assumption because control signals are invariably bounded. So is the strict feasibility assumption: if the interior of U is empty, then some inequalities are indeed equalities and should be regarded as such.

Proposition 2. Problem Pt given by (15a)(15c) is convex.

^ m t is convex.

Corollary 1. Sub-problem Pm t; y

^ t is a convex function and U is a convex set, u

^ tH is a local minimum for f

Proposition 3. (Optimality conditions) Because f u

over U if and only if:

^ tH 0 u

^ t u

^ tH P 0;

rf u

^ t 2 U

8u

17

A vector u

^ m tH ; y

^ tH is a local minimum for Pt if, and only if, u

^m tH is a local minimum of

Corollary 2. (Local optimality conditions) u

H

^

Pm t; ym t for all m 2 M.

This corollary means that an overall control vector that cannot be unilaterally improved by a single agent (a xed point) is

locally optimal for all sub-problems fPm tg and therefore also locally optimal for Pt. As the problems are all convex, a local

optimum induces a global optimum.

3.4. Multi-agent distributed control

A perfect problem decomposition establishes an equivalence between an optimal solution to Pt and a stationary solu^ tH ? Below, we present a distributed algotion for the sub-problem network fP m tg. How do the agents reach a xed point u

rithm for the agents to arrive at a stationary point for fPm tg which works by generating a sequence

^ k

^ k

^ 0

^ tk u

u

1 t; . . . ; uM t of iterates. Starting with a feasible control vector ut , at each iteration k the agents exchange

their decisions locally, coordinate the iterations to preclude coupled agents from acting simultaneously, and keep working

until convergence is attained or time is up. At this point, the control signals are implemented and the horizon is rolled forwards to the next sample time. Two fundamental assumptions for the convergence of the agents iterates to a stationary

solution are stated below.

5

A set S is compact if for any given sequence xk of vectors in S there exists a subsequence xki which converges to a point xH in S. Any compact set is closed

and bounded.

129

k

^m t that becomes its next iter^ m t u

(i) agent m uses y

^ k1

t;

ate u

m

^ k

^ ik1 t u

(ii) all the neighbors of agent m keep their decisions at iteration k, that is, u

i t for all i 2 Nm.

^ tk is not a stationary point for all problems in fPm tg, then at least one agent m for

Assumption 4. (Continuous work) If u

k

^ k1

^

^k

t by approximately solving Pm t; y

which um t is not a stationary point for P m t produces a new iterate u

m

m t.

Condition (ii) of Assumption 3 and Assumption 4 hold by arranging the agents to iterate repeatedly in a sequence

hS1 ; . . . ; Sr i where Si # M; [ri1 Si M, and all distinct pairs m; n 2 Si are non-neighbors for all i. hS1 ; S2 ; S3 i with

S1 f2; 4; 6g; S2 f3; 5g, and S3 f1g is a valid sequence for the illustrative scenario. Actually, this sequence is too restrictive because the dynamic Eq. (9) assumes that ui t; i 2 Im; inuences the entire state vector xm t 1. This is not the case

in the trafc scenario. While the control signals u1 t and u4 t inuence x3 t 1 as a whole in the model, u1 t inuences

only the part of x3 t 1 associated with x6 , whereas u4 t inuences only the part associated with x7 . Thus,

S1 f2; 4; 6g; S2 f3; 5g, and S3 f1; 4g is also a plausible iteration sequence for the agents. Time-varying sequences that

uphold the conditions and synchronization protocols are other alternatives.

^ tk are drawn to a stationary point of

Of relevance is the way an agent m solves Pm t approximately so that the iterates u

fP m tg. To this end, we developed a distributed algorithm based on the feasible direction method (Bertsekas, 1995) which is

only outlined below, but fully developed in (Camponogara and de Oliveira, 2009). The distributed feasible direction method

is specially tailored for LDNs, taking advantage of the local dynamic and constraint structure which is not present in frameworks for more general settings (Camponogara et al., 2002; Camponogara and Talukdar, 2007).

k

^ k t u

m

^ k

^ tk , agent m computes a locally descent direction d

t u

At the current iterate u

m

m t by solving a linear programming (LP) problem

Dk

m t min

k

u

m t

0

^ k

k

^ k

^k

ru^ m t fm u

m t; ym t um t um t

18a

k

s: to : C m u

m t P cm

18b

mu

k

D

m t dm

18c

k

^ k

^ k t 0 is locally feasible at u

^m

^ k

^ k

A direction d

t; y

m

m t if u

m t am dm t 2 U m for all sufciently small am > 0. A locally

k

k

0 ^ k

^ k

^ m t; y

^m t if rfm u

^k

feasible direction is locally descent at a nonstationary point u

m t; y

m t dm t < 0. Notice that the

t

produces

a

locally

descent

direction

if

one

exists.

solution to Dk

m

k

k

^ k

^ k

^ k1

t u

The next iterate u

m

m t am tdm t is obtained by nding a step am t that satises the Armijo rule. Given

k

bm ; rm 2 0; 1; am t is the smallest nonnegative integer am for which:

0 ^ k

am ^ k

am

^ k

^ k

^ k

^k

^k

^k

fm u

^ m t fm u

m t bm dm t; ym t 6 fm um t; y m t rm bm ru

m t; ym t dm

Agent-iterations as delineated above, Assumptions 3, and 4 ensure that the iterates u

fP m tg and thereby a solution to Pt. Some technical details are needed for the convergence proof, but effectively the agent

network implements a distributed feasible direction method for quadratic programming (Camponogara and de Oliveira,

2009). The procedure used by each agent m at iteration k to solve fP m tg is outlined below.

Agent-iterationt; m; k

1: if agent m cannot revise its decisions in iteration k then

^ k

^ k1

t u

2: u

m

m t

3: return

4: end if

^ k

^ k

5: Agent m obtains y

m t u

i t : i 2 Nm from its neighbors

^ k t

t

to

obtain

d

6: Agent m solves Dk

m

m

k

^

7: if dm t 0 then

^ k

^ k

^ k1

^k

t u

. u

8: u

m

m t

m t; y

m t is stationary for P m t

9: return

10: end if

11: am 0

0 ^ k

am ^ k

am

^ k

^ k

^ k

^k

^ k

^ k

12: while fm u

^ m t fm u

m t bm dm t; y

m t > fm u

m t; y

m t rm bm ru

m t; y

m t dm

13: am am 1

14: end while

am ^ k

^ k

^ k1

t u

15: u

m

m t bm dm t

130

The iteration procedure is relatively simple. The most computationally demanding step is the solution of the linear program, for which fast and robust LP solvers are available off-the-shelf.

3.4.1. Analytical computation of feasible descent direction

The constraint structure of the linear dynamic network for signaling split control allows Dk

m t to be solved analytically.

To this end, replace the split ui t of a road i at cycle t with ui t li di t, where li is the lower bound for green time

and di t is the green time extension. If the upper bound for ui t is the cycle time C, then constraints (18b) and (18c)

become

P

P

li ;

di t kjt C

ui 2um

8k 2 K

ui 2um

8k 2 K

Such constraint structure is separable, having an independent constraint set for each prediction time k. Further, any basic

solution will have precisely one nonzero variable

di t kjt for each k. The net result is that an optimal basic solution to

t

is

found

by

dening

the

basic

variables

as

those corresponding to the most negative entries of the gradient

Dk

m

^ k

^k

rfm u

m t; y

m t.

3.4.2. Conict resolution

The multi-agent MPC framework can be viewed as a dynamic game (Camponogara et al., 2006). Each agent m has an implicit reaction function Rm ym determining the agents response um to the decisions ym of its neighboring agents. The reaction function is computed by solving sub-problem P m t. Thus, the agents resolve conicts by iteratively reacting to one

anothers decisions until they reach a xed point. Such a xed point is a Nash point for the game, that is, a combined decision

vector u which cannot be improved unilaterally by any agent with respect to its objective. On the one hand, the agents are

selsh to the extent that they are driven by their own interests, as quantied by their objective functions. On the other hand,

this selsh behavior leads to a global optimum since the objectives of the agents are aligned with the global objective in

problem Pt.

3.4.3. Multi-agent MPC as a multi-agent system

All in all, the multi-agent MPC framework falls within the class of multi-agent systems, which are systems composed of

multiple interacting intelligent agents having the characteristics of autonomy, local views, and decentralization (Wooldridge,

2002). The agents have limited autonomy because they follow the iteration and communication protocol imposed by

Assumptions 3 and 4, but each agent m is free to decide upon the values of parameters bm and rm based on what worked

best in the past, perform multiple iterations rather than simply satisfying the Armijo rule, and even utilize a totally different

algorithm that would solve P m or nd a near-optimal solution which implicitly satises the Armijo rule. The views of the

agents are local because they sense and decide upon the values of a fraction of the state and control variables, respectively.

And the agents are decentralized since no single agent has a complete view of or operates the entire network.

3.5. Closed-loop stability

The MPC approach is a kind of feedback control. It repeatedly revises the predicted control actions over a receding horizon

as new state measurements are received. However, the optimizations do not explictly consider the system behavior beyond

the prediction horizon, potentially leading the system to an unstable mode. For simplicity, let the origin x; u 0 be an equilibrium point for the dynamic network xt 1 Axt But. It is important to mention that stabilization conditions as^ t kjt xt k, and a global optimum is found for the optimization problems.

sume that the prediction model is perfect, x

The two main strategies for closed-loop stability of MPC are terminal constraints and innite horizons (Maciejowski, 2002).

^ m t Kjt 0 for

The terminal constraint strategy drives the nal state to the origin, that is, it introduces the constraint x

all m 2 M. Then a positive-denite objective function and these terminal constraints ensure closed-loop stability of the network. Notice that this strategy would couple the sub-systems in more complex ways than the local constraint structure given

^ m t Kjtk2 for all m 2 M can be introduced in the objective to retain

by Eqs. (10d) and (10e). Instead, the penalty term 1 kx

^

the local structure of the network. Notice that xm t Kjt tends to 0 as ! 0.

An innite prediction horizon ensures closed-loop stability. As the optimizations would not be in nite-dimensional

space, the innite horizon problem has to be expressed in terms of a nite set of control variables. If the network is intrinsically stable (all eigenvalues of A are inside the unit disc), this strategy introduces a terminal cost

P1 0 i

P

0

i

^

^

m2M xm t Kjt W m xm t Kjt where W m

i0 Am Q m Am is convergent because Am is stable. For an unstable plant,

W m does not converge and the unstable modes must be forced to zero at the end of the prediction. The network state is

decomposed in terms of stable xsm Asm xm and unstable xum Aum xm modes via a Jordan decomposition. Pt is augmented

P

^ sm t Kjt0 W sm x

^ sm t Kjt where W sm is

^um t Kjt 0 for all m 2 M and a terminal cost m2M x

with a terminal constraint x

obtained similarly to W m . The terminal constraints couple the sub-systems through the constraint set, but they can be

approximated with a penalty term as explained above to preserve the local structure of Pt.

All in all, the distributed agents can implement the terminal cost strategy if the open-loop plant is stable, or otherwise

introduce terminal constraints on the unstable modes while enforcing terminal costs on the stable modes. The agents

131

enforce terminal constraints approximately using penalty terms. Regardless of the strategy or combination thereof, the

agents can ensure closed-loop stability without compromising the local structure of fP m tg.

4. Simulation analysis

Fig. 4 shows the urban trafc network that served as a test bed for simulations with the multi-agent model predictive

control strategy and the standard TUC two-stage regulator. First, both strategies are evaluated through a numerical analysis

based on the nominal network model. Besides a comparison of these strategies, the study included an analysis of the convergence of the solutions produced by multi-agent MPC to the optimal solution obtained with centralized MPC. Second, simulations were conducted with a professional trafc simulation software to assess the behavior of both strategies in a more

realistic scenario and subject to model discrepancies. Third, the test bed network was expanded by adding two junctions,

four state variables, and four control signals to illustrate the exibility and scalability of multi-agent MPC.

4.1. Network specication

The test bed network was designed to represent an urban perimeter traversed by high ow avenues, providing a convenient scenario for split control evaluation. Nevertheless, the complexity of the network is inuenced by other variables that

include cycle time and offset between junctions. The network consists only of one-way links to diminish the inuence of

these control parameters and the network specication on the performance metrics. Further, offset control is not implemented and the cycle time is dened as a multiple of the shortest Webster cycle to balance the internal streams of vehicles.

To mitigate the inuence of the network specication and the uncontrolled variables (offset and cycle time), three scenarios

were appraised.

4.1.1. Scenario I: distinct cycles C

Cycle times and nominal splits were computed through a method known as Websters procedure (Webster, 1959), which

yields optimal cycle times and signaling splits for isolated junctions. The procedure is summarized by the equations below:

Cj

qN =Si C j Lj

1:5Lj 5

P

and uj;i iP

;

N

1 k2Ij qNk =Sk

k2Ij qk =Sk

for all i 2 F j

where C j is the cycle of junction j; Lj denotes the lost time of the same junction; qNi is the nominal inow to link i in vehicles

per hour; Si is the saturation ow of link i; Ij is the set of input links of junction j; uj;i is the nominal green time allocated to

phase i of junction j; and F j is the set of phases of the controlled junction j.

The cycle times and splits resulting from the application of Websters procedure appear in Table 1. The cycles and splits

are not optimal since the junctions are not isolated and operate synchronously (their offset is zero). In fact, vehicle progression is erratic and difcult since the junctions have distinct cycles. In this scenario, the high trafc inows concentrated in

the main avenues x2 and x8 make progression even more difcult.

4.1.2. Scenario II: equal cycles (C=)

Cycle times were set to 120 s, providing a harmonic progression of vehicles and minimizing the undesirable effects of the

lack of synchronization. For this scenario, the trafc inows were more balanced to avoid oscillations in the internal ows

and thereby diminish the effects of synchronization. With the given cycle time, the nominal splits were obtained by the

Websters procedure. The nominal trafc control parameters are presented in Table 2.

Table 1

Nominal parameters of the distinct cycles scenario.

Link

z

Control

uj;z

Saturation

Sz (veh/h)

Nominal inow

qz (veh/h)

Nominal split

uN

j;z (s)

Cycle

C (s)

x1

x2

x3

x4

x5

x6

x7

x8

x9

x10

x11

x12

x13

u1;1

u1;2

u1;3

u2;1

u2;2

u3;1

u3;2

u4;1

u4;2

u5;1

u5;2

u6;1

u6;2

3600

3600

3600

3600

3600

1800

3600

3600

3600

3600

1800

3600

3600

1000

1100

900

1800

1300

58.0

63.8

52.2

46.7

73.9

26.3

43.6

89.2

64.4

50.9

28.8

75.6

43.7

192.0

132.6

81.9

165.6

91.7

131.3

132

Table 2

Nominal parameters of the equal cycles scenario.

Link

z

Control

uj;z

Saturation

Sz (veh/h)

Nominal inow

qz (veh/h)

Nominal split

uN

j;z (s)

Cycle

C (s)

x1

x2

x3

x4

x5

x6

x7

x8

x9

x10

x11

x12

x13

u1;1

u1;2

u1;3

u2;1

u2;2

u3;1

u3;2

u4;1

u4;2

u5;1

u5;2

u6;1

u6;2

3600

3600

3600

3600

3600

1800

3600

3600

3600

3600

1800

3600

3600

800

1300

900

900

700

28.8

46.8

32.4

72.5

39.5

54.9

57.1

63.0

49.0

59.8

52.2

54.7

57.3

120

120

120

120

120

120

Table 3

Nominal parameters of the equal cycles scenario with crash simulation.

Link

z

Control

uj;z

Saturation

Sz (veh/h)

Nominal inow

qz (veh/h)

Nominal split

uN

j;z (s)

Cycle

C (s)

x1

x2

x3

x4

x5

x6

x7

x8

x9

x10

x11

x12

x13

u1;1

u1;2

u1;3

u2;1

u2;2

u3;1

u3;2

u4;1

u4;2

u5;1

u5;2

u6;1

u6;2

3600

3600

3600

3600

3600

1800

3600

3600

3600

3600

1800

3600

3600

800

1300

900/0/1500

900

700

28.8

46.8

32.4

72.5

39.5

54.9

57.1

63.0

49.0

59.8

52.2

54.7

57.3

120

120

120

120

120

120

This scenario has the same characteristics of the previous one, except for the simulation of a car crash in link x3 . This incident occurs at the 15 min of simulation and blocks the link for 15 min. The intent is to temporarily suspend trafc ow

through the link. When the link is unblocked at the 30th minute of simulation, the inow of link x3 reaches a rate higher

than the nominal rate for the remaining of the simulation because of the accumulation of vehicles during the incident. Table

3 presents the values regarding this scenario.

4.1.4. Remarks

Table 4 shows the turning rates which are common for all scenarios6. The rst column gives the origin link of the conversion, while the remaining columns dene the destination links. The data characterizing a scenario and the turning rates are sufcient to determine the matrix B (see Section 2) and thereby obtain the dynamic system xt 1 Axt But.

Scenarios II and III share the same dynamic system but differ in the input demand fk, which simulates the suspension of

trafc ow on link x3 for 15 min in scenario III. A comparison between scenarios I and II aims to verify if one of the control

strategies is more suitable for an erratic progression (distinct cycle times) or smoother progression (equal cycle times). A

comparison between scenarios II and III seeks to assess the robustness of the control strategies when the demands deviate

drastically from the nominal demands.

4.2. Numerical results

This section presents results from numerical simulation using Eq. (4) as a model for the trafc system. The simulation can

be implemented with scientic computation software, such as MATLAB and SCILAB, and even programming languages such as

PYTHON and C. The networks actual state is calculated at each interval based on the given initial conditions, the previous state,

and the discrete model. To make the control design model different from the simulation model, a disturbance was introduced

in the simulation model

6

Turning rates are not reported for x4 ; x5 ; x12 , and x13 because they are exit links.

133

Table 4

Nominal turning rates for the test bed network.

sw;j

x1

x2

x3

x4

x5

x6

x7

x8

x9

x10

x11

x12

x13

x1

x2

x3

x6

x7

x8

x9

x10

x11

0.20

0.25

0.65

0.50

0.80

0.05

0.30

0.05

0.40

0.60

0.60

0.40

0.05

0.30

0.05

0.80

0.50

0.70

0.15

0.15

xt 1 Axt But ft

19

where the term ft represents the inows of input links, that is, vehicles entering the network. In the network of Fig. 4, the

set of inow links is f1; 2; 3; 8; 9g. Fig. 6 presents the owchart of the numerical simulation.

Note however that this simulation is a rough representation of real trafc behavior and is best used for control design. For

instance, the model does not allow vehicles to cross two intersections in the same control interval. Another limitation is the

assumption that queue lengths are sufciently large and downstream links are not obstructed, so that the outow of a link

with right of way is approximated by its saturation ow. This assumption does not hold when few vehicles are waiting at the

stop line of a road, say xz , which feeds another queue, say xw .

The scenario chosen for this experiment is the one of distinct cycles. Since the model ignores the interactions between

junctions, the given cycles and splits can be regarded as optimal and there is no need to replicate the experiments for other

scenarios. The inows of the network, ft, are dened by a Gaussian function centered at the middle of simulation time, with

an initial value equal to the nominal inows and a peak that doubles the nominal.

The network was simulated for approximately 2 h, namely T 40 simulation steps with control interval of DT 200 s.

The impact of the prediction horizon on multi-agent MPC is evaluated for steps ranging from 1 to 5. Furthermore, 10 random

initial conditions were considered to increase reliability of the analysis. The initial state of the links were obtained at random

in the range from 0 to 500 vehicles for each initial condition.

134

Fig. 7. Mean accumulated cost over 40 simulation steps for a set of 10 random initial conditions.

Because model (19) is not ideal for trafc representation, the following accumulated cost was chosen as the objective

function and comparative metric:

J exp

T

X

xt0 Q xt Dut0 RDut

20

t0

where T is the number of simulation steps; Dut uN ut is the deviation from the nominal control signals; Q I is an

identity matrix weighing the states; and R 0:003I is a matrix weighing control deviation.

The values assumed for the weighting matrices are typical of other papers on TUC control (Diakaki, 1999; Carlson et al.,

2006). The objective J exp simultaneously minimizes queues in a balanced way with the quadratic term kxtk2Q and control

deviation from a nominal xed-time control policy uN with the quadratic term kDutk2R kut uN k2R . The trafc engineer

experimentally sets the parameter r dening the control-cost matrix R rI and thereby the trade-off rate between the two

objectives. Nominal splits for the experiments appear in Tables 13.

P

^t kjt0 Q x

^t kjt

A stop criterion of relative tolerance was selected for multi-agent MPC. Let J exp t Kk1 x

0

^ t k 1jt RDu

^ t k 1jt be the objective function for MPC over the prediction horizon, that is, the objective of

Du

k1

k

^ k

Pt. The distributed agents iterate until kJ k

exp t J exp tk=kJ exp tk < q where q is the tolerance, fDut g is the sequence

of iterates produced by the agents, and k is the iteration counter. Such criterion is satised when the relative decrease in the

objective function becomes insignicant.

Fig. 7 shows the mean accumulated cost J exp over 10 simulation runs with different initial conditions. These results corroborate the efciency of multi-agent MPC. For a prediction horizon K 5, multi-agent MPC achieves a performance increase

of approximately 10% in comparison to the TUC LQR approach. For long horizons, the changes in control signals are more

subtle and so are the variations in objective function as shown in the gure. For short horizons, the relative distance between

the multi-agent MPC solution and the centralized solution becomes more pronounced, specially for high tolerances. Junctions with high inuence on the network, as junction 1, induce a large cost reduction that compared to the reduction from

less inuential junctions can trigger the stop criterion far from the optimal point.

4.3. Simulation results

Aiming to circumvent the limitations of the numerical analysis, the three scenarios were modeled in AIMSUN version 6

which is a professional trafc simulator (Barcel and Casas, 2002). The performance results from these simulations are more

reliable as the trafc dynamics are modeled more accurately.

Eq. (20) remains the objective function for computing the gain matrix L of the TUC strategy and for multi-agent MPC. Matrix Q was the identity, whereas the control deviation matrix R was either R1 0:003I or R2 I. All scenarios share the same

control interval DT 200 s and a duration of approximately 1 h. Further, equal prediction and control horizons of length

K 2 f1; 3g were used for multi-agent MPC. Although they seem small at rst, such sliding horizons are in accordance with

the dynamics of interest in the process: the proposed control interval is 200 s long which is larger than the highest cycle

time, thereby conguring an adequate control horizon.

135

Fig. 8.

AIMSUN

Table 5

Simulation results with R1 matrix for all scenarios.

Scenario

Mean

Density (veh/km)

Mean

Std. dev.

TUC LQR

C

C

C /crash

241.23

189.89

193.06

Std. dev.

3.15

0.75

2.72

29.51

18.57

19.14

0.67

0.23

2.74

M-MPC K 1

C

C

C /crash

240.42

189.85

192.09

6.43

0.96

1.80

29.59

18.57

19.06

0.97

0.09

2.44

M-MPC K 3

C

C

C /crash

465.66

208.21

205.77

55.38

2.68

18.83

53.57

20.30

20.55

4.74

0.27

3.86

Because state variables are not readily available in AIMSUN7, inductive loop detectors were inserted at the entrance and stop

line of the controlled links. Then, the number of vehicles that have entered but not left the link is obtained by subtracting the

measurements of the former detector from the measurements of the latter detector.

Fig. 8 depicts the AIMSUN simulation model of the test bed network. A set of ten replications with different seeds were simulated for each scenario. Tables 5 and 6 report the results achieved by multi-agent MPC (M-MPC) and the TUC LQR strategy

for matrices R1 and R2 , respectively. The results encompass the scenarios of distinct cycle times C, identical cycle times

(C=), and identical cycle times with car crash (C = /crash).

With controlcost matrix R1 , the difference between the performance of multi-agent MPC with a unitary step control

horizon (K 1) and the TUC LQR approach is not statistically signicant. On the other hand, the multi-agent MPC performance is inferior with a prediction horizon of three steps, corroborating the hypothesis that the predictions from the trafc

ow model given in Eq. (4) might be signicantly wrong. This observation is reinforced by the lack of performance degradation in the numerical experiments in which the predictions match the actual model.

With controlcost matrix R2 , the results are slightly favorable to multi-agent MPC but not statistically signicant when

the length of the prediction horizon is K 1. The TUC LQR approach achieves better performance than multi-agent MPC

when K 3.

7

URL: http://www.aimsun.com.

136

Table 6

Simulation results with R2 matrix for all scenarios.

Scenario

Mean

Density (veh/km)

Std. dev.

Mean

Std. dev.

TUC LQR

C

C

C / crash

240.87

189.03

192.38

2.73

0.59

2.70

29.63

18.46

18.97

0.56

0.24

2.64

M-MPC K 1

C

C

C / crash

237.82

188.74

191.60

3.03

0.80

2.47

29.35

18.47

19.05

0.37

0.06

2.53

M-MPC K 3

C

C

C / crash

311.64

199.04

202.22

31.32

2.36

6.45

37.56

19.40

20.07

3.24

0.21

0.29

A comparison between Tables 5 and 6 indicates that the performance of all control strategies were slightly better when

R R2 .

4.4. Multi-agent MPC recongurability

To demonstrate that multi-agent MPC can be recongured at ease, two junctions were added to the test bed network as

depicted in Fig. 9. The inclusion of the new junctions will take place in two phases, rst including sub-systems 7 and 8 afterwards. The introduction of junction 7 expands the neighborhood of junction 6 from the set N6 f1; 5g to N6 f1; 5; 7g.

As a consequence, new terms are included in agent 6s objective function to account for the inuence of the control signals at

junction 6 on the state of junction 7. No change is required in any other junction.

Initially, the neighborhood of junction 7 consists only of junction 6, conguring an easily implementable sub-system. In

the form of Eq. (16a), the objective function of agent 7 is given by

H7 t H777

7

B077 Q 7 B77 R

1

^ 6 t

g7 t H0767 H776 u

2

0 Q 7 A7 x7 t

^ 6 t B

B077 Q 7 B76 u

77

The addition of junction 8 is very similar to the previous one. This time, the introduction of junction 8 expands the neighborhood of junction 7 from the set N7 f6g to N7 f6; 8g. As a consequence, agent 7s objective function must be updated to account for the inuence on the state of junction 8.

137

Table 7

Simulation results of the expanded test bed network.

Scenario

M-MPC K 1

Density (veh/km)

Mean

Std. dev.

Mean

Std. dev.

200.07

0.76

19.63

0.05

H7 t H777 H877

77 R7 B0 Q 8 B87

B077 Q 7 B

87

1 0

1

^ 6 t g88 t H0887 H878 u

^ 8 t g87 t

g7 t H767 H776 u

2

2

0 Q 8 A8 x8 t

^ 6 t B077 Q 7 A7 x7 t B087 Q 8 B88 u

^ 8 t B

B077 Q 7 B76 u

87

The sub-problem of agent 8 is actually fairly simple since junction 7 is its sole neighboring sub-system

1

^ 7 t g88 t B088 Q 8 B87 u

^ 7 t B088 Q 8 A8 x8 t

g8 t H0878 H887 u

2

At this point the system is already congured with the newly added junctions. The reconguration process is summarized

in the following steps:

(1) statistically gather the parameters of the new junction(s);

(2) determine the neighborhood of the added intersection(s); and

(3) revise the objective function of the junctions belonging to that neighborhood and determine the objective function of

the new sub-system(s) according to Eq. (16a).

The parameters necessary to put together the simulation scenario are

the

the

the

the

turning rates are s12;14 0:6; s13;14 0:4; s14;16 0:5, and s15;16 0:5;

saturation ow is 3600 veh=h for links x14 ; x15 ; x16 , and x17 ;

nominal splits are uN7;1 uN7;2 uN8;1 uN8;2 54 s; and

inow for links x15 and x17 is 800 veh=h.

With the purpose of illustration, the AIMSUN equal cycle scenario C was modied to encompass junctions 7 and 8. The

results from the simulations appear in Table 7 for a prediction horizon of one step and R 3 103 I.

To provide a clear comparison with the LQR process of reconguration, the steps needed to include the two junctions

above are listed below:

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

include the new data in the global matrices A; B; Q , and R;

compute the new control matrix L;

modify all the parameters of the control matrix;

statistically gather the parameters of junction 8;

include the new data in the global matrices A; B; Q , and R;

compute the new control matrix L;

modify all the parameters of the control matrix; and

set up new procedures for recovering feasibility of control signals.

Although the number of steps involved are similar, the inclusion of a new junction in the LQR control scheme requires

modication of the control laws of all junctions. As network complexity increases, this task not only becomes arduous,

but also error prone as the parameters must be manually input.

5. Summary and future work

The operation of large dynamic systems remains a challenge in control engineering to a great extent due to their sheer

size, intrinsic complexity, and nonlinear behavior (Tatara et al., 2005, 2007). Recently, control engineers have turned their

attention to multi-agent systems for their composite nature, exibility, and scalability. To this end, this paper contributed

to this evolving technology with a framework for multi-agent control of linear dynamic networks, which are obtained from

the interconnection of sub-systems that become dynamically coupled but otherwise have local constraints.

138

Of particular interest to this paper is the signaling split control of trafc ow modeled by store-and-forward equations.

Such model leads to a linear dynamic network of sub-systems matching the trafc junctions. The state variables are the

number of vehicles in the roads leading to each junction, while the control signals are the green times given to each of their

stages. The signaling split control entails solving a constrained, innite time, linear-quadratic-regulator problem (Diakaki

et al., 2002): the quadratic cost seeks to minimize queue lengths and deviation from nominal signals; the constraints ensure

that the green times add up to cycle time and are within bounds; and the linear dynamics result from the store-and-forward

trafc ow model.

The TUC approach uses a feedback control law for signaling split, whereby a static feedback matrix is computed off-line

with the LQR technique and a quadratic program is solved on-line to recover split feasibility. On the other hand, model predictive control handles constraints in a systematic way by using a nite-time rolling horizon and solving optimization problems on-line. To cope with large networks and allow distributed reconguration, this paper proposed a decomposition of the

MPC problem into a set of locally coupled sub-problems that are iteratively solved by a network of distributed agents. The

iterates produced by these distributed agents are drawn towards a globally optimal solution if they synchronize their work.

The purpose of the experiments was threefold. First, the numerical analysis aimed to demonstrate the convergent behavior

of the multi-agent system and compare its speed with that of an ideal, centralized agent that solves the overall MPC problem.

Second, the simulation analysis showed that multi-agent model predictive control can achieve performance comparable to

the TUC approach in representative scenarios implemented with the Aimsun simulator. And third, the experiments illustrated the exibility of the multi-agent MPC framework by introducing two additional controlled junctions, which required

only the reconguration of the control agent at the neighboring junction.

The research reported heretofore is multidisciplinary with contributions across the elds of multi-agent technology, optimization, and urban trafc control. Further improvements will be pursued along the following directions:

numerical and simulated studies with very large networks aimed to conrm the potential of the multi-agent MPC

framework;

the formulation and application of trafc models that more accurately represent trafc ow (Aboudolas et al., 2007); and

the formal extension of the multi-agent framework to handle constraints on state variables.

References

Aboudolas, K., Papageorgiou, M., Kosmatopoulos, E., 2007. Control and optimization methods for trafc signal control in large-scale congested urban road

networks. In: Proceedings of the American Control Conference, New York, USA, pp. 31323138.

Balan, G., Luke, S., 2006. History-based trafc control. In: AAMAS06: Proceedings of the 5th International Joint Conference on Autonomous Agents and

Multiagent Systems, ACM, New York, NY, USA, pp. 616621.

Barcel, J., Casas, J., 2002. Dynamic network simulation with Aimsun. In: Proceedings of the International Symposium on Transport Simulation. <http://

www.aimsun.com/site/content/view/35/50/>.

Bertsekas, D.P., 1995. Nonlinear Programming. Athena Scientic, Belmont, MA.

Bielefeldt, C., Diakaki, C., Papageorgiou, M., 2001. TUC and the SMART NETS project. In: Proceedings of the International IEEE Conference on Intelligent

Transportation Systems, Oakland, CA, USA, pp. 5560.

Camacho, E.F., Bordons, C., 2004. Model Predictive Control. Springer-Verlag.

Camponogara, E., de Oliveira, L.B., 2009. Distributed optimization for model predictive control of linear dynamic networks, Accepted by IEEE Transactions on

Systems, Man, and Cybernetics Part A. <http://www.das.ufsc.br/~camponog/papers/dmpc-tuc.pdf>.

Camponogara, E., Talukdar, S.N., 2004. Designing communication networks for distributed control agents. European Journal of Operational Research 153 (3),

544563.

Camponogara, E., Talukdar, S., 2005. Designing communication networks to decompose network control problems. INFORMS Journal on Computing 17 (2),

207223.

Camponogara, E., Talukdar, S.N., 2007. Distributed model predictive control: synchronous and asynchronous computation. IEEE Transactions on Systems,

Man, and Cybernetics Part A 37 (5), 732745.

Camponogara, E., Jia, D., Krogh, B.H., Talukdar, S.N., 2002. Distributed model predictive control. IEEE Control Systems Magazine 22 (1), 4452.

Camponogara, E., Zhou, H., Talukdar, S.N., 2006. Altruistic agents in uncertain, dynamic games. Journal of Computer & Systems Sciences International 45,

536552.

Carlson, R.C., Kraus Junior, W., Campnogara, E. 2006. Combining the TUC urban trafc control strategy with bandwidth maximisation control in

transportation systems. In: Proceedings of the 11th IFAC Symposium on Control in Transportation Systems.

de Oliveira, L.B., 2008. Otimizao e controle distribudo de fraes de verde em malhas veiculares urbanas, Masters thesis, Graduate Program in Electrical

Engineering, Federal University of Santa Catarina, in Portuguese.

de Oliveira, L.B., Camponogara, E., 2007. Predictive control for urban trafc networks: initial evaluation. In: Proceedings of the 3rd IFAC Symposium on

System, Structure and Control, Iguassu Falls, Brazil.

de Oliveira, D., Bazzan, A.L.C., Lesser, V., 2005. Using cooperative mediation to coordinate trafc lights: a case study. In: AAMAS05: Proceedings of the 4th

International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 563470.

Diakaki, C., 1999. Integrated control of trafc ow in corridor networks, Ph.D. Thesis, Department of Production Engineering and Management, Technical

University of Crete, Greece.

Diakaki, C., Papageorgiou, M., 1997. Partners of the Project TABASCO, Urban Integrated Trafc Control Implementation Strategies, Tech. Rep. Project

TABASCO (TR1054), Transport Telematics Ofce, Brussels, Belgium (September 1997).

Diakaki, C., Papageorgiou, M., Aboudolas, K., 2002. A multivariable regulator approach to trafc-responsive network-wide signal control. Control

Engineering Practice 10 (2), 183195.

Gazis, D.C., Potts, R.B., 1963. The oversaturated intersection. In: Proceedings of the Second International Symposium on Trafc Theory, pp. 221237.

Hunt, P.B., Robertson, D.I., Bretherton, R.D., Winton, R.I., 1981. SCOOT a trafc responsive method of coordinating signals, Tech. rep., Transport Research

Laboratory, Crowthorne, England.

Jennings, N., 2000. On agent-based software engineering. Articial Intelligence 117, 277296.

139

Kosmatopoulos, E., Papageorgiou, M., Bielefeldt, C., Dinopoulou, V., Morris, R., Mueck, J., Richards, A., Weichenmeier, F., 2006. International comparative eld

evaluation of a trafc-responsive signal control strategy in three cities. Transportation Research Part A: Policy and Practice 40 (5), 399413.

Khne, F., 2005. Controle preditivo de robs mveis no holonmicos, Masters thesis, Graduate Program in Electrical Engineering, Federal University of Rio

Grande do Sul, Brazil, in Portuguese.

Li, S., Zhang, Y., Zhu, Q., 2005. Nash-optimization enhanced distributed model predictive control applied to the Shell benchmark problem. Information

Sciences 170 (2-4), 329349.

Lowrie, P.R., 1982. The Sydney co-ordinated adaptive trafc system principles, methodology and algorithms. In: Proceedings of the IEE International

Conference on Road Trafc Signalling, London, pp. 6770.

Maciejowski, J.M., 2002. Predictive Control with Constraints. Prentice Hall.

Manikonda, V., Levy, R., Satapathy, G., Lovell, D.J., Chang, P.C., Teittinen, A., 2001. Autonomous agents for trafc simulation and control. Transportation

Research Record 1774, 110.

Maturana, F.P., Staron, R.J., Hall, K.H., 2005. Methodologies and tools for intelligent agents in distributed control. IEEE Intelligent Systems 20 (1), 4249.

Negenborn, R.R., Schutter, B.D., Hellendoorn, J., 2008. Multi-agent model predictive control for transportation networks: serial versus parallel schemes.

Engineering Applications of Articial Intelligence 21 (3), 353366.

Nguyen-Duc, M., Guessoum, Z., Mari, O., Perrot, J.-F., Briot, J.-P., Duong, V., 2008. Towards a reliable air trafc control. In: AAMAS08: Proceedings of the 7th

International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 101104.

Papageorgiou, M., 2004. Overview of road trafc control strategies. In: Information and Communication Technologies: From Theory to Applications, pp. LIX

LLX.

Pechoucek, M., ilk, D., Pavlcek, D., Uller, M., 2006. Autonomous agents for air-trafc deconiction. In: AAMAS06: Proceedings of the 5th International

Joint Conference on Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 14981505.

Rigolli, M., Brady, M., 2005. Towards a behavioural trafc monitoring system. In: AAMAS05: Proceedings of the 4th International Joint Conference on

Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 449454.

Robertson, D.I., 1969. TRANSYT: A trafc network study tool, Tech. rep., Transport Research Laboratory, Crowthorne, England.

Robertson, D.I., Bretherton, R.D., 1991. Optimizing networks of trafc signals in real time the SCOOT method. IEEE Transactions on Vehicular Technology

40 (1), 1115.

Srinivasan, D., Choy, M.C., 2006. Cooperative multi-agent system for coordinated trafc signal control. IEE Proceedings Intelligent Transport Systems 153

(1), 4149.

Tatara, E., Birol, I., Teymour, F., inar, A., 2005. Agent-based control of autocatalytic replicators in networks of reactors. Computers & Chemical Engineering

29, 807815.

Tatara, E., inar, A., Teymour, F., 2007. Control of complex distributed systems with distributed intelligent agents. Journal of Process Control 17, 415427.

Toms, V.R., Garcia, L.A., 2005. A cooperative multiagent system for trafc management and control. In: AAMAS05: Proceedings of the 4th International

Joint Conference on Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 5259.

Tumer, K., Agogino, A., 2007. Distributed agent-based air trafc ow management. In: AAMAS07: Proceedings of the 6th International Joint Conference on

Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 18.

Webster, F.V., 1959. Trafc signal settings, Tech. Rep. 39, Road Research Laboratory, London, UK.

Wooldridge, M., 2002. An Introduction to MultiAgent Systems. John Wiley & Sons Ltd.

Yamashita, T., Izumi, K., Kurumatani, K., Nakashima, H., 2005. Smooth trafc ow with a cooperative car navigation system. In: AAMAS05: Proceedings of

the 4th International Joint Conference on Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 478485.

- CBRMEMS JCISE Reviseddraft 013110 Cobb AgoginoЗагружено:Krushnasamy Suramaniyan
- Dynamic Economic Dispatch Using Model Predictive Control AlgorithmЗагружено:selaroth168
- A Heuristic Algorithm to Optimise Stope BoundariesЗагружено:paulogmello
- Thèse Gustavo MendozaЗагружено:Luis Horacio Martínez Martínez
- 1-s2.0-S1751570X13000642-main.pdfЗагружено:Hadyan Gilang Kurnia
- Rehabilitation of Existing Structures by Optimal Placement of Viscous DampersЗагружено:bursuc2
- Optimum Design of Shell-And-tube Heat ExchangerЗагружено:medhunvarun
- Discrete Optimum Design of Cable-Stayed BridgesЗагружено:spoorthiss
- Cq 2643524359Загружено:IJMER
- IEEE IndusElec2009 PWMconvertersЗагружено:Adismael Souza
- THESIS ABSTRACTЗагружено:Sooraj Narayan
- STRUCTURAL OPTIMIZATION OF REAR BUMPER FOG LAMP PUNCHING MACHINEЗагружено:IAEME Publication
- CRITICAL POINTS IN REACTIVE MIXTURESЗагружено:FRANCISCO SANCHEZ MARES
- A BIM Based Dynamic Model for Site Material Supply 2016 Procedia EngineeringЗагружено:j
- A Novel Correlation for Prediction of Gas ViscosityЗагружено:Euler Cauchi
- Grid Search in MatlabЗагружено:Danyel Day
- 04-120921225450-phpapp02Загружено:Zineb Garroussi
- Four TanksЗагружено:Phạm Ngọc Huy
- PGM_45Загружено:tariq76
- 1902.09964.pdfЗагружено:Tariq Khan
- Central Facilities LocationЗагружено:Juan Sebastián Poveda Gulfo
- Trivariate Optimal Programming Problems For Bacterial Disease Management Among PlantsЗагружено:Anonymous 7VPPkWS8O
- UntitledЗагружено:Vageesha Shantha Veerabhadra Swamy
- Flyer for RUppalariЗагружено:AtiyoBanerjee
- 04359249Загружено:ERIKA_UNI
- 1-s2.0-S0304407607002564-mainЗагружено:Aziz Adam
- EconЗагружено:Lily Zhao
- f0dc2f86ad8dd4780b4a765cca853ab07277Загружено:Gabriela Piroșcă
- Agreg Concours AccesCPA CS SIIIM2017 AdmisЗагружено:Omar Rajad
- Et Ghani 2013Загружено:road1212

- PESTR18Загружено:Julio César
- Popular MechanicsЗагружено:Julio César
- GM2018-MediaKit-9-21-2017-1Загружено:Julio César
- Renewable Energy Generation SystemsЗагружено:Julio César
- Vermaas Et Al (2011)Загружено:Julio César
- Pess Li 1087Загружено:Julio César
- 47187.pdfЗагружено:Annas Qureshi
- AEEI Renewables Grid Integration Case StudiesЗагружено:Julio César
- 1-s2.0-S0952197614001730-mainЗагружено:Julio César
- 1-s2.0-S1364815213002302-mainЗагружено:Julio César
- 1-s2.0-S1359431115000368-mainЗагружено:Julio César
- 1-s2.0-S1005888515606240-mainЗагружено:Julio César
- 1-s2.0-S1005888509606026-mainЗагружено:Julio César
- 1-s2.0-S0925231215006839-mainЗагружено:Julio César
- 1-s2.0-S1364815213002715-mainЗагружено:Julio César
- 1-s2.0-S0959152410002593-mainЗагружено:Julio César
- 1-s2.0-S0967066113000105-mainЗагружено:Julio César
- 1-s2.0-S0968090X09000540-mainЗагружено:Julio César
- 1-s2.0-S2405896315011039-mainЗагружено:Julio César
- 1-s2.0-S1877050913007424-mainЗагружено:Julio César
- 1-s2.0-S2405896315011064-mainЗагружено:Julio César
- 1-s2.0-S1568494609000908-main.pdfЗагружено:Julio César
- 07219582Загружено:Julio César
- 06645049Загружено:Julio César
- 07032662Загружено:Julio César
- 04350721Загружено:Julio César
- 07280881Загружено:Julio César
- 04017877Загружено:Julio César

- 1000481Загружено:Tejas Ahalpara
- FINAL_experiment 8_oxalate in the SampleЗагружено:eldeee143
- World Meteorological Day_2013Загружено:Jeremy Medina
- CncЗагружено:Gokulraju Rangasamy
- Wartsila O E RT82Загружено:Gennadiy Kovalyov
- Chapter 1 Atomic BondingЗагружено:Renu Sekaran
- Excelvningar DerivativeЗагружено:Navneet Pandey
- Determination of Gamma No and TSЗагружено:Aditya Shrivastava
- Sie10170 - Drts FamilyЗагружено:csudha
- Lesson 5 ProblemsЗагружено:aula8045040
- Drive Train Baja SaeЗагружено:Ravi Kothari
- Atomic Molecular Physics Rajkumar PDFЗагружено:r prathap
- Warp Field Mechanics (Dr. Harold “Sonny” White)Загружено:THE NIKOLA TESLA INSTITUTE
- Influencing Factors on the Demagnetization of LSPMSM During Its Starting ProcessЗагружено:Anchal Saxena
- Acceleration due to gravity labЗагружено:afghansher
- HHO BlendingЗагружено:himamsaheb_megnit
- Weibel Instability Due to Ultra Intense Laser Plasma Interaction Relevant to FusionЗагружено:IJBSS,ISSN:2319-2968
- Math Kangaroo 2002 Gr 05-06Загружено:pilakaya
- C150C150M-15_Standard_Specification_for_Portland_Cement.pdfЗагружено:Prafulla Patil
- Beam apdlЗагружено:Madhur Deshmukh
- Motor StarterЗагружено:7402653
- Sizing of Water Piping System.pdfЗагружено:organicspolybond
- Sifat Optis MineralЗагружено:gemc99
- Systems and Signals B P Lathi ContentsЗагружено:prashantgemini
- Prestressed Steel BeamЗагружено:Vaibhav Gaikar
- Site SelectionЗагружено:swapnil kotwal
- Unix practical FileЗагружено:Karandeep Singh
- Conservation of EnergyЗагружено:BenQuarrie
- Chapter 2 geotechЗагружено:GauraNitai108
- Chapter 9 Force and PressureЗагружено:th015w65866637