Академический Документы
Профессиональный Документы
Культура Документы
I. I NTRODUCTION
Our work advances the state of the art in the following ways.
First, we develop an agent bidding strategy for obtaining goods
in multiple overlapping English auctions. A practical algorithm
is described and through empirical evaluation it is shown
to be effective in a wide range of situations. The strategy
adapts to its prevailing circumstances through a mixture of
off-line and on-line learning. Second, by exploiting a fuzzy
set representation of the users preferences, the strategy is able
to make trade-offs between the various attributes of the goods
it purchases and cope with the inherent imprecision/flexibility
that often characterises a users preferences.
The rest of this paper is structured as follows. Section II
describes the bidding algorithm and how the FNN operates.
Section III gives an example of the FNN strategy in operation
in a flight auction scenario. Section IV provides a systematic
empirical evaluation of the strategy and benchmarks it against
a number of strategies that have been proposed in the literature.
Finally, Section V concludes.
PROCEDURE F N N run()
1: data {}
2: holding f alse
3: while holding = f alse or AuctionRunning() do
4:
new data U pdate()
5:
data {data, new data}
6:
data T rim()
7:
F N N train(data)
8:
if Success() then
9:
Return
10:
end if
11:
if Hold() = f alse then
12:
[abest , sbest ] F N N predict()
13:
L RunningAuctions()
14:
for all a L do
15:
if (Evaluate(a) sbest ) or (sbest Evaluate(a) )
then
16:
Bid(a)
17:
break
18:
end if
19:
end for
20:
end if
21: end while
Fig. 1.
TABLE I
T HE FNN RULE BASE .
R1 :
R2 :
R3 :
R4 :
R5 :
R6 :
R7 :
R8 :
R9 :
R10 :
Layer 1: each node in this layer generates the membership degrees of a linguistic label for each input variable
(e.g., reference price is low or there are a big number
of auctions left). Specifically, the ith node performs the
following (fuzzification) operation:
(1)
Oi
= Ai (x) = e
(xci )2
2 2
i
+ ,
(1)
(1)
Oi
= wi = jS (1) {Aj },
(2)
(1)
Si
where
is the set of nodes in layer 1 which feed into
node i in layer 2, and wi is the output of this node (i.e.,
the strength of the corresponding rule).
Layer 3: the ith node calculates the ratio of the ith rules
firing strength to the sum of all rules firing strengths:
(3)
Oi
wi0
= P
wi
jS (2)
wj
C. FNN Learning
As discussed in section I, the FNN involves two types of
learning: off-line and on-line. In either case, however, the
same basic method is used. Specifically, the learning rule
of the FNN agent is based on gradient descent optimisation
[8]. Given the training data xi (i = 1, 2, 3), the desired
output value Y , and the fuzzy logic rules, the parameters of
the membership functions for the FNNs input variables are
adjusted by supervised learning. Here the goal is to minimise
the error (E) function for all the training patterns:
E=
= ri
(1)
(2)
E Om Oi
(2)
(1) c
i
Om Oi
X
E
=
ci
(2)
mSi
X X
=
(2)
where
kSm
E
Ok
(2)
Om
Om
(4)
(1)
Oi
(1)
Oi
(3)
jSi
(3)
Si
O(5)
X
X 0
=
wj ,
r i
iS (4)
where j S
(4)
(3)
jSi
ci
(2)
Om
(1)
Oi
(1)
Oi
ci
(5)
= (O1 Y )rk ;
(3)
Ok
(3)
E Ok
(3)
(2)
Ok Om
(3)
mSi
(3)
where j
is the set of nodes in layer 3 that feed into
node i. The output of this layer combines all the outputs
of the rules that have the same consequent output.
Layer 5: the single node in this layer aggregates the
overall output of the FNN (i.e., pclose ) as the summation
of all incoming signals:
(7)
(3)
wj0 ,
(5)
E
,
(t + 1) = (t) +
(2)
(4)
(6)
X1
(5)
(Yj Oj )2 ,
2
j
wm
(1)
Oi
=e
(9)
P
k wk wm
P
( k wk )2
Pwm
( k wk )2
if k=m,
otherwise;
(10)
(xi ci )2
2 2
i
(8)
(11)
(xi ci )
.
i2
(12)
E
.
ci
(13)
X X
E
=
i
(2)
(2)
mSi
kSm
(3)
E Ok
(3)
(2)
Ok Om
(2)
Om
(1)
Oi
(1)
Oi
(14)
where
(1)
Oi
i
=e
(xi ci )2
2 2
i
(xi ci )2
.
i3
(15)
E
.
i
(16)
E
E O(5) Oi
=
ri
O(5) Oi(4) ri
= (O (5) Y )
wj0 ,
(17)
(3)
jSi
wj0 .
(18)
(3)
jSi
(19)
wp
u
max{wp ,wq } p
max{wqp ,wq } uq
w
w
1 + 1 max{wpp ,wq } up 1 max{wqp ,wq } uq
(20)
(1 )up uq
,
=
(1 )up uq + (1 up )(1 uq )
(21)
Q (y) =
300x
100
0
y12
3
18y
3
if x 200,
if 200 < x < 300,
if x 300.
(22)
if 12 y 15,
if 15 y 18,
if y 12 or y 18.
(23)
B. Experimental Settings
The experiments aim to cover a broad range of scenarios.
All the parameters about the environment are assigned at the
beginning of the game. Here we suppose that all auctions
start at a price of 0 and all have a bid increment of 10
pounds. A day in the game equals 30 seconds of real time.
An auction is starting time and end time are randomly chosen
from uniformly distributed ranges. Auction is flight date and
a customers preferred travel date qi are chosen randomly from
(11, 19) and (12, 18) respectively. The valuation of the goods
are randomly chosen from range (170, 370).
C. The Learning Algorithm
As discussed in section II-C, the agent engages in a period
of off-line learning in order to provide initial parameters for
the FNN. Figure 2 shows the curve of the root mean square
error with respect to the number of epochs. After 200 training
epochs, it can be seen that the error between the target output
and the actual output reaches its lowest point and so the
TABLE II
E XPERIMENTAL RESULTS FOR VARYING AGENT POPULATIONS . E ACH
26
25
24
23
22
21
20
19
18
0
20
40
60
80
100
120
140
160
180
200
Epochs
Fig. 2.
Membership
ref
Low
Membership
Membership
Agent Strategies
Fix Auction Fix Threshold
0.186 (20%) 0.200 (20%)
0.174 (20%) 0.206 (40%)
0.173 (20%) 0.214 (20%)
0.165 (25%) 0.202 (25%)
0.186 (40%) 0.279 (20%)
FNN
0.297 (20%)
0.313 (20%)
0.286 (40%)
0.312 (25%)
0.361 (20%)
High
VeryHigh
0
10
15
20
25
30
Price
(b) Antecedent MFs for number of auctions left (nauction)
35
1
Small
Big
0.5
Number 3
2
4
(c) Antecedent MFs for auction closing order (oauction)
1
Early
Late
0.5
Greedy
0.236 (40%)
0.223 (20%)
0.249 (20%)
0.256 (25%)
0.283 (20%)
Medium
0.5
session
no.
1
2
3
4
5
3
Order
B. Results
Three groups of experiments are conducted to evaluate the
performance of each type of agent (sections IV-B.1 to IV-B.3).
In each group of experiments, there are a number of sessions
which correspond to experiments with different settings (as
per section III-B). For each session, at least 200 games1 are
played among the agents.
Since the number of agents in each experiment varies, the
performance K of a particular type K of agent (i.e., Greedy,
Fix Auction, Fix Threshold, and FNN) is calculated as:
K =
Pn K
i
(i)
up,q
nK
(24)
0.3
Greedy
Fix Auction
Fix Threshold
FNN
Performance
0.25
0.2
0.15
0.1
0.05
Weighted Average
Einstein
Uninorm
Aggregation operator
Fig. 4.
Greedy
Fix Auction
Fix Threshold
FNN
0.3
Performance
0.25
0.2
0.15
0.1
0.05
1:3
1:1
3:1
Fig. 5.
0.2
Greedy
Fix Auction
Fix Threshold
FNN
0.18
0.16
0.14
Performance
that it can get a high satisfaction on the flight date. This leads
to a poor overall performance because it misses auctions that
have a high evaluation on price but a lower one on date.
It can also be seen from table II that all the agents, except
Fix Auction, behave best when there are many Fix Auction
agents in the population. This is because the Fix Auction
agents only bid in a small number of auctions. Thus other
auctions have less competition and, consequently, lower prices.
The FNN agent behaves worst when there are many agents like
itself. This is because they tend to have similar predictions
about the expected closing prices of the auctions and agents
with the same preference date will compete strongly with one
another in the same set of auctions.
2) Varying Aggregation Operators: This experiment
studies the impact on the different types of agents of the
different ways of trading-off the price and travel date (figure
4). To do this, we fix the number of auctions to be 10 and
the number of agents to be 25. We also fix the weight of
wp : wq = 1 : 1 for each operator. This time, the numbers
of each agent type in a given game are randomly generated.
Again, the FNN agents behave the best in all cases. In fact,
0.12
0.1
0.08
0.06
0.04
0.02
0
1:3
1:1
3:1
Fig. 6.