Вы находитесь на странице: 1из 12

Applied Soft Computing 7 (2007) 1197–1208

www.elsevier.com/locate/asoc

Automatic extraction and identification of chart patterns


towards financial forecast
James N.K. Liu *, Raymond W.M. Kwong
Department of Computing, The Hong Kong Polytechnic University, Hong Kong
Available online 20 March 2006

Abstract
Technical analysis of stocks mainly focuses on the study of irregularities, which is a non-trivial task. Because one time scale alone cannot be
applied to all analytical processes, the identification of typical patterns on a stock requires considerable knowledge and experience of the stock
market. It is also important for predicting stock market trends and turns. The last two decades has seen attempts to solve such non-linear financial
forecasting problems using AI technologies such as neural networks, fuzzy logic, genetic algorithms and expert systems but these, although
promising, lack explanatory power or are dependent on domain experts. This paper presents an algorithm, PXtract to automate the recognition
process of possible irregularities underlying the time series of stock data. It makes dynamic use of different time windows, and exploits the potential
of wavelet multi-resolution analysis and radial basis function neural networks for the matching and identification of these irregularities. The study
provides rooms for case establishment and interpretation, which are both important in investment decision making.
# 2006 Elsevier B.V. All rights reserved.
Keywords: Forecasting; Wavelet analysis; Neural networks; Radial basis function network; Chart pattern extraction; Stock forecasting; CBR

1. Introduction Since the late 1980s, advances in technology have allowed


researchers in finance and investment to solve non-linear
According to the efficient market theory, it is practically financial forecasting problems using artificial intelligence
impossible to infer a fixed long-term global forecasting model technologies including neural networks [1–4], fuzzy logic [5–
from historical stock market information. It is said that if the 7], genetic algorithms and expert systems [8]. These methods
market presents some irregularities, someone will take advan- have all shown promise, but each has its own advantages and
tages of it and this will cause the irregularities to disappear. But it disadvantages. Neural networks and genetic algorithms have
does not exclude that hidden short-term local conditional produced promisingly accurate and robust predictions, yet they
irregularities may exist; this means that we can still take lack explanatory power and investors show little confidence in
advantage from the market if we have a system which can identify their recommendations. Expert systems and fuzzy logic provide
the hidden underlying short-term irregularities when they occur. users with explanations but usually require experts to set up the
The behavior of these irregularities is mostly non-linear amid domain knowledge. At last but not least, none of these expert
many uncertainties inherent in the real world. In general, the systems can learn.
response to those irregularities will follow the golden rule — In this paper we introduce an algorithm, PXtract to automate
‘‘buy low, sell high’’ for most investors. If one foresees that the the recognition process of possible irregularities underlying the
stock prices will have a certain degree of upward movement, one time series of stock data. It makes dynamic use of different time
will buy the stocks. In contrast, if one foresees that a certain windows, and exploits the potential of using wavelet multi-
degree of drop will happen, one will sell the stocks on hand. This resolution analysis and radial basis function neural networks for
gives arise the problems of what irregularities we should focus on, the matching and identification of these irregularities.
forecasting techniques we can deplore, effective indicators we
can assemble, data information and features we can select to
2. Related work
facilitate the modeling and making of sound investment decision.

* Corresponding author.
Many of financial researchers believe that there are some
E-mail addresses: csnkliu@comp.polyu.edu.hk (James N.K. Liu), hidden indicators and patterns underlying stocks [9]. Weinstein
cskwong@comp.polyu.edu.hk (Raymond W.M. Kwong). [10] found that every stock has its own characteristics. It mainly
1568-4946/$ – see front matter # 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.asoc.2006.01.007
1198 J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208

falls into five categories, they are: finance, utilities, property, chart patterns play a very important role in technical analysis
and commercial/industrial and technology. Stocks’ price move- with different chart patterns revealing different market trends.
ments in different categories are depending on different factors. It For example, a head-and-shoulders tops chart pattern reveals
is difficult to identify which factors will affect a particular stock’s that the market will most likely to have a 20–30% rise in the
price movement. To address the problem, we explored the use of coming future. Successfully identifying the chart pattern is said
genetic algorithm to provide a dynamic mechanism for selecting to be the crucial step towards the win. Fig. 1 shows 16 samples
appropriate factors from available fundamental data and of typical chart patterns.
technical indicators [11]. Our investigation of the HK stock However, the analysis and identification of wave patterns is
market included potential parameters in fundamental data such difficult for two reasons. Firstly, there exists no single time
as daily high, daily low, daily opening, daily closing, daily scale that works for all analytical purposes. Secondly, any stock
turnover, gold price, oil price, HK/US dollar exchange rate, HK chart may exhibit countless different pattern combinations,
deposit call, HK interbank call, HK prime rate, silver price, and some containing sub-patterns. Choosing the most representa-
Hang Seng index comprising 33 stocks from the said five tive presents quite a dilemma. Furthermore, there is no readily
categories. The aggregate market capitalization of these stocks report of research development on the automatic process of
accounts for about 79% of the total market capitalization on The identifying chart patterns. We address this problem using the
Stock Exchange of Hong Kong Limited (SEHK). following algorithm.
On the other hand, for the technical indicators, we examined
the influences of popular indicators such as the relative strength 3.1. The PXtract algorithm
index (RSI), moving average (MA), stochastic and Ballinger
bands, prices/index movements, time lags and several data The PXtract algorithm extracts wave patterns from stock
transformations [12,13]. Each of these indicators provides price charts based on the following phases:
guidance for investors to analyze the trend of the stocks’ prices
movements. In particularly, the RSI is quite useful to technical 3.1.1. Window size phase
analyst in chart interpretation. The theoretical basis of the As there is hardly a single time scale that works for all
relative strength index is the concept of momentum. A analytical purposes in a wave identification process [2,29], a set
momentum oscillator is used to measure the velocity or rate of of time window sizes W={fw1 ; w2 ; . . . ; wn g j w1 > w2 > . . .
change of price over time. It is essentially a short-term trading > wn is defined (wi is the window size for 1 < = i< = n).
indicator and also quite effective in extracting price information Different window sizes are used to determine whether a wave
for a non-trending market. In short, the total number of pattern occurs in a specific time range. For example, in a short-
potential inputs being tested was 57 [11]. We applied GAs to term investment strategy, a possible window size can be defined
determine which input parameters are optimal for different as Wi 2 W = {40, 39, . . ., 10}.
stock modeling in Hong Kong. The fitness value of the
chromosome in the genetic algorithm was the classification rate 3.1.2. Time subset generation phase
of the neural network. It was calculated by counting on how Stock price trading data contain a set of time data T = {t1, t2,
many days the network’s output matched the derived ‘‘best . . ., tn} j t1 > t2 > . . . > tn. For a given time window size wi , T
strategy’’. We defined the best strategy at trading time t as: will be divided into a temporary subset T0. A set P is also
8 defined, where P  T. It contains the time ranges in which
>
> priceðt þ 1Þ  priceðtÞ
>
> buy if > z% previously identified wave patterns have occurred. Set P is f in
< priceðtÞ
the beginning.
best strategy ¼ priceðt þ 1Þ  priceðtÞ
>
> sell if < z% It is said that any large change in a trend plays a more
>
> priceðtÞ
: important role in the prediction process [13]. A range which
hold otherwise
has previously been discovered to contain a wave pattern will
where z is the decision threshold, and the output of the network not be tested again (i.e. If T0  P, tests will not be carried out).
is encoded as 1, 0, and 1 corresponding to the suggested Details about time subset T0 generation processes are shown in
investment strategies ‘buy’, ‘hold’, ‘sell’, respectively. We Fig. 2.
observed that the daily closing price and its transformation For example, T = {10 Jan, 9 Jan, 8 Jan, 7 Jan, 6 Jan, 5 Jan, 4
were the most sensitive input parameters for the stock forecast. Jan, 3 Jan, 2 Jan, 1 Jan}, the current testing window size is 3
In contrast, technical indicators such as RSI and MA were not (w ¼ 3), and P = {9 Jan, 8 Jan, 7 Jan, 6 Jan}. After the time
critical in those experiments. As such, we feel confident to subset generation process, T0 = {(5 Jan, 4 Jan, 3 Jan), (4 Jan, 3
concentrate on the investigation of the closing price movements Jan, 2 Jan), (3 Jan, 2 Jan, 1 Jan)}.
for possible trends and irregularities. This will be the subject of
chart pattern analysis below. 3.1.3. Pattern recognition
For a given set of time T00 j T00  T0, apply the wavelet theory
3. Wave pattern identification to identify the desired sequences. If a predefined wave pattern is
discovered, add T00 to P. Details are described below.
According to Thomas [14], there are up to 47 different chart The proposed algorithm PXtract is given in Fig. 3. The
patterns, which can be identified in stock price charts. These function genSet(wi ) is the subset generation process discussed
J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208 1199

Fig. 1. Samples of typical chart patterns [14].

earlier. At the end of the algorithm, all the time information of univariate function c, defined on R when subjected to
the identified wave pattern is stored in set P. fundamental operations of shifts and dyadic dilation, yielding
Pattern matching can be carried out using simple multi- an orthogonal basis of L2(R).
resolution (MR) matching (or radial basis function neural The orthonormal basis of compactly supported wavelets of
network (RBFNN) matching. Details of the wavelet recognition L2(R) is formed by the dilation and translation of a single
and simple MR matching can be found in our previous work function c (x).
[15].
c j;k ðxÞ ¼ 2i=2 cð2 j x  kÞ
4. Wavelet recognition and matching where j, k 2 Z. Vanishing moments means that the basis func-
tions are chosen to be orthogonal to the low degree polyno-
Wavelet analysis is a relatively recent development of applied mials. It is said that a function w(x) has a vanishing kth moment
mathematics in 1980s. It has since been applied widely with at point t0 if the following equality holds with the integral
encouraging results in signal processing, image processing and converging absolutely:
pattern recognition [16]). As the waves in stock charts are 1D Z
patterns, no transformation from higher dimension to 1D is ðt  t0 Þk ’ðtÞdt ¼ 0
needed. In general, wavelet analysis involves the use of a

Fig. 2. Time subset generation. Fig. 3. Algorithm PXtract.


1200 J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208

The function c(x) has a companion, the scaling function f(x), ing to the wavelet orthonormal decomposition as shown in
and these functions satisfy the following relations: Eq. (1), Vj is first decomposed orthogonally into a high-fre-
quency sub-space Vj+1 and Wj+1. The low-frequency sub-space
pffiffiffiX
L1
fðxÞ ¼ 2 hk fð2x  kÞ Vj+1 is further decomposed into Vj+2 and Wj+2 and the processes
k¼0 can be continued. The above wavelet orthonormal decomposi-
tion can be represented by
pffiffiffiX
L1
’ðxÞ ¼ 2 gk fð2x  kÞ V j ¼ W jþ1 V jþ1 ¼ W jþ1 W jþ2 V jþ2
k¼0
¼ W jþ1 W jþ2 W jþ3 V jþ3 ¼ . . .
where hk and gk are the low- and high-pass filter coefficients,
respectively, L is related to the number of vanishing moments k According to Tang et al. [16], projective operators Aj and Dj
and L is always even. For example, L = 2k in the Daubechies are defined as:
wavelets.
A j : L2 ðRÞ  V j projective operator from L2 ðRÞ to V j
gk ¼ ð1Þk hLk1 ; k ¼ 0; . . . ; L  1 D j : L2 ðRÞ  W j projective operator from L2 ðRÞ to W j
Z
þ1
Since f ðxÞ 2 V j  L2 ðRÞ :
fðxÞdx ¼ 1 X
f ðxÞ ¼ A j f ðxÞ ¼ c j;k f j;k ðxÞ ¼ A jþ1 f ðxÞ þ D jþ1 f ðxÞ
1
k 2 ZZ
The filter coefficients are assumed to satisfy the orthogon- X X
¼ c jþ1;m f jþ1;m ðxÞ þ d jþ1;m c jþ1;m ðxÞ
ality relations: m 2 ZZ m 2 ZZ
X
hn hnþ2 j ¼ dð jÞ Also, Tang et al. [16] has proved the following equations:
n
X X
hn gnþ2 j ¼ 0 c jþ1;m ¼ hk c j;kþ2m (2)
n

for all j, where d(0) = 1 and d(j) = 0 for j 6¼0. X


d jþ1;m ¼ gk c j;kþ2m (3)
4.1. Multi-resolution analysis
According to the wavelet orthonormal decomposition as shown
Multi-resolution analysis (MRA) was formulated based on the in Eq. (1), the original signal V0 can be decomposed ortho-
study of orthonormal, compactly supported wavelet bases [17]. gonally into a high-frequency sub-space W0 and a low-fre-
The wavelet basis induces a MRA on L2(R), the decomposition of quency sub-space V0 by using the wavelet transform Eqs. (2)
the Hilbert space L2(R), into a chain of closed sub-space: and (3). In the chart pattern recognition process, V0 should be
    V4  V3  V2  V1  V0     the original wave pattern, while V1 and W1 should be the
wavelet-transformed sub-patterns.
such that If we want to analyze the current data to determine whether it
 \ j 2 Z V j ¼ f0g and [ j 2 Z V j ¼ L2 ðRÞ is a predefined chart pattern, a template of the chart pattern is
 f ðxÞ 2 V j , f ð2xÞ 2 V jþ1 needed. According to the noisy input data, direct comparing the
 f ðxÞ 2 V0 , f ðx  kÞ 2 V0 data with the template will lead to an incorrect result.
 9 c 2 V0 ; fcðx  kÞgk 2 Z is an orthogonal basis of V0 Therefore, wavelet decomposition should be applied to both
the input data and the template. Example of matching the input
In pattern recognition, an 1D pattern, f(x), can always be data to a ‘‘head-and-shoulder, top’’ pattern is illustrated in
viewed as a signal of finite energy; such that, Fig. 4.
Z
þ1 We can match sub-patterns using either a range of coarse-to-
j f ðxÞj2 < þ 1 fine scales, or by matching the input data with features in the
pattern template. The matching process will only be terminated
1
if the target is accepted or rejected. If the result is
It is mathematically equivalent to f(x) 2 L2(R). It means that undetermined, it continues to at the next, finer scale. The
MRA can be applied to the function f(x) and can decompose it coarse scale coefficients obtained from the low pass filter
to L2(R) space. In MRA, closed sub-space Vj1 can be represent the global features of the signal.
decomposed orthogonally as: For a high-resolution scale, the intraclass variance will be
larger than for a low resolution scale. A threshold scale
V j ¼ V jþ1 W jþ1 (1) should be defined to determine the acceptance level. For
example, scale n is defined as the lowest resolution. The
Vj contains the low-frequency signal component of Vj1 and Wj resolution threshold is t and t > n. At each resolution t, its
contains the high-frequency signal component of Vj1. Accord- root-means-square should be greater than another threshold
J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208 1201

network outputs that feedback to the network; (2) major co-


relative variables that are concerned with the prediction
problem. Past network outputs enter into the network by means
of time-delay unit as the first inputs. These outputs are also
affected by a decay factor g ¼ aelk , where l is the decay
constant, a is the normalization constant, and k is the forecast
horizon. In general, the time series prediction of the proposed
network is to predict the outcome of the sequence x1i+k at the
time t + k that is based on the past observation sequence of size
n, i.e. x1t, x1t1, x1t2, x1t3, . . ., x1tn+1 and on the major
variables that influence the outcome of the time series at time t.
The numbers of input nodes in the first and second portions are
set to n and m, respectively. The number of hidden nodes is set
to p. The predictive steps are set to k, so the number of output
nodes is k. At time t, the input will be [x1t, x1t1, x1t2, x1t3, . . .,
x1tn+1] and [x21, x22, . . ., x2m], respectively. The output is given
by xt+k, denoted by pkt for simplicity, wijt denotes the connection
weight between the ith node and the jth node at time t.
Fig. 4. Wavelet decomposition in both input data and chart pattern template. To simplify the network, the choice of the centers of the
Gaussian functions is determined by the K-means algorithm
value l, which called the level threshold. It is difficult to [24]. The variances of the Gaussians are chosen to be equal to
derive optimal thresholds; therefore, we need to determine the mean distance of every Gaussian center from its
this through empirical testing. Fig. 5 illustrates the details of neighboring Gaussian centers. A constructive learning
the process. approach is used to select the number of hidden units in the
RBFNN. The hidden nodes can be created one at a time. During
4.2. Radial basis function neural network (RBFNN) the iteration we add one hidden node and check the new
network for errors. This procedure is repeated until the error
Neural networks are widely used to provide non-linear goal is met, or until the preset maximum number of hidden
forecasts [18–20] and have been found to be good in pattern nodes is reached. Normally, the preset maximum number of
recognition and classification problems. Radial basis function hidden nodes should be less than the total number of input
neural network (RBFNN), its universal approximation cap- patterns. The existence of fewer hidden nodes shows that the
abilities have been proven by Park and Sandberg [21,22] to be network generalizes well though it may not be accurate enough.
suitable for solving our pattern/signal matching problem [23].
We have created different RBFNNs for recognizing different 5. Training set collections
patterns at different resolution levels. The input of the network
is the wavelet-transformed values in a particular resolution. Stock chart pattern identification is highly subjective and
As shown in Fig. 6, a typical network consists of three layers. humans are far better than machines at recognizing stock
The first layer is the input layer having two portions: (1) past patterns, which are meaningful to investors. Moreover,

Fig. 5. The multi-resolution matching.


1202 J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208

Fig. 6. Schematic diagram of a typical RBFNN.

extracting chart patterns in the stock time series data is a time we use a simple but powerful mechanism to generate more
consuming and expensive operation. We have examined five training data based on the real data.
typical stocks for the period 1 January 1995 to 31 December To generate more training samples, a radial deformation
2001 (see Table 1). A summary of the total numbers of real method is introduced. Here are the major steps of the radial
training data for fourteen different chart patterns is shown in deformation process:
Table 2. The training set of the chart patterns is collected based
on the judgment of a human critic following the rules suggested (a) P = {p1, p2, p3, . . ., pn} is a set of data points containing a
by Thomas [14] from the real and deformed data described in chart pattern.
the following. The training set contains totally 308 records. A (b) Randomly pick i points (i< = n) in set P for deformation.
quarter of the training set is extracted as the validation set. We (c) Randomly generate a set of the radial deformation distance
set the wavelet resolution equal to 8. We found that the signal/ D = {d1, d2, . . ., di}.
pattern for the resolution 1–3, was too smooth and each pattern (d) For each point in P, a random step dr is taken in a random
was similar to each others at those levels. The network was not direction. The deformed pattern is constructed by joining
able to recognize different patterns well. Therefore, only four consecutive points with straight lines. Details are depicted
RBFNNs were created for training different chart patterns at the in Fig. 7.
resolution levels 4–7. The performance of the networks at (e) Justify the deformed pattern using human critics.
different resolution levels and the classification results are
shown in Section 6. Psychophysical studies [25] tell us that humans are better
In our training set, the initial quantity of data is insufficient than machines at recognizing objects, which are more
for training the system well. If we tried to extract over 200 chart
patterns in the time series data, it would be infeasible, time
consuming and expensive. In order to expand the training set,

Table 1
The five different stocks and their stock IDs
Stock ID Stock name
00341 CAFÉ DE CORAL HOLDINGS Ltd.
00293 CATHAY PACIFIC AIRWAYS Ltd.
00011 HANG SENG BANK Ltd.
00005 HSBC HOLDINGS PLC.
00016 SUN HUNG KAI PROPERTIES Ltd. Fig. 7. Radial deformation. (a) An example of accepted deformed pattern. (b)
An example of NOT accepted deformed pattern.
J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208 1203

Table 2
Total numbers of training patterns in fourteen typical chart patterns of five different stocks

meaningful to humans. In assessing the generated training 6. Experimental results


data, the whole training set (including real and generated
data) is accepted and selected based on the opinion of the Two set of experiments have been conducted to evaluate the
human critic. In the training set, 64 chart patterns have been accuracy of the proposed system. The first set evaluates whether
extracted from five different stocks in Hong Kong stock the algorithm PXtract is scaleable, and the second set evaluates
market. the performance comparison between using simple multi-
By applying the radial deformation technique, the 64 real resolution matching and RBFNN matching.
training patterns were extended to a total of 308 patterns. All of Algorithm PXtract uses different time window sizes to
the patterns generated by radial deformation must be judged by locate any occurrence of a specific chart pattern. The major
humans to identify whether a human would accept the concern is the performance of the algorithm. To assess the
deformed pattern or find it meaningful. Fig. 8 illustrates relative performance of the algorithms and to investigate their
examples of (a) an accepted and (b) a NOT accepted of scale-up properties, we performed experiments on an IBM PC
deformed pattern. Fig. 9 shows the training set from both the workstation with 500 MHz CPU, 128 MB memory. To evaluate
real and deformed chart patterns. the performance of the algorithm using RBFNN Matching over

Table 3
Optimal wavelet and thresholds setting found by empirical testing
Wavelet family Resolution threshold Threshold value Accuracy (%) Total number of patterns discovered Processing time (s)
Daubechies (DB2) 4 0.3 6.2 8932 312
0.2 7.1 7419
0.15 14.2 3936
0.1 43.1 543
5 0.3 7.1 7734 931
0.2 9.4 6498
0.15 17.4 2096
0.1 53 420
6 0.3 8.9 7146 3143
0.2 13.5 5942
0.15 19.9 1873
0.1 56.9 231
7 0.3 10.5 6023 8328
0.2 14.5 5129
0.15 18.5 1543
0.1 48.3 194
1204 J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208

Fig. 8. Accepted and NOT accepted deformed patterns by radial transformation.

a large range of widow’s sizes, we used typical stock prices of contains the ‘Double Tops’ pattern. For the identification of the
SUN HUNG KAI and Co. Ltd. (0086) for the period from 2 chart patterns, two matching methods were studied — simple
January 1992 to 31 December 2001. multi-resolution (MR) matching and RBFNN matching,
As shown in Fig. 10, the algorithm scales linearly as the size respectively. For the simple MR matching, similarity between
of the time window increases. the input and the template is measured by mean absolute
In the experiments on wavelet chart patterns recognition, percentage error (MAPE). A low MAPE denotes that they are
different wavelet families were selected as the filter. The similar. The performance of simple MR matching was tested in
maximum resolution level was set to be 7. The highest experiments using different resolution threshold t and different
resolution level 8 is taken as the raw input. The left hand side of level threshold l.
Fig. 11 shows the price of the stock CATHAY PACIFIC (00293) Table 3 shows the most accurate combinations. We note
for the period from 7 June 1999 to 22 July 1999. This period that the accuracy using simple MR matching is not accurate
J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208 1205

Fig. 9. Training set from both the real and deformed chart patterns.

with an average recognition rate of just 30%. Furthermore,


the calculation of MAPE between the input data and the
pattern templates creates a heavy workload. Although it is
possible to reach a recognition rate of more than 50%, if we
set a level threshold at a low value (about 0.1) and a high-
resolution threshold (above 6), the processing time is
unacceptably long (about 3143 s). This illustrates that simple
MR matching is not a good choice for use in the matching
process.
Table 4 illustrates the overall classification results. It shows
that the classification rate is over 90% and the optimal
recognition resolution level is 6. Four wavelet families were
tested, and their performances were more or less the same
Fig. 10. Execution time of algorithm PXtract using RBFNN matching under except that the Haar wavelet was found to be not suitable for
different time window sizes. use.
1206 J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208

Fig. 11. Algorithm PXtract using wavelet multi-resolutions analysis on the pattern ‘‘double tops’’ template.

Having found the appropriate setting for the RBFNN, we Table 5


applied it to extract all the chart patterns from 10 different stocks Accuracy of identifying the fourteen different chart patterns using RBFNN
extraction methods
over the last 10 years. Table 5 shows the accuracy of the 14
different chart patterns. The RBFNN is on average 81% accurate. Chart pattern Accuracy (%)
Multi-resolution RBFNN matching has a high accuracy in Broadening bottoms 73
recognizing different chart patterns. However, the accuracy Broadening formations, right-angled and ascending 84
Broadening formations, right-angled and descending 81
Table 4 Broadening tops 79
RBFNNs: accuracy in different wavelet families and at different resolution Broadening wedges, ascending 86
levels Bump-and-run reversal bottoms 83
Bump-and-run reversal tops 82
Wavelet Resolution Training Validation
Cup with handle 63
Families level set (%) set (%)
Double bottoms 92
Haar 4 66 64 Double tops 89
DB1 5 75 72 Head-and-shoulders, top 86
6 81 78 Head-and-shoulders, bottoms 87
7 87 74 Triangles, ascending 73
Triangles, descending 76
Daubechies 4 73 64
DB2 5 85 78
6 95 91
7 97 85 of the recognition process is heavily dependent on the
Coiflet 4 77 72 resolution level. Once the resolution level has been identified,
C1 5 86 81 based on empirical testing, the proposed method is highly
6 95 90 accurate.
7 98 84
Symmlet 4 75 68 7. Conclusion and future works
S8 5 84 78
6 93 89
In this paper, we examined the sensitive factors associated
7 96 82
with stock forecast and stressed the importance of chart pattern
J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208 1207

identification. We have demonstrated how to automate the References


process of chart pattern extraction and recognition, which has
not been discussed in previous studies. The PXtract algorithm [1] G. Zhang, B.E. Patuwo, M.Y. Hu, Forecasting with artificial neural
networks: the state of the art, Int. J. Forecasting 14 (1998) 32–62.
provides a dynamic means for extracting all the possible
[2] M. Austin, C. Looney, J. Zhuo, Security market timing using neural
chart patterns underlying stock price charts. It is shown that networks, New Rev. Appl. Expert Syst. (2000).
PXtract consistently achieves high accuracy with a desirable [3] P.K.H. Phua, X. Zhu, C.H. Koh, Forecasting stock index increments using
result. neural networks with trust region methods, in: Proceedings of the Inter-
Currently, we have analyzed only 14 of the representatives national Joint Conference on Neural Networks, vol. 1, 2003, pp. 260–
of chart patterns and templates from a total of 308 training 265.
[4] S. Heravi, D.R. Osborn, C.R. Birchenhall, Linear verus neural network
samples. According to Thomas [14], there are totally 47 forecasts for European industrial production series, Int. J. Forecasting 20
different chart patterns, which can be extracted from the time (2004) 435–446.
series data. In order to complete the system, the future direction [5] M. Funabashi, A. Maeda, Y. Morooka, K. Mori, Fuzzy and neural hybrid
of work will be to build templates for the remaining chart expert systems. Synergetic AI, IEEE Expert (1997) 32–40.
patterns. [6] H.S. Ng, K.P. Lam, S.S. Lam, Incremental genetic fuzzy expert trading
system for derivatives market timing, in: Proceeding of IEEE 2003
On the other hand, the identification and extraction of the International Conference on Computational Intelligence for Financial
chart patterns enable us to establish cases for interpretation and Engineering, Hong Kong, 2003.
stock forecast. We regard these chart patterns as potentially [7] M. Mohammadian, M. Kingham, An adoptive hierarchical fuzzy logic
suitable for case representation in a CBR system. It may be system for modeling of financial systems, Intell. Syst. Account. Financ.
Manag. 12 (1) (2004) 61–82.
worthwhile revisiting the selection of indicators associated with
[8] K. Boris, V. Evgenii, Data Mining in Finance — Advances in Relational
the relevant chart patterns in order to form feature vector (e.g. and Hybrid Methods, Kluwer Academic Publishers, 2000.
time range, RSI, OBV, prices moving average, wave pattern). [9] T. Plummer, Forecasting Financial Markets, Kogan Page Ltd., 1993.
We might then compare these feature vectors, v1 ; v2 2 ½a; b, [10] S. Weinstein, Stan Weinstein’s Secrets for Profiting in Bull and Bear
such that the similarity of v1 and v2 is computed by the Markets, McGraw Hill, 1988.
following expression: [11] Kwong, R. (2004). Intelligent web-based agent system (iWAF) for e-
finance application, MPhil, The Hong Kong Polytechnic University.
jv1  v2 j [12] E. Gately, Neural Networks for Financial Forecasting — Top techniques
simðv1 ; v2 Þ ¼ 1  for b 6¼ a for Designing and Applying the Latest Trading Systems, Wiley Trader’s
ba Advantage, 1996.
[13] R. Bensignor, New Thinking in Technical Analysis, Bloomberg Press,
For the attribute ‘‘class pattern’’ in the feature vector, the 2002.
similarity measure of the attribute between the two cases can be [14] N.B. Thomas, Encyclopedia of Chart Patterns, John Wiley & Sons, 2000.
measured by the following expression: [15] J.N.K. Liu, R. Kwong, Chart Patterns Extraction and Recognition in CBR
 System for Financial Forecasting., in: Proceeding of the IASTED Inter-
1 if v1 ¼ v2 national Conference ACI2002, Tokyo, Japan, (2002), pp. 227–232.
simðv1 ; v2 Þ ¼ [16] Y.Y. Tang, L.H. Yang, J.N.K. Liu, H. Ma, Wavelet Theory and Its
0 otherwise
Application to Pattern Recognition, World Scientific Publishing, River
Edge, NJ, 2000.
The overall similarity between two cases c1 and c2 is
[17] S. Mallat, Multiresolution approximations and wavelet orthonormal bases
measured by the weighted-sum metric shown below: of L2(R), Trans. Am. Math. Soc. (1989) 69–87.
P [18] R.G. Donaldson, M. Kamstra, Forecast combining with neural networks,
w simðv1i ; v2i Þ
Pi
simðc1 ; c2 Þ ¼ i¼1...n J. Forecasting 15 (1996) 49–61.
i¼1...n wi [19] M. Adya, F. Collopy, How effective are neural networks at forecasting and
prediction? A review and evaluation, J. Forecasting 17 (1998) 481–
The system retrieves the updated stock data and converts them 495.
into case knowledge. It then studies the new current status of the [20] A. Kanas, Non-linear forecasts of stock returns, J. Forecasting 22 (2003)
299–315.
cases and appends the new result set into the result database for
[21] J. Park, I.W. Sandberg, Universal approximation using radial basis func-
users’ direct query. The system can be set to refer to three tion networks, Neural Comput. 3 (1991) 246–257.
successive cases within a series of stock cases as one complete [22] J. Park, I.W. Sandberg, Approximation and radial basis function networks,
CASE for the purposes of prediction. Further exploration in this Neural Comput. 5 (1993) 305–316.
area is ongoing. Typical examples can be obtained from our [23] F.J. Chang, J.M. Liang, Y.-C. Chen, Flood forecasting using RBF neural
networks, IEEE Trans. SMC Part C 31 (4) (2001) 530–535.
previous work [26,27]. On the other hand, the consideration of
[24] J.T. Tou, R.C. Gonzalez, Pattern Recognition Principles, Addison Wesley,
the use of hybrid approaches such as support vector machine Reading, MA, 1974.
with adaptive parameters (e.g. [28]), evolutionary fuzzy neural [25] W.R. Uttal, T. Baruch, L. Allen, The effect of combinations of image
networks (e.g. [32]), etc. shall be able to help improve financial degradations in a discrimination task, Perception Psychophys. 57 (5)
forecast. This will be the subject of future research. (1995) 668–681.
[26] J.N.K. Liu, T.T.S. Leung, A web-based CBR agent for financial forecast-
ing a workshop program, in: Proceeding of the 4th International Con-
Acknowledgement ference on Case-Based Reasoning, Vancouver, CA, (2001), pp. 243–253.
[27] Y. Li, S.C.K. Shiu, S.K. Pal, J.N.K. Liu, Case-base maintenance using soft
The authors would like to acknowledge the partial support of computing techniques, in: Proceedings of the Second International Con-
the Hong Kong Polytechnic University via CRG grant G-T375. ference on Machine Learning and Cybernetics, Machine Learning and
1208 J.N.K. Liu, R.W.M. Kwong / Applied Soft Computing 7 (2007) 1197–1208

Cybernetics, Sheraton Hotel, Xi’an, China, 02–05 November 2003, [29] P. Blakey, Pattern recognition techniques [in stock price and volumes],
(2003), pp. 1768–1773. IEEE Microwave Mag. 3 (1) (2000) 28–33.
[28] L.J. Cao, F.E.H. Tay, Support vector machine with adaptive parameters in [32] L.Y. Yu, Y.-Q. Zhang, Evolutionary fuzzy neural networks for
financial time series forecasting, IEEE Trans. Neural Networks 14 (6) hybrid financial prediction, IEEE Trans. SMC Part C 35 (2) (2005)
(2003) 1506–1518. 244–249.

Вам также может понравиться