Академический Документы
Профессиональный Документы
Культура Документы
Idriss Tsafack∗1
1
Department of Economics, University of Montreal
Abstract
This paper propose to analyze the S&P 500 intraday cumulative return predictability with a fully
functional autoregressive model. This approach is practically important because market participant
can use the forecast results to tactically adjust their market timing exposure for the next day. Since
there is a deal with very big data, one of the main challenge is to estimate properly the autoregressive
operator. In order to get a more stable estimation, four different methods are suggested, that
are the Functional Principal Component Analysis (FPCA), the Functional Partial Least Squares
(FPLS), the Functional Tikhonov method (FT) and the Functional Landweber Fridman technique
(FLF). The convergence rate of the estimated autoregressive operator is derived for the different
methods. Monte Carlo simulations and real data application show that the FLF and the FT method
outperform the others method in overall model settings. Moreover, based on the dynamic of the
previous day, it is always better for traders to expose their portfolio only on the first opening hour
of the market. If traders were more active in the first half of the previous day, it is possible to hold
their investment in the whole trading session for the next day, otherwise they should hold at most
for the first half of the next day and turn to a reversal strategy on the second half. Using a the
FPLS estimation approach, it is possible to reach a remarkable R2 value of 8% in the first and last
hours of a trading session, that is almost 4 times the one obtained by Gao et al. (2018).
Keywords : Time series momentum, intraday momentum, Functional linear regression, Principal
Component Analysis, Regularization Method, return predictability, Big data.
JEL Codes : C01, C49, C53, C55, C62, G12, G13, G15.
∗
Economic Departement, University of Montreal, 3150 rue Jean-Brillant, Montral, (QC) H3T 1N8, Canada
(idriss.tsafack.teufack@umontreal.ca)
1
1 Introduction
Following the idea of Jegadeesh & Titman (1993), there is a very large literature documenting
the success of the momentum strategy in the financial market. The idea behind this strategy is
that the market participant should buy the assets that have outperformed and sell those that have
underperformed based on their historical evolution. The literature distinguish two types of momentum
strategies, that are the cross-sectional momentum, widely discussed by authors such as Griffin et al.
(2003), Lehmann (1990), and the time series momentum strategies developed by Moskowitz et al.
(2012), Neely et al. (2014) and other. Unfortunately, most of these papers tend to focus on long time
horizon - typically month or longer. The issue with long term is that they display very low sharp
ratio and backtest statistical significance because of infrequent independent trading signals. Another
issue is that long term momentum usually underperform in the aftermath of financial crisis 1 . The
research interest on such momentum pattern at the intraday granularity has recently started with Gao
et al. (2018) who show the evidence that the first half-hour return positively predicts the last half-hour
return in the U.S. stock market. Furthermore, Sun et al. (2016) and Renault (2017) have identified
that the intraday stock return can be predicted by high-frequency investor sentiment.
On the same flow, this paper propose to analyze the S&P 500 intraday cumulative return predictability
with a fully functional autoregressive model. In contrast to the traditional financial analysis, as an
econometrician the intraday cumulative returns can be observed as curves in a functional space. Then,
the shape of the cumulative return observed at the 5 minutes timeframe is used to predict the next
day shape2 . This approach is practically important because market participant can use the forecast
results to tactically adjust their market timing exposure for the next trading day. Furthermore, on
an econometric point of view, using functional data analysis is interesting since it makes possible to
take into account additional informations, that is the dynamic of the return from a certain time of the
day to another. Moreover, this approach is helpful to get a nice overview of the market participant
behavior within a trading day such as the U-Shape of the volatility, the most relevant period to take
When using a functional autregressive model, because the model setting is directly exposed to very
1
see page chapter 7 of Chan (2013)
2
The 5 minute is considered here just for illustration purpose. It is possible to use other timeframe such as the 10
minutes, 15 minutes, 1 minutes or tick data.
2
high dimensional data, one of the most important challenge is to estimate the autoregressive operator
3. Indeed, if one is not careful about the manipulation of the functional objects, there is a high
probability to obtain very unstable estimators of the autoregressive operator. To control the stability
of the estimated parameter, this paper suggest to compare 4 different estimation methods that are the
FPCA, FPLS, FT and FLF methods. These methods can also be observed as regularization methods
they all depend on a tuning parameter. To be sure of the quality of the estimation methods, their
respective convergence rate and asymptotic normality are derived and analyzed under some general
regularity conditions. These asymptotic normality results would be useful to test the significance of
To compare the four estimation methods, some Monte Carlo simulations have been developed. The
comparison is based on 5 criteria that are the Mean Squared Error (MSE), the Mean Average Distance
(AD), the Ratio Average Distance (RAD) to measure the quality of the estimation of the kernel, on
one hand and the Mean Squared Prediction Error measured on two different approaches (En and
Rn) on the other hand. Based on a large part of the model setting considered, the simulations show
that the FLF and the FT tend to outperform the others in terms of estimation of the autoregressive
operator what ever the sample size and the error terms of the model are. In terms of prediction, the
4 methods behave almost the same way for the small sample size and in the situation of large sample
size, the FPLS approach tend to outperform the others in terms of prediction errors. This result is
The empirical analysis of this paper use the S&P 500 future data from 01/01/2008 to 12/31/2017
on the 5 minute timeframe. The empirical analysis provide several interesting findings. An overview
of the empirical findings show the evidence that the cumulative intraday return shape of the current
trading day contribute significantly to predict the next day cumulative return shape. The different
estimation method provide different results. The FT and FLF display almost similar estimation results
of the autoregressive operator while the FPCA and FPLS tend to capture well the prediction results.
Based on the predictive R2 , the FPLS tend to catch a remarkable value of 8% at the beginning and
the end of the trading session, that is almost 4 times the one obtained by Gao et al. (2018) and twice
the one obtained by Zhang et al. (2019). The FPCA and FT method tend to reach a predictive R2 of
5% and 6% respectively in the morning (That is nearly twice the one obtained by Gao et al. (2018))
3
The autoregressive operator is similar to the slope parameter in the context of a simple OLS model.
3
and nearly 2.5% at the end of the trading session for the FT approach. Surprisingly, the FPCA is
not able to capture the momentum of the ending trading session. According to the FLF approach,
the R2 is around 2.5% at the beginning of the trading session while in the second half of the day it is
approximately 1.2%.
An examination of the autoregressive operator show that a strong momentum in the first half
of a trading day is positively correlated to to the next day shape, while a strong momentum in the
second half of a trading session is significantly positively correlated to the first half of the next trading
and negatively correlated to the second half of the next trading day. When the traders The previous
day first hour opening trading session is significantly positively correlated to the next. This result is
based on the FLF and the FT estimation method. Moreover, the shape of the previous day contribute
to derive more simply the volatility U-shaped patterns with the high volume and volatility in the
beginning and the last hour of the next trading day. This result is easily observed when the FPLS
approach is used. This result is similar to the one obtained by Gao et al. (2018). An exploration of
the effect of the volatility and the trading volume on the intraday momentum, we observe that the
intraday momentum is stronger on high volatility of the beginning trading session and high volatility
of the ending trading days. The same result is observed for volatility. These results can be explained
by the portfolio rebalancing pattern, the late-informed investor effect, the market manipulation of the
high frequency traders and often forced sales or purchases of assets of various type of funds.
This paper is related to the one of Gao et al. (2018) and Zhang et al. (2019) for the context
of financial market. The difference with their result is that the cumulative returns are observed
as curve instead of working with instant return. Moreover, this approach is easy to document the
U-shaped volatility and the in-sample and out-of-sample R2 function can reach an impressive predictive
performance of 8%, that is 4 times the performance of Gao et al. (2018). The idea of using cumulative
return is inspired by the paper of Kokoszka & Zhang (2012) and Shang (2017). Indeed, Kokoszka
& Zhang (2012) use individual assets and market assets respectively, but they their main purpose is
about comparing some forecasting methods the next day curve. The difference with their paper is
that this paper care about predictability and forecasting by proposing 4 different estimation methods.
The consistency results of the estimation method (of the autoregressive operator) are presented, and
comparison of those methods are made based on simulation and empirical analysis. An economic
4
significance analysis is also derived. This paper is also related to the Functional linear Regression with
functional response widely developed by Benatia et al. (2017), Imaizumi & Kato (2018) and functional
autoregressive model extensively developed by Bosq (2000), Kargin & Onatski (2008) and Kokoszka
The rest of the paper is organized as follows. Section 2 is dedicated to present the related
literature. The functional econometric model is presented in section 3. In section 4, I present how
to estimate the model using the four aforementioned methods. Section 5 is devoted to analyze the
convergence rate and the asymptotic normality for the estimated autoregressive operator. Section
6 present the comparison of the four methods based on Monte Carlo simulations. The real data
2 Related literature
This paper is related to four literatures : the return predictability, the momentum strategy in the
financial market, the functional data analysis, and the functional autoregressive model.
The momentum strategy became very popular with the well-known seminal work of Jegadeesh &
Titman (1993). In fact, they developed a cross-sectional momentum using a monthly frequency and
show that buying past winners and selling past losers can generate a significant positive return over
the next 3 to 12 months of holding. This paper has been widely extended by Griffin et al. (2003)
who show the evidence that these momentum can be observed in different stock markets such as U.S.,
In contrast to the cross-sectional momentum, Moskowitz et al. (2012) have documented the success
of the time-series momentum strategy in equity index, currency, commodity, and bond futures. The
idea behind this strategy is to look at one asset at a time instead of a group of assets. Indeed, they
show the evidence that the previous 12-month returns of an asset contribute significantly to predict
the future returns. He & Li (2015) have proposed a continuous-time heterogeneous agent model in
order to explain the significance of time series momentum. They show that the momentum strategies
perform well when the market is dominated by the momentum traders. They specifically documented
that the short term momentum strategies tend to stabilize the market, while the long term momentum
5
Furthermore, the recent research by Gao et al. (2018) have demonstrated the intraday momentum
in the U.S. stock market at the 30 minutes frequency. Indeed they show that the that the first half hour
return contribute to predict the last half hour return and that the effect is stronger in more volatile
days, on higher volume days, recession days and high impact news release days. In the same line, Zhang
et al. (2019) have documented almost the same results in the China stock market and explained that
these momentum can be explained not only by the infrequent rebalancing or late-informed investors
but also the U-shaped volume pattern. Chu et al. (2019) have identified not only that the last half
hour is positively predicted by the first half hour, but they also identified a reversal effect in the second
half hour of the trading day in the Chinese stock market. They also find that this momentum and
reversal effect is robust when including previous day return and day-of-week. Beside those authors,
Heston et al. (2010) has discovered a striking pattern of return continuation at half-hour intervals that
Concerning the main causes of momentum, Chan (2013) listed 4, that are the persistence of the
sign of roll returns for futures; the slow diffusion, analysis, and acceptance of new information; the
forced sales or purchases of assets of various type of fund and market manipulation of high-frequency
traders. On the other hand, based on some theoretical analysis Bogousslavsky (2016) has identified
the infrequent rebalancing and the late-informed investors’ effect. Indeed, the infrequent rebalancing
phenomenon is described by the fact that a certain group of investors decide to rebalance their portfolio
in early in the morning while other decide to do so near the market closing. On the other hand, the
late-informed investors’ effect induce the fact that news release is an important factor that contribute
significantly to return predictability. These hypothesis have been confirmed by Gao et al. (2018) and
Zhang et al. (2019) but not by Chu et al. (2019). They argue that the noise trading drives intraday
returns predictability. Beside those causes, high-frequency traders sentiment can also be listed to
significantly affect the times series momentum as was documented by Sun et al. (2016) and Renault
(2017).
The literature on functional data analysis has attracted a lot of attention in the statistical field
during the last decade. Some of the pioneers are Ramsay & Silverman (2007) who have developed a
general context, Hyndman & Shang (2009), Hörmann & Kokoszka (2012), Kokoszka & Zhang (2012)
, and Ferraty & Vieu (2006) who have developed a semiparametric approach. One of the main
6
challenge is to be able to estimate the slope function (in the context where the response is a scalar) or
the operator (if the response variable is a function) because there is a high dimensional issue. More
recently, some authors such as Crambes et al. (2013), Benatia et al. (2017) and Imaizumi & Kato
(2018) have analyzed the rate of convergence of the estimated parameter when we are in the context
The functional autoregressive model has became popular by Bosq (2000) He consider a parametric
approach for estimation purpose. Besse et al. (2000) has considered the same model and adopted a
nonparametric approach for estimation. Kargin & Onatski (2008) has proposed a predictive factor
approach to estimate the autoregressive operator. The idea of the approach is project the data on a
set of factors that are relevant for the prediction. Didericksen et al. (2012) have compared the method
of Kargin & Onatski (2008) and the FPCA and show that the FPCA outperform the predictive factor
in terms of estimation and in terms of prediction, they reach the same prediction error. Hyndman
& Shang (2009) and Aue et al. (2015) have proposed to use a univariate and multivariate time series
forecasting method since the PCA scores can display a temporal dependence.
The usage of functional autoregressive model or the functional data analysis is less observed in
the financial market field, to the best of our knowledge. The usage of the cumulative return is the
interesting because the curve are more informative in the sense that they take into account the dynamic
between two discretization point. The only papers that can be identified are the one by Kokoszka &
Zhang (2012) who proposed to predict an individual stock by using a functional CAPM and Shang
(2017) who suggested to forecast the U.S. stock market by using the dynamic update technique. Their
main purpose is to compare different forecasting method and not to analyze the return predictability.
In this paper, for each day we observe the shape of the intraday 5 minutes of the S&P 500
future cumulative return. We use the cumulative intraday returns (CIDRs) used by Gabrys et al.
(2010) and Kokoszka et al.(2015). Let Pi (tj ) be the 5 minutes closing price of a financial asset at time
7
×10 -3
price evolution within the year 2017 Univariate Instant return Functional Time series
2700 1.5 1.5
2650
1
2600 1
0.5
2550
2500 0.5
Instant return
0
2450
-0.5
2400 0
2350
-1
2300 -0.5
-1.5
2250
2200 -1 -2
Jan 01, 2017 Jan 01, 2017 0 0.5 1
Time of the day Time of the day Time of the day
Figure 1 represent the constructed intraday cumulative return of the S&P 500 based on the raw
data for the year 2017. From the instant return and the one on the right hand side display at least 4
to 5 outliers
Let (Xi : i ∈ Z) be an arbitrary stationary functional time series of S&P 500 cumulative return
curve. It is assumed that each function Xi is an element of the separable Hilbert space H = L2 ([0, 1])
(the space of square integrable functions mapping from the compact interval [0,1] to R) endowed with
1/2
R1 R1 2
the inner product < f, g >H = 0 f (t)g(t)dt and the norm and the norm ||f ||H = 0 f (t)dt .
Then each random function is square-integrable, that is E(||Xi ||2 ) < ∞. All the random functions are
defined on the same probability space (Ω, F , P ). Therefore, if X ∈ LpH = LpH (Ω, F , P ) with p > 0,
to the observed curve of S&P 500 at the day i = 1, ..., N . In this paper, It is assumed that the
sequence of H-valued variables {X1 , X2 , ...., XN } follow a functional autoregressive Hilbertian process
8
of order 1 (FAR(1)) presented as follows:
Z 1
Xn+1 (t) = Ψ(t, s)Xn (s)ds + εn+1 (t) n ∈ Z (3)
0
linear operator and ε = (εn , n ∈ Z) is a H-valued strong white noise, which is a sequence of H-valued
identically and independently distributed random variables such that E(εn ) = 0 and E(||εn ||2 ) = σ 2 <
∞. We can also consider that the innovations ε is a sequence of H-valued martingale difference since
that the functional error do not necessary follow the same distribution. Without loss of generality, it
Figure 2 represent how the predictor and the predicted function are displayed. The bold blue line
is the mean function for the predictor and the response function respectively. According to what is
observed, it can be deduced that if the functional outliers are removed, from the sample, it is possible
to say that each functional observation is generated from the same data generation process. This idea
has been argued by Kokoszka & Young (2016) who developed a KPSS unit root test for functional
time series.
1 1
0.5 0.5
Intraday cumulative return
0 0
-0.5 -0.5
-1 -1
-1.5 -1.5
-2 -2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)
9
Let us denote by L the space of bounded linear operators on H equipped with the norm
Under the conditions that there exist an integer j0 ≥ 1 such that the linear operator ||Ψj0 ||L < 1,
∞
X
Xn = Ψk (εn−k ) (5)
k=0
and the series converges almost surely in H. If it is assumed that the Hilbert-Schmidt norm of the
operator Ψ is lower than 1, then the existence en uniqueness of the solution is satisfied (See Lemma
We propose to estimate the yield factors by using the partial least squares approach and
4 Model estimation
The goal of this paper is to forecast the one day ahead S&P 500 shape Xn+1 . According to
the data generating process, the best linear predictor of Xn+1 given X1 , ..., Xn is given by Ψ(Xn ).
Typically, Ψ is unknown and should be estimated consistently by an estimator Ψ̂. This section is
dedicated to the estimation of the autoregressive operator Ψ. We propose to estimate this operator
by four different strategy by using directly the fully functional form. Then multiplying equation (1)
by Xn and taking the expectation on both sides, lead to the following equation :
Since E[||Xn ||2 ] < ∞, the covariance operator is symmetric, positive, nuclear and, therefore,
10
K(vj ) = λj vj , j ≥ 1.
with the eigenfunctions vj forming an orthonormal basis of H and the eigenvalues are such that
λ1 ≥ λ2 ≥ ... ≥ 0.
where Ψ∗ and D∗ denote the adjoint operator of Ψ, and D respectively. The operators K, D and
D∗ are observed on the theoretical setting, so they are unknown and can be estimated by K̂, D̂ and
N −1
1 X
D̂(f ) = < Xn+1 , f > Xn .
N −1
n=1
N −1
∗ 1 X
D̂ (f ) = < Xn , f > Xn+1 .
N −1
n=1
and
N −1
1 X
K̂(f ) = < Xn , f > Xn .
N −1
n=1
K̂ is endowed with the empirical spectral system (λ̂j , v̂j )j≥1 with λ̂j ≥ λ̂j ≥ ... ≥ 0 and (v̂j )j≥1
Given the equation (5), one would like to brutally estimate the autoregressive operator by writting
11
Ψ∗ = K −1 D∗ as we usually do for the finite-dimensional context. The problem is that the covariance
operator, which would lead to an unstable estimator. In the inverse probleme literature, equation (5)
is called an ill-posed problem in the sense that K is only invertable on a subset of H and its inverse
is not continuous.
Different approach have been proposed to estimate a stable autoregressive operator. Bosq (2000)
has suggested to use the Functional Principal Component Analysis (FPCA) and derived its consistency
results under some strong assumptions of the eigenvalues. There is a large literature on the nonparametric
techniques to estimate the autoregressive operator such as the spline smoothing and interpolations
techniques proposed by authors such as Besse & Cardot (1996), Besse et al. (2000), Ramsay &
Silverman (2007). Antoniadis & Sapatinas (2003) suggested linear wavelet technique to estimate a
FAR(1) model. Kargin & Onatski (2008) proposed a predictive factor technique which consist to find
out an estimator of the autoregressive operator in such a way that the prediction error is minimized by
projecting the data on principal components that respond to that goal. citedidericksen2012empirical
compared the FPCA method proposed by Bosq (2000) and the predictive factor technique of Kargin &
Onatski (2008) based on some simulation data and they show that in an overview of the comparison,
Other contributions are in the univariate and multivariate time series forecasting of the principal
component scores proposed by Hyndman & Shang (2009) and Aue et al. (2015) respectively. The
& Shang (2009) Basically their approach consist to project first all the data on the most important
principal component by running a Functional Principal Component Analysis, then use the FPCA
scores to form time series vectors and then run the usual univariate ARMA models for each score
vector or VAR model to get the one step ahead scores prediction. Once those results are obtained, the
Karhunen-Loeve expansion is used to retransform the predicted score time series into the predicted
functional time series. There is also a contribution of Benatia et al. (2017) who proposed the Tikhonov
regularization to estimate the unknown operator for an i.i.d fully functional regression model. They
also derived some consistency results under the source conditions proposed by Carrasco et al. (2007).
The goal of this paper is to consider the functional Yule-Walker equation and estimate the
12
autoregressive operator by 4 different regularization techniques that are Tikhonov method, the FPCA
approach of Bosq (2000) , the Functional Partial Least Squares (FPLS) and the Functional Landweber
This approach is one of the most popular methods and was proposed by Bosq (2000) in order to
estimate the autoregressive operator on a finite subspace of H. Since that the operator K is symmetric
∞
X
K(s, t) = λj vj (s)vj (t).
j=1
∞
X
Ψ(s, t) = Ψjk vj (s)vk (t).
j,k=1
R1R1
with Ψjk = 0 0 Ψ(s, t)vj (s)vk (t)dsdt. Moreover, using the Karhunen - Loeve representation of each
∞
X
E[< Xn , vk > Xn+1 (t)] = λk Ψjk vj (s)vk (t).
j=1
∞
X E[< Xn , vk > Xn+1 (t)]
Ψ(s, t) = vk (s). (8)
λk
k=1
and
If it is assumed that m principal components is selected for the procedure, then the estimator of Ψ is
given by
13
m
X
Ψm (s, t) = Ψjk vj (s)vk (t). (9)
j,k=1
Since Ψjk and (vj )j≥1 are unknown, Ψ is consistently estimated using sample data [X1 , ..., XN ] and
we obtain
m
X
Ψ̂m (s, t) = Ψ̂jk v̂j (s)v̂k (t). (10)
j,k=1
with
N −1
1 1 X
Ψjk = < Xn , v̂k >< Xn+1 , v̂j > . (11)
λ̂k N −1
n=1
N −1 m
1 X X < Xn , v̂j >
Ψ̂m (s, t) = v̂j (s)Xn+1 (t). (12)
N −1 λ̂j
n=1 j=1
and
N −1 N −1 ˆ
1 X X Qm,j
Ψ̂∗m (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (13)
N −1 λ̂j
j=1 n=1
This procedure was also considered by Crambes et al. (2013) for the i.i.d model. Another configuration
of the FPCA such as the one proposed by Imaizumi & Kato (2018). Their approach consist to project
the data on the predictor and the response function on the first m principal component, then use the
vector of score to estimate the Fourrier coefficients of the autoregressive operator and the estimated
autoregressive operator is then obtained by writing the estimated operator on the basis of the m
eigenfunctions of the covariance operator with the Fourrier coefficients as the estimated scores. We
14
4.2 Tikhonov Method
This technique is widely used in the inverse problem literature. It has been studied recently by
Benatia et al. (2017) in the context of fully functional regression. They derive some consistency and
asymptotic normality results for the estimated operator. This technique is most widely justified to
Let us consider by α a positive tunning parameter which will be considered for the estimation
purpose and such that it converge to zero as n goes to infinity. Then the esimated autoregressive
operator is given by
−1
Ψ∗α = αI + K D∗ . (14)
−1
Ψ̂∗α = αI + K̂ D̂∗ . (15)
This estimator can also be characterized in terms of the spectral system of the covariance operator K̂
as follows
N −1 ˆ
Q N −1
!
X α,j 1 X
Ψ̂∗α (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s). (16)
λ̂j N −1
j=1 n=1
and
N −1 N −1 ˆ
1 X X Qα,j
Ψ̂∗α (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (17)
N −1 λ̂j
j=1 n=1
ˆ
where Q
λ̂j
α,j = λ̂j +α
is called the filter factor. The truncation that is operated with the FPCA
The Landweber-Fridman method is basically an iterative method. This method is also very popular
in the literature of inverse problem. Let consider a positive parameter d such that 0 < ||K||L < 1/d.
Then, the FLF technique can be computed iteratively as follows. Take the initial value
15
Ψ∗0 (f ) = dD∗ (f ), f or each f ∈ H.
where m is the maximum number of iterations. We see that the the estimated autoregressive operator
m
X
Ψ∗m (f ) =d (I − dK)l−1 D∗ (f ), f or each f ∈ H. (18)
l=1
Since the operators K and D are not observed, they are consistently estimated by K̂ and D̂ respectively
m
X
Ψ̂∗m (f ) = d (I − dK̂)l−1 D̂∗ (f ), f or each f ∈ H. (19)
l=1
This estimator can be written in terms of the eigensystem of the covariance operator K̂ as follows
N −1 ˆ
Q N −1
!
X m,j 1 X
Ψ̂∗m (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s) (20)
λ̂j N −1
j=1 n=1
and
N −1 N −1 ˆ
1 X X Qm,j
Ψ̂∗m (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (21)
N −1 λ̂j
j=1 n=1
!
ˆ
where Q 1 − (1 − dλ̂j )m
m,j = is the filter factor.
The Functional Partial Least squares is one of the method that is based on the idea that the
estimator obtained by the FPCA method is not directly justified by the problem of prediction and
the extracted principal component may not capture the most important information relevant for the
16
prediction. The FPLS may be more adapted in the sense that it extract the most important factors
that explain the relation between the predictand and the predictor function. This method is very
popular in the chemometrics field and has been discussed by some prior authors such as Wold et
al. (1984) Helland (1988), Höskuldsson (1988). It was recently introduced in the econometric field
by Groen & Kapetanios (2009), Kelly & Pruitt (2015), Carrasco & Rossi (2016). In the Functional
regression context, there are authors like Aguilera et al. (2010), and Delaigle et al. (2012).
R1
Practically, for the model setting of this paper, the idea is to identify a new factor th = 0 Xn (s)vh (s)ds
at each step h = 1, ..., m such that the covariance withe the response function is maximized.
Z 1 Z 1
2
max cov Xn (s)vh (s)ds, Xn+1 (t)ch (t)dt
vh ,ch ∈L2 ([0,1]) 0 0
where φ1 , ..., φh−1 , c1 , ..., ch−1 are already obtained in the h − 1 previous step. By using an extension of
the Alternative Partial Least Squares (APLS) approach proposed Delaigle et al. (2012), the estimated
m
X
Ψ̂∗m (s, t) = γ̂t,l K̂ l−1 (D̂)(s, t)
l=1
m (23)
X Z 1
l−1
= γ̂t,l K̂ (s, u)Ĉ1 (u, t)du
l=1 0
−1
for each s, t ∈ [0, 1], where for each t, γ̂t = R̂t µ̂t is a vector of size m. R̂s is an (m × m) matrix with
Z 1Z 1
R̂t,j,l = D̂∗ (t, u)K̂ j+l−1 (u, s)D̂∗ (s, t)duds (24)
0 0
Z 1Z 1
µ̂t,l = D̂∗ (t, u)K̂ l−1 (u, s)D̂∗ (s, t)duds (25)
0 0
17
This estimator can be written in terms of the eigensystem of the empirical covariance operator K̂ as
follows
N −1 ˆ
Q N −1
!
X m,j 1 X
Ψ̂∗m (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s) (26)
λ̂j N −1
j=1 n=1
and
N −1 N −1 ˆ
1 X X Qm,j
Ψ̂m (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (27)
N −1 λ̂j
j=1 n=1
where
m
!
ˆ Y λ̂j
Q m,j = 1− (1 − )
l=1
θ̂l
is the filter factor and θ̂2 > θ̂2 > ... > θ̂m > 0 are the eigenvalues of the matrix R̂.
Therefore considering the previous results, the estimated autoregressive operator Ψ∗m can be
summarized as
N −1 ˆ
Q N −1
!
X δ,j 1 X
Ψ̂∗δ (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s) (28)
λ̂j N −1
j=1 n=1
and
N −1 N −1 ˆ
1 X X Qδ,j
Ψ̂∗δ (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (29)
N −1 λ̂j
j=1 n=1
ˆ ≡ Q(δ, λ̂ ) where
where the filter factor is Qδ,j j
m
!
Y λ̂j
Q(m, λ̂j ) = 1− (1 − ) F or the F P LS method.
l=1
θ̂l
18
It can also be noticed that if θ̂l = θ̂r = θ̂0 for each l, r = 1, ...m, then FPLS is almost similar to
1
FLF with d = .
θ̂0
Given the estimated autoregressive operator Ψ̂m , the best prediction of the one day ahead S&P
Z 1
X̂n+1 (t) = Ψ̂m (s, t)Xn (s)ds f or each t ∈ [0, 1]. (31)
0
5 Asymptotic Results
This section is dedicated to study the rate of convergence of the estimator of the operator Ψ̂ and
of the predicted functions X̂n+1 in the context that is when the eigenvalues of the covariance operator
K are bounded and decline gradually to zero. This situation is analyzed because as far as we are
concerned, it encompass most of the practical case studies in the economic and financial field. For this
Assumption 1 (A1) : {X1 , ..., XN } is a sequence of zero-mean and square integrable functions
following a functional autoregressive process with E[||Xn ||4 ] < +∞ and there exist an integer k0 ≥ 1
respect, with E[||εn ||2 |X] = σ 2 and E[||Xn ||4 ] < ∞. (It can also be assumed that εn is a stationary
martingale difference sequence with respect to {εn−1 (t), εn−2 (t), ..., Xn−1 (t), Xn−2 (t)...}).
that
Ψ∗ = K β/2 R
∞
X < Ψ∗ (f ), vj >2
< +∞ F or all f ∈ H.
j=1 λβj
Assumption 4 (A4) : The eigenvalues of the covariance operator K and the estimated one K̂
are distinct. λ1 > λ2 > ... > 0 and λ̂1 > λ̂2 > ... > λ̂N > 0.
19
Assumption 5 (A5) : nλm → ∞.
Assumption 1 ensures that the sequence {Xn ; n ∈ H} is a stationary process and admit a unique
solution. Assumption 2 impose that the innovations εn is homoskedastic and ensure that the operators
Furthermore, since E[||Xn ||4 ] < +∞, the operator K is trace-class and thereby is Hilbert-Schmidt.
Assumption 3 is a source condition ensuring that the Fourier coefficients < Ψ∗ (f ), vj > go to zero not
too faster than eigenvalues λβj as j goes to infinity. This condition guarantees that Ψ∗ belongs to the
complement orthogonal to the null space of the operator K. The larger β is, the smoother Ψ∗ (f ) is
(see Carrasco et al. (2007), Benatia et al. (2017)). This assumption is necessary to control the bias
term and will be necessary to show that the bias term depend on β. This asumption is different to
the one considered by Imaizumi & Kato (2018) or Crambes et al. (2013). Indeed, their assumption is
more restrictive and is related to the decreasing rate of the eigenvalues λj . This source condition is
more general than their assumptions. Assumption 4 imposes that the eigenvalues λj are distincts and
are consistently estimated by λ̂j . Assumption 5 is the suffiscient conditions under which the expected
estimation error goes to zero. This assumption is necessary to control the estimation error such that
it is possible to keep a good balance between underfitting and overfitting for the FPCA and FPLS
estimation methods.
Let us consider the regularized version of Ψ∗ by Ψ∗δ where δ is α for the FT and m for FPCA,
FPLS and FLF methods. Then, for each function f ∈ H, Ψ∗δ can be written as
∞
X Qδ,j
Ψ∗δ (f ) = < D∗ (f ), vj > vj
λj
j=1
∞
X Qδ,j
= < K(Ψ∗ )(f ), vj > vj
λj
j=1
∞
X
= Qδ,j < Ψ∗ (f ), vj > vj
j=1
20
where {Ψ∗δ (f ) − Ψ∗ (f )} represent the bias term that goes to zero as δ increase. {Ψ̂∗δ (f ) − Ψ∗δ (f )} is
2
M SE = E ||Ψ̂δ − Ψ|| |X1 , ..., Xn .
m β
M SEF P CA = Op + Op λm+1 . (32)
λm n
m
M SEF P LS = Op + Op λβm+1 . (33)
θm n
where θm is the smallest root of the residual polynomial Q̄m,j . The first Op term represent the
m2
−2β
M SELF = Op + Op m . (34)
n
if β > 1 then
1 min{β,2}
M SEF T = Op + Op α . (35)
nα
if β < 1
αβ
M SEF T = Op + Op αβ . (36)
nα2
Remarks :
• The rate of convergence of FPCA and FPLS depend on the decreasing rate of the eigenvalues
of the covariance operator (λm ). Indeed, for the FPLS approach, the smallest eigenvalue of
21
the Hankel Matrice (θm ) is such that θm < λm ( see for instance Lingjaerde & Christophersen
(2000)). Furthermore, since that θm decrease at an exponential rate (see Berg & Szwarc (2011)),
it is most of the time expected that the FPLS present a high estimation error of the autoregressive
operator. Furthermore, if the eigenvalues do not decrease very smoothly, FPCA and FPLS may
underestimate
• In contrast to the FPCA and FPLS methods, the rate of convergence for the FLF and FT do
not depend on the decreasing configuration of the eigenvalues. Moreover, due to the saturation
property 4 of the FT method, the FLF approach (that can be observed as an iterative version
of the FT method) should be preferred to FT (see Carrasco et al. (2007)). This pattern should
• Propositions 1 and 2 show that as m increase the squared bias term decrease while the variance
increase. Then m should be optimally chosen i.e. such that bias is equal to the variance. Then
at the optimality,
−β
• If m ∼ n1/(2+2β) then M SEF LF ∼ n 1+β .
−min{β,2}
• For β > 1, if α ∼ n−1/(2+2β) then M SEF T ∼ n 1+min{β,2} ,
• The upper bound derived for FPCA and FPLS are more general bound. Specially the upper
bound obtained for the FPCA is different to the one obtained by Imaizumi & Kato (2018) since
that their assumptions on the decreasing rate of the eigenvalues λj are more restrictive. The
assumptions concidered by Crambes et al. (2013) is also different and more restrictive than the
one considered in this paper, but we get the same upper bound. To get more information about
the optimal number of functional components m for the FPCA and FPLS, it is necessary to set
Proposition 3:
Assume that A1 to A4 (A1 to A5) for FLF and FT (for FPCA) hold, if E(||Xi ||4 < ∞ and E(||Xi ||2 ||εi ||2 ) <
4
see chapter 6 of Engl et al. (1996) concerning the saturation property of the Ridge regression.
22
∞, then
√
n(Ψ∗δ − Ψ∗δ ) → N (0, Ωδ ). (37)
where δ = m for the FPCA and FLF and δ = α for the FT.
−1 −1 −1 −1
¯ (ε + Kδ ◦ Ψδ (Xi )) ⊗ Kδ (Xi ))
Ωδ = E (ε + Kδ ◦ Ψδ (Xi )) ⊗ Kδ (Xi )) ⊗
(38)
∗ ¯ ∗
− E Kδ (Ψ ) ⊗ E(Xi ) ⊗E Kδ (Ψ ) ⊗ E(Xi ) .
6 Simulation Results
This section is devoted to compare performance of the described estimation methods in a finite
sample context. The comparison are made in terms of Monte Carlo Simulations. The main comparisons
are done on the mean-square error of the estimated autoregressive operator and the mean-square
Z 1
Xn+1 (t) = Ψ(t, s)Xn (s)ds + εn+1 (t) n = 1, ..., N. (39)
0
The three error processes used by Didericksen et al. (2012) ε(1) (t), ε(2) (t), and ε(3) (t) are considered
b
b 1 X
W( ) = √ Z` b = 1, ..., B.
B B `=1
√ √
ε(2) (t) = ξ1 2sin(2πt) + ξ2 2κcos(2πt)W (t) − tW (1). (41)
where ξ1 and ξ2 are two independent variables following a normal distribution and κ can be a constant.
23
and
where a ∈ [0, 1] is a real constant which represent the strength of the two components ξ1 and ξ2 . ε(1) (t)
is an infinite series expansion , ε(2) (t) is a finite series expansion and ε(3) (t) is the combination of the
previous one.
The theoretical autoregressive operator Ψ is an integral operator mapping from L2 ([0, 1]) to
Sloping plane (s) : Ψ(s, t) = Cs with (s, t) ∈ ([0, 1])2 and C a constant useful to normalize the
autoregressive operator.
Two norms of the operators are considered ||Ψ||L = 0.5 and ||Ψ||L = 0.8. A continuous intervals
of [0, 1] is considered. This interval consist of 1000 equally-spaced discretization. Three sample size of
functional time series N is considered, that are 50, 100, 200 and 500. For the numerical integration, the
trapezoidal rule is used for all the the operations in the simulation and real data applications. It has
been noticed that a good estimation and prediction of the model depend on the choice of the tunning
parameter that is the number of principal components m for the FPCA and FPLS methods, the
number of iterations m for the FLF and the regularization parameter α for the FT. These parameters
are chosen in such a way that the MSE and the MSPE is minimized. The predictive cross-validation
(PCV) is performed to run the Monte Carlo and choose the regularization parameter. The PCV is
based on choosing the optimal tunning parameter that predict well the discarded observations curve
after estimation.
2
N −1
1 X
P CV (δ) = Xn+1 − Ψ̂δ (Xn ) .
N −1
n=1
with δ equal to m for FPCA, FPLS, FLF, and equal to α for the FT. The prediction error a new curve
at the time n
24
To measure the prediction quality, two indicators are considered, that are the root integrated
squared error (En) and the integrated absolute error (Rn). Those criteria are given by
s
Z 1 2
En = X̂n (t) − Xn (t) dt
0
Z 1
Rn = X̂n (t) − Xn (t)dt
0
To analyze the estimation error, three criteria are considered. That are the mean squared error
(MSE), the Average distance (AD) and the ratio averaged distance (RAD). Those criteria are given
by
s
Z 1Z 1 2
M SE = Ψ̂(s, t) − Ψ(s, t) dsdt
0 0
Z 1Z 1
AD = Ψ̂(s, t) − Ψ(s, t)dt
0 0
Ψ̂(s, t) − Ψ(s, t)
Z 1Z 1
RAD(Ψ) = dsdt
0 0 |Ψ(s, t)|
The gaussian kernel model with brownian motion innovation is firstly considered to take into
account that the data When looking at the gaussian kernel setting with the brownian motion innovations
for the small sample size case (N = 50), it can be observed that the in terms of estimation, the FLF
and the FT methods outperform the others, but in terms of prediction all the compared method
seems to get the same performance except the fact that the FPLS tend to be sensible to the presence
of functional outliers. The figure 3 display such results. The fact that FLF and FLF outperform
the FPCA and FPLS can be explained by the fact that their convergence rate do not depend on the
decreasing rate eigenvalues λj , while the others do. Furthermore, the prediction error is almost the
25
same for all the method is because there is a smoothing effect when using the integral to calculate
the predicted function. Furthermore, as the sample size increase, the bias and the estimation error is
reduced.
Figure 3: Comparison of the different estimation techniques. Gaussian Kernel with n
= 50, and ε(1)
MSE RAD AD
3
4 6
2.5
3 2
MSE
RAD 4
AD
2 1.5
2 1
1
0.5
Rn En
2
2
1.5
Rn
En
1.5
1
1
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
As the sample size increase and become very large (N = 200 and N = 500), the same pattern is
observed and nicely perceptible, but the dispersion is reduced for the the FPLS method (see figure
4). Another reason of the gap observed between those different techniques in terms of estimation
performance is that the eigenvalues λj may decrease very quickly (potentially more than an exponential
rate). This pattern has also been noticed by Didericksen et al. (2012). In that situation introducing a
regularization parameter for each functional component for the FPCA an FPLS (or on the eigenvalues
λj and θj should improve the results. Those regularization parameters should also be optimally chosen
Smooth innovation
Considering that the innovation term are smooth is the same thing as considering the case of
smooth data. When looking at this case for the gaussian kernel model, it is observed that the FPLS
is still very affected by the functional outliers. The gab between the methods in terms of estimation
26
Figure 4: Comparison of the different estimation techniques. Gaussian Kernel with n
= 500, and ε(1)
MSE RAD AD
0.6
0.7
1.2
0.6 0.5
1
0.5
MSE
RAD
0.4
AD
0.8
0.4
0.6 0.3
0.3
0.4 0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.2
1
Rn
En
1
0.8
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
and prediction is not much pronounced, whatever the sample size considered. Maybe it is necessary
to find a way to manage those outliers in order to get more information. (See figure 5 and figure 6 )
Mixed innovations
The case of mixed innovation display the almost the same results as the previous one. Indeed, if
more weight is attributed to the smoothed innovations, then the same results as the case of smooth
innovations is obtained, while, if the weight are attributed to the brownian motion innovation, the
When considering a different kernel, the kernel and for instance the identity one, it can be observed
that in the context of brownian motion innovation, the same pattern is observed for the evaluation
criteria considered in terms of estimation and prediction. The gap is still observable in terms of
estimation quality, while the prediction performance are almost the same (see figure 7, 16 and 17).
Moreover, the When the sample size become very large (N = 500), the prediction error with
FPLS becomes smaller than the one obtained by the other methods (see figure 8). Furthermore, as
27
Figure 5: Comparison of the different estimation techniques. Gaussian Kernel with n
= 50, and ε(2)
MSE RAD AD
2500
4000 5000
4000 2000
3000
MSE
1500
RAD
3000
AD
2000
2000 1000
1000 500
1000
0 0 0
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
10 58 Rn 10 58 En
10
6
8
4 6
Rn
En
4
2
2
0 0
FPCAFPLS FT FLF FPCA
FPLS FT FLF
the sample size increase, the bias and the estimation error is reduced in terms of prediction. All the
The same results are obtained for the slope kernel case as can be seen in figure 9, figure 10 and
figure 11
7.1 Data
The S&P 500 Index future data is used to analyze the intraday returns predictability. The sample is
from 01/01/2008 to 12/31/2017. The data is collected from the website called www.backtestmarket.com.
This sample would be used in different parts to analyse the return predictability to the identification
of the intraday momentum and the main causes and also to test the robustness of the results in many
different context such as the volatility effect, the volume effect, the liquidity effect, the aftermath
28
Figure 6: Comparison of the different estimation techniques. Gaussian Kernel with n
= 200, and ε(2)
MSE RAD AD
2000 2500
2000 1000
1500
MSE
RAD
1500
AD
1000
1000 500
500 500
0 0 0
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
10 45 Rn 10 46 En
2
10 1.5
Rn
En
1
5
0.5
0 0
FPCAFPLS FT FLF FPCA
FPLS FT FLF
financial crisis, the macroeconomic news release effect (Federal Open Market Committee (FOMC),
Consumer Price Index (CPI), Growth Domestic Product (GDP)), the infrequent rebalancing effect
To start the empirical analysis, the simple functional autoregressive model is considered where the
current cumulative intraday market return is used to predict the next day. The year considered is 2015
- 2017. these results would be tested for the other years of our data base. In the prediction sample,
Z 1
Xn+1 (t) = Ψ0 (t) + Ψ(s, t)Xn (s)ds + εn+1 (t), n = 1, ..., 750. (43)
0
The sample size for this regression period is N = 750. This sample is split into two parts, that
are the regression part and the prediction part (N 1 = 650 days) and the Validation and testing part
(N 2 = 100 days) in order to choose the optimal tuning parameter for each estimation method. Each
day is represented by the by 273 discretization of 5 minutes during 24 hours of a day. Figure 12
29
Figure 7: Comparison of the different estimation techniques. Identity Kernel with n
= 50, and ε(1)
MSE RAD AD
2.5
1.2
1.5
2 1
0.8
MSE
RAD
1 1.5
AD
0.6
1
0.5 0.4
0.5
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
4
4
3
3
Rn
En
2
2
1
1
display a contour plot representing how correlated is the previous day cumulative return shape to the
next one on a 5 minute timeframe, that is the estimated autoregressive operator Ψ̂(s, t). It can be
seen that the FT and and FLF display almost similar results in terms of estimation, while the FPCA
and FPLS display different results. The yellow part represent the positive correlation while the blue
one represent the negative correlation. It can be observed that the previous trading day first hour
opening session contribute to predict positively to the next day momentum while, last hour return
of the previous day predict positively the next day return in the first half and predict negatively the
market return in the second half of the next trading day. This momentum can be explained by many
different causes.
The next step is to plot the predictive functional R − squared. From the figure 13, it is easy to see
that the most important time of the day to buy stocks is around 9:30 AM - 10:30 AM. The result is
similar for all the different estimation methods. the FPLS tend to catch a remarkable value of 8% at
the beginning and the end of the trading session, that is almost 4 times the one obtained by Gao et al.
(2018) and twice the one obtained by Zhang et al. (2019). The FPCA and FT method tend to reach
a predictive R2 of 5% and 6% respectively in the morning (That is nearly twice the one obtained by
Gao et al. (2018)) and nearly 2.5% at the end of the trading session for the FT approach. Surprisingly,
30
Figure 8: Comparison of the different estimation techniques. Identity Kernel with n
= 500, and ε(1)
MSE RAD AD
0.6 1 0.5
0.5
0.8 0.4
MSE
RAD
AD
0.4
0.6 0.3
0.3
0.4 0.2
0.2
0.2 0.1
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.1
0.9
1
0.8
Rn
En
0.7 0.9
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
the FPCA is not able to capture the momentum of the ending trading session. According to the FLF
approach, the R2 is around 2.5% at the beginning of the trading session while in the second half of
The next step is to observe how this model perform out-of-sample in order to examine its stability
to irregular events and what are the main causes of momentum in such considerations. The new
Z 1
Xn+1 (t) = Ψ0 (t) + Ψ(s, t)Xn (s)ds + ρ1 Ie,n + ρ2 Ie,n ∗ Rn + εn+1 (t), n = 1, ..., 750. (44)
0
where Ie,n is a dummy variable taking the value 1 if the concerned event happened at day n and 0
otherwise. Rn the cumulated return at the closing of the day n. ρ2 capture the interaction effect
between Ie,n and Rn . Ie,n can be a variable observing if there where a high impact news in that day
n, or the financial crisis days, or the high volume day, or the high volatility day.
31
Figure 9: Comparison of the different estimation techniques. Slope Kernel (s) with n
= 50, and ε(1)
MSE RAD AD
8
3 2
6
1.5
MSE
RAD
2
AD
4
1
1 2 0.5
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
400
300
200
Rn
En
200
100
100
0 0
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
8 Conclusion
32
Figure 10: Comparison of the different estimation techniques. Slope Kernel (s) with
n = 100, and ε(1)
MSE RAD AD
1.4 0.8
1.2 2
1 0.6
1.5
MSE
RAD
AD
0.8
1 0.4
0.6
0.4
0.5 0.2
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
8
8
6
6
Rn
En
4
4
2 2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
9 Appendices
(To be completed)
(To be completed)
(To be completed)
9.4 Graphs
33
Figure 11: Comparison of the different estimation techniques. Slope Kernel (s) with
n = 200, and ε(1)
MSE RAD AD
1.2
1 0.6
1.5
MSE
RAD
0.8
AD
1 0.4
0.6
0.4
0.5 0.2
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
10 12
10
8
8
6
Rn
En
6
4
4
2 2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
References
Aguilera, A. M., Escabias, M., Preda, C., & Saporta, G. (2010). Using basis expansions for estimating
functional pls regression: applications with chemometric data. Chemometrics and Intelligent
Antoniadis, A., & Sapatinas, T. (2003). Wavelet methods for continuous-time prediction using
Aue, A., Norinho, D. D., & Hörmann, S. (2015). On the prediction of stationary functional time
Benatia, D., Carrasco, M., & Florens, J.-P. (2017). Functional linear regression with functional
Berg, C., & Szwarc, R. (2011). The smallest eigenvalue of hankel matrices. Constructive
Besse, P. C., & Cardot, H. (1996). Approximation spline de la prévision d’un processus fonctionnel
34
Figure 12: The estimated Autoregressive operator. Year 2015 - 2017
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Current day (9:30 AM - 4:00 PM) Current day (9:30 AM - 4:00 PM)
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Current day (9:30 AM - 4:00 PM) Current day (9:30 AM - 4:00 PM)
Besse, P. C., Cardot, H., & Stephenson, D. B. (2000). Autoregressive forecasting of some functional
Bogousslavsky, V. (2016). Infrequent rebalancing, return autocorrelation, and seasonality. The Journal
Bosq, D. (2000). Linear processes in function spaces: Theory and applications, volume 149 of lecture
Carrasco, M., Florens, J.-P., & Renault, E. (2007). Linear inverse problems in structural econometrics
5633–5751.
35
Figure 13: The estimated Functional R-Squared. Year 2015 - 2017
0.04
0.06
R-squared(R2(t))
R-squared(R2(t))
0.03
0.04
0.02
0.02
0.01
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)
0.05 0.025
R-squared(R2(t))
R-squared(R2(t))
0.04 0.02
0.03 0.015
0.02 0.01
0.01 0.005
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)
Carrasco, M., & Rossi, B. (2016). In-sample inference and forecasting in misspecified factor models.
Chan, E. (2013). Algorithmic trading: winning strategies and their rationale (Vol. 625). John Wiley
& Sons.
Chu, X., Gu, Z., & Zhou, H. (2019). Intraday momentum and reversal in chinese stock market.
Crambes, C., Mas, A., et al. (2013). Asymptotics of prediction in functional linear regression with
Delaigle, A., Hall, P., et al. (2012). Methodology and theory for partial least squares applied to
36
Figure 14: Comparison of the different estimation techniques. Gaussian Kernel with
n = 100, and ε(1)
MSE RAD AD
2 1.5
3
1.5
1
MSE
RAD
2
AD
1
1 0.5
0.5
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.4
1.4
1.2
1.2
1
Rn
En
0.8 1
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Didericksen, D., Kokoszka, P., & Zhang, X. (2012). Empirical properties of forecasts with the functional
Engl, H. W., Hanke, M., & Neubauer, A. (1996). Regularization of inverse problems (Vol. 375).
Ferraty, F., & Vieu, P. (2006). Nonparametric functional data analysis: theory and practice. Springer
Gao, L., Han, Y., Li, S. Z., & Zhou, G. (2018). Market intraday momentum. Journal of Financial
Griffin, J. M., Ji, X., & Martin, J. S. (2003). Momentum investing and business cycle risk: Evidence
Groen, J. J., & Kapetanios, G. (2009). Revisiting useful approaches to data-rich macroeconomic
forecasting.
He, X.-Z., & Li, K. (2015). Profitability of time series momentum. Journal of Banking & Finance,
53 , 140–157.
37
Figure 15: Comparison of the different estimation techniques. Gaussian Kernel with
n = 200, and ε(1)
MSE RAD AD
1.5
0.8 0.6
0.5
0.6 1
MSE
RAD
AD
0.4
0.4 0.3
0.5
0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.4
1.2
1 1.2
Rn
En
0.8 1
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Heston, S. L., Korajczyk, R. A., & Sadka, R. (2010). Intraday patterns in the cross-section of stock
Hörmann, S., & Kokoszka, P. (2012). Functional time series. In Handbook of statistics (Vol. 30, pp.
157–186). Elsevier.
Hyndman, R. J., & Shang, H. L. (2009). Forecasting functional time series. Journal of the Korean
Imaizumi, M., & Kato, K. (2018). Pca-based estimation for functional linear regression with functional
Jegadeesh, N., & Titman, S. (1993). Returns to buying winners and selling losers: Implications for
38
Figure 16: Comparison of the different estimation techniques. Identity Kernel with n
= 100, and ε(1)
MSE RAD AD
1.4 0.7
0.8
1.2 0.6
1 0.5
0.6
MSE
RAD
AD
0.8 0.4
0.4 0.6 0.3
0.4 0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.2
1
1.1
0.9
Rn
En
1
0.8
0.9
0.7
0.8
0.6
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Kargin, V., & Onatski, A. (2008). Curve forecasting by functional autoregression. Journal of
Kelly, B., & Pruitt, S. (2015). The three-pass regression filter: A new approach to forecasting using
Kokoszka, P., & Young, G. (2016). Kpss test for functional time series. Statistics, 50 (5), 957–973.
Kokoszka, P., & Zhang, X. (2010). Improved estimation of the kernel of the functional autoregressive
Kokoszka, P., & Zhang, X. (2012). Functional prediction of intraday cumulative returns. Statistical
Lehmann, B. N. (1990). Fads, martingales, and market efficiency. The Quarterly Journal of Economics,
Lingjaerde, O. C., & Christophersen, N. (2000). Shrinkage structure of partial least squares.
39
Figure 17: Comparison of the different estimation techniques. Identity Kernel with n
= 200, and ε(1)
MSE RAD AD
0.8
1.2 0.6
0.6 1 0.5
MSE
RAD
0.8 0.4
AD
0.4 0.6 0.3
0.4 0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.4
1.2
1 1.2
Rn
En
0.8 1
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Moskowitz, T. J., Ooi, Y. H., & Pedersen, L. H. (2012). Time series momentum. Journal of financial
Neely, C. J., Rapach, D. E., Tu, J., & Zhou, G. (2014). Forecasting the equity risk premium: the role
Ramsay, J. O., & Silverman, B. W. (2007). Applied functional data analysis: methods and case studies.
Springer.
Renault, T. (2017). Intraday online investor sentiment and return patterns in the us stock market.
Shang, H. L. (2017). Forecasting intraday s&p 500 index returns: A functional time series approach.
Sun, L., Najand, M., & Shen, J. (2016). Stock return predictability and investor sentiment: A
40
Figure 18: Comparison of the different estimation techniques. Slope Kernel (t) with
n = 50, and ε(1)
MSE RAD AD
1
4
0.6
0.8
3 0.5
MSE
RAD
AD
0.6 0.4
2
0.4 0.3
1
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
2
2
1.5
Rn
En
1.5
1
1
0.5
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Wold, S., Ruhe, A., Wold, H., & Dunn, W., III. (1984). The collinearity problem in linear regression.
the partial least squares (pls) approach to generalized inverses. SIAM Journal on Scientific and
Zhang, Y., Ma, F., & Zhu, B. (2019). Intraday momentum and stock return predictability: Evidence
41
Figure 19: Comparison of the different estimation techniques. Slope Kernel (t) with
n = 100, and ε(1)
MSE RAD AD
1
0.8 3 0.6
MSE
RAD
AD
0.6
2 0.4
0.4
1
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
2.5
2.5
2
2
Rn
En
1.5
1.5
1
1
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Figure 20: Comparison of the different estimation techniques. Slope Kernel (t) with
n = 200, and ε(1)
MSE RAD AD
0.6 3
0.5
2.5
MSE
RAD
0.5
AD
0.4
2
0.4
1.5 0.3
0.3 1
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
1.4
1.2
1 1.2
Rn
En
0.8 1
0.6 0.8
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
42
Figure 21: The estimated Autoregressive operator. Year 2011 - 2014
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Current day (9:30 AM - 4:00 PM) Current day (9:30 AM - 4:00 PM)
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Current day (9:30 AM - 4:00 PM) Current day (9:30 AM - 4:00 PM)
43
Figure 22: The estimated Functional R-Squared. Year 2011 - 2014
0.04
0.06
R-squared(R2(t))
R-squared(R2(t))
0.03
0.04
0.02
0.02
0.01
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)
0.05 0.025
R-squared(R2(t))
R-squared(R2(t))
0.04 0.02
0.03 0.015
0.02 0.01
0.01 0.005
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)
44