Idriss Tsafack CIREQ Presentation UdeM

Intraday Market Momentum : A functional Econometric approach
Idriss Tsafack∗1
1
Department of Economics, University of Montreal
June 12, 2019
Abstract
This paper propose to analyze the S&P 500 intraday cumulative return predictability with a fully
functional autoregressive model. This approach is practically important because market participant
can use the forecast results to tactically adjust their market timing exposure for the next day. Since
there is a deal with very big data, one of the main challenge is to estimate properly the autoregressive
operator. In order to get a more stable estimation, four different methods are suggested, that
are the Functional Principal Component Analysis (FPCA), the Functional Partial Least Squares
(FPLS), the Functional Tikhonov method (FT) and the Functional Landweber Fridman technique
(FLF). The convergence rate of the estimated autoregressive operator is derived for the different
methods. Monte Carlo simulations and real data application show that the FLF and the FT method
outperform the others method in overall model settings. Moreover, based on the dynamic of the
previous day, it is always better for traders to expose their portfolio only on the first opening hour
of the market. If traders were more active in the first half of the previous day, it is possible to hold
their investment in the whole trading session for the next day, otherwise they should hold at most
for the first half of the next day and turn to a reversal strategy on the second half. Using a the
FPLS estimation approach, it is possible to reach a remarkable R2 value of 8% in the first and last
hours of a trading session, that is almost 4 times the one obtained by Gao et al. (2018).
Keywords : Time series momentum, intraday momentum, Functional linear regression, Principal
Component Analysis, Regularization Method, return predictability, Big data.
JEL Codes : C01, C49, C53, C55, C62, G12, G13, G15.
∗
Economic Departement, University of Montreal, 3150 rue Jean-Brillant, Montral, (QC) H3T 1N8, Canada
(idriss.tsafack.teufack@umontreal.ca)
1
1 Introduction
Following the idea of Jegadeesh & Titman (1993), there is a very large literature documenting
the success of the momentum strategy in the financial market. The idea behind this strategy is
that the market participant should buy the assets that have outperformed and sell those that have
underperformed based on their historical evolution. The literature distinguish two types of momentum
strategies, that are the cross-sectional momentum, widely discussed by authors such as Griffin et al.
(2003), Lehmann (1990), and the time series momentum strategies developed by Moskowitz et al.
(2012), Neely et al. (2014) and other. Unfortunately, most of these papers tend to focus on long time
horizon - typically month or longer. The issue with long term is that they display very low sharp
ratio and backtest statistical significance because of infrequent independent trading signals. Another
issue is that long term momentum usually underperform in the aftermath of financial crisis 1 . The
research interest on such momentum pattern at the intraday granularity has recently started with Gao
et al. (2018) who show the evidence that the first half-hour return positively predicts the last half-hour
return in the U.S. stock market. Furthermore, Sun et al. (2016) and Renault (2017) have identified
that the intraday stock return can be predicted by high-frequency investor sentiment.
On the same flow, this paper propose to analyze the S&P 500 intraday cumulative return predictability
with a fully functional autoregressive model. In contrast to the traditional financial analysis, as an
econometrician the intraday cumulative returns can be observed as curves in a functional space. Then,
the shape of the cumulative return observed at the 5 minutes timeframe is used to predict the next
day shape2 . This approach is practically important because market participant can use the forecast
results to tactically adjust their market timing exposure for the next trading day. Furthermore, on
an econometric point of view, using functional data analysis is interesting since it makes possible to
take into account additional informations, that is the dynamic of the return from a certain time of the
day to another. Moreover, this approach is helpful to get a nice overview of the market participant
behavior within a trading day such as the U-Shape of the volatility, the most relevant period to take
some investment or portfolio rebalancing actions within a trading day.
When using a functional autregressive model, because the model setting is directly exposed to very
1
see page chapter 7 of Chan (2013)
2
The 5 minute is considered here just for illustration purpose. It is possible to use other timeframe such as the 10
minutes, 15 minutes, 1 minutes or tick data.
2
high dimensional data, one of the most important challenge is to estimate the autoregressive operator
3. Indeed, if one is not careful about the manipulation of the functional objects, there is a high
probability to obtain very unstable estimators of the autoregressive operator. To control the stability
of the estimated parameter, this paper suggest to compare 4 different estimation methods that are the
FPCA, FPLS, FT and FLF methods. These methods can also be observed as regularization methods
they all depend on a tuning parameter. To be sure of the quality of the estimation methods, their
respective convergence rate and asymptotic normality are derived and analyzed under some general
regularity conditions. These asymptotic normality results would be useful to test the significance of
the predictability and to develop the curve interval forecast.
To compare the four estimation methods, some Monte Carlo simulations have been developed. The
comparison is based on 5 criteria that are the Mean Squared Error (MSE), the Mean Average Distance
(AD), the Ratio Average Distance (RAD) to measure the quality of the estimation of the kernel, on
one hand and the Mean Squared Prediction Error measured on two different approaches (En and
Rn) on the other hand. Based on a large part of the model setting considered, the simulations show
that the FLF and the FT tend to outperform the others in terms of estimation of the autoregressive
operator what ever the sample size and the error terms of the model are. In terms of prediction, the
4 methods behave almost the same way for the small sample size and in the situation of large sample
size, the FPLS approach tend to outperform the others in terms of prediction errors. This result is
valid for a large category of error term configuration.
The empirical analysis of this paper use the S&P 500 future data from 01/01/2008 to 12/31/2017
on the 5 minute timeframe. The empirical analysis provide several interesting findings. An overview
of the empirical findings show the evidence that the cumulative intraday return shape of the current
trading day contribute significantly to predict the next day cumulative return shape. The different
estimation method provide different results. The FT and FLF display almost similar estimation results
of the autoregressive operator while the FPCA and FPLS tend to capture well the prediction results.
Based on the predictive R2 , the FPLS tend to catch a remarkable value of 8% at the beginning and
the end of the trading session, that is almost 4 times the one obtained by Gao et al. (2018) and twice
the one obtained by Zhang et al. (2019). The FPCA and FT method tend to reach a predictive R2 of
5% and 6% respectively in the morning (That is nearly twice the one obtained by Gao et al. (2018))
3
The autoregressive operator is similar to the slope parameter in the context of a simple OLS model.
3
and nearly 2.5% at the end of the trading session for the FT approach. Surprisingly, the FPCA is
not able to capture the momentum of the ending trading session. According to the FLF approach,
the R2 is around 2.5% at the beginning of the trading session while in the second half of the day it is
approximately 1.2%.
An examination of the autoregressive operator show that a strong momentum in the first half
of a trading day is positively correlated to to the next day shape, while a strong momentum in the
second half of a trading session is significantly positively correlated to the first half of the next trading
and negatively correlated to the second half of the next trading day. When the traders The previous
day first hour opening trading session is significantly positively correlated to the next. This result is
based on the FLF and the FT estimation method. Moreover, the shape of the previous day contribute
to derive more simply the volatility U-shaped patterns with the high volume and volatility in the
beginning and the last hour of the next trading day. This result is easily observed when the FPLS
approach is used. This result is similar to the one obtained by Gao et al. (2018). An exploration of
the effect of the volatility and the trading volume on the intraday momentum, we observe that the
intraday momentum is stronger on high volatility of the beginning trading session and high volatility
of the ending trading days. The same result is observed for volatility. These results can be explained
by the portfolio rebalancing pattern, the late-informed investor effect, the market manipulation of the
high frequency traders and often forced sales or purchases of assets of various type of funds.
This paper is related to the one of Gao et al. (2018) and Zhang et al. (2019) for the context
of financial market. The difference with their result is that the cumulative returns are observed
as curve instead of working with instant return. Moreover, this approach is easy to document the
U-shaped volatility and the in-sample and out-of-sample R2 function can reach an impressive predictive
performance of 8%, that is 4 times the performance of Gao et al. (2018). The idea of using cumulative
return is inspired by the paper of Kokoszka & Zhang (2012) and Shang (2017). Indeed, Kokoszka
& Zhang (2012) use individual assets and market assets respectively, but they their main purpose is
about comparing some forecasting methods the next day curve. The difference with their paper is
that this paper care about predictability and forecasting by proposing 4 different estimation methods.
The consistency results of the estimation method (of the autoregressive operator) are presented, and
comparison of those methods are made based on simulation and empirical analysis. An economic
4
significance analysis is also derived. This paper is also related to the Functional linear Regression with
functional response widely developed by Benatia et al. (2017), Imaizumi & Kato (2018) and functional
autoregressive model extensively developed by Bosq (2000), Kargin & Onatski (2008) and Kokoszka
& Zhang (2012).
The rest of the paper is organized as follows. Section 2 is dedicated to present the related
literature. The functional econometric model is presented in section 3. In section 4, I present how
to estimate the model using the four aforementioned methods. Section 5 is devoted to analyze the
convergence rate and the asymptotic normality for the estimated autoregressive operator. Section
6 present the comparison of the four methods based on Monte Carlo simulations. The real data
application is developed in Section 7 and section 8 concludes.
2 Related literature
This paper is related to four literatures : the return predictability, the momentum strategy in the
financial market, the functional data analysis, and the functional autoregressive model.
The momentum strategy became very popular with the well-known seminal work of Jegadeesh &
Titman (1993). In fact, they developed a cross-sectional momentum using a monthly frequency and
show that buying past winners and selling past losers can generate a significant positive return over
the next 3 to 12 months of holding. This paper has been widely extended by Griffin et al. (2003)
who show the evidence that these momentum can be observed in different stock markets such as U.S.,
Europe, Asia, Australia, except the Japanese market.
In contrast to the cross-sectional momentum, Moskowitz et al. (2012) have documented the success
of the time-series momentum strategy in equity index, currency, commodity, and bond futures. The
idea behind this strategy is to look at one asset at a time instead of a group of assets. Indeed, they
show the evidence that the previous 12-month returns of an asset contribute significantly to predict
the future returns. He & Li (2015) have proposed a continuous-time heterogeneous agent model in
order to explain the significance of time series momentum. They show that the momentum strategies
perform well when the market is dominated by the momentum traders. They specifically documented
that the short term momentum strategies tend to stabilize the market, while the long term momentum
tend to destabilize the market.
5
Furthermore, the recent research by Gao et al. (2018) have demonstrated the intraday momentum
in the U.S. stock market at the 30 minutes frequency. Indeed they show that the that the first half hour
return contribute to predict the last half hour return and that the effect is stronger in more volatile
days, on higher volume days, recession days and high impact news release days. In the same line, Zhang
et al. (2019) have documented almost the same results in the China stock market and explained that
these momentum can be explained not only by the infrequent rebalancing or late-informed investors
but also the U-shaped volume pattern. Chu et al. (2019) have identified not only that the last half
hour is positively predicted by the first half hour, but they also identified a reversal effect in the second
half hour of the trading day in the Chinese stock market. They also find that this momentum and
reversal effect is robust when including previous day return and day-of-week. Beside those authors,
Heston et al. (2010) has discovered a striking pattern of return continuation at half-hour intervals that
are exact multiples of a trading day on a 40 days time horizon.
Concerning the main causes of momentum, Chan (2013) listed 4, that are the persistence of the
sign of roll returns for futures; the slow diffusion, analysis, and acceptance of new information; the
forced sales or purchases of assets of various type of fund and market manipulation of high-frequency
traders. On the other hand, based on some theoretical analysis Bogousslavsky (2016) has identified
the infrequent rebalancing and the late-informed investors’ effect. Indeed, the infrequent rebalancing
phenomenon is described by the fact that a certain group of investors decide to rebalance their portfolio
in early in the morning while other decide to do so near the market closing. On the other hand, the
late-informed investors’ effect induce the fact that news release is an important factor that contribute
significantly to return predictability. These hypothesis have been confirmed by Gao et al. (2018) and
Zhang et al. (2019) but not by Chu et al. (2019). They argue that the noise trading drives intraday
returns predictability. Beside those causes, high-frequency traders sentiment can also be listed to
significantly affect the times series momentum as was documented by Sun et al. (2016) and Renault
(2017).
The literature on functional data analysis has attracted a lot of attention in the statistical field
during the last decade. Some of the pioneers are Ramsay & Silverman (2007) who have developed a
general context, Hyndman & Shang (2009), Hörmann & Kokoszka (2012), Kokoszka & Zhang (2012)
, and Ferraty & Vieu (2006) who have developed a semiparametric approach. One of the main
6
challenge is to be able to estimate the slope function (in the context where the response is a scalar) or
the operator (if the response variable is a function) because there is a high dimensional issue. More
recently, some authors such as Crambes et al. (2013), Benatia et al. (2017) and Imaizumi & Kato
(2018) have analyzed the rate of convergence of the estimated parameter when we are in the context
of i.i.d model using the FT and FPCA method.
The functional autoregressive model has became popular by Bosq (2000) He consider a parametric
approach for estimation purpose. Besse et al. (2000) has considered the same model and adopted a
nonparametric approach for estimation. Kargin & Onatski (2008) has proposed a predictive factor
approach to estimate the autoregressive operator. The idea of the approach is project the data on a
set of factors that are relevant for the prediction. Didericksen et al. (2012) have compared the method
of Kargin & Onatski (2008) and the FPCA and show that the FPCA outperform the predictive factor
in terms of estimation and in terms of prediction, they reach the same prediction error. Hyndman
& Shang (2009) and Aue et al. (2015) have proposed to use a univariate and multivariate time series
forecasting method since the PCA scores can display a temporal dependence.
The usage of functional autoregressive model or the functional data analysis is less observed in
the financial market field, to the best of our knowledge. The usage of the cumulative return is the
interesting because the curve are more informative in the sense that they take into account the dynamic
between two discretization point. The only papers that can be identified are the one by Kokoszka &
Zhang (2012) who proposed to predict an individual stock by using a functional CAPM and Shang
(2017) who suggested to forecast the U.S. stock market by using the dynamic update technique. Their
main purpose is to compare different forecasting method and not to analyze the return predictability.
3 The Model Setting
In this paper, for each day we observe the shape of the intraday 5 minutes of the S&P 500
future cumulative return. We use the cumulative intraday returns (CIDRs) used by Gabrys et al.
(2010) and Kokoszka et al.(2015). Let Pi (tj ) be the 5 minutes closing price of a financial asset at time
tj , on a given day i, with j = 1, ..., 273 . The return is defined as
Ri (tj ) = 100 ∗ [ln(Pi (tj )) − ln(Pi (t1 ))] (1)
7
×10 -3
price evolution within the year 2017 Univariate Instant return Functional Time series
2700 1.5 1.5
2650
1
2600 1
0.5
2550
Intraday cumulative return

S&P 500 Price level
2500 0.5
Instant return
0
2450
-0.5
2400 0
2350
-1
2300 -0.5
-1.5
2250
2200 -1 -2
Jan 01, 2017 Jan 01, 2017 0 0.5 1
Time of the day Time of the day Time of the day
Figure 1: Intraday cumulative returns of the S&P 500 index in 2017
And the continuous curves are constructed by
X(t) = Ri (tj ) with tj ∈ (5(j − 1), 5j]. (2)
Figure 1 represent the constructed intraday cumulative return of the S&P 500 based on the raw
data for the year 2017. From the instant return and the one on the right hand side display at least 4
to 5 outliers
Let (Xi : i ∈ Z) be an arbitrary stationary functional time series of S&P 500 cumulative return
curve. It is assumed that each function Xi is an element of the separable Hilbert space H = L2 ([0, 1])
(the space of square integrable functions mapping from the compact interval [0,1] to R) endowed with
1/2
R1 R1 2
the inner product < f, g >H = 0 f (t)g(t)dt and the norm and the norm ||f ||H = 0 f (t)dt .
Then each random function is square-integrable, that is E(||Xi ||2 ) < ∞. All the random functions are
defined on the same probability space (Ω, F , P ). Therefore, if X ∈ LpH = LpH (Ω, F , P ) with p > 0,
then E(||Xi ||p ) < ∞.
Therefore, we observe a sequence {X1 , X2 , ...., XN } of realizations of X, where Xi corresponds
to the observed curve of S&P 500 at the day i = 1, ..., N . In this paper, It is assumed that the
sequence of H-valued variables {X1 , X2 , ...., XN } follow a functional autoregressive Hilbertian process
8
of order 1 (FAR(1)) presented as follows:
Z 1
Xn+1 (t) = Ψ(t, s)Xn (s)ds + εn+1 (t) n ∈ Z (3)
0
where for each day n , Xn is a random curve of a Hilbert space H, Ψ : H → H is a bounded
linear operator and ε = (εn , n ∈ Z) is a H-valued strong white noise, which is a sequence of H-valued
identically and independently distributed random variables such that E(εn ) = 0 and E(||εn ||2 ) = σ 2 <
∞. We can also consider that the innovations ε is a sequence of H-valued martingale difference since
that the functional error do not necessary follow the same distribution. Without loss of generality, it
is assumed that E(Xn ) = 0.
Figure 2 represent how the predictor and the predicted function are displayed. The bold blue line
is the mean function for the predictor and the response function respectively. According to what is
observed, it can be deduced that if the functional outliers are removed, from the sample, it is possible
to say that each functional observation is generated from the same data generation process. This idea
has been argued by Kokoszka & Young (2016) who developed a KPSS unit root test for functional
time series.
Functional Time series (day n) Functional Time series (day n+1)

1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
-1.5 -1.5
-2 -2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)
Figure 2: Functional Predictor and Functional response
9
Let us denote by L the space of bounded linear operators on H equipped with the norm
||Ψ||L = Sup{||Ψ(f )|| : ||f || ≤ 1} (4)
Under the conditions that there exist an integer j0 ≥ 1 such that the linear operator ||Ψj0 ||L < 1,
(1) has a unique solution, which is a weakly stationary process in H given by
∞
X
Xn = Ψk (εn−k ) (5)
k=0
and the series converges almost surely in H. If it is assumed that the Hilbert-Schmidt norm of the
operator Ψ is lower than 1, then the existence en uniqueness of the solution is satisfied (See Lemma
3.1 of Kokoszka & Zhang (2010)).
We propose to estimate the yield factors by using the partial least squares approach and
compare with the competitive methods.
4 Model estimation
The goal of this paper is to forecast the one day ahead S&P 500 shape Xn+1 . According to
the data generating process, the best linear predictor of Xn+1 given X1 , ..., Xn is given by Ψ(Xn ).
Typically, Ψ is unknown and should be estimated consistently by an estimator Ψ̂. This section is
dedicated to the estimation of the autoregressive operator Ψ. We propose to estimate this operator
by four different strategy by using directly the fully functional form. Then multiplying equation (1)
by Xn and taking the expectation on both sides, lead to the following equation :
E[< Xn+1 , f > Xn ] = E[< Ψ(Xn ), f > Xn ], f ∈ H.
Let us define the covariance operator by
K(f ) = E[< Xn , f > Xn ]
Since E[||Xn ||2 ] < ∞, the covariance operator is symmetric, positive, nuclear and, therefore,
Hilbert-Schmidt and it’s spectral system (vj , λj )j≥1 is defined by
10
K(vj ) = λj vj , j ≥ 1.
with the eigenfunctions vj forming an orthonormal basis of H and the eigenvalues are such that
λ1 ≥ λ2 ≥ ... ≥ 0.
Let us define the cross-covariance operator by
D(f ) = E[< Xn+1 , f > Xn ].
Then, it is easy to see that
D(f ) = Ψ(K(f )). (6)
We can rewrite the previous equation as
D∗ (f ) = K(Ψ∗ (f )). (7)
where Ψ∗ and D∗ denote the adjoint operator of Ψ, and D respectively. The operators K, D and
D∗ are observed on the theoretical setting, so they are unknown and can be estimated by K̂, D̂ and
D̂∗ respectively, where
N −1
1 X
D̂(f ) = < Xn+1 , f > Xn .
N −1
n=1
N −1
∗ 1 X
D̂ (f ) = < Xn , f > Xn+1 .
N −1
n=1
and
N −1
1 X
K̂(f ) = < Xn , f > Xn .
N −1
n=1
K̂ is endowed with the empirical spectral system (λ̂j , v̂j )j≥1 with λ̂j ≥ λ̂j ≥ ... ≥ 0 and (v̂j )j≥1
forming an orthnormal basis of H.
Given the equation (5), one would like to brutally estimate the autoregressive operator by writting
11
Ψ∗ = K −1 D∗ as we usually do for the finite-dimensional context. The problem is that the covariance
operator K is compact and is defined in an infinite-dimensional space. Then K −1 is an unbounded
operator, which would lead to an unstable estimator. In the inverse probleme literature, equation (5)
is called an ill-posed problem in the sense that K is only invertable on a subset of H and its inverse
is not continuous.
Different approach have been proposed to estimate a stable autoregressive operator. Bosq (2000)
has suggested to use the Functional Principal Component Analysis (FPCA) and derived its consistency
results under some strong assumptions of the eigenvalues. There is a large literature on the nonparametric
techniques to estimate the autoregressive operator such as the spline smoothing and interpolations
techniques proposed by authors such as Besse & Cardot (1996), Besse et al. (2000), Ramsay &
Silverman (2007). Antoniadis & Sapatinas (2003) suggested linear wavelet technique to estimate a
FAR(1) model. Kargin & Onatski (2008) proposed a predictive factor technique which consist to find
out an estimator of the autoregressive operator in such a way that the prediction error is minimized by
projecting the data on principal components that respond to that goal. citedidericksen2012empirical
compared the FPCA method proposed by Bosq (2000) and the predictive factor technique of Kargin &
Onatski (2008) based on some simulation data and they show that in an overview of the comparison,
the FPCA outperforms.
Other contributions are in the univariate and multivariate time series forecasting of the principal
component scores proposed by Hyndman & Shang (2009) and Aue et al. (2015) respectively. The
multivariate prediction is a kind of generalization of the univariate forecasting proposed by Hyndman
& Shang (2009) Basically their approach consist to project first all the data on the most important
principal component by running a Functional Principal Component Analysis, then use the FPCA
scores to form time series vectors and then run the usual univariate ARMA models for each score
vector or VAR model to get the one step ahead scores prediction. Once those results are obtained, the
Karhunen-Loeve expansion is used to retransform the predicted score time series into the predicted
functional time series. There is also a contribution of Benatia et al. (2017) who proposed the Tikhonov
regularization to estimate the unknown operator for an i.i.d fully functional regression model. They
also derived some consistency results under the source conditions proposed by Carrasco et al. (2007).
The goal of this paper is to consider the functional Yule-Walker equation and estimate the
12
autoregressive operator by 4 different regularization techniques that are Tikhonov method, the FPCA
approach of Bosq (2000) , the Functional Partial Least Squares (FPLS) and the Functional Landweber
Frieman iteration method (FLF).
4.1 The Functional Principal Component Analysis
This approach is one of the most popular methods and was proposed by Bosq (2000) in order to
estimate the autoregressive operator on a finite subspace of H. Since that the operator K is symmetric
and nuclear, it admits a spectral decomposition, that is
∞
X
K(s, t) = λj vj (s)vj (t).
j=1
The autoregressive operator Ψ is Hilbert-Schmidt. Since {vj ⊗ vk }∞

j,k=1 is an orthonormal basis of
L2 ([0, 1]2 ) then it admits a Hilbert-Schmidt decomposition.
∞
X
Ψ(s, t) = Ψjk vj (s)vk (t).
j,k=1
R1R1
with Ψjk = 0 0 Ψ(s, t)vj (s)vk (t)dsdt. Moreover, using the Karhunen - Loeve representation of each
function Xn we have for each k ≥ 1, we have
∞
X
E[< Xn , vk > Xn+1 (t)] = λk Ψjk vj (s)vk (t).
j=1
Therefore we obtain the following characterization of Ψ.
∞
X E[< Xn , vk > Xn+1 (t)]
Ψ(s, t) = vk (s). (8)
λk
k=1
and
E[< Xn , vk >< Xn+1 , vj >]

Ψjk = .
λk
If it is assumed that m principal components is selected for the procedure, then the estimator of Ψ is
given by
13
m
X
Ψm (s, t) = Ψjk vj (s)vk (t). (9)
j,k=1
Since Ψjk and (vj )j≥1 are unknown, Ψ is consistently estimated using sample data [X1 , ..., XN ] and
we obtain
m
X
Ψ̂m (s, t) = Ψ̂jk v̂j (s)v̂k (t). (10)
j,k=1
with
N −1
1 1 X
Ψjk = < Xn , v̂k >< Xn+1 , v̂j > . (11)
λ̂k N −1
n=1
This estimator can also be written as
N −1 m
1 X X < Xn , v̂j >
Ψ̂m (s, t) = v̂j (s)Xn+1 (t). (12)
N −1 λ̂j
n=1 j=1
and
N −1 N −1 ˆ
1 X X Qm,j
Ψ̂∗m (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (13)
N −1 λ̂j
j=1 n=1
This procedure was also considered by Crambes et al. (2013) for the i.i.d model. Another configuration
of the FPCA such as the one proposed by Imaizumi & Kato (2018). Their approach consist to project
the data on the predictor and the response function on the first m principal component, then use the
vector of score to estimate the Fourrier coefficients of the autoregressive operator and the estimated
autoregressive operator is then obtained by writing the estimated operator on the basis of the m
eigenfunctions of the covariance operator with the Fourrier coefficients as the estimated scores. We
will not consider this approach for this paper.
14
4.2 Tikhonov Method
This technique is widely used in the inverse problem literature. It has been studied recently by
Benatia et al. (2017) in the context of fully functional regression. They derive some consistency and
asymptotic normality results for the estimated operator. This technique is most widely justified to
tackle the high dimensionality problem.
Let us consider by α a positive tunning parameter which will be considered for the estimation
purpose and such that it converge to zero as n goes to infinity. Then the esimated autoregressive
operator is given by
−1
Ψ∗α = αI + K D∗ . (14)
where I is the identity operator. The empirical version is then given by
−1
Ψ̂∗α = αI + K̂ D̂∗ . (15)
This estimator can also be characterized in terms of the spectral system of the covariance operator K̂
as follows
N −1 ˆ
Q N −1
!
X α,j 1 X
Ψ̂∗α (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s). (16)
λ̂j N −1
j=1 n=1
and
N −1 N −1 ˆ
1 X X Qα,j
Ψ̂∗α (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (17)
N −1 λ̂j
j=1 n=1
ˆ
where Q
λ̂j
α,j = λ̂j +α
is called the filter factor. The truncation that is operated with the FPCA
method is replaced by the shrinkage effect of the parameter α.
4.3 Functional Landweber-Fridman(FLF)
The Landweber-Fridman method is basically an iterative method. This method is also very popular
in the literature of inverse problem. Let consider a positive parameter d such that 0 < ||K||L < 1/d.
Then, the FLF technique can be computed iteratively as follows. Take the initial value
15
Ψ∗0 (f ) = dD∗ (f ), f or each f ∈ H.
For h = 1, ... , m, we have
Ψ∗h (f ) = (I − dK)(Ψ∗h−1 (f )) + dD∗ (f ), f or each f ∈ H.
where m is the maximum number of iterations. We see that the the estimated autoregressive operator
can be written as a polynomial function of the covariance operator K and we have.
m
X
Ψ∗m (f ) =d (I − dK)l−1 D∗ (f ), f or each f ∈ H. (18)
l=1
Since the operators K and D are not observed, they are consistently estimated by K̂ and D̂ respectively
. Then Ψ̂∗m is given by
m
X
Ψ̂∗m (f ) = d (I − dK̂)l−1 D̂∗ (f ), f or each f ∈ H. (19)
l=1
This estimator can be written in terms of the eigensystem of the covariance operator K̂ as follows
N −1 ˆ
Q N −1
!
X m,j 1 X
Ψ̂∗m (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s) (20)
λ̂j N −1
j=1 n=1
and
N −1 N −1 ˆ
1 X X Qm,j
Ψ̂∗m (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (21)
N −1 λ̂j
j=1 n=1
!
ˆ
where Q 1 − (1 − dλ̂j )m
m,j = is the filter factor.
4.4 Functional Partial Least Squares(FPLS)
The Functional Partial Least squares is one of the method that is based on the idea that the
estimator obtained by the FPCA method is not directly justified by the problem of prediction and
the extracted principal component may not capture the most important information relevant for the
16
prediction. The FPLS may be more adapted in the sense that it extract the most important factors
that explain the relation between the predictand and the predictor function. This method is very
popular in the chemometrics field and has been discussed by some prior authors such as Wold et
al. (1984) Helland (1988), Höskuldsson (1988). It was recently introduced in the econometric field
by Groen & Kapetanios (2009), Kelly & Pruitt (2015), Carrasco & Rossi (2016). In the Functional
regression context, there are authors like Aguilera et al. (2010), and Delaigle et al. (2012).
R1
Practically, for the model setting of this paper, the idea is to identify a new factor th = 0 Xn (s)vh (s)ds
at each step h = 1, ..., m such that the covariance withe the response function is maximized.
Z 1 Z 1
2
max cov Xn (s)vh (s)ds, Xn+1 (t)ch (t)dt
vh ,ch ∈L2 ([0,1]) 0 0
subject to ||vh || = 1, ||ch || = 1, and (22)

Z 1Z 1
v` (s)K(s, t)vh (t)dsdt = 0, ` = 1, ..., h − 1
0 0
where φ1 , ..., φh−1 , c1 , ..., ch−1 are already obtained in the h − 1 previous step. By using an extension of
the Alternative Partial Least Squares (APLS) approach proposed Delaigle et al. (2012), the estimated
autoregressive operator is given by :
m
X
Ψ̂∗m (s, t) = γ̂t,l K̂ l−1 (D̂)(s, t)
l=1
m (23)
X Z 1
l−1
= γ̂t,l K̂ (s, u)Ĉ1 (u, t)du
l=1 0
−1
for each s, t ∈ [0, 1], where for each t, γ̂t = R̂t µ̂t is a vector of size m. R̂s is an (m × m) matrix with
Z 1Z 1
R̂t,j,l = D̂∗ (t, u)K̂ j+l−1 (u, s)D̂∗ (s, t)duds (24)
0 0
and µ̂t = [µ̂t,1 , ..., µ̂t,m ]0 is a vector of length m.
Z 1Z 1
µ̂t,l = D̂∗ (t, u)K̂ l−1 (u, s)D̂∗ (s, t)duds (25)
0 0
17
This estimator can be written in terms of the eigensystem of the empirical covariance operator K̂ as
follows
N −1 ˆ
Q N −1
!
X m,j 1 X
Ψ̂∗m (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s) (26)
λ̂j N −1
j=1 n=1
and
N −1 N −1 ˆ
1 X X Qm,j
Ψ̂m (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (27)
N −1 λ̂j
j=1 n=1
where
m
!
ˆ Y λ̂j
Q m,j = 1− (1 − )
l=1
θ̂l
is the filter factor and θ̂2 > θ̂2 > ... > θ̂m > 0 are the eigenvalues of the matrix R̂.
Therefore considering the previous results, the estimated autoregressive operator Ψ∗m can be
summarized as
N −1 ˆ
Q N −1
!
X δ,j 1 X
Ψ̂∗δ (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s) (28)
λ̂j N −1
j=1 n=1
and
N −1 N −1 ˆ
1 X X Qδ,j
Ψ̂∗δ (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (29)
N −1 λ̂j
j=1 n=1
ˆ ≡ Q(δ, λ̂ ) where
where the filter factor is Qδ,j j
Q(δ, λ̂j ) =Q(m, λ̂j ) = I(j ≤ m) F or F P CA method

λ̂j
Q(α, λ̂j ) = F or the F T method
λ̂j + α
!
(30)
Q(m, λ̂j ) = 1 − (1 − dλ̂j )m F or the F LF method
m
!
Y λ̂j
Q(m, λ̂j ) = 1− (1 − ) F or the F P LS method.
l=1
θ̂l
18
It can also be noticed that if θ̂l = θ̂r = θ̂0 for each l, r = 1, ...m, then FPLS is almost similar to
1
FLF with d = .
θ̂0
Given the estimated autoregressive operator Ψ̂m , the best prediction of the one day ahead S&P
500 curve is given by
Z 1
X̂n+1 (t) = Ψ̂m (s, t)Xn (s)ds f or each t ∈ [0, 1]. (31)
0
5 Asymptotic Results
This section is dedicated to study the rate of convergence of the estimator of the operator Ψ̂ and
of the predicted functions X̂n+1 in the context that is when the eigenvalues of the covariance operator
K are bounded and decline gradually to zero. This situation is analyzed because as far as we are
concerned, it encompass most of the practical case studies in the economic and financial field. For this
purpose, the follwing assumptions are needed :
Assumption 1 (A1) : {X1 , ..., XN } is a sequence of zero-mean and square integrable functions
following a functional autoregressive process with E[||Xn ||4 ] < +∞ and there exist an integer k0 ≥ 1
such that ||(Ψ∗ )k0 ||L < 1.
Assumption 2 (A2) : εn is an i.i.d process stationary martingale difference sequence with
respect, with E[||εn ||2 |X] = σ 2 and E[||Xn ||4 ] < ∞. (It can also be assumed that εn is a stationary
martingale difference sequence with respect to {εn−1 (t), εn−2 (t), ..., Xn−1 (t), Xn−2 (t)...}).
Assumption 3 (A3) : There is a Hilbert-Schmidt operator R and a positive constant β such
that
Ψ∗ = K β/2 R
This source condition can also be written as
∞
X < Ψ∗ (f ), vj >2
< +∞ F or all f ∈ H.
j=1 λβj
Assumption 4 (A4) : The eigenvalues of the covariance operator K and the estimated one K̂
are distinct. λ1 > λ2 > ... > 0 and λ̂1 > λ̂2 > ... > λ̂N > 0.
19
Assumption 5 (A5) : nλm → ∞.
Assumption 1 ensures that the sequence {Xn ; n ∈ H} is a stationary process and admit a unique
solution. Assumption 2 impose that the innovations εn is homoskedastic and ensure that the operators
K and D∗ are consistently estimated by K̂ and D̂∗ respectively.
Furthermore, since E[||Xn ||4 ] < +∞, the operator K is trace-class and thereby is Hilbert-Schmidt.
Assumption 3 is a source condition ensuring that the Fourier coefficients < Ψ∗ (f ), vj > go to zero not
too faster than eigenvalues λβj as j goes to infinity. This condition guarantees that Ψ∗ belongs to the
complement orthogonal to the null space of the operator K. The larger β is, the smoother Ψ∗ (f ) is
(see Carrasco et al. (2007), Benatia et al. (2017)). This assumption is necessary to control the bias
term and will be necessary to show that the bias term depend on β. This asumption is different to
the one considered by Imaizumi & Kato (2018) or Crambes et al. (2013). Indeed, their assumption is
more restrictive and is related to the decreasing rate of the eigenvalues λj . This source condition is
more general than their assumptions. Assumption 4 imposes that the eigenvalues λj are distincts and
are consistently estimated by λ̂j . Assumption 5 is the suffiscient conditions under which the expected
estimation error goes to zero. This assumption is necessary to control the estimation error such that
it is possible to keep a good balance between underfitting and overfitting for the FPCA and FPLS
estimation methods.
Let us consider the regularized version of Ψ∗ by Ψ∗δ where δ is α for the FT and m for FPCA,
FPLS and FLF methods. Then, for each function f ∈ H, Ψ∗δ can be written as
∞
X Qδ,j
Ψ∗δ (f ) = < D∗ (f ), vj > vj
λj
j=1
∞
X Qδ,j
= < K(Ψ∗ )(f ), vj > vj
λj
j=1
∞
X
= Qδ,j < Ψ∗ (f ), vj > vj
j=1
Then, for each function f ,
Ψ̂∗δ (f ) − Ψ∗ (f ) = {Ψ̂∗δ (f ) − Ψ∗δ (f )} + {Ψ∗δ (f ) − Ψ∗ (f )}.
20
where {Ψ∗δ (f ) − Ψ∗ (f )} represent the bias term that goes to zero as δ increase. {Ψ̂∗δ (f ) − Ψ∗δ (f )} is
the estimation error term and it may increase as δ increases.
The conditional MSE is defined by

2
M SE = E ||Ψ̂δ − Ψ|| |X1 , ..., Xn .
Proposition 1: Under assumptions A1 to A5, if ||K||op < 1, then

m β
M SEF P CA = Op + Op λm+1 . (32)
λm n

m
M SEF P LS = Op + Op λβm+1 . (33)
θm n
where θm is the smallest root of the residual polynomial Q̄m,j . The first Op term represent the
estimation error and the second one is the bias term
Proposition 2: Under assumptions A1 to A4,
if ||K||op < 1, then
m2

−2β
M SELF = Op + Op m . (34)
n
if β > 1 then

1 min{β,2}
M SEF T = Op + Op α . (35)
nα
if β < 1
αβ

M SEF T = Op + Op αβ . (36)
nα2
Remarks :
• The rate of convergence of FPCA and FPLS depend on the decreasing rate of the eigenvalues
of the covariance operator (λm ). Indeed, for the FPLS approach, the smallest eigenvalue of
21
the Hankel Matrice (θm ) is such that θm < λm ( see for instance Lingjaerde & Christophersen
(2000)). Furthermore, since that θm decrease at an exponential rate (see Berg & Szwarc (2011)),
it is most of the time expected that the FPLS present a high estimation error of the autoregressive
operator. Furthermore, if the eigenvalues do not decrease very smoothly, FPCA and FPLS may
underestimate
• In contrast to the FPCA and FPLS methods, the rate of convergence for the FLF and FT do
not depend on the decreasing configuration of the eigenvalues. Moreover, due to the saturation
property 4 of the FT method, the FLF approach (that can be observed as an iterative version
of the FT method) should be preferred to FT (see Carrasco et al. (2007)). This pattern should
be checked in the simulation.
• Propositions 1 and 2 show that as m increase the squared bias term decrease while the variance
increase. Then m should be optimally chosen i.e. such that bias is equal to the variance. Then
at the optimality,
−β
• If m ∼ n1/(2+2β) then M SEF LF ∼ n 1+β .
−min{β,2}
• For β > 1, if α ∼ n−1/(2+2β) then M SEF T ∼ n 1+min{β,2} ,
• For β < 1, if nα2 → ∞, M SEF T ∼ αβ .
• The upper bound derived for FPCA and FPLS are more general bound. Specially the upper
bound obtained for the FPCA is different to the one obtained by Imaizumi & Kato (2018) since
that their assumptions on the decreasing rate of the eigenvalues λj are more restrictive. The
assumptions concidered by Crambes et al. (2013) is also different and more restrictive than the
one considered in this paper, but we get the same upper bound. To get more information about
the optimal number of functional components m for the FPCA and FPLS, it is necessary to set
some additional assumption on the eigenvalues.
Proposition 3:
Assume that A1 to A4 (A1 to A5) for FLF and FT (for FPCA) hold, if E(||Xi ||4 < ∞ and E(||Xi ||2 ||εi ||2 ) <
4
see chapter 6 of Engl et al. (1996) concerning the saturation property of the Ridge regression.
22
∞, then
√
n(Ψ∗δ − Ψ∗δ ) → N (0, Ωδ ). (37)
where δ = m for the FPCA and FLF and δ = α for the FT.

−1 −1 −1 −1
¯ (ε + Kδ ◦ Ψδ (Xi )) ⊗ Kδ (Xi ))
Ωδ = E (ε + Kδ ◦ Ψδ (Xi )) ⊗ Kδ (Xi )) ⊗
(38)
∗ ¯ ∗
− E Kδ (Ψ ) ⊗ E(Xi ) ⊗E Kδ (Ψ ) ⊗ E(Xi ) .
6 Simulation Results
This section is devoted to compare performance of the described estimation methods in a finite
sample context. The comparison are made in terms of Monte Carlo Simulations. The main comparisons
are done on the mean-square error of the estimated autoregressive operator and the mean-square
prediction error of the model. The model setting is the FAR(1)
Z 1
Xn+1 (t) = Ψ(t, s)Xn (s)ds + εn+1 (t) n = 1, ..., N. (39)
0
The three error processes used by Didericksen et al. (2012) ε(1) (t), ε(2) (t), and ε(3) (t) are considered
and are defined as follows :
ε(1) (t) = W (t) − tW (1). (40)
is the Brownian motion, where W is the standard Wiener process generated as
b
b 1 X
W( ) = √ Z` b = 1, ..., B.
B B `=1
and Z` are independent standard normal variables and Z0 = 0
√ √
ε(2) (t) = ξ1 2sin(2πt) + ξ2 2κcos(2πt)W (t) − tW (1). (41)
where ξ1 and ξ2 are two independent variables following a normal distribution and κ can be a constant.
23
and
ε(3) (t) = aε(1) (t) + (1 − a)ε(2) (t). (42)
where a ∈ [0, 1] is a real constant which represent the strength of the two components ξ1 and ξ2 . ε(1) (t)
is an infinite series expansion , ε(2) (t) is a finite series expansion and ε(3) (t) is the combination of the
previous one.
The theoretical autoregressive operator Ψ is an integral operator mapping from L2 ([0, 1]) to
L2 ([0, 1]). Four configurations of Ψ are considered, that are the :

h 2 2
i
Gaussian operator : Ψ(s, t) = Cexp − t +s 2 ,
Identity operator : Ψ(s, t) = C,
Sloping plane (t) : Ψ(s, t) = Ct,
Sloping plane (s) : Ψ(s, t) = Cs with (s, t) ∈ ([0, 1])2 and C a constant useful to normalize the
autoregressive operator.
Two norms of the operators are considered ||Ψ||L = 0.5 and ||Ψ||L = 0.8. A continuous intervals
of [0, 1] is considered. This interval consist of 1000 equally-spaced discretization. Three sample size of
functional time series N is considered, that are 50, 100, 200 and 500. For the numerical integration, the
trapezoidal rule is used for all the the operations in the simulation and real data applications. It has
been noticed that a good estimation and prediction of the model depend on the choice of the tunning
parameter that is the number of principal components m for the FPCA and FPLS methods, the
number of iterations m for the FLF and the regularization parameter α for the FT. These parameters
are chosen in such a way that the MSE and the MSPE is minimized. The predictive cross-validation
(PCV) is performed to run the Monte Carlo and choose the regularization parameter. The PCV is
based on choosing the optimal tunning parameter that predict well the discarded observations curve
after estimation.
2
N −1

1 X
P CV (δ) = Xn+1 − Ψ̂δ (Xn ) .

N −1
n=1
with δ equal to m for FPCA, FPLS, FLF, and equal to α for the FT. The prediction error a new curve
at the time n
Indicators for prediction error:
24
To measure the prediction quality, two indicators are considered, that are the root integrated
squared error (En) and the integrated absolute error (Rn). Those criteria are given by
s
Z 1 2
En = X̂n (t) − Xn (t) dt
0
Z 1
Rn = X̂n (t) − Xn (t)dt

0
Indicators for estimation error:
To analyze the estimation error, three criteria are considered. That are the mean squared error
(MSE), the Average distance (AD) and the ratio averaged distance (RAD). Those criteria are given
by
s
Z 1Z 1 2
M SE = Ψ̂(s, t) − Ψ(s, t) dsdt
0 0
Z 1Z 1
AD = Ψ̂(s, t) − Ψ(s, t)dt

0 0

Ψ̂(s, t) − Ψ(s, t)

Z 1Z 1
RAD(Ψ) = dsdt
0 0 |Ψ(s, t)|
6.1 The Gaussian kernel model
Brownian motion innovation
The gaussian kernel model with brownian motion innovation is firstly considered to take into
account that the data When looking at the gaussian kernel setting with the brownian motion innovations
for the small sample size case (N = 50), it can be observed that the in terms of estimation, the FLF
and the FT methods outperform the others, but in terms of prediction all the compared method
seems to get the same performance except the fact that the FPLS tend to be sensible to the presence
of functional outliers. The figure 3 display such results. The fact that FLF and FLF outperform
the FPCA and FPLS can be explained by the fact that their convergence rate do not depend on the
decreasing rate eigenvalues λj , while the others do. Furthermore, the prediction error is almost the
25
same for all the method is because there is a smoothing effect when using the integral to calculate
the predicted function. Furthermore, as the sample size increase, the bias and the estimation error is
reduced.
Figure 3: Comparison of the different estimation techniques. Gaussian Kernel with n
= 50, and ε(1)
MSE RAD AD
3
4 6
2.5
3 2
MSE
RAD 4
AD
2 1.5
2 1
1
0.5
FPCAFPLS FT FLF FPCA

FPLS FT FLF FPCA
FPLS FT FLF
Rn En
2
2
1.5
Rn
En
1.5
1
1
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
As the sample size increase and become very large (N = 200 and N = 500), the same pattern is
observed and nicely perceptible, but the dispersion is reduced for the the FPLS method (see figure
4). Another reason of the gap observed between those different techniques in terms of estimation
performance is that the eigenvalues λj may decrease very quickly (potentially more than an exponential
rate). This pattern has also been noticed by Didericksen et al. (2012). In that situation introducing a
regularization parameter for each functional component for the FPCA an FPLS (or on the eigenvalues
λj and θj should improve the results. Those regularization parameters should also be optimally chosen
in terms of estimation or prediction.
Smooth innovation
Considering that the innovation term are smooth is the same thing as considering the case of
smooth data. When looking at this case for the gaussian kernel model, it is observed that the FPLS
is still very affected by the functional outliers. The gab between the methods in terms of estimation
26
= 500, and ε(1)
MSE RAD AD
0.6
0.7
1.2
0.6 0.5
1
0.5
MSE
RAD
0.4
AD
0.8
0.4
0.6 0.3
0.3
0.4 0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.2
1
Rn
En
1
0.8
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
and prediction is not much pronounced, whatever the sample size considered. Maybe it is necessary
to find a way to manage those outliers in order to get more information. (See figure 5 and figure 6 )
Mixed innovations
The case of mixed innovation display the almost the same results as the previous one. Indeed, if
more weight is attributed to the smoothed innovations, then the same results as the case of smooth
innovations is obtained, while, if the weight are attributed to the brownian motion innovation, the
results are similar to such case.
6.2 The Identity kernel
When considering a different kernel, the kernel and for instance the identity one, it can be observed
that in the context of brownian motion innovation, the same pattern is observed for the evaluation
criteria considered in terms of estimation and prediction. The gap is still observable in terms of
estimation quality, while the prediction performance are almost the same (see figure 7, 16 and 17).
Moreover, the When the sample size become very large (N = 500), the prediction error with
FPLS becomes smaller than the one obtained by the other methods (see figure 8). Furthermore, as
27
= 50, and ε(2)
MSE RAD AD
2500
4000 5000
4000 2000
3000
MSE
1500
RAD
3000
AD
2000
2000 1000
1000 500
1000
0 0 0
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
10 58 Rn 10 58 En
10
6
8
4 6
Rn
En
4
2
2
0 0
FPLS FT FLF
the sample size increase, the bias and the estimation error is reduced in terms of prediction. All the
results presented at the optimal tuning parameter.
6.3 The Slope kernel
The same results are obtained for the slope kernel case as can be seen in figure 9, figure 10 and
figure 11
7 Application to the S&P 500 intraday data
7.1 Data
The S&P 500 Index future data is used to analyze the intraday returns predictability. The sample is
from 01/01/2008 to 12/31/2017. The data is collected from the website called www.backtestmarket.com.
This sample would be used in different parts to analyse the return predictability to the identification
of the intraday momentum and the main causes and also to test the robustness of the results in many
different context such as the volatility effect, the volume effect, the liquidity effect, the aftermath
28
= 200, and ε(2)
MSE RAD AD
2000 2500
2000 1000
1500
MSE
RAD
1500
AD
1000
1000 500
500 500
0 0 0
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
10 45 Rn 10 46 En
2
10 1.5
Rn
En
1
5
0.5
0 0
FPLS FT FLF
financial crisis, the macroeconomic news release effect (Federal Open Market Committee (FOMC),
Consumer Price Index (CPI), Growth Domestic Product (GDP)), the infrequent rebalancing effect
and other ideas.
7.2 Intraday momentum : empirical analysis
To start the empirical analysis, the simple functional autoregressive model is considered where the
current cumulative intraday market return is used to predict the next day. The year considered is 2015
- 2017. these results would be tested for the other years of our data base. In the prediction sample,
The regression model is given by
Z 1
Xn+1 (t) = Ψ0 (t) + Ψ(s, t)Xn (s)ds + εn+1 (t), n = 1, ..., 750. (43)
0
The sample size for this regression period is N = 750. This sample is split into two parts, that
are the regression part and the prediction part (N 1 = 650 days) and the Validation and testing part
(N 2 = 100 days) in order to choose the optimal tuning parameter for each estimation method. Each
day is represented by the by 273 discretization of 5 minutes during 24 hours of a day. Figure 12
29
Figure 7: Comparison of the different estimation techniques. Identity Kernel with n
= 50, and ε(1)
MSE RAD AD
2.5
1.2
1.5
2 1
0.8
MSE
RAD
1 1.5
AD
0.6
1
0.5 0.4
0.5
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
4
4
3
3
Rn
En
2
2
1
1

FPLS FT FLF
display a contour plot representing how correlated is the previous day cumulative return shape to the
next one on a 5 minute timeframe, that is the estimated autoregressive operator Ψ̂(s, t). It can be
seen that the FT and and FLF display almost similar results in terms of estimation, while the FPCA
and FPLS display different results. The yellow part represent the positive correlation while the blue
one represent the negative correlation. It can be observed that the previous trading day first hour
opening session contribute to predict positively to the next day momentum while, last hour return
of the previous day predict positively the next day return in the first half and predict negatively the
market return in the second half of the next trading day. This momentum can be explained by many
different causes.
The next step is to plot the predictive functional R − squared. From the figure 13, it is easy to see
that the most important time of the day to buy stocks is around 9:30 AM - 10:30 AM. The result is
similar for all the different estimation methods. the FPLS tend to catch a remarkable value of 8% at
the beginning and the end of the trading session, that is almost 4 times the one obtained by Gao et al.
(2018) and twice the one obtained by Zhang et al. (2019). The FPCA and FT method tend to reach
a predictive R2 of 5% and 6% respectively in the morning (That is nearly twice the one obtained by
Gao et al. (2018)) and nearly 2.5% at the end of the trading session for the FT approach. Surprisingly,
30
= 500, and ε(1)
MSE RAD AD
0.6 1 0.5
0.5
0.8 0.4
MSE
RAD
AD
0.4
0.6 0.3
0.3
0.4 0.2
0.2
0.2 0.1
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.1
0.9
1
0.8
Rn
En
0.7 0.9
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
the FPCA is not able to capture the momentum of the ending trading session. According to the FLF
approach, the R2 is around 2.5% at the beginning of the trading session while in the second half of
the day it is approximately 1.2%.
7.3 Main causes of the intraday momentum
The next step is to observe how this model perform out-of-sample in order to examine its stability
to irregular events and what are the main causes of momentum in such considerations. The new
regression model configuration is given by
Z 1
Xn+1 (t) = Ψ0 (t) + Ψ(s, t)Xn (s)ds + ρ1 Ie,n + ρ2 Ie,n ∗ Rn + εn+1 (t), n = 1, ..., 750. (44)
0
where Ie,n is a dummy variable taking the value 1 if the concerned event happened at day n and 0
otherwise. Rn the cumulated return at the closing of the day n. ρ2 capture the interaction effect
between Ie,n and Rn . Ie,n can be a variable observing if there where a high impact news in that day
n, or the financial crisis days, or the high volume day, or the high volatility day.
The Frisch-Waugh theorem is used to estimate this equation.
31
Figure 9: Comparison of the different estimation techniques. Slope Kernel (s) with n
= 50, and ε(1)
MSE RAD AD
8
3 2
6
1.5
MSE
RAD
2
AD
4
1
1 2 0.5
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
400
300
200
Rn
En
200
100
100
0 0
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
8 Conclusion
32
Figure 10: Comparison of the different estimation techniques. Slope Kernel (s) with
n = 100, and ε(1)
MSE RAD AD
1.4 0.8
1.2 2
1 0.6
1.5
MSE
RAD
AD
0.8
1 0.4
0.6
0.4
0.5 0.2
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
8
8
6
6
Rn
En
4
4
2 2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
9 Appendices
9.1 Proofs of the proposition 1
(To be completed)
(To be completed)
(To be completed)
9.4 Graphs
33
Figure 11: Comparison of the different estimation techniques. Slope Kernel (s) with
n = 200, and ε(1)
MSE RAD AD
1.2
1 0.6
1.5
MSE
RAD
0.8
AD
1 0.4
0.6
0.4
0.5 0.2
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
10 12
10
8
8
6
Rn
En
6
4
4
2 2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
References
Aguilera, A. M., Escabias, M., Preda, C., & Saporta, G. (2010). Using basis expansions for estimating
functional pls regression: applications with chemometric data. Chemometrics and Intelligent
Laboratory Systems, 104 (2), 289–305.
Antoniadis, A., & Sapatinas, T. (2003). Wavelet methods for continuous-time prediction using
hilbert-valued autoregressive processes. Journal of Multivariate Analysis, 87 (1), 133–158.
Aue, A., Norinho, D. D., & Hörmann, S. (2015). On the prediction of stationary functional time
series. Journal of the American Statistical Association, 110 (509), 378–392.
Benatia, D., Carrasco, M., & Florens, J.-P. (2017). Functional linear regression with functional
response. Journal of Econometrics, 201 (2), 269–291.
Berg, C., & Szwarc, R. (2011). The smallest eigenvalue of hankel matrices. Constructive
Approximation, 34 (1), 107–133.
Besse, P. C., & Cardot, H. (1996). Approximation spline de la prévision d’un processus fonctionnel
autorégressif d’ordre 1. Canadian Journal of Statistics, 24 (4), 467–487.
34
Figure 12: The estimated Autoregressive operator. Year 2015 - 2017
Estimated Kernel (FPCA) Estimated Kernel(FPLS)

1 1
next day (9:30 AM - 4:00 PM)
next day (9:30 AM - 4:00 PM)

0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Current day (9:30 AM - 4:00 PM) Current day (9:30 AM - 4:00 PM)
Estimated Kernel (FT) Estimated Kernel (FLF)

1 1
next day (9:30 AM - 4:00 PM)
next day (9:30 AM - 4:00 PM)
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Besse, P. C., Cardot, H., & Stephenson, D. B. (2000). Autoregressive forecasting of some functional
climatic variations. Scandinavian Journal of Statistics, 27 (4), 673–687.
Bogousslavsky, V. (2016). Infrequent rebalancing, return autocorrelation, and seasonality. The Journal
of Finance, 71 (6), 2967–3006.
Bosq, D. (2000). Linear processes in function spaces: Theory and applications, volume 149 of lecture
notes in statistics. Springer-Verlag New York Inc.
Carrasco, M., Florens, J.-P., & Renault, E. (2007). Linear inverse problems in structural econometrics
estimation based on spectral decomposition and regularization. Handbook of econometrics, 6 ,
5633–5751.
35
Figure 13: The estimated Functional R-Squared. Year 2015 - 2017
Functional R-squared(FPCA) Functional R-squared (FPLS)

0.05 0.08
0.04
0.06
R-squared(R2(t))
R-squared(R2(t))
0.03
0.04
0.02
0.02
0.01
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Functional R-squared (FT) Functional R-squared (FLF)

0.06 0.03
0.05 0.025
R-squared(R2(t))
R-squared(R2(t))
0.04 0.02
0.03 0.015
0.02 0.01
0.01 0.005
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Carrasco, M., & Rossi, B. (2016). In-sample inference and forecasting in misspecified factor models.
Journal of Business & Economic Statistics, 34 (3), 313–338.
Chan, E. (2013). Algorithmic trading: winning strategies and their rationale (Vol. 625). John Wiley
& Sons.
Chu, X., Gu, Z., & Zhou, H. (2019). Intraday momentum and reversal in chinese stock market.
Finance Research Letters, 30 , 83–88.
Crambes, C., Mas, A., et al. (2013). Asymptotics of prediction in functional linear regression with
functional outputs. Bernoulli , 19 (5B), 2627–2651.
Delaigle, A., Hall, P., et al. (2012). Methodology and theory for partial least squares applied to
functional data. The Annals of Statistics, 40 (1), 322–352.
36
Figure 14: Comparison of the different estimation techniques. Gaussian Kernel with
n = 100, and ε(1)
MSE RAD AD
2 1.5
3
1.5
1
MSE
RAD
2
AD
1
1 0.5
0.5
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.4
1.4
1.2
1.2
1
Rn
En
0.8 1
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Didericksen, D., Kokoszka, P., & Zhang, X. (2012). Empirical properties of forecasts with the functional
autoregressive model. Computational statistics, 27 (2), 285–298.
Engl, H. W., Hanke, M., & Neubauer, A. (1996). Regularization of inverse problems (Vol. 375).
Springer Science & Business Media.
Ferraty, F., & Vieu, P. (2006). Nonparametric functional data analysis: theory and practice. Springer
Science & Business Media.
Gao, L., Han, Y., Li, S. Z., & Zhou, G. (2018). Market intraday momentum. Journal of Financial
Economics, 129 (2), 394–414.
Griffin, J. M., Ji, X., & Martin, J. S. (2003). Momentum investing and business cycle risk: Evidence
from pole to pole. The Journal of Finance, 58 (6), 2515–2547.
Groen, J. J., & Kapetanios, G. (2009). Revisiting useful approaches to data-rich macroeconomic
forecasting.
He, X.-Z., & Li, K. (2015). Profitability of time series momentum. Journal of Banking & Finance,
53 , 140–157.
37
Figure 15: Comparison of the different estimation techniques. Gaussian Kernel with
n = 200, and ε(1)
MSE RAD AD
1.5
0.8 0.6
0.5
0.6 1
MSE
RAD
AD
0.4
0.4 0.3
0.5
0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.4
1.2
1 1.2
Rn
En
0.8 1
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Helland, I. S. (1988). On the structure of partial least squares regression. Communications in
statistics-Simulation and Computation, 17 (2), 581–607.
Heston, S. L., Korajczyk, R. A., & Sadka, R. (2010). Intraday patterns in the cross-section of stock
returns. The Journal of Finance, 65 (4), 1369–1407.
Hörmann, S., & Kokoszka, P. (2012). Functional time series. In Handbook of statistics (Vol. 30, pp.
157–186). Elsevier.
Höskuldsson, A. (1988). Pls regression methods. Journal of chemometrics, 2 (3), 211–228.
Hyndman, R. J., & Shang, H. L. (2009). Forecasting functional time series. Journal of the Korean
Statistical Society, 38 (3), 199–211.
Imaizumi, M., & Kato, K. (2018). Pca-based estimation for functional linear regression with functional
responses. Journal of multivariate analysis, 163 , 15–36.
Jegadeesh, N., & Titman, S. (1993). Returns to buying winners and selling losers: Implications for
stock market efficiency. The Journal of finance, 48 (1), 65–91.
38
= 100, and ε(1)
MSE RAD AD
1.4 0.7
0.8
1.2 0.6
1 0.5
0.6
MSE
RAD
AD
0.8 0.4
0.4 0.6 0.3
0.4 0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.2
1
1.1
0.9
Rn
En
1
0.8
0.9
0.7
0.8
0.6
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Kargin, V., & Onatski, A. (2008). Curve forecasting by functional autoregression. Journal of
Multivariate Analysis, 99 (10), 2508–2526.
Kelly, B., & Pruitt, S. (2015). The three-pass regression filter: A new approach to forecasting using
many predictors. Journal of Econometrics, 186 (2), 294–316.
Kokoszka, P., & Young, G. (2016). Kpss test for functional time series. Statistics, 50 (5), 957–973.
Kokoszka, P., & Zhang, X. (2010). Improved estimation of the kernel of the functional autoregressive
process (Tech. Rep.). Technical Report. Utah State University.
Kokoszka, P., & Zhang, X. (2012). Functional prediction of intraday cumulative returns. Statistical
Modelling, 12 (4), 377–398.
Lehmann, B. N. (1990). Fads, martingales, and market efficiency. The Quarterly Journal of Economics,
105 (1), 1–28.
Lingjaerde, O. C., & Christophersen, N. (2000). Shrinkage structure of partial least squares.
Scandinavian Journal of Statistics, 27 (3), 459–473.
39
= 200, and ε(1)
MSE RAD AD
0.8
1.2 0.6
0.6 1 0.5
MSE
RAD
0.8 0.4
AD
0.4 0.6 0.3
0.4 0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Rn En
1.4
1.2
1 1.2
Rn
En
0.8 1
0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF
Moskowitz, T. J., Ooi, Y. H., & Pedersen, L. H. (2012). Time series momentum. Journal of financial
economics, 104 (2), 228–250.
Neely, C. J., Rapach, D. E., Tu, J., & Zhou, G. (2014). Forecasting the equity risk premium: the role
of technical indicators. Management science, 60 (7), 1772–1791.
Ramsay, J. O., & Silverman, B. W. (2007). Applied functional data analysis: methods and case studies.
Springer.
Renault, T. (2017). Intraday online investor sentiment and return patterns in the us stock market.
Journal of Banking & Finance, 84 , 25–40.
Shang, H. L. (2017). Forecasting intraday s&p 500 index returns: A functional time series approach.
Journal of Forecasting, 36 (7), 741–755.
Sun, L., Najand, M., & Shen, J. (2016). Stock return predictability and investor sentiment: A
high-frequency perspective. Journal of Banking & Finance, 73 , 147–164.
40
Figure 18: Comparison of the different estimation techniques. Slope Kernel (t) with
n = 50, and ε(1)
MSE RAD AD
1
4
0.6
0.8
3 0.5
MSE
RAD
AD
0.6 0.4
2
0.4 0.3
1
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
2
2
1.5
Rn
En
1.5
1
1
0.5
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Wold, S., Ruhe, A., Wold, H., & Dunn, W., III. (1984). The collinearity problem in linear regression.
the partial least squares (pls) approach to generalized inverses. SIAM Journal on Scientific and
Statistical Computing, 5 (3), 735–743.
Zhang, Y., Ma, F., & Zhu, B. (2019). Intraday momentum and stock return predictability: Evidence
from china. Economic Modelling, 76 , 319–329.
41
n = 100, and ε(1)
MSE RAD AD
1
0.8 3 0.6
MSE
RAD
AD
0.6
2 0.4
0.4
1
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
2.5
2.5
2
2
Rn
En
1.5
1.5
1
1
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
n = 200, and ε(1)
MSE RAD AD
0.6 3
0.5
2.5
MSE
RAD
0.5
AD
0.4
2
0.4
1.5 0.3
0.3 1
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF
Rn En
1.4
1.2
1 1.2
Rn
En
0.8 1
0.6 0.8
FPCA
FPLSFT FLF FPCA
FPLSFT FLF
42
Figure 21: The estimated Autoregressive operator. Year 2011 - 2014
Estimated Kernel (FPCA) Estimated Kernel(FPLS)

1 1
next day (9:30 AM - 4:00 PM)
next day (9:30 AM - 4:00 PM)

0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Estimated Kernel (FT) Estimated Kernel (FLF)

1 1
next day (9:30 AM - 4:00 PM)
next day (9:30 AM - 4:00 PM)
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
43
Figure 22: The estimated Functional R-Squared. Year 2011 - 2014
Functional R-squared(FPCA) Functional R-squared (FPLS)

0.05 0.08
0.04
0.06
R-squared(R2(t))
R-squared(R2(t))
0.03
0.04
0.02
0.02
0.01
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Functional R-squared (FT) Functional R-squared (FLF)

0.06 0.03
0.05 0.025
R-squared(R2(t))
R-squared(R2(t))
0.04 0.02
0.03 0.015
0.02 0.01
0.01 0.005
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
44

Idriss Tsafack CIREQ Presentation UdeM

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Idriss Tsafack CIREQ Presentation UdeM

Загружено:

Авторское право:

Доступные форматы

Intraday Market Momentum : A functional Econometric approach

June 12, 2019

some investment or portfolio rebalancing actions within a trading day.

the predictability and to develop the curve interval forecast.

valid for a large category of error term configuration.

& Zhang (2012).

application is developed in Section 7 and section 8 concludes.

Europe, Asia, Australia, except the Japanese market.

tend to destabilize the market.

are exact multiples of a trading day on a 40 days time horizon.

of i.i.d model using the FT and FPCA method.

3 The Model Setting

tj , on a given day i, with j = 1, ..., 273 . The return is defined as

Ri (tj ) = 100 ∗ [ln(Pi (tj )) − ln(Pi (t1 ))] (1)

Intraday cumulative return

Figure 1: Intraday cumulative returns of the S&P 500 index in 2017

And the continuous curves are constructed by

X(t) = Ri (tj ) with tj ∈ (5(j − 1), 5j]. (2)

then E(||Xi ||p ) < ∞.

Therefore, we observe a sequence {X1 , X2 , ...., XN } of realizations of X, where Xi corresponds

where for each day n , Xn is a random curve of a Hilbert space H, Ψ : H → H is a bounded

is assumed that E(Xn ) = 0.

Functional Time series (day n) Functional Time series (day n+1)

Intraday cumulative return

Figure 2: Functional Predictor and Functional response

||Ψ||L = Sup{||Ψ(f )|| : ||f || ≤ 1} (4)

(1) has a unique solution, which is a weakly stationary process in H given by

3.1 of Kokoszka & Zhang (2010)).

compare with the competitive methods.

E[< Xn+1 , f > Xn ] = E[< Ψ(Xn ), f > Xn ], f ∈ H.

Let us define the covariance operator by

K(f ) = E[< Xn , f > Xn ]

Hilbert-Schmidt and it’s spectral system (vj , λj )j≥1 is defined by

Let us define the cross-covariance operator by

D(f ) = E[< Xn+1 , f > Xn ].

Then, it is easy to see that

D(f ) = Ψ(K(f )). (6)

We can rewrite the previous equation as

D∗ (f ) = K(Ψ∗ (f )). (7)

D̂∗ respectively, where

forming an orthnormal basis of H.

operator K is compact and is defined in an infinite-dimensional space. Then K −1 is an unbounded

the FPCA outperforms.

multivariate prediction is a kind of generalization of the univariate forecasting proposed by Hyndman

Frieman iteration method (FLF).

4.1 The Functional Principal Component Analysis

and nuclear, it admits a spectral decomposition, that is

The autoregressive operator Ψ is Hilbert-Schmidt. Since {vj ⊗ vk }∞

L2 ([0, 1]2 ) then it admits a Hilbert-Schmidt decomposition.

function Xn we have for each k ≥ 1, we have

Therefore we obtain the following characterization of Ψ.

E[< Xn , vk >< Xn+1 , vj >]

This estimator can also be written as

will not consider this approach for this paper.

tackle the high dimensionality problem.

where I is the identity operator. The empirical version is then given by

method is replaced by the shrinkage effect of the parameter α.

4.3 Functional Landweber-Fridman(FLF)

For h = 1, ... , m, we have

Ψ∗h (f ) = (I − dK)(Ψ∗h−1 (f )) + dD∗ (f ), f or each f ∈ H.

can be written as a polynomial function of the covariance operator K and we have.

. Then Ψ̂∗m is given by

4.4 Functional Partial Least Squares(FPLS)

subject to ||vh || = 1, ||ch || = 1, and (22)