Вы находитесь на странице: 1из 44

Intraday Market Momentum : A functional Econometric approach

Idriss Tsafack∗1

1
Department of Economics, University of Montreal

June 12, 2019

Abstract

This paper propose to analyze the S&P 500 intraday cumulative return predictability with a fully
functional autoregressive model. This approach is practically important because market participant
can use the forecast results to tactically adjust their market timing exposure for the next day. Since
there is a deal with very big data, one of the main challenge is to estimate properly the autoregressive
operator. In order to get a more stable estimation, four different methods are suggested, that
are the Functional Principal Component Analysis (FPCA), the Functional Partial Least Squares
(FPLS), the Functional Tikhonov method (FT) and the Functional Landweber Fridman technique
(FLF). The convergence rate of the estimated autoregressive operator is derived for the different
methods. Monte Carlo simulations and real data application show that the FLF and the FT method
outperform the others method in overall model settings. Moreover, based on the dynamic of the
previous day, it is always better for traders to expose their portfolio only on the first opening hour
of the market. If traders were more active in the first half of the previous day, it is possible to hold
their investment in the whole trading session for the next day, otherwise they should hold at most
for the first half of the next day and turn to a reversal strategy on the second half. Using a the
FPLS estimation approach, it is possible to reach a remarkable R2 value of 8% in the first and last
hours of a trading session, that is almost 4 times the one obtained by Gao et al. (2018).

Keywords : Time series momentum, intraday momentum, Functional linear regression, Principal
Component Analysis, Regularization Method, return predictability, Big data.
JEL Codes : C01, C49, C53, C55, C62, G12, G13, G15.

Economic Departement, University of Montreal, 3150 rue Jean-Brillant, Montral, (QC) H3T 1N8, Canada
(idriss.tsafack.teufack@umontreal.ca)

1
1 Introduction

Following the idea of Jegadeesh & Titman (1993), there is a very large literature documenting

the success of the momentum strategy in the financial market. The idea behind this strategy is

that the market participant should buy the assets that have outperformed and sell those that have

underperformed based on their historical evolution. The literature distinguish two types of momentum

strategies, that are the cross-sectional momentum, widely discussed by authors such as Griffin et al.

(2003), Lehmann (1990), and the time series momentum strategies developed by Moskowitz et al.

(2012), Neely et al. (2014) and other. Unfortunately, most of these papers tend to focus on long time

horizon - typically month or longer. The issue with long term is that they display very low sharp

ratio and backtest statistical significance because of infrequent independent trading signals. Another

issue is that long term momentum usually underperform in the aftermath of financial crisis 1 . The

research interest on such momentum pattern at the intraday granularity has recently started with Gao

et al. (2018) who show the evidence that the first half-hour return positively predicts the last half-hour

return in the U.S. stock market. Furthermore, Sun et al. (2016) and Renault (2017) have identified

that the intraday stock return can be predicted by high-frequency investor sentiment.

On the same flow, this paper propose to analyze the S&P 500 intraday cumulative return predictability

with a fully functional autoregressive model. In contrast to the traditional financial analysis, as an

econometrician the intraday cumulative returns can be observed as curves in a functional space. Then,

the shape of the cumulative return observed at the 5 minutes timeframe is used to predict the next

day shape2 . This approach is practically important because market participant can use the forecast

results to tactically adjust their market timing exposure for the next trading day. Furthermore, on

an econometric point of view, using functional data analysis is interesting since it makes possible to

take into account additional informations, that is the dynamic of the return from a certain time of the

day to another. Moreover, this approach is helpful to get a nice overview of the market participant

behavior within a trading day such as the U-Shape of the volatility, the most relevant period to take

some investment or portfolio rebalancing actions within a trading day.

When using a functional autregressive model, because the model setting is directly exposed to very

1
see page chapter 7 of Chan (2013)
2
The 5 minute is considered here just for illustration purpose. It is possible to use other timeframe such as the 10
minutes, 15 minutes, 1 minutes or tick data.

2
high dimensional data, one of the most important challenge is to estimate the autoregressive operator
3. Indeed, if one is not careful about the manipulation of the functional objects, there is a high

probability to obtain very unstable estimators of the autoregressive operator. To control the stability

of the estimated parameter, this paper suggest to compare 4 different estimation methods that are the

FPCA, FPLS, FT and FLF methods. These methods can also be observed as regularization methods

they all depend on a tuning parameter. To be sure of the quality of the estimation methods, their

respective convergence rate and asymptotic normality are derived and analyzed under some general

regularity conditions. These asymptotic normality results would be useful to test the significance of

the predictability and to develop the curve interval forecast.

To compare the four estimation methods, some Monte Carlo simulations have been developed. The

comparison is based on 5 criteria that are the Mean Squared Error (MSE), the Mean Average Distance

(AD), the Ratio Average Distance (RAD) to measure the quality of the estimation of the kernel, on

one hand and the Mean Squared Prediction Error measured on two different approaches (En and

Rn) on the other hand. Based on a large part of the model setting considered, the simulations show

that the FLF and the FT tend to outperform the others in terms of estimation of the autoregressive

operator what ever the sample size and the error terms of the model are. In terms of prediction, the

4 methods behave almost the same way for the small sample size and in the situation of large sample

size, the FPLS approach tend to outperform the others in terms of prediction errors. This result is

valid for a large category of error term configuration.

The empirical analysis of this paper use the S&P 500 future data from 01/01/2008 to 12/31/2017

on the 5 minute timeframe. The empirical analysis provide several interesting findings. An overview

of the empirical findings show the evidence that the cumulative intraday return shape of the current

trading day contribute significantly to predict the next day cumulative return shape. The different

estimation method provide different results. The FT and FLF display almost similar estimation results

of the autoregressive operator while the FPCA and FPLS tend to capture well the prediction results.

Based on the predictive R2 , the FPLS tend to catch a remarkable value of 8% at the beginning and

the end of the trading session, that is almost 4 times the one obtained by Gao et al. (2018) and twice

the one obtained by Zhang et al. (2019). The FPCA and FT method tend to reach a predictive R2 of

5% and 6% respectively in the morning (That is nearly twice the one obtained by Gao et al. (2018))
3
The autoregressive operator is similar to the slope parameter in the context of a simple OLS model.

3
and nearly 2.5% at the end of the trading session for the FT approach. Surprisingly, the FPCA is

not able to capture the momentum of the ending trading session. According to the FLF approach,

the R2 is around 2.5% at the beginning of the trading session while in the second half of the day it is

approximately 1.2%.

An examination of the autoregressive operator show that a strong momentum in the first half

of a trading day is positively correlated to to the next day shape, while a strong momentum in the

second half of a trading session is significantly positively correlated to the first half of the next trading

and negatively correlated to the second half of the next trading day. When the traders The previous

day first hour opening trading session is significantly positively correlated to the next. This result is

based on the FLF and the FT estimation method. Moreover, the shape of the previous day contribute

to derive more simply the volatility U-shaped patterns with the high volume and volatility in the

beginning and the last hour of the next trading day. This result is easily observed when the FPLS

approach is used. This result is similar to the one obtained by Gao et al. (2018). An exploration of

the effect of the volatility and the trading volume on the intraday momentum, we observe that the

intraday momentum is stronger on high volatility of the beginning trading session and high volatility

of the ending trading days. The same result is observed for volatility. These results can be explained

by the portfolio rebalancing pattern, the late-informed investor effect, the market manipulation of the

high frequency traders and often forced sales or purchases of assets of various type of funds.

This paper is related to the one of Gao et al. (2018) and Zhang et al. (2019) for the context

of financial market. The difference with their result is that the cumulative returns are observed

as curve instead of working with instant return. Moreover, this approach is easy to document the

U-shaped volatility and the in-sample and out-of-sample R2 function can reach an impressive predictive

performance of 8%, that is 4 times the performance of Gao et al. (2018). The idea of using cumulative

return is inspired by the paper of Kokoszka & Zhang (2012) and Shang (2017). Indeed, Kokoszka

& Zhang (2012) use individual assets and market assets respectively, but they their main purpose is

about comparing some forecasting methods the next day curve. The difference with their paper is

that this paper care about predictability and forecasting by proposing 4 different estimation methods.

The consistency results of the estimation method (of the autoregressive operator) are presented, and

comparison of those methods are made based on simulation and empirical analysis. An economic

4
significance analysis is also derived. This paper is also related to the Functional linear Regression with

functional response widely developed by Benatia et al. (2017), Imaizumi & Kato (2018) and functional

autoregressive model extensively developed by Bosq (2000), Kargin & Onatski (2008) and Kokoszka

& Zhang (2012).

The rest of the paper is organized as follows. Section 2 is dedicated to present the related

literature. The functional econometric model is presented in section 3. In section 4, I present how

to estimate the model using the four aforementioned methods. Section 5 is devoted to analyze the

convergence rate and the asymptotic normality for the estimated autoregressive operator. Section

6 present the comparison of the four methods based on Monte Carlo simulations. The real data

application is developed in Section 7 and section 8 concludes.

2 Related literature

This paper is related to four literatures : the return predictability, the momentum strategy in the

financial market, the functional data analysis, and the functional autoregressive model.

The momentum strategy became very popular with the well-known seminal work of Jegadeesh &

Titman (1993). In fact, they developed a cross-sectional momentum using a monthly frequency and

show that buying past winners and selling past losers can generate a significant positive return over

the next 3 to 12 months of holding. This paper has been widely extended by Griffin et al. (2003)

who show the evidence that these momentum can be observed in different stock markets such as U.S.,

Europe, Asia, Australia, except the Japanese market.

In contrast to the cross-sectional momentum, Moskowitz et al. (2012) have documented the success

of the time-series momentum strategy in equity index, currency, commodity, and bond futures. The

idea behind this strategy is to look at one asset at a time instead of a group of assets. Indeed, they

show the evidence that the previous 12-month returns of an asset contribute significantly to predict

the future returns. He & Li (2015) have proposed a continuous-time heterogeneous agent model in

order to explain the significance of time series momentum. They show that the momentum strategies

perform well when the market is dominated by the momentum traders. They specifically documented

that the short term momentum strategies tend to stabilize the market, while the long term momentum

tend to destabilize the market.

5
Furthermore, the recent research by Gao et al. (2018) have demonstrated the intraday momentum

in the U.S. stock market at the 30 minutes frequency. Indeed they show that the that the first half hour

return contribute to predict the last half hour return and that the effect is stronger in more volatile

days, on higher volume days, recession days and high impact news release days. In the same line, Zhang

et al. (2019) have documented almost the same results in the China stock market and explained that

these momentum can be explained not only by the infrequent rebalancing or late-informed investors

but also the U-shaped volume pattern. Chu et al. (2019) have identified not only that the last half

hour is positively predicted by the first half hour, but they also identified a reversal effect in the second

half hour of the trading day in the Chinese stock market. They also find that this momentum and

reversal effect is robust when including previous day return and day-of-week. Beside those authors,

Heston et al. (2010) has discovered a striking pattern of return continuation at half-hour intervals that

are exact multiples of a trading day on a 40 days time horizon.

Concerning the main causes of momentum, Chan (2013) listed 4, that are the persistence of the

sign of roll returns for futures; the slow diffusion, analysis, and acceptance of new information; the

forced sales or purchases of assets of various type of fund and market manipulation of high-frequency

traders. On the other hand, based on some theoretical analysis Bogousslavsky (2016) has identified

the infrequent rebalancing and the late-informed investors’ effect. Indeed, the infrequent rebalancing

phenomenon is described by the fact that a certain group of investors decide to rebalance their portfolio

in early in the morning while other decide to do so near the market closing. On the other hand, the

late-informed investors’ effect induce the fact that news release is an important factor that contribute

significantly to return predictability. These hypothesis have been confirmed by Gao et al. (2018) and

Zhang et al. (2019) but not by Chu et al. (2019). They argue that the noise trading drives intraday

returns predictability. Beside those causes, high-frequency traders sentiment can also be listed to

significantly affect the times series momentum as was documented by Sun et al. (2016) and Renault

(2017).

The literature on functional data analysis has attracted a lot of attention in the statistical field

during the last decade. Some of the pioneers are Ramsay & Silverman (2007) who have developed a

general context, Hyndman & Shang (2009), Hörmann & Kokoszka (2012), Kokoszka & Zhang (2012)

, and Ferraty & Vieu (2006) who have developed a semiparametric approach. One of the main

6
challenge is to be able to estimate the slope function (in the context where the response is a scalar) or

the operator (if the response variable is a function) because there is a high dimensional issue. More

recently, some authors such as Crambes et al. (2013), Benatia et al. (2017) and Imaizumi & Kato

(2018) have analyzed the rate of convergence of the estimated parameter when we are in the context

of i.i.d model using the FT and FPCA method.

The functional autoregressive model has became popular by Bosq (2000) He consider a parametric

approach for estimation purpose. Besse et al. (2000) has considered the same model and adopted a

nonparametric approach for estimation. Kargin & Onatski (2008) has proposed a predictive factor

approach to estimate the autoregressive operator. The idea of the approach is project the data on a

set of factors that are relevant for the prediction. Didericksen et al. (2012) have compared the method

of Kargin & Onatski (2008) and the FPCA and show that the FPCA outperform the predictive factor

in terms of estimation and in terms of prediction, they reach the same prediction error. Hyndman

& Shang (2009) and Aue et al. (2015) have proposed to use a univariate and multivariate time series

forecasting method since the PCA scores can display a temporal dependence.

The usage of functional autoregressive model or the functional data analysis is less observed in

the financial market field, to the best of our knowledge. The usage of the cumulative return is the

interesting because the curve are more informative in the sense that they take into account the dynamic

between two discretization point. The only papers that can be identified are the one by Kokoszka &

Zhang (2012) who proposed to predict an individual stock by using a functional CAPM and Shang

(2017) who suggested to forecast the U.S. stock market by using the dynamic update technique. Their

main purpose is to compare different forecasting method and not to analyze the return predictability.

3 The Model Setting

In this paper, for each day we observe the shape of the intraday 5 minutes of the S&P 500

future cumulative return. We use the cumulative intraday returns (CIDRs) used by Gabrys et al.

(2010) and Kokoszka et al.(2015). Let Pi (tj ) be the 5 minutes closing price of a financial asset at time

tj , on a given day i, with j = 1, ..., 273 . The return is defined as

Ri (tj ) = 100 ∗ [ln(Pi (tj )) − ln(Pi (t1 ))] (1)

7
×10 -3
price evolution within the year 2017 Univariate Instant return Functional Time series
2700 1.5 1.5

2650
1

2600 1

0.5
2550

Intraday cumulative return


S&P 500 Price level

2500 0.5

Instant return
0

2450

-0.5
2400 0

2350
-1

2300 -0.5

-1.5
2250

2200 -1 -2
Jan 01, 2017 Jan 01, 2017 0 0.5 1
Time of the day Time of the day Time of the day

Figure 1: Intraday cumulative returns of the S&P 500 index in 2017

And the continuous curves are constructed by

X(t) = Ri (tj ) with tj ∈ (5(j − 1), 5j]. (2)

Figure 1 represent the constructed intraday cumulative return of the S&P 500 based on the raw

data for the year 2017. From the instant return and the one on the right hand side display at least 4

to 5 outliers

Let (Xi : i ∈ Z) be an arbitrary stationary functional time series of S&P 500 cumulative return

curve. It is assumed that each function Xi is an element of the separable Hilbert space H = L2 ([0, 1])

(the space of square integrable functions mapping from the compact interval [0,1] to R) endowed with
 1/2
R1 R1 2
the inner product < f, g >H = 0 f (t)g(t)dt and the norm and the norm ||f ||H = 0 f (t)dt .

Then each random function is square-integrable, that is E(||Xi ||2 ) < ∞. All the random functions are

defined on the same probability space (Ω, F , P ). Therefore, if X ∈ LpH = LpH (Ω, F , P ) with p > 0,

then E(||Xi ||p ) < ∞.

Therefore, we observe a sequence {X1 , X2 , ...., XN } of realizations of X, where Xi corresponds

to the observed curve of S&P 500 at the day i = 1, ..., N . In this paper, It is assumed that the

sequence of H-valued variables {X1 , X2 , ...., XN } follow a functional autoregressive Hilbertian process

8
of order 1 (FAR(1)) presented as follows:

Z 1
Xn+1 (t) = Ψ(t, s)Xn (s)ds + εn+1 (t) n ∈ Z (3)
0

where for each day n , Xn is a random curve of a Hilbert space H, Ψ : H → H is a bounded

linear operator and ε = (εn , n ∈ Z) is a H-valued strong white noise, which is a sequence of H-valued

identically and independently distributed random variables such that E(εn ) = 0 and E(||εn ||2 ) = σ 2 <

∞. We can also consider that the innovations ε is a sequence of H-valued martingale difference since

that the functional error do not necessary follow the same distribution. Without loss of generality, it

is assumed that E(Xn ) = 0.

Figure 2 represent how the predictor and the predicted function are displayed. The bold blue line

is the mean function for the predictor and the response function respectively. According to what is

observed, it can be deduced that if the functional outliers are removed, from the sample, it is possible

to say that each functional observation is generated from the same data generation process. This idea

has been argued by Kokoszka & Young (2016) who developed a KPSS unit root test for functional

time series.

Functional Time series (day n) Functional Time series (day n+1)


1.5 1.5

1 1

0.5 0.5
Intraday cumulative return

Intraday cumulative return

0 0

-0.5 -0.5

-1 -1

-1.5 -1.5

-2 -2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)

Figure 2: Functional Predictor and Functional response

9
Let us denote by L the space of bounded linear operators on H equipped with the norm

||Ψ||L = Sup{||Ψ(f )|| : ||f || ≤ 1} (4)

Under the conditions that there exist an integer j0 ≥ 1 such that the linear operator ||Ψj0 ||L < 1,

(1) has a unique solution, which is a weakly stationary process in H given by


X
Xn = Ψk (εn−k ) (5)
k=0

and the series converges almost surely in H. If it is assumed that the Hilbert-Schmidt norm of the

operator Ψ is lower than 1, then the existence en uniqueness of the solution is satisfied (See Lemma

3.1 of Kokoszka & Zhang (2010)).

We propose to estimate the yield factors by using the partial least squares approach and

compare with the competitive methods.

4 Model estimation

The goal of this paper is to forecast the one day ahead S&P 500 shape Xn+1 . According to

the data generating process, the best linear predictor of Xn+1 given X1 , ..., Xn is given by Ψ(Xn ).

Typically, Ψ is unknown and should be estimated consistently by an estimator Ψ̂. This section is

dedicated to the estimation of the autoregressive operator Ψ. We propose to estimate this operator

by four different strategy by using directly the fully functional form. Then multiplying equation (1)

by Xn and taking the expectation on both sides, lead to the following equation :

E[< Xn+1 , f > Xn ] = E[< Ψ(Xn ), f > Xn ], f ∈ H.

Let us define the covariance operator by

K(f ) = E[< Xn , f > Xn ]

Since E[||Xn ||2 ] < ∞, the covariance operator is symmetric, positive, nuclear and, therefore,

Hilbert-Schmidt and it’s spectral system (vj , λj )j≥1 is defined by

10
K(vj ) = λj vj , j ≥ 1.

with the eigenfunctions vj forming an orthonormal basis of H and the eigenvalues are such that

λ1 ≥ λ2 ≥ ... ≥ 0.

Let us define the cross-covariance operator by

D(f ) = E[< Xn+1 , f > Xn ].

Then, it is easy to see that

D(f ) = Ψ(K(f )). (6)

We can rewrite the previous equation as

D∗ (f ) = K(Ψ∗ (f )). (7)

where Ψ∗ and D∗ denote the adjoint operator of Ψ, and D respectively. The operators K, D and

D∗ are observed on the theoretical setting, so they are unknown and can be estimated by K̂, D̂ and

D̂∗ respectively, where

N −1
1 X
D̂(f ) = < Xn+1 , f > Xn .
N −1
n=1

N −1
∗ 1 X
D̂ (f ) = < Xn , f > Xn+1 .
N −1
n=1

and

N −1
1 X
K̂(f ) = < Xn , f > Xn .
N −1
n=1

K̂ is endowed with the empirical spectral system (λ̂j , v̂j )j≥1 with λ̂j ≥ λ̂j ≥ ... ≥ 0 and (v̂j )j≥1

forming an orthnormal basis of H.

Given the equation (5), one would like to brutally estimate the autoregressive operator by writting

11
Ψ∗ = K −1 D∗ as we usually do for the finite-dimensional context. The problem is that the covariance

operator K is compact and is defined in an infinite-dimensional space. Then K −1 is an unbounded

operator, which would lead to an unstable estimator. In the inverse probleme literature, equation (5)

is called an ill-posed problem in the sense that K is only invertable on a subset of H and its inverse

is not continuous.

Different approach have been proposed to estimate a stable autoregressive operator. Bosq (2000)

has suggested to use the Functional Principal Component Analysis (FPCA) and derived its consistency

results under some strong assumptions of the eigenvalues. There is a large literature on the nonparametric

techniques to estimate the autoregressive operator such as the spline smoothing and interpolations

techniques proposed by authors such as Besse & Cardot (1996), Besse et al. (2000), Ramsay &

Silverman (2007). Antoniadis & Sapatinas (2003) suggested linear wavelet technique to estimate a

FAR(1) model. Kargin & Onatski (2008) proposed a predictive factor technique which consist to find

out an estimator of the autoregressive operator in such a way that the prediction error is minimized by

projecting the data on principal components that respond to that goal. citedidericksen2012empirical

compared the FPCA method proposed by Bosq (2000) and the predictive factor technique of Kargin &

Onatski (2008) based on some simulation data and they show that in an overview of the comparison,

the FPCA outperforms.

Other contributions are in the univariate and multivariate time series forecasting of the principal

component scores proposed by Hyndman & Shang (2009) and Aue et al. (2015) respectively. The

multivariate prediction is a kind of generalization of the univariate forecasting proposed by Hyndman

& Shang (2009) Basically their approach consist to project first all the data on the most important

principal component by running a Functional Principal Component Analysis, then use the FPCA

scores to form time series vectors and then run the usual univariate ARMA models for each score

vector or VAR model to get the one step ahead scores prediction. Once those results are obtained, the

Karhunen-Loeve expansion is used to retransform the predicted score time series into the predicted

functional time series. There is also a contribution of Benatia et al. (2017) who proposed the Tikhonov

regularization to estimate the unknown operator for an i.i.d fully functional regression model. They

also derived some consistency results under the source conditions proposed by Carrasco et al. (2007).

The goal of this paper is to consider the functional Yule-Walker equation and estimate the

12
autoregressive operator by 4 different regularization techniques that are Tikhonov method, the FPCA

approach of Bosq (2000) , the Functional Partial Least Squares (FPLS) and the Functional Landweber

Frieman iteration method (FLF).

4.1 The Functional Principal Component Analysis

This approach is one of the most popular methods and was proposed by Bosq (2000) in order to

estimate the autoregressive operator on a finite subspace of H. Since that the operator K is symmetric

and nuclear, it admits a spectral decomposition, that is


X
K(s, t) = λj vj (s)vj (t).
j=1

The autoregressive operator Ψ is Hilbert-Schmidt. Since {vj ⊗ vk }∞


j,k=1 is an orthonormal basis of

L2 ([0, 1]2 ) then it admits a Hilbert-Schmidt decomposition.


X
Ψ(s, t) = Ψjk vj (s)vk (t).
j,k=1

R1R1
with Ψjk = 0 0 Ψ(s, t)vj (s)vk (t)dsdt. Moreover, using the Karhunen - Loeve representation of each

function Xn we have for each k ≥ 1, we have


X
E[< Xn , vk > Xn+1 (t)] = λk Ψjk vj (s)vk (t).
j=1

Therefore we obtain the following characterization of Ψ.


X E[< Xn , vk > Xn+1 (t)]
Ψ(s, t) = vk (s). (8)
λk
k=1

and

E[< Xn , vk >< Xn+1 , vj >]


Ψjk = .
λk

If it is assumed that m principal components is selected for the procedure, then the estimator of Ψ is

given by

13
m
X
Ψm (s, t) = Ψjk vj (s)vk (t). (9)
j,k=1

Since Ψjk and (vj )j≥1 are unknown, Ψ is consistently estimated using sample data [X1 , ..., XN ] and

we obtain

m
X
Ψ̂m (s, t) = Ψ̂jk v̂j (s)v̂k (t). (10)
j,k=1

with

N −1
1 1 X 
Ψjk = < Xn , v̂k >< Xn+1 , v̂j > . (11)
λ̂k N −1
n=1

This estimator can also be written as

N −1 m
1 X X < Xn , v̂j >
Ψ̂m (s, t) = v̂j (s)Xn+1 (t). (12)
N −1 λ̂j
n=1 j=1

and

N −1 N −1 ˆ
1 X X Qm,j
Ψ̂∗m (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (13)
N −1 λ̂j
j=1 n=1

This procedure was also considered by Crambes et al. (2013) for the i.i.d model. Another configuration

of the FPCA such as the one proposed by Imaizumi & Kato (2018). Their approach consist to project

the data on the predictor and the response function on the first m principal component, then use the

vector of score to estimate the Fourrier coefficients of the autoregressive operator and the estimated

autoregressive operator is then obtained by writing the estimated operator on the basis of the m

eigenfunctions of the covariance operator with the Fourrier coefficients as the estimated scores. We

will not consider this approach for this paper.

14
4.2 Tikhonov Method

This technique is widely used in the inverse problem literature. It has been studied recently by

Benatia et al. (2017) in the context of fully functional regression. They derive some consistency and

asymptotic normality results for the estimated operator. This technique is most widely justified to

tackle the high dimensionality problem.

Let us consider by α a positive tunning parameter which will be considered for the estimation

purpose and such that it converge to zero as n goes to infinity. Then the esimated autoregressive

operator is given by

 −1
Ψ∗α = αI + K D∗ . (14)

where I is the identity operator. The empirical version is then given by

 −1
Ψ̂∗α = αI + K̂ D̂∗ . (15)

This estimator can also be characterized in terms of the spectral system of the covariance operator K̂

as follows

N −1 ˆ
Q N −1
!
X α,j 1 X
Ψ̂∗α (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s). (16)
λ̂j N −1
j=1 n=1

and

N −1 N −1 ˆ
1 X X Qα,j
Ψ̂∗α (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (17)
N −1 λ̂j
j=1 n=1

ˆ
where Q
λ̂j
α,j = λ̂j +α
is called the filter factor. The truncation that is operated with the FPCA

method is replaced by the shrinkage effect of the parameter α.

4.3 Functional Landweber-Fridman(FLF)

The Landweber-Fridman method is basically an iterative method. This method is also very popular

in the literature of inverse problem. Let consider a positive parameter d such that 0 < ||K||L < 1/d.

Then, the FLF technique can be computed iteratively as follows. Take the initial value

15
Ψ∗0 (f ) = dD∗ (f ), f or each f ∈ H.

For h = 1, ... , m, we have

Ψ∗h (f ) = (I − dK)(Ψ∗h−1 (f )) + dD∗ (f ), f or each f ∈ H.

where m is the maximum number of iterations. We see that the the estimated autoregressive operator

can be written as a polynomial function of the covariance operator K and we have.

m
X
Ψ∗m (f ) =d (I − dK)l−1 D∗ (f ), f or each f ∈ H. (18)
l=1

Since the operators K and D are not observed, they are consistently estimated by K̂ and D̂ respectively

. Then Ψ̂∗m is given by

m
X
Ψ̂∗m (f ) = d (I − dK̂)l−1 D̂∗ (f ), f or each f ∈ H. (19)
l=1

This estimator can be written in terms of the eigensystem of the covariance operator K̂ as follows

N −1 ˆ
Q N −1
!
X m,j 1 X
Ψ̂∗m (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s) (20)
λ̂j N −1
j=1 n=1

and

N −1 N −1 ˆ
1 X X Qm,j
Ψ̂∗m (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (21)
N −1 λ̂j
j=1 n=1
!
ˆ
where Q 1 − (1 − dλ̂j )m
m,j = is the filter factor.

4.4 Functional Partial Least Squares(FPLS)

The Functional Partial Least squares is one of the method that is based on the idea that the

estimator obtained by the FPCA method is not directly justified by the problem of prediction and

the extracted principal component may not capture the most important information relevant for the

16
prediction. The FPLS may be more adapted in the sense that it extract the most important factors

that explain the relation between the predictand and the predictor function. This method is very

popular in the chemometrics field and has been discussed by some prior authors such as Wold et

al. (1984) Helland (1988), Höskuldsson (1988). It was recently introduced in the econometric field

by Groen & Kapetanios (2009), Kelly & Pruitt (2015), Carrasco & Rossi (2016). In the Functional

regression context, there are authors like Aguilera et al. (2010), and Delaigle et al. (2012).
R1
Practically, for the model setting of this paper, the idea is to identify a new factor th = 0 Xn (s)vh (s)ds

at each step h = 1, ..., m such that the covariance withe the response function is maximized.

Z 1 Z 1 
2
max cov Xn (s)vh (s)ds, Xn+1 (t)ch (t)dt
vh ,ch ∈L2 ([0,1]) 0 0

subject to ||vh || = 1, ||ch || = 1, and (22)


Z 1Z 1
v` (s)K(s, t)vh (t)dsdt = 0, ` = 1, ..., h − 1
0 0

where φ1 , ..., φh−1 , c1 , ..., ch−1 are already obtained in the h − 1 previous step. By using an extension of

the Alternative Partial Least Squares (APLS) approach proposed Delaigle et al. (2012), the estimated

autoregressive operator is given by :

m
X
Ψ̂∗m (s, t) = γ̂t,l K̂ l−1 (D̂)(s, t)
l=1
m (23)
X Z 1
l−1
= γ̂t,l K̂ (s, u)Ĉ1 (u, t)du
l=1 0

−1
for each s, t ∈ [0, 1], where for each t, γ̂t = R̂t µ̂t is a vector of size m. R̂s is an (m × m) matrix with

Z 1Z 1
R̂t,j,l = D̂∗ (t, u)K̂ j+l−1 (u, s)D̂∗ (s, t)duds (24)
0 0

and µ̂t = [µ̂t,1 , ..., µ̂t,m ]0 is a vector of length m.

Z 1Z 1
µ̂t,l = D̂∗ (t, u)K̂ l−1 (u, s)D̂∗ (s, t)duds (25)
0 0

17
This estimator can be written in terms of the eigensystem of the empirical covariance operator K̂ as

follows

N −1 ˆ
Q N −1
!
X m,j 1 X
Ψ̂∗m (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s) (26)
λ̂j N −1
j=1 n=1

and

N −1 N −1 ˆ
1 X X Qm,j
Ψ̂m (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (27)
N −1 λ̂j
j=1 n=1

where

m
!
ˆ Y λ̂j
Q m,j = 1− (1 − )
l=1
θ̂l

is the filter factor and θ̂2 > θ̂2 > ... > θ̂m > 0 are the eigenvalues of the matrix R̂.

Therefore considering the previous results, the estimated autoregressive operator Ψ∗m can be

summarized as

N −1 ˆ
Q N −1
!
X δ,j 1 X
Ψ̂∗δ (s, t) = < Xn , v̂j > Xn+1 (t) v̂j (s) (28)
λ̂j N −1
j=1 n=1

and

N −1 N −1 ˆ
1 X X Qδ,j
Ψ̂∗δ (f ) = < f, Xn >< Xn+1 , v̂j > v̂j , f or each f ∈ H. (29)
N −1 λ̂j
j=1 n=1

ˆ ≡ Q(δ, λ̂ ) where
where the filter factor is Qδ,j j

Q(δ, λ̂j ) =Q(m, λ̂j ) = I(j ≤ m) F or F P CA method


λ̂j
Q(α, λ̂j ) = F or the F T method
λ̂j + α
!
(30)
Q(m, λ̂j ) = 1 − (1 − dλ̂j )m F or the F LF method

m
!
Y λ̂j
Q(m, λ̂j ) = 1− (1 − ) F or the F P LS method.
l=1
θ̂l

18
It can also be noticed that if θ̂l = θ̂r = θ̂0 for each l, r = 1, ...m, then FPLS is almost similar to
1
FLF with d = .
θ̂0

Given the estimated autoregressive operator Ψ̂m , the best prediction of the one day ahead S&P

500 curve is given by

Z 1
X̂n+1 (t) = Ψ̂m (s, t)Xn (s)ds f or each t ∈ [0, 1]. (31)
0

5 Asymptotic Results

This section is dedicated to study the rate of convergence of the estimator of the operator Ψ̂ and

of the predicted functions X̂n+1 in the context that is when the eigenvalues of the covariance operator

K are bounded and decline gradually to zero. This situation is analyzed because as far as we are

concerned, it encompass most of the practical case studies in the economic and financial field. For this

purpose, the follwing assumptions are needed :

Assumption 1 (A1) : {X1 , ..., XN } is a sequence of zero-mean and square integrable functions

following a functional autoregressive process with E[||Xn ||4 ] < +∞ and there exist an integer k0 ≥ 1

such that ||(Ψ∗ )k0 ||L < 1.

Assumption 2 (A2) : εn is an i.i.d process stationary martingale difference sequence with

respect, with E[||εn ||2 |X] = σ 2 and E[||Xn ||4 ] < ∞. (It can also be assumed that εn is a stationary

martingale difference sequence with respect to {εn−1 (t), εn−2 (t), ..., Xn−1 (t), Xn−2 (t)...}).

Assumption 3 (A3) : There is a Hilbert-Schmidt operator R and a positive constant β such

that

Ψ∗ = K β/2 R

This source condition can also be written as


X < Ψ∗ (f ), vj >2
< +∞ F or all f ∈ H.
j=1 λβj

Assumption 4 (A4) : The eigenvalues of the covariance operator K and the estimated one K̂

are distinct. λ1 > λ2 > ... > 0 and λ̂1 > λ̂2 > ... > λ̂N > 0.

19
Assumption 5 (A5) : nλm → ∞.

Assumption 1 ensures that the sequence {Xn ; n ∈ H} is a stationary process and admit a unique

solution. Assumption 2 impose that the innovations εn is homoskedastic and ensure that the operators

K and D∗ are consistently estimated by K̂ and D̂∗ respectively.

Furthermore, since E[||Xn ||4 ] < +∞, the operator K is trace-class and thereby is Hilbert-Schmidt.

Assumption 3 is a source condition ensuring that the Fourier coefficients < Ψ∗ (f ), vj > go to zero not

too faster than eigenvalues λβj as j goes to infinity. This condition guarantees that Ψ∗ belongs to the

complement orthogonal to the null space of the operator K. The larger β is, the smoother Ψ∗ (f ) is

(see Carrasco et al. (2007), Benatia et al. (2017)). This assumption is necessary to control the bias

term and will be necessary to show that the bias term depend on β. This asumption is different to

the one considered by Imaizumi & Kato (2018) or Crambes et al. (2013). Indeed, their assumption is

more restrictive and is related to the decreasing rate of the eigenvalues λj . This source condition is

more general than their assumptions. Assumption 4 imposes that the eigenvalues λj are distincts and

are consistently estimated by λ̂j . Assumption 5 is the suffiscient conditions under which the expected

estimation error goes to zero. This assumption is necessary to control the estimation error such that

it is possible to keep a good balance between underfitting and overfitting for the FPCA and FPLS

estimation methods.

Let us consider the regularized version of Ψ∗ by Ψ∗δ where δ is α for the FT and m for FPCA,

FPLS and FLF methods. Then, for each function f ∈ H, Ψ∗δ can be written as


X Qδ,j
Ψ∗δ (f ) = < D∗ (f ), vj > vj
λj
j=1

X Qδ,j
= < K(Ψ∗ )(f ), vj > vj
λj
j=1

X
= Qδ,j < Ψ∗ (f ), vj > vj
j=1

Then, for each function f ,

Ψ̂∗δ (f ) − Ψ∗ (f ) = {Ψ̂∗δ (f ) − Ψ∗δ (f )} + {Ψ∗δ (f ) − Ψ∗ (f )}.

20
where {Ψ∗δ (f ) − Ψ∗ (f )} represent the bias term that goes to zero as δ increase. {Ψ̂∗δ (f ) − Ψ∗δ (f )} is

the estimation error term and it may increase as δ increases.

The conditional MSE is defined by

 
2
M SE = E ||Ψ̂δ − Ψ|| |X1 , ..., Xn .

Proposition 1: Under assumptions A1 to A5, if ||K||op < 1, then

   
m β
M SEF P CA = Op + Op λm+1 . (32)
λm n

   
m
M SEF P LS = Op + Op λβm+1 . (33)
θm n

where θm is the smallest root of the residual polynomial Q̄m,j . The first Op term represent the

estimation error and the second one is the bias term

Proposition 2: Under assumptions A1 to A4,

if ||K||op < 1, then

m2
   
−2β
M SELF = Op + Op m . (34)
n

if β > 1 then

   
1 min{β,2}
M SEF T = Op + Op α . (35)

if β < 1
αβ
   
M SEF T = Op + Op αβ . (36)
nα2

Remarks :

• The rate of convergence of FPCA and FPLS depend on the decreasing rate of the eigenvalues

of the covariance operator (λm ). Indeed, for the FPLS approach, the smallest eigenvalue of

21
the Hankel Matrice (θm ) is such that θm < λm ( see for instance Lingjaerde & Christophersen

(2000)). Furthermore, since that θm decrease at an exponential rate (see Berg & Szwarc (2011)),

it is most of the time expected that the FPLS present a high estimation error of the autoregressive

operator. Furthermore, if the eigenvalues do not decrease very smoothly, FPCA and FPLS may

underestimate

• In contrast to the FPCA and FPLS methods, the rate of convergence for the FLF and FT do

not depend on the decreasing configuration of the eigenvalues. Moreover, due to the saturation

property 4 of the FT method, the FLF approach (that can be observed as an iterative version

of the FT method) should be preferred to FT (see Carrasco et al. (2007)). This pattern should

be checked in the simulation.

• Propositions 1 and 2 show that as m increase the squared bias term decrease while the variance

increase. Then m should be optimally chosen i.e. such that bias is equal to the variance. Then

at the optimality,
−β
• If m ∼ n1/(2+2β) then M SEF LF ∼ n 1+β .
−min{β,2}
• For β > 1, if α ∼ n−1/(2+2β) then M SEF T ∼ n 1+min{β,2} ,

• For β < 1, if nα2 → ∞, M SEF T ∼ αβ .

• The upper bound derived for FPCA and FPLS are more general bound. Specially the upper

bound obtained for the FPCA is different to the one obtained by Imaizumi & Kato (2018) since

that their assumptions on the decreasing rate of the eigenvalues λj are more restrictive. The

assumptions concidered by Crambes et al. (2013) is also different and more restrictive than the

one considered in this paper, but we get the same upper bound. To get more information about

the optimal number of functional components m for the FPCA and FPLS, it is necessary to set

some additional assumption on the eigenvalues.

Proposition 3:

Assume that A1 to A4 (A1 to A5) for FLF and FT (for FPCA) hold, if E(||Xi ||4 < ∞ and E(||Xi ||2 ||εi ||2 ) <
4
see chapter 6 of Engl et al. (1996) concerning the saturation property of the Ridge regression.

22
∞, then

n(Ψ∗δ − Ψ∗δ ) → N (0, Ωδ ). (37)

where δ = m for the FPCA and FLF and δ = α for the FT.

   
−1 −1 −1 −1
¯ (ε + Kδ ◦ Ψδ (Xi )) ⊗ Kδ (Xi ))
Ωδ = E (ε + Kδ ◦ Ψδ (Xi )) ⊗ Kδ (Xi )) ⊗
    (38)
∗ ¯ ∗
− E Kδ (Ψ ) ⊗ E(Xi ) ⊗E Kδ (Ψ ) ⊗ E(Xi ) .

6 Simulation Results

This section is devoted to compare performance of the described estimation methods in a finite

sample context. The comparison are made in terms of Monte Carlo Simulations. The main comparisons

are done on the mean-square error of the estimated autoregressive operator and the mean-square

prediction error of the model. The model setting is the FAR(1)

Z 1
Xn+1 (t) = Ψ(t, s)Xn (s)ds + εn+1 (t) n = 1, ..., N. (39)
0

The three error processes used by Didericksen et al. (2012) ε(1) (t), ε(2) (t), and ε(3) (t) are considered

and are defined as follows :

ε(1) (t) = W (t) − tW (1). (40)

is the Brownian motion, where W is the standard Wiener process generated as

b
b 1 X
W( ) = √ Z` b = 1, ..., B.
B B `=1

and Z` are independent standard normal variables and Z0 = 0

√ √
ε(2) (t) = ξ1 2sin(2πt) + ξ2 2κcos(2πt)W (t) − tW (1). (41)

where ξ1 and ξ2 are two independent variables following a normal distribution and κ can be a constant.

23
and

ε(3) (t) = aε(1) (t) + (1 − a)ε(2) (t). (42)

where a ∈ [0, 1] is a real constant which represent the strength of the two components ξ1 and ξ2 . ε(1) (t)

is an infinite series expansion , ε(2) (t) is a finite series expansion and ε(3) (t) is the combination of the

previous one.

The theoretical autoregressive operator Ψ is an integral operator mapping from L2 ([0, 1]) to

L2 ([0, 1]). Four configurations of Ψ are considered, that are the :


h 2 2
i
Gaussian operator : Ψ(s, t) = Cexp − t +s 2 ,

Identity operator : Ψ(s, t) = C,

Sloping plane (t) : Ψ(s, t) = Ct,

Sloping plane (s) : Ψ(s, t) = Cs with (s, t) ∈ ([0, 1])2 and C a constant useful to normalize the

autoregressive operator.

Two norms of the operators are considered ||Ψ||L = 0.5 and ||Ψ||L = 0.8. A continuous intervals

of [0, 1] is considered. This interval consist of 1000 equally-spaced discretization. Three sample size of

functional time series N is considered, that are 50, 100, 200 and 500. For the numerical integration, the

trapezoidal rule is used for all the the operations in the simulation and real data applications. It has

been noticed that a good estimation and prediction of the model depend on the choice of the tunning

parameter that is the number of principal components m for the FPCA and FPLS methods, the

number of iterations m for the FLF and the regularization parameter α for the FT. These parameters

are chosen in such a way that the MSE and the MSPE is minimized. The predictive cross-validation

(PCV) is performed to run the Monte Carlo and choose the regularization parameter. The PCV is

based on choosing the optimal tunning parameter that predict well the discarded observations curve

after estimation.

2
N −1

1 X
P CV (δ) = Xn+1 − Ψ̂δ (Xn ) .

N −1
n=1

with δ equal to m for FPCA, FPLS, FLF, and equal to α for the FT. The prediction error a new curve

at the time n

Indicators for prediction error:

24
To measure the prediction quality, two indicators are considered, that are the root integrated

squared error (En) and the integrated absolute error (Rn). Those criteria are given by

s
Z 1 2
En = X̂n (t) − Xn (t) dt
0

Z 1
Rn = X̂n (t) − Xn (t) dt

0

Indicators for estimation error:

To analyze the estimation error, three criteria are considered. That are the mean squared error

(MSE), the Average distance (AD) and the ratio averaged distance (RAD). Those criteria are given

by

s
Z 1Z 1 2
M SE = Ψ̂(s, t) − Ψ(s, t) dsdt
0 0

Z 1Z 1
AD = Ψ̂(s, t) − Ψ(s, t) dt

0 0


Ψ̂(s, t) − Ψ(s, t)

Z 1Z 1
RAD(Ψ) = dsdt
0 0 |Ψ(s, t)|

6.1 The Gaussian kernel model

Brownian motion innovation

The gaussian kernel model with brownian motion innovation is firstly considered to take into

account that the data When looking at the gaussian kernel setting with the brownian motion innovations

for the small sample size case (N = 50), it can be observed that the in terms of estimation, the FLF

and the FT methods outperform the others, but in terms of prediction all the compared method

seems to get the same performance except the fact that the FPLS tend to be sensible to the presence

of functional outliers. The figure 3 display such results. The fact that FLF and FLF outperform

the FPCA and FPLS can be explained by the fact that their convergence rate do not depend on the

decreasing rate eigenvalues λj , while the others do. Furthermore, the prediction error is almost the

25
same for all the method is because there is a smoothing effect when using the integral to calculate

the predicted function. Furthermore, as the sample size increase, the bias and the estimation error is

reduced.
Figure 3: Comparison of the different estimation techniques. Gaussian Kernel with n
= 50, and ε(1)

MSE RAD AD
3
4 6
2.5
3 2
MSE

RAD 4

AD
2 1.5

2 1
1
0.5

FPCAFPLS FT FLF FPCA


FPLS FT FLF FPCA
FPLS FT FLF

Rn En
2
2

1.5
Rn

En

1.5

1
1

FPCA
FPLS FT FLF FPCA
FPLS FT FLF

As the sample size increase and become very large (N = 200 and N = 500), the same pattern is

observed and nicely perceptible, but the dispersion is reduced for the the FPLS method (see figure

4). Another reason of the gap observed between those different techniques in terms of estimation

performance is that the eigenvalues λj may decrease very quickly (potentially more than an exponential

rate). This pattern has also been noticed by Didericksen et al. (2012). In that situation introducing a

regularization parameter for each functional component for the FPCA an FPLS (or on the eigenvalues

λj and θj should improve the results. Those regularization parameters should also be optimally chosen

in terms of estimation or prediction.

Smooth innovation

Considering that the innovation term are smooth is the same thing as considering the case of

smooth data. When looking at this case for the gaussian kernel model, it is observed that the FPLS

is still very affected by the functional outliers. The gab between the methods in terms of estimation

26
Figure 4: Comparison of the different estimation techniques. Gaussian Kernel with n
= 500, and ε(1)

MSE RAD AD
0.6
0.7
1.2
0.6 0.5
1
0.5
MSE

RAD
0.4

AD
0.8
0.4
0.6 0.3
0.3
0.4 0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Rn En

1.2
1
Rn

En

1
0.8

0.6 0.8

FPCA
FPLS FT FLF FPCA
FPLS FT FLF

and prediction is not much pronounced, whatever the sample size considered. Maybe it is necessary

to find a way to manage those outliers in order to get more information. (See figure 5 and figure 6 )

Mixed innovations

The case of mixed innovation display the almost the same results as the previous one. Indeed, if

more weight is attributed to the smoothed innovations, then the same results as the case of smooth

innovations is obtained, while, if the weight are attributed to the brownian motion innovation, the

results are similar to such case.

6.2 The Identity kernel

When considering a different kernel, the kernel and for instance the identity one, it can be observed

that in the context of brownian motion innovation, the same pattern is observed for the evaluation

criteria considered in terms of estimation and prediction. The gap is still observable in terms of

estimation quality, while the prediction performance are almost the same (see figure 7, 16 and 17).

Moreover, the When the sample size become very large (N = 500), the prediction error with

FPLS becomes smaller than the one obtained by the other methods (see figure 8). Furthermore, as

27
Figure 5: Comparison of the different estimation techniques. Gaussian Kernel with n
= 50, and ε(2)

MSE RAD AD
2500
4000 5000

4000 2000
3000
MSE

1500

RAD
3000

AD
2000
2000 1000
1000 500
1000

0 0 0
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF

10 58 Rn 10 58 En
10
6
8

4 6
Rn

En

4
2
2

0 0
FPCAFPLS FT FLF FPCA
FPLS FT FLF

the sample size increase, the bias and the estimation error is reduced in terms of prediction. All the

results presented at the optimal tuning parameter.

6.3 The Slope kernel

The same results are obtained for the slope kernel case as can be seen in figure 9, figure 10 and

figure 11

7 Application to the S&P 500 intraday data

7.1 Data

The S&P 500 Index future data is used to analyze the intraday returns predictability. The sample is

from 01/01/2008 to 12/31/2017. The data is collected from the website called www.backtestmarket.com.

This sample would be used in different parts to analyse the return predictability to the identification

of the intraday momentum and the main causes and also to test the robustness of the results in many

different context such as the volatility effect, the volume effect, the liquidity effect, the aftermath

28
Figure 6: Comparison of the different estimation techniques. Gaussian Kernel with n
= 200, and ε(2)

MSE RAD AD

2000 2500

2000 1000
1500
MSE

RAD
1500

AD
1000
1000 500
500 500

0 0 0
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF

10 45 Rn 10 46 En
2

10 1.5
Rn

En

1
5
0.5

0 0
FPCAFPLS FT FLF FPCA
FPLS FT FLF

financial crisis, the macroeconomic news release effect (Federal Open Market Committee (FOMC),

Consumer Price Index (CPI), Growth Domestic Product (GDP)), the infrequent rebalancing effect

and other ideas.

7.2 Intraday momentum : empirical analysis

To start the empirical analysis, the simple functional autoregressive model is considered where the

current cumulative intraday market return is used to predict the next day. The year considered is 2015

- 2017. these results would be tested for the other years of our data base. In the prediction sample,

The regression model is given by

Z 1
Xn+1 (t) = Ψ0 (t) + Ψ(s, t)Xn (s)ds + εn+1 (t), n = 1, ..., 750. (43)
0

The sample size for this regression period is N = 750. This sample is split into two parts, that

are the regression part and the prediction part (N 1 = 650 days) and the Validation and testing part

(N 2 = 100 days) in order to choose the optimal tuning parameter for each estimation method. Each

day is represented by the by 273 discretization of 5 minutes during 24 hours of a day. Figure 12

29
Figure 7: Comparison of the different estimation techniques. Identity Kernel with n
= 50, and ε(1)

MSE RAD AD
2.5
1.2
1.5
2 1

0.8
MSE

RAD
1 1.5

AD
0.6
1
0.5 0.4
0.5
0.2

FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Rn En
4
4
3
3
Rn

En

2
2

1
1

FPCAFPLS FT FLF FPCA


FPLS FT FLF

display a contour plot representing how correlated is the previous day cumulative return shape to the

next one on a 5 minute timeframe, that is the estimated autoregressive operator Ψ̂(s, t). It can be

seen that the FT and and FLF display almost similar results in terms of estimation, while the FPCA

and FPLS display different results. The yellow part represent the positive correlation while the blue

one represent the negative correlation. It can be observed that the previous trading day first hour

opening session contribute to predict positively to the next day momentum while, last hour return

of the previous day predict positively the next day return in the first half and predict negatively the

market return in the second half of the next trading day. This momentum can be explained by many

different causes.

The next step is to plot the predictive functional R − squared. From the figure 13, it is easy to see

that the most important time of the day to buy stocks is around 9:30 AM - 10:30 AM. The result is

similar for all the different estimation methods. the FPLS tend to catch a remarkable value of 8% at

the beginning and the end of the trading session, that is almost 4 times the one obtained by Gao et al.

(2018) and twice the one obtained by Zhang et al. (2019). The FPCA and FT method tend to reach

a predictive R2 of 5% and 6% respectively in the morning (That is nearly twice the one obtained by

Gao et al. (2018)) and nearly 2.5% at the end of the trading session for the FT approach. Surprisingly,

30
Figure 8: Comparison of the different estimation techniques. Identity Kernel with n
= 500, and ε(1)

MSE RAD AD

0.6 1 0.5

0.5
0.8 0.4
MSE

RAD

AD
0.4
0.6 0.3
0.3
0.4 0.2
0.2
0.2 0.1
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Rn En
1.1
0.9

1
0.8
Rn

En

0.7 0.9

0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF

the FPCA is not able to capture the momentum of the ending trading session. According to the FLF

approach, the R2 is around 2.5% at the beginning of the trading session while in the second half of

the day it is approximately 1.2%.

7.3 Main causes of the intraday momentum

The next step is to observe how this model perform out-of-sample in order to examine its stability

to irregular events and what are the main causes of momentum in such considerations. The new

regression model configuration is given by

Z 1
Xn+1 (t) = Ψ0 (t) + Ψ(s, t)Xn (s)ds + ρ1 Ie,n + ρ2 Ie,n ∗ Rn + εn+1 (t), n = 1, ..., 750. (44)
0

where Ie,n is a dummy variable taking the value 1 if the concerned event happened at day n and 0

otherwise. Rn the cumulated return at the closing of the day n. ρ2 capture the interaction effect

between Ie,n and Rn . Ie,n can be a variable observing if there where a high impact news in that day

n, or the financial crisis days, or the high volume day, or the high volatility day.

The Frisch-Waugh theorem is used to estimate this equation.

31
Figure 9: Comparison of the different estimation techniques. Slope Kernel (s) with n
= 50, and ε(1)

MSE RAD AD
8
3 2
6
1.5
MSE

RAD
2

AD
4
1
1 2 0.5

FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF

Rn En
400

300
200
Rn

En

200
100
100

0 0
FPCA
FPLSFT FLF FPCA
FPLSFT FLF

8 Conclusion

32
Figure 10: Comparison of the different estimation techniques. Slope Kernel (s) with
n = 100, and ε(1)

MSE RAD AD
1.4 0.8
1.2 2
1 0.6
1.5
MSE

RAD

AD
0.8
1 0.4
0.6
0.4
0.5 0.2
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF

Rn En
8
8
6
6
Rn

En

4
4
2 2

FPCA
FPLSFT FLF FPCA
FPLSFT FLF

9 Appendices

9.1 Proofs of the proposition 1

(To be completed)

9.2 Proofs of the proposition 2

(To be completed)

9.3 Proofs of the proposition 3

(To be completed)

9.4 Graphs

33
Figure 11: Comparison of the different estimation techniques. Slope Kernel (s) with
n = 200, and ε(1)

MSE RAD AD
1.2
1 0.6
1.5
MSE

RAD
0.8

AD
1 0.4
0.6
0.4
0.5 0.2
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF

Rn En
10 12
10
8
8
6
Rn

En

6
4
4
2 2

FPCA
FPLSFT FLF FPCA
FPLSFT FLF

References

Aguilera, A. M., Escabias, M., Preda, C., & Saporta, G. (2010). Using basis expansions for estimating

functional pls regression: applications with chemometric data. Chemometrics and Intelligent

Laboratory Systems, 104 (2), 289–305.

Antoniadis, A., & Sapatinas, T. (2003). Wavelet methods for continuous-time prediction using

hilbert-valued autoregressive processes. Journal of Multivariate Analysis, 87 (1), 133–158.

Aue, A., Norinho, D. D., & Hörmann, S. (2015). On the prediction of stationary functional time

series. Journal of the American Statistical Association, 110 (509), 378–392.

Benatia, D., Carrasco, M., & Florens, J.-P. (2017). Functional linear regression with functional

response. Journal of Econometrics, 201 (2), 269–291.

Berg, C., & Szwarc, R. (2011). The smallest eigenvalue of hankel matrices. Constructive

Approximation, 34 (1), 107–133.

Besse, P. C., & Cardot, H. (1996). Approximation spline de la prévision d’un processus fonctionnel

autorégressif d’ordre 1. Canadian Journal of Statistics, 24 (4), 467–487.

34
Figure 12: The estimated Autoregressive operator. Year 2015 - 2017

Estimated Kernel (FPCA) Estimated Kernel(FPLS)


1 1
next day (9:30 AM - 4:00 PM)

next day (9:30 AM - 4:00 PM)


0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Current day (9:30 AM - 4:00 PM) Current day (9:30 AM - 4:00 PM)

Estimated Kernel (FT) Estimated Kernel (FLF)


1 1
next day (9:30 AM - 4:00 PM)

next day (9:30 AM - 4:00 PM)

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Current day (9:30 AM - 4:00 PM) Current day (9:30 AM - 4:00 PM)

Besse, P. C., Cardot, H., & Stephenson, D. B. (2000). Autoregressive forecasting of some functional

climatic variations. Scandinavian Journal of Statistics, 27 (4), 673–687.

Bogousslavsky, V. (2016). Infrequent rebalancing, return autocorrelation, and seasonality. The Journal

of Finance, 71 (6), 2967–3006.

Bosq, D. (2000). Linear processes in function spaces: Theory and applications, volume 149 of lecture

notes in statistics. Springer-Verlag New York Inc.

Carrasco, M., Florens, J.-P., & Renault, E. (2007). Linear inverse problems in structural econometrics

estimation based on spectral decomposition and regularization. Handbook of econometrics, 6 ,

5633–5751.

35
Figure 13: The estimated Functional R-Squared. Year 2015 - 2017

Functional R-squared(FPCA) Functional R-squared (FPLS)


0.05 0.08

0.04
0.06
R-squared(R2(t))

R-squared(R2(t))
0.03
0.04
0.02

0.02
0.01

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)

Functional R-squared (FT) Functional R-squared (FLF)


0.06 0.03

0.05 0.025
R-squared(R2(t))

R-squared(R2(t))

0.04 0.02

0.03 0.015

0.02 0.01

0.01 0.005

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)

Carrasco, M., & Rossi, B. (2016). In-sample inference and forecasting in misspecified factor models.

Journal of Business & Economic Statistics, 34 (3), 313–338.

Chan, E. (2013). Algorithmic trading: winning strategies and their rationale (Vol. 625). John Wiley

& Sons.

Chu, X., Gu, Z., & Zhou, H. (2019). Intraday momentum and reversal in chinese stock market.

Finance Research Letters, 30 , 83–88.

Crambes, C., Mas, A., et al. (2013). Asymptotics of prediction in functional linear regression with

functional outputs. Bernoulli , 19 (5B), 2627–2651.

Delaigle, A., Hall, P., et al. (2012). Methodology and theory for partial least squares applied to

functional data. The Annals of Statistics, 40 (1), 322–352.

36
Figure 14: Comparison of the different estimation techniques. Gaussian Kernel with
n = 100, and ε(1)

MSE RAD AD
2 1.5
3
1.5
1
MSE

RAD
2

AD
1

1 0.5
0.5

FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Rn En
1.4
1.4
1.2
1.2
1
Rn

En

0.8 1

0.6 0.8

FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Didericksen, D., Kokoszka, P., & Zhang, X. (2012). Empirical properties of forecasts with the functional

autoregressive model. Computational statistics, 27 (2), 285–298.

Engl, H. W., Hanke, M., & Neubauer, A. (1996). Regularization of inverse problems (Vol. 375).

Springer Science & Business Media.

Ferraty, F., & Vieu, P. (2006). Nonparametric functional data analysis: theory and practice. Springer

Science & Business Media.

Gao, L., Han, Y., Li, S. Z., & Zhou, G. (2018). Market intraday momentum. Journal of Financial

Economics, 129 (2), 394–414.

Griffin, J. M., Ji, X., & Martin, J. S. (2003). Momentum investing and business cycle risk: Evidence

from pole to pole. The Journal of Finance, 58 (6), 2515–2547.

Groen, J. J., & Kapetanios, G. (2009). Revisiting useful approaches to data-rich macroeconomic

forecasting.

He, X.-Z., & Li, K. (2015). Profitability of time series momentum. Journal of Banking & Finance,

53 , 140–157.

37
Figure 15: Comparison of the different estimation techniques. Gaussian Kernel with
n = 200, and ε(1)

MSE RAD AD
1.5
0.8 0.6

0.5
0.6 1
MSE

RAD

AD
0.4
0.4 0.3
0.5
0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Rn En

1.4
1.2

1 1.2
Rn

En

0.8 1

0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Helland, I. S. (1988). On the structure of partial least squares regression. Communications in

statistics-Simulation and Computation, 17 (2), 581–607.

Heston, S. L., Korajczyk, R. A., & Sadka, R. (2010). Intraday patterns in the cross-section of stock

returns. The Journal of Finance, 65 (4), 1369–1407.

Hörmann, S., & Kokoszka, P. (2012). Functional time series. In Handbook of statistics (Vol. 30, pp.

157–186). Elsevier.

Höskuldsson, A. (1988). Pls regression methods. Journal of chemometrics, 2 (3), 211–228.

Hyndman, R. J., & Shang, H. L. (2009). Forecasting functional time series. Journal of the Korean

Statistical Society, 38 (3), 199–211.

Imaizumi, M., & Kato, K. (2018). Pca-based estimation for functional linear regression with functional

responses. Journal of multivariate analysis, 163 , 15–36.

Jegadeesh, N., & Titman, S. (1993). Returns to buying winners and selling losers: Implications for

stock market efficiency. The Journal of finance, 48 (1), 65–91.

38
Figure 16: Comparison of the different estimation techniques. Identity Kernel with n
= 100, and ε(1)

MSE RAD AD
1.4 0.7
0.8
1.2 0.6
1 0.5
0.6
MSE

RAD

AD
0.8 0.4
0.4 0.6 0.3
0.4 0.2
0.2
FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Rn En

1.2
1
1.1
0.9
Rn

En

1
0.8
0.9
0.7
0.8
0.6

FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Kargin, V., & Onatski, A. (2008). Curve forecasting by functional autoregression. Journal of

Multivariate Analysis, 99 (10), 2508–2526.

Kelly, B., & Pruitt, S. (2015). The three-pass regression filter: A new approach to forecasting using

many predictors. Journal of Econometrics, 186 (2), 294–316.

Kokoszka, P., & Young, G. (2016). Kpss test for functional time series. Statistics, 50 (5), 957–973.

Kokoszka, P., & Zhang, X. (2010). Improved estimation of the kernel of the functional autoregressive

process (Tech. Rep.). Technical Report. Utah State University.

Kokoszka, P., & Zhang, X. (2012). Functional prediction of intraday cumulative returns. Statistical

Modelling, 12 (4), 377–398.

Lehmann, B. N. (1990). Fads, martingales, and market efficiency. The Quarterly Journal of Economics,

105 (1), 1–28.

Lingjaerde, O. C., & Christophersen, N. (2000). Shrinkage structure of partial least squares.

Scandinavian Journal of Statistics, 27 (3), 459–473.

39
Figure 17: Comparison of the different estimation techniques. Identity Kernel with n
= 200, and ε(1)

MSE RAD AD
0.8
1.2 0.6

0.6 1 0.5
MSE

RAD
0.8 0.4

AD
0.4 0.6 0.3

0.4 0.2
0.2

FPCA
FPLS FT FLF FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Rn En
1.4
1.2

1 1.2
Rn

En

0.8 1

0.6 0.8
FPCA
FPLS FT FLF FPCA
FPLS FT FLF

Moskowitz, T. J., Ooi, Y. H., & Pedersen, L. H. (2012). Time series momentum. Journal of financial

economics, 104 (2), 228–250.

Neely, C. J., Rapach, D. E., Tu, J., & Zhou, G. (2014). Forecasting the equity risk premium: the role

of technical indicators. Management science, 60 (7), 1772–1791.

Ramsay, J. O., & Silverman, B. W. (2007). Applied functional data analysis: methods and case studies.

Springer.

Renault, T. (2017). Intraday online investor sentiment and return patterns in the us stock market.

Journal of Banking & Finance, 84 , 25–40.

Shang, H. L. (2017). Forecasting intraday s&p 500 index returns: A functional time series approach.

Journal of Forecasting, 36 (7), 741–755.

Sun, L., Najand, M., & Shen, J. (2016). Stock return predictability and investor sentiment: A

high-frequency perspective. Journal of Banking & Finance, 73 , 147–164.

40
Figure 18: Comparison of the different estimation techniques. Slope Kernel (t) with
n = 50, and ε(1)

MSE RAD AD
1
4
0.6
0.8
3 0.5
MSE

RAD

AD
0.6 0.4
2
0.4 0.3
1
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF

Rn En

2
2
1.5
Rn

En

1.5
1
1
0.5
FPCA
FPLSFT FLF FPCA
FPLSFT FLF

Wold, S., Ruhe, A., Wold, H., & Dunn, W., III. (1984). The collinearity problem in linear regression.

the partial least squares (pls) approach to generalized inverses. SIAM Journal on Scientific and

Statistical Computing, 5 (3), 735–743.

Zhang, Y., Ma, F., & Zhu, B. (2019). Intraday momentum and stock return predictability: Evidence

from china. Economic Modelling, 76 , 319–329.

41
Figure 19: Comparison of the different estimation techniques. Slope Kernel (t) with
n = 100, and ε(1)

MSE RAD AD
1

0.8 3 0.6
MSE

RAD

AD
0.6
2 0.4
0.4
1
0.2
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF

Rn En
2.5
2.5
2
2
Rn

En

1.5
1.5
1
1

FPCA
FPLSFT FLF FPCA
FPLSFT FLF

Figure 20: Comparison of the different estimation techniques. Slope Kernel (t) with
n = 200, and ε(1)

MSE RAD AD

0.6 3
0.5
2.5
MSE

RAD

0.5
AD

0.4
2
0.4
1.5 0.3
0.3 1
FPCA
FPLSFT FLF FPCA
FPLSFT FLF FPCA
FPLSFT FLF

Rn En
1.4
1.2

1 1.2
Rn

En

0.8 1

0.6 0.8
FPCA
FPLSFT FLF FPCA
FPLSFT FLF

42
Figure 21: The estimated Autoregressive operator. Year 2011 - 2014

Estimated Kernel (FPCA) Estimated Kernel(FPLS)


1 1
next day (9:30 AM - 4:00 PM)

next day (9:30 AM - 4:00 PM)


0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Current day (9:30 AM - 4:00 PM) Current day (9:30 AM - 4:00 PM)

Estimated Kernel (FT) Estimated Kernel (FLF)


1 1
next day (9:30 AM - 4:00 PM)

next day (9:30 AM - 4:00 PM)

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Current day (9:30 AM - 4:00 PM) Current day (9:30 AM - 4:00 PM)

43
Figure 22: The estimated Functional R-Squared. Year 2011 - 2014

Functional R-squared(FPCA) Functional R-squared (FPLS)


0.05 0.08

0.04
0.06
R-squared(R2(t))

R-squared(R2(t))

0.03
0.04
0.02

0.02
0.01

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)

Functional R-squared (FT) Functional R-squared (FLF)


0.06 0.03

0.05 0.025
R-squared(R2(t))

R-squared(R2(t))

0.04 0.02

0.03 0.015

0.02 0.01

0.01 0.005

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time of the day (9:30 AM - 4:00 PM) Time of the day (9:30 AM - 4:00 PM)

44

Вам также может понравиться