Вы находитесь на странице: 1из 16

Journal of Financial Markets 14 (2011) 625640

www.elsevier.com/locate/nmar

A computing bias in estimating the probability


of informed trading$
Hsiou-Wei William Linn, Wen-Chyan Ke
Department of International Business, National Taiwan University, Taipei, Taiwan
Available online 8 March 2011

Abstract

This study identies a factor that leads to a bias in estimating the probability of informed trading
(PIN), a widely-used microstructure measure. It is shown that, along with the numerical
maximization of the likelihood function for PIN, the oating-point exception (i.e., overow or
underow) may eliminate feasible solutions to the actual parameters in the optimization problem.
Approximately 44% of PIN estimates for recent stock market data may have been subject to a
downward bias that is more pronounced for active stocks than for inactive stocks. This study
develops a remedy to mitigate the resulting bias.
& 2011 Elsevier B.V. All rights reserved.

JEL classification: G12; G14; C13

Keywords: Floating-point exception; Informed trading; Market microstructure

1. Introduction

This study identies a computing bias in estimating the probability of informed trading
(PIN). To obtain PIN using numerical maximum likelihood estimation (MLE), Easley,
Kiefer, OHara, and Paperman (1996) and Easley, Hvidkjaer, and OHara (2002) propose
a structural model calling for researchers to count the daily numbers of buyer- and seller-
initiated trades (or buys and sells) for a stock.
$
The authors are indebted to the National Science Council of Taiwan for support under Grant NSC
97-2420-H-002-189-DR. They thank Bruce Lehman and an anonymous referee for helpful comments.
n
Correspondence to: National Taiwan University, Department of International Business, College of
Management, Section 4, Roosevelt Road, Taipei 10617, Taiwan. Tel.: 886 2 3366 4967; fax: 886 2 2362 7203.
E-mail addresses: plin@management.ntu.edu.tw (H.-W. William Lin), wenchyan.ke@gmail.com (W.-C. Ke).

1386-4181/$ - see front matter & 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.nmar.2011.03.001
626 H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640

Daily buys or sells may differ signicantly across stocks and even change signicantly
over time for the same stocks. In 2007, for example, for active stocks, mean daily buys were
2,142 and sells 2,311, compared to 10 buys and 11 sells for inactive stocks on the New
York Stock Exchange (NYSE). The standard deviation of daily buys or sells may also be
greater than 700 for one stock but less than 10 for another.
Yet, during the PIN estimation process, large buys or sells may trigger the power
function embedded in the likelihood to generate a numerical value that exceeds the range
of real number values that a computer software program can handle. Such a phenomenon
of overow or underow is referred to in computing science as the oating-point exception
(FPE; see Hauser, 1996). The FPE may disrupt programs such as SAS or MATLAB before
an optimal solution is obtained.1
We also show that FPE narrows the set of feasible solutions in the optimization, and can in
fact eliminate estimates for the actual parameters. Accordingly, the FPE bias can also overstate
the correlation between the PIN estimates and trading frequency, such as the daily number of
trades.
FPE could have affected estimations of Easley, Hvidkjaer, and OHara (2002, 2005)
and Yan and Zhang (2006); they appear to have had more impact as trading has become
more frequent in recent years. Easley, Hvidkjaer, and OHara (2002, 2005) and Yan and
Zhang (2006) also note that FPE may affect the estimation of PIN but they do not
systematically describe whether or how FPE leads to biased PIN estimates.
For example, Easley, Hvidkjaer, and OHara (2005) suggest that a large daily number of
trades could lead to failures in estimating PIN. In their 2001 sample of 2,037 stocks, they do
not have PIN estimates for 47 (3.6%); the value of these stocks accounts for 23.7% of the
total market capitalization. Similarly, Yan and Zhang (2006) report that for the fourth
quarter of 2004, 3.8% of stocks (65 of 1,690) lack PIN estimates; the market value of these
stocks accounts for 41.8% of the total market capitalization. We conjecture that for large-cap
stocks, PIN estimates are likely to be subject to the bias because of their high trading volume.
The remainder of this paper is organized as follows. In Section 2, we describe the effect
of FPE on the estimation of PIN. We also propose to reformulate the likelihood function
for PIN estimation in order to remedy its shortcomings. Finally, in Sections 3 and 4, we
use both simulated and historical data to demonstrate the signicance of such a bias and to
explore the extent to which this bias may be reduced by our proposed measures.

2. Estimating the probability of informed trading: bias and remedy

We rst describe the effect of oating-point exception on the numerical estimation for
the probability of informed trading, and propose a remedy. Drawing on Easley, Hvidkjaer,
and OHara (2002), we use buys and sells as proxies for buy orders and sell orders, and
assume that these variables follow Poisson distributions. On day i, the joint probability
density function of (Bi,Si), the observed numbers of buys and sells, is specied by
eBb i es mSi
f Bi ; Si jh  adexpeb expes m
Bi ! Si !

1
The oating point is the most common representation today for real numbers by Intel-based PCs, Macintoshes
(Macs), and most UNIX platforms. Its effective range is approximately 710308.25. For the exponential function
exp(  ), computing a number that exceeds 10308.25 for example, exp(710) results in an overow (i.e., an FPE).
H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640 627

eb mBi eSi
a1dexpeb m expes s
Bi ! Si !
eBb i eSs i
1aexpeb expes
Bi ! Si !
where a is the probability of an information event; d is the probability of a bad signal;
eb and es are the arrival rates of uninformed buys and sells, respectively; m is the arrival rate
of informed trades; and h(a,d,m,eb,es) consists of these structural parameters.
Assuming independent daily information arrivals, the basic expression regarding the log
joint likelihood given observed series of (Bi,Si) over the past I trading days is
X
I X
I
LB hjT  LB hjBi ;Si log f Bi ;Si jh;
i1 i1

where LB hjBi ; Si  log f Bi ; Si jh and T((B1,S1), (B2,S2), (B3,S3), y, (BI,SI)).


Further, the estimate of h from MLE, denoted as h^ ^a; d; ^ m;
^ e^ b ; e^s , is obtained by
solving the problem
Maximize LB hjT; 1
h2BFS

where the set of basic feasible solutions, BFS{h=(a,d,m,eb,es)|a, dA[0,1] and m,eb,esA
[0,N)}, is the boundary constraint. Then, the PIN estimate is

d a^ m^
PIN ;
a^ m^ e^b e^s
which is the ratio of mean informed trades to mean total trades.

2.1. Computing bias in estimating PIN

Given T and LB hjT, there is a subset of BFS:


BFSLB T  fh 2 BFSjLB hjT do not lead to the FPEg;
where LB hjT can be accurately expressed as a oating-point number in the computing
process. Moreover, the boundary of BFSLB T varies with respect to T and LB hjT.
Therefore, if we obtain a (local) solution to problem (1) via a numerical method with
FPE, we are, in practice, solving the following:
Maximize LB hjT: 2
h2BFSLB T

In the presence of FPE, problem (2) is different from problem (1). Depending on T and
LB hjT, the relation between h^ and h^ LB T the solutions to (1) and (2), respectively is
complicated. Larger number of Bi and Si, which are likely to be observed during periods of
more frequent trading, correspond to a smaller BFSLB T , and thus a more pronounced
difference between h^ and h^ LB T .
We can show where FPE leads to selection bias in Easley, Hvidkjaer, and OHara (2002,
2005) and Yan and Zhang (2006). Problem (1) is typically solved using the numerical
method by adopting a number of initial solutions as a means to obtain the maximizer,
although numerical solutions may easily fall within BFS\BFSLB T before the iteration
process generates the maximizer. This failure may imply that there is no locally optimal
628 H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640

(internal) solution h^ LB T in BFSLB T and thus produce a selection bias with regard to the
estimation of PIN in an empirical test. This selection bias, all the more common for more
frequently traded stocks, may then lead to a loss of representative observations.
In another case, even if one obtains a numerical solution h^ LB T to problem (2), h^ LB T is in
BFSLB T but differs from h, which is the actual parameter vector and may be in BFS\BFSLB T .
This causes a bias in the estimate of h and thus a downward bias in the PIN estimate.
PTo examine this downward bias, consider the inaccurate likelihood expression LI hjT 
I
i1 I hjBi ;Si proposed by Easley, Hvidkjaer, and OHara (2005). LI hjBi ;Si is
L
obtained by reformulating LB hjBi ;Si as
LI hjBi ;Si  logad expmxBb i Mi xM
s
i
a1dexpmxM
b
i Si Mi
xs 1axBb i Mi xSs i Mi 

Bi logeb m Si loges meb es Mi logxb logxs logSi !Bi !;


where Mi=min(Bi,Si)max(Bi,Si)/2, xb eb =m eb , and xs es =m es . The last term,
log(Si!Bi!), is a constant for the parameter vector h and is dropped in MLE. LI hjT and
LB hjT are algebraically equivalent, but, in practice, when calculations are computed
using digital computers, may obtain different maximizers.
Using LI hjT and focusing on deriving the exponential function without overow, we
construct BFSLI T E analogous to BFSLB T as shown below
 
BFSLI T E  h 2 BFSjLI hjT does not lead to the FPE
8  9
>  Mi minlogxb ;logxs oE; >
>
>  8 9 >
>
>
>  m Bi logxb ; >
>
>
<  >
< >
= >
=

h 2 BFS   min m S i logxs ; and Mi logxb logxs oE; ;
>
>  : B logx S logx >
> ; >
>
>
>  i b i s >
>
>
>  >
>
:  8i 1;2;3;. . .; I ;

3
where E40 denotes the minimum of numbers leading to the overow for the exponential
function exp(  ) in the computing process, or approximately 710 in typical software
programs. Further, we derive
 
BFSLI T E+BFSLLI T  h 2 BFSjMmax jlogxb logxs joE 4
and
 
BFSLI T EDBFSU
LI T  h 2 BFSjMmax maxjlogxb j; jlogxs joE ; 5
where Mmax=max{Mi, i=1,2,3, y, I}. According to (4) and (5), we could argue that
LI hjT results in downward-biased PIN estimates for stocks with a large number of
trades. Appendix A presents details of the deduction process.

2.2. Remedy for PIN estimation bias

PTo reduce the FPE bias, we rst present the accurate likelihood expression, LA hjT 
I
i1 A hjBi ;Si , where
L
LA hjBi ; Si  logad expe1i emax i a1dexpe2i emax i 1aexpe3i emax i 
Bi logeb m Si loges meb es emax i logSi !Bi !;
H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640 629

where e1i=mBi log(1m/eb), e2i=mSi log(1m/es), e3i=Bi log(1m/eb)Si, log(1m/es),


and emax i=max(e1i,e2i,e3i). Again, the last term, log(Si!Bi!) is dropped. Even though LA hjT,
LI hjT, and LB hjT or the accurate, inaccurate, and basic expressions are algebraically
equivalent in terms of ideal real numbers, LA hjT is the most accurate expression for the
computing process.
LA hjT is derived from two principles:

1. In computing exey (or xey), the expression of exy (or sign(x)elog(|x|)y) is more stable
than that of exey (or xey).
2. In the computer arithmetic process, the absolute computing error of a function f(x)
increases with the absolute value of its rst-order derivative, |f0 (x)| (Elden and
Wittmeyer-Koch, 1990).

Principle 1 suggests that to compute log(exeyez) with x=800, y=400, and z=900, the
expression log(exyez) is more appropriate than log(exeyez). The explanation is as
follows. When e710 is the benchmark, computing exey leads to an overow because ex=e800
exceeds e710, while computing exy=e400 does not.
Further, according to Principle 2, one should avoid computing a large input value
for exp(  ) and, conversely, a small positive input value for log(  ). Therefore,
log(e(xy)mezm)m with m=max(xy,z)=900 is more accurate than log(exyez) in
computing. The latter leads to an overow because ez=e900 exceeds e710.
The values, however, (xy)m=500 and zm=0 are always less than or equal to zero
and allow accurate computation for exp(  ); e(xy)mezm=e500e0 is always greater
than one, and the expression is helpful for computing log(  ).
To derive the exponential function without overow as in the last section, we also
construct BFSLA T E analogous to BFSLB T as follows:
 
BFSLA T E  h 2 BFSjLA hjT does not lead to the FPE
 
h 2 BFSjmaxe1i emax i ; e2i emax i ; e3i emax i oE; i 1;2;3;. . .;I ;
where e1iemax i, e2iemax i, and e3iemax i are always negative, and are therefore less than
E40 for each hABFS. Namely, BFSLA T E is equal to BFS if one focuses on the exp(  ),
the function subject to FPE; thus LA hjT may reduce the FPE bias in computing.
Below we compare the two different likelihood expressions, LI hjT and LA hjT, based
on both simulation and historical data. To distinguish between the two, we denote
the estimate using LA hjT as h^ A ^aA ; d^ A ; m^ A ; e^b;A ; e^s;A and the estimate using LI hjT as
h^ I ^aI ; d^ I ; m^ I ; e^b;I ; e^s;I . We similarly denote the PIN estimates calculated by h^ A and h^ I as
d A and PIN
PIN d I . The subscripts A and I indicate estimates from LA hjT and LI hjT.

3. Simulation tests of reformulated likelihood functions

We perform a simulation test with 2,500 hypothetical stocks, each assigned a random
parameter vector h=(a,d,m,eb,es). Let aA{0.1, 0.3, 0.5, 0.7, 0.9} and dA{0.1, 0.3, 0.5, 0.7, 0.9}.
Then, for each pair of a and d, we randomly generate 100 combinations of m, eb, and es for
[0, 600] with a probability density function f(m)=1/600 and for [0, 1,200] with probability
density functions f(eb)=1/1,200 and f(es)=1/1,200. The upper bounds of m as well as eb and es
630 H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640

600 and 1,200, respectively are based on the results of Yan and Zhang (2006) for NYSE
and AMEX stocks between 1993 and 2004.2
For each hypothetical stock with h, we simulate the number of buys and sells (Bi,Si) for
60 trading days. Then, we maximize both the reformulated likelihood functions LA hjT
and LI hjT in BFS using a modied NewtonRaphson method with linear search in the
SAS NLP procedure.
To solve the optimization problems, we adopt initial values of a0 ; d0 ; m0 ; e0b ; and e0s
revised from those proposed by Yan and Zhang (2006)
Be0b
a0 ai ; d0 dj ; e0b gk B; m0 ; and e0s Sa0 d0 m0 if BrS;
a0 1d0
Se0s
a0 a i ; d0 dj ; e0s gk S; m0 ; and e0b Ba0 1d0 m0 if B4S;
a0 d0
where B and S are the sample means, and the three variables, ai , dj , and gk , take their
values from the ve fractions {0.1, 0.3, 0.5, 0.7, 0.9}, resulting in 53 125 combinations.
Certain combinations ai ; dj ; gk would lead to a negative e0b or e0s and are thus excluded.
Fig. 1 provides a visual comparison of the two estimates with LA hjT and LI hjT, and
Table 1 provides summary statistics and the Wilcoxon signed rank tests for the estimates.
The distributions of estimation errors appear to vary for different parameter vectors, so for
each estimate, we investigate the bias associated with LI hjT and LA hjT using a simple
no-intercept regression such as Model (6) and test whether b deviates from one:
d jPIN b  PIN;
EPIN 6
d denotes PIN
where PIN d A or PIN d I , and PIN is the actual generated random number. The
regression results are reported in Table 2.
Fig. 1 as well as Tables 1 and 2 show that the estimates with LA hjT are unbiased. When
LI hjT is used, however, the PIN estimate is statistically downward-biased, perhaps
because of underestimation of m and overestimation of ebes.
Further, as shown in Table 3, the oating-point exception narrows the feasible region so
that the optimization process eliminates the actual parameters from the set of feasible
solutions With simulated trading data set T, Table 3 uses BFSLI T U, which is BFSLI T E
in Eq. (3) except for replacing the constant E by the variable U with values of 100, 400 and
710, to classify 2,500 hypothetical stocks by actual parameters h into a few subsets.
If the actual parameter h for a hypothetical stock is in BFSLI T U with a small
UoEE710, its estimate for the probability of informed trading may not be subject to FPE
bias. Table 3 shows that in contrast to PIN d I with h in BFS\BFSLI T 710
d A , 96.64% of PIN
is underestimated. Namely, FPE bias frequently occurs for h in BFS\BFSLI T E 
BFS\BFSLI T 710, which is consistent with our initial analysis (see Section 2).We also nd
that each h^ I is in BFSLI T E  BFSLI T 710 for each and every hypothetical stock. That is,
the estimation of PIN is naturally subject to FPE bias even when the constraints embedded
in BFSLI T E are not actually imposed.
The simulation result shows that when we use LI hjT, the FPE bias leads to an
underestimation of m and an overestimation of ebes, and the result is an underestimation
2
It is appropriate that the range for m is not as wide as that for eb or es. There are typically fewer informed
traders than uninformed traders in the market.
H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640 631

1.0 1.0
0.9 Estimate with LA ( T ) 0.9 Estimate with LA ( T )
0.8 Estimate with L ( T ) 0.8 Estimate with L ( T )
I
0.7 I 0.7

Estimate
Estimate

0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Actual Value Actual Value

1.0 700
0.9 Estimate with LA ( T ) Estimate with LA ( T )
600 Estimate with
0.8 Estimate with LI ( T )
LI ( T ) 500
0.7
Estimate
Estimate

0.6 400
0.5
0.4 300
0.3 200
0.2
0.1 100
0.0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 100 200 300 400 500 600 700
Actual Value Actual Value

2500 7.0
Estimate with LA ( T ) Estimate with LA ( T )
6.0
2000 Estimate with L ( T ) Estimate with LI ( T )
I 5.0
Estimate
Estimate

1500 4.0
1000 3.0
2.0
500
1.0
0 0.0
0 500 1000 1500 2000 2500 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0
Actual Value Actual Value

Fig. 1. Visual comparison of estimates and their actual values for the simulation sample: (A) PIN, (B) a, (C) d,
(D) m, (E) eebes, and (F) m/e. In Panel A, most of the PIN estimates using LA h9T are located along the 451 line.
By contrast, most PIN estimates using LI h9T are located to the right of the 451 line and are systematically
underestimated. Moreover, Panels D, E, and F show that the underestimation of PIN results from the
underestimation of m and the overestimation of eebes if these parameters are estimated with LI h9T.

of PIN. The parameter estimates based on 19932004 real-world data also suggest that the
results in Easley, Hvidkjaer, and OHara (2002, 2005) and Yan and Zhang (2006) may
have been contaminated by FPE bias. Even in a simulation test for a reformulated LI hjT
with eb=es, proposed by Easley, Engle, OHara, and Wu (2001), we nd the presence of
FPE bias.3

3
The constraint that eb=es may reduce the effect of FPE, but it still may not be appropriate for empirical studies
in certain situations. For instance, in a credit crunch, liquidity traders may submit more sell orders because of
their funding needs; therefore, assuming eb=es may lead to an overestimation of PIN.
632 H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640

Table 1
Summary and test statistics for the simulation data.
We conduct multiple simulations using 2,500 hypothetical stocks, each of which corresponds to a random
parameter vector h. With LA h9T, the estimate of PIN is downward biased, but the estimates of parameters are
unbiased according to the Wilcoxon signed rank test (po0.01), except for a. The summary statistics also imply
d A is more precise than PIN
that PIN d I with its smaller standard error of the mean (SEM) and that a^ A and a^ I may
not be signicantly different from each other with their close ranges, dened as the difference between the
maximum and the minimum. Moreover, with LI h9T, PIN is statistically signicantly underestimated (po0.01).
Such an underestimation may result from the underestimation of m and the overestimation of e, where eebes.

Variable Mean SEMa Min Median Max Testb

d A cPIN
PIN 0.0011 0.0003 0.1145 0.0002 0.0951 0.0064n
d I dPIN
PIN 0.0299 0.0012 0.6123 0.0069 0.0951 o0.0001n
e
a^ A a 0.0156 0.0028 0.9000 0.0000 0.9000 0.0007n
a^ I a 0.0145 0.0031 0.8578 0.0000 0.9000 0.8372
d^ A d 0.0049 0.0033 0.9000 0.0000 0.9000 0.3616
d^ I d 0.0052 0.0037 0.9000 0.0000 0.9000 0.3883
m^ A m 1.3877 0.2872 106.3085 0.1031 324.4225 0.0288
m^ I m 73.5955 2.2565 547.3298 9.9742 324.4225 o0.0001n
e^A ef 1.6607 0.2362 148.8821 0.1057 32.0964 0.0104
e^I e 32.0990 1.3010 148.8823 5.7255 421.4701 o0.0001n
m^ A =^eA m=e 0.0009 0.0004 0.4242 0.0001 0.1512 0.0188
m^ I =^eI m=e 0.1189 0.0135 30.4464 0.0088 0.1512 o0.0001n
n
Indicates signicance at po0.01.
a
SEM denotes the standard error of mean.
b
The column shows the p-value of Wilcoxon signed rank test with the null hypothesis of zero mean.
c
The right subscripted A denotes that the estimate is from MLE with the accurate function LA h9T.
d
The right subscripted I denotes that the estimate is from MLE with the inaccurate function LI h9T.
e
All zeros are due to rounding off and truncation.
f
eebes and e^  e^b e^s , where the extra right subscripted A or I is omitted.

4. Empirical evidence

Next, we use a regression model to explore the extent to which the oating-point exception
affects the validity of the estimate for the probability of informed trading in predicting the
opening spread. Easley, Kiefer, OHara, and Paperman (1996) propose the regression model:

S b0 b1  V  PIN b2  VOL Z;

where S is the opening spread, V is the stock price, VOL denotes the mean daily dollar volume,
and Z is the error term. Over the sample period, S is calculated as the mean daily opening spread
for each stock as in Easley, Kiefer, OHara, and Paperman (1996); V is calculated as the mean
daily closing price; and VOL is the mean daily number of trades multiplied by V.
In the regression model, the linear relation between S and V  PIN is derived from the
PIN model under certain assumptions that simplify the analysis (see Easley, Kiefer,
OHara, and Paperman, 1996). Easley, Kiefer, OHara, and Paperman (1996) include VOL
as a control, because the inventory effect is not incorporated into the PIN model. They
also add a constant term, b0, because the PIN model ignores any costs the market maker
incurs beyond losses to informed traders.
H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640 633

Table 2
Test of unbiasedness using the regression method for simulation data.
We conduct multiple simulations using 2,500 hypothetical stocks, each of which corresponds to a random
parameter vector h. Given the simple no-intercept regression model, the regression coefcient should be one in
which the estimates are unbiased. Moreover, many of these regressions are subject to heteroscedasticity.
Accordingly, the table uses the White test estimator for the variance to test the null hypothesis that the regression
coefcient is one. At the signicance level 0.01, the estimates of parameters are not signicantly biased for LA hjT
with the exception of e, while the estimates of parameters other than a are biased for LI hjT. As for that the
coefcient in regression, E(^eA )=be appears to be different from one, the potential explanation may be that its
standard error of 0.0002 is too small. Moreover, for LI hjT, the underestimation of PIN may be due to both the
underestimation of m and the overestimation of e, where eebes.

Model b White S.E. White testa H0: b =1b


v2

d A |PIN)c=b  PIN
E(PIN 0.9988 0.0023 12.85n 0.27
d I |PIN)d=b  PIN
E(PIN 0.7059 0.0109 25.42n 733.26n
E(^aA |a)=b  a 1.0074 0.0039 14.50n 3.49
E(^aI |a)=b  a 0.9951 0.0040 34.85n 1.49
E(d^ A |d)=b  d 0.9910 0.0055 2.70 2.69
E(d^ I |d)=b  d 0.9787 0.0060 3.21 12.54n
E(m^ A |m)=b  m 1.0001 0.0005 9.04n 0.04
E(m^ I |m)=b  m 0.6986 0.0063 156.30n 2,268.51n
E(^eA |e)=b  ee 0.9985 0.0002 14.16n 42.56n
E(^eI |e)=b  e 1.0195 0.0009 32.73n 430.54n
E(m^ A =^eA |m/e)=b  m/e 0.9934 0.0035 1.54 3.57
E(m^ I =^eI |m/e)=b  m/e 0.3332 0.1318 1.18 25.60n
n
Indicates signicance at po0.01.
a
The w2 for White test of Heteroscedasticity.
b
The Wald statistic based on the White estimator for the covariance matrix.
c
The right subscripted A denotes that the estimate is from MLE with the accurate LA hjT.
d
The right subscripted I denotes that the estimate is from MLE with the inaccurate LI hjT.
e
eebes and e^  e^b e^s , where the right subscripted A or I is omitted.

Our sample is constructed using the NYSEs publicly available Trade and Quote data
(TAQ) and the Center for Research in Security Prices (CRSP) database. The initial
procedure is selection of 1,317 NYSE-listed stocks during the fourth quarter of 2007. First,
we select the common stocks with initial public offering dates before October 1, 2004, that
are traded every day during the sample period. Next, we eliminate infrequently traded
stocks with mean daily volumes below $20,000, fewer than 20 mean daily trades, and prices
below $3 in any trade during the sample period. Even then the large number of buys and
d I in 2007. Therefore, we focus on the
sells still leads to failure in terms of obtaining PIN
d d
1,056 stocks for which both PIN A and PIN I can be obtained.
To estimate PIN, we rst count the daily numbers of buys and sells for each stock.
Following Easley, Kiefer, OHara, and Paperman (1996), Easley, OHara, and Paperman
(1998), Easley, Hvidkjaer, and OHara (2002), as well as Boehmer, Grammig, and Theissen
(2007), we adopt the Lee and Ready (1991) algorithm, which relies on a combination of
prevailing quotes and past transaction prices to infer trade direction during regular trading
hours (between 9:30 am and 4:00 pm). We use the rst quote during the regular P trading
hours as the opening quote and calculate the mean daily opening spread as for each
stock. Finally, we retrieve from CRSP both V and VOL.
634 H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640

Table 3
Regions subject to the oating-point exception bias.
We conduct multiple simulations using 2,500 hypothetical stocks, each of which corresponds to a random parameter
vector h. We classify them to different groups according to BFSLI T U, and examine the relation between regions
covering true parameter and the underestimated PIN. For a hypothetical stock, if its true parameter h is in BFSLI T U
with a small UoEE710, then its PIN estimate may not be subject to bias. This table depicts that 94.64% of PIN d I are
underestimated when true parameters are in BFS\BFSLI T 710EBFS\BFSLI T E, in which the FPE occurs with a high
frequency. Moreover, the percentage of convergence increases from 19.92% to 29.22% for different given initial values
and LI h9T when actual parameters are distant from BFS\BFSLI T 710. By contrast, PIN d A is not signicantly
underestimated, and almost all iterations converge for the optimization of LA h9T in different regions. Namely, the bias
d I is primarily caused by FPE, and when convergences of MLE are sensitive to initial values, the PIN estimate
of PIN
tends to be downward biased. Moreover, small percentages of underestimated PIN d I with h in BFSLI T 100
d A and PIN
are due to most observations with the true m being too small relative to eb and es.

Regions covering d Ao
Sample #(PIN d Io
#(PIN Number #(Con.)b/(2) for #(Con.)/(2) for
true parameter h size PIN)a/(1) PIN=1 of initial LA h9T LA h9T
values
(1) % of (1) % of (1) (2) % of (2) % of (2)

BFSLI T 100c 188 29.79 20.74 18,845 99.79 29.22


BFSLI T 400\BFSLI T 100 547 51.37 51.19 54,280 99.72 25.90
BFSLI T 710\BFSLI T 400 609 48.28 48.77 60,520 99.80 22.13
BFS\BFSLI T 710 1,156 51.12 94.64d 121,765 99.86 19.62
a
Number of downward biased PIN estimates.
b
Number of convergences for given initial values.
8  9
>  Mi minlogxb ;logxs oU; >
>
>  8 9 >
>
>
>  m Bi logxb ; >
>
>
<  >
< >
= >
=

c 
BFSLI T U  h 2 BFS  min m Si logxs ; and Mi logxb logxs oU; :
>
>  >
: B logx S logx ;> >
>
>
>  i b i s >
>
>
>  >
>
:  8i 1;2;3;. . .;I ;

d
The value is not equal to 100 because terms of a and d in LI hjT may mitigate FPE.

Before the regression analysis, we examine the estimates of h and PIN using LA hjT and
LI hjT. Fig. 2 and Table 4 show that most values of PIN d I are less than PINd A values.
d A as unbiased benchmarks, we show that the underestimation of
Taking a^ A , m^ A , and PIN
d
PIN I results mainly from both upward bias in a^ I and downward bias in m^ I .
d A and PIN
Next, we examine how much the difference between PIN d I helps explain the spread
d A outperforms PIN
via the regression model. Table 5 shows that PIN d I . FPE bias signicantly
dampens the explanatory power of PIN estimates for the spread. We also conduct a regression
analysis referred to as the marginal effect model. The result demonstrates that PIN d bias
d I PIN
PIN d A provides incremental explanatory power for spread regardless of whether we use
d A or PIN
PIN d I in the regression; namely, FPE leads to the difference between PIN
d A and PIN dI
even though the two estimates measure the same probability of informed trading.
d bias is negligible for stocks with fewer than about 1,000 maximum daily trades
PIN
(Tmax), as shown in Fig. 3A. This result is consistent with the idea that the difference in the
PIN estimate is caused by FPE, along with the large daily number of trades. Fig. 3A also
shows that the more active the stocks, the more pronounced the absolute values of PIN d bias .
H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640 635

0.40 1.0
0.35 0.9
0.8
0.30 0.7

Inaccurate
Inaccurate

0.25 0.6
0.20 0.5
0.15 0.4
0.3
0.10 0.2
0.05 0.1
0.00 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Accurate Accurate

1.0 2100
0.9 1800
0.8
0.7 1500
Inaccurate

Inaccurate
0.6 1200
0.5
0.4 900
0.3 600
0.2
0.1 300
0.0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 300 600 900 1200 1500 1800 2100
Accurate Accurate

4200 4.0
3600 3.5
3000 3.0
Inaccurate
Inaccurate

2.5
2400
2.0
1800 1.5
1200 1.0
600 0.5
0 0.0
0 600 1200 1800 2400 3000 3600 4200 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Accurate Accurate

Fig. 2. Scatter plots of estimates using LA h9T and LI h9T for TAQ data: (A) PIN d I vs: PIN d A , (B) a^ I vs: a^ A , (C)
d^ I vs: d^ A , (D) m^ I vs: m^ A , (E) e^I  e^b;I e^s;I vs: e^A  e^b;A e^s;A , and (F) m^ I =^eI vs: m^ A =^eA . The dots should be located
around the 451 line if the two estimates using LA hjT and LI hjT are not signicantly different from each other.
Panel A shows that most dots are located to the right of the 451 line, suggesting that most values of PIN d I are
underestimated relative to those of PIN d A . Further, Panels B through F show that the underestimation emerges when
most values of a^ I are overestimated relative to those of a^ A , and most values of m^ I are underestimated relative to those
of m^ A . Specically, the values of m^ I are bounded by 300, but the values of m^ A are spread in a wide range from zero to
2,100. Moreover, d^ I and e^I are not signicantly different from d^ A and e^A in Panels C and E. In particular, the dots are
located approximately along the 451 line in Panel E, suggesting that e^A and e^I are close to each other.

In untabulated analysis, we derive the approximate upper bound of PIN estimates with a
given Tmax for the reformulated LI hjT with eb=es, as proposed by Easley, Engle, OHara, and
Wu (2001). The upper bound is denoted by PIN I Tmax 1=1 2=exp4E=Tmax 1 and
plotted in Fig. 3B.4 With PIN I Tmax , if the empirical upper bound is 0.4, the FPE bias may

4
Details can be obtained at SSRN: http://ssrn.com/abstract=1500828. The upper bound is PIN I Tmax
1=1 2=expE=Tmax 1 1=1 2=exp4E=3Tmax 1=2 for the original LI hjT with eb=es.
636 H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640

Table 4
Summary and comparison of estimates for 1,056 NYSE-listed stocks from TAQ.
We estimate the probability of informed trading (PIN) based on both LA hjT and LI hjT for a sample of 1,056
NYSE-listed stocks in the fourth quarter of 2007 from TAQ. To estimate PIN, we use Lee and Readys (1991)
d I are underestimated relative
algorithm to determine the number of buys and sells. This table shows that 86.17% of PIN
d A , and the means of PIN
to PIN d bias PINd I PINd A and relative PIN
d bias PINd bias PIN
d A are 0.0447 and 0.3179,
respectively. Moreover, the values of m^ A scatter around their median of 425.0530 over a wide range from approximately
zero to 2,100. By contrast, the median of m^ I is 188.9032 and the values of m^ I are bounded by approximately 310. The
d I results from the underestimation of m^ I .
result is consistent with the notion that the underestimation of PIN

Estimate Mean Min Median Max

d Aa
PIN 0.1396 0.0468 0.1341 0.3398
d Ib
PIN 0.0949 0.0286 0.0786 0.2991
d bias PIN
PIN d I PIN
dA 0.0447 0.2342 0.0426 0.0394
d bias PIN
Relative PIN d bias PIN
dA 0.3179 0.8302 0.3523 0.4665
d I under-estimated relative to PIN
percentage of PIN dA 86.17%
a^ A 0.3638 0.0312 0.3594 0.7656
a^ I 0.4898 0.0313 0.4660 0.8974
d^ I 0.5299 0.0000c 0.5529 1.0000
d^ I 0.4723 0.0000 0.4535 1.0000
m^ A 472.2234 18.6807 425.0530 2,097.9190
m^ I 182.6015 18.6807 188.9032 309.1559
e^A 1,125.9189 18.8944 928.7753 3,888.0867
e^I 1,198.8276 18.8944 979.2572 4,109.5795
m^ A =^eA d 0.5028 0.2155 0.4327 6.3139
m^ I =^eI 0.2740 0.0602 0.1904 2.3425
a
The right subscripted A denotes that the estimate is from MLE with the accurate LA hjT.
b
The right subscripted I denotes that the estimate is from MLE with the inaccurate LI hjT.
c
The zero is due to rounding off and truncation.
d
eebes and e^  e^b e^s , where the right subscripted A or I is omitted.

occur for stocks with Tmax greater than 3,350. Therefore, even if we adopt this reformulated
likelihood with eb=es, which reduces the impact of FPE, Tmax exceeds 3,350 for approximately
44.44% of the PIN estimates, which may thus be subject to the FPE bias. Further, Fig. 3C and
D show that the R2 between the natural log of mean daily number of trades, denoted as
d A is 0.1487, while that between ln(Trade) and PIN
ln(Trade), and PIN d I is 0.6566.
We conclude that the oating-point exception bias signicantly inuences the regression
results, as well as results in earlier empirical studies related to the probability of informed
trading. A study examining the factors of information risk using a regression method for the
PIN metric, for example, may be subject to this type of bias (e.g., Aslan, Easley, Hvidkjaer, and
OHara, 2007).
We demonstrate that FPE bias leads to an underestimation of PIN, especially for the
large number of trades. Therefore, FPE bias may overstate the correlation between the
PIN estimates and measures sensitive to the trading frequency.

5. Discussion and conclusions

We posit that there is a bias in estimating the probability of informed trading that occurs
because of the oating-point exception. A biased PIN estimate directly impacts analyses of
H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640 637

Table 5
Regression test for the inuence of oating-point exception bias.
This table presents the results from estimating the linear regressions S=b0b1  V  PINb2  VOLZ and
S=b0b1  V  PINb1,bias  V  PIN d bias b2  VOLZ, of which the former is proposed by Easley, Kiefer,
OHara, and Paperman (1996) for exploring their PIN model. The dependent variable S is the mean quoted
opening spread calculated for the period of 10/1/2007 to 12/31/2007. The mean price, V, is obtained by averaging
the CRSP daily closing prices for the same period. The dollar volume, VOL, is the mean daily number of shares
traded multiplied by V. Moreover, PIN d I minus PIN
d bias is equal to PIN d A . Finally, the PIN in the regression is
d A or PIN
replaced by PIN d I , alternately. The regressions use ordinary least squares, and the t-statistics are reported
d A outperforms PIN
in parentheses. The table shows that PIN d I in explaining the spread because it leads to a greater
adjusted R2 in both the general model and the model restricted to b2=0. In particular, PINd A improves the
adjusted R2 from 0.3801 to 0.5753 for the model restricted to b2=0. Moreover, with the marginal effect model,
d bias provides incremental explanatory power for spread regardless of controlling
this study demonstrates that PIN
d A or PIN
for PIN d I . The explanation is that PIN
d A and PIN
d I are substantially different measures for the
probability of informed trading.

Coeff. Model with adopted PIN

General model Restriction to b2=0 Marginal effect

dA
PIN dI
PIN dA
PIN dI
PIN dA
PIN dI
PIN

Intercept 0.9766 0.9367 0.9772 1.1448 0.9002 0.9002


(28.9437n) (24.0413n) (28.9526n) (27.5519n) (25.1248n) (25.1248n)
V  PIN 0.2162 0.2712 0.2217 0.2995 0.2551 0.2551
(30.3282n) (25.6081n) (37.8259n) (25.4660n) (26.0859n) (26.0859n)
d bias
V  PIN 0.0915 0.1637
(5.7156n) (14.1339n)
VOL 8.84E10 9.95E09 2.86E09 2.86E09
(1.3307n) (16.7623n) (3.8644n) (3.8644n)

Adj. R2 0.5757 0.5102 0.5753 0.3801 0.5880 0.5880


F-value 716.5800n 550.5300n 1,430.3500n 647.9600n 502.9800n 502.9800n
n
Indicates signicance at po0.01.

stock market microstructure. While Yan and Zhang (2006) and Boehmer, Grammig, and
Theissen (2007) have investigated the bias in PIN estimations, few studies have explored
the issue of FPE. Boehmer, Grammig, and Theissen (2007) focus on mitigating a bias in
PIN estimation due to the misclassication of trades, but we investigate both the extent to
which FPE produces a bias in estimating PIN and how we can reduce such a bias using the
reformulated likelihood function.
Our results based on both simulation and historical data show that the bias may be more
pronounced for active stocks. Such a bias may have contaminated the results of prior
studies, as failing to identify such a bias may lead to overstatement of the relation between
PIN and measures sensitive to trading frequency. Easley, Hvidkjaer, and OHara (2005),
for example, note that due to a strong negative correlation between (rm) size and PIN,
independent sorts into size and PIN portfolios provide too few rms in the large-sized,
high-PIN, and small-sized, low-PIN cells.
FPE bias may also lead to confounding conclusions. If both rm size and liquidity are
highly correlated with trading frequency, results in Easley, Hvidkjaer, and OHara (2002,
2005), Aslan, Easley, Hvidkjaer, and OHara (2007), and Duarte and Young (2009) may be
638 H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640

0.04 0.9
0.00 0.8
-0.04 0.7
0.6
PINbias

-0.08 0.5
-0.12 0.4
-0.16 0.3
0.2
-0.20 0.1
-0.24 0.0
0

0
0
0
0
0
0
0
0
0

2, 0
3, 0
4, 0
5, 0
6, 0
7, 0
8, 0
9, 0
10 0
00
00
00
00
00
00
00
00
00
00

00
00
00
00
00
00
00
00
00
,0
1,
2,
3,
4,
5,
6,
7,
8,
9,

1,
Tmax Tmax

0.35 0.35
PIN = -0.0399ln (Trade) + 0.3672 PIN = -0.0149ln (Trade) + 0.2414
0.30 2 I 0.30 2 A
R = 0.6566 R = 0.1487
0.25 0.25
0.20 0.20
0.15 0.15
0.10 0.10
0.05 0.05
0.00 0.00
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
ln (Trade) ln (Trade)

Fig. 3. PIN estimates with LA h9T and LI h9T against the daily number of trades: (A) PIN d bias vs: Tmax ,
d d
(B) PIN I Tmax and PIN I Tmax vs: Tmax , (C) PIN I vs: lnTrade, and (D) PIN A vs: lnTrade. Panel A shows that
d bias PIN
the absolute value of PIN d I PINd A increases with the maximum daily number of trades, denoted by Tmax,
for the 2007 sample. When Tmax is less than approximately 1,000, the difference between both estimates is negligible.
Namely, the deviations of PIN estimates from the actual measure may emerge for rms with frequent trades. Further,
with assuming eb=es and certain conditions for simplication, we derive the approximate upper bound function of
PIN estimates with a given Tmax, denoted by PIN I Tmax , for the reformulated LI h9T with eb=es, proposed
by Easley, Engle, OHara, and Wu (2001). Similarly, we derive PIN I Tmax for the original LI h9T with eb=es. The
two upper bound functions are plotted in Panel B. In Panel B, if the empirical upper bound is 0.4, the FPE bias may be
signicant for stocks with Tmax greater than 970 or 3,350, depending on the adopted likelihood function. Moreover,
Panels C and D show that the R2 between PIN d I and the natural log of mean daily number of trades, denoted as
d A and ln(Trade), it is 0.1487. In sum, Panels A through D show that FPE bias
ln(Trade), is 0.6566 but between PIN
may overstate the correlation between the PIN estimates and trading-frequency sensitive measures, such as the daily
number of trades. This result implies that FPE is not merely random but rather is systemic with large numbers of
trades.

contaminated. Specically, Easley, Hvidkjaer, and OHara (2002, 2005) show that PIN
serves as a variable for cross-sectional security returns, and Duarte and Young (2009)
further use an extended PIN model to demonstrate that the explanatory power of PIN for
security returns may be due to liquidity concerns unrelated to information
asymmetry. Duarte and Young (2009) also propose a computing procedure for the
Poisson distribution, which nevertheless shows another instance of FPEin this case,
underow. With such a phenomenon, the maximum likelihood estimation process tends to
yield a numerical likelihood value of zero in the case of a great number of trades because of
the oating-point imprecision for small pure decimals.
H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640 639

Our work does more than demonstrate the signicance of FPE bias; we also present
measures for mitigating the bias. Our remedy may be applied to most extended PIN
models (e.g., Grammig, Schiereck, and Theissen, 2001; Duarte and Young, 2009).
Although some software packages such as Mathematica and Maple allow researchers to
use high precision to reduce FPE with more CPU time when necessary, our ndings should
remind users of other packages such as SAS, GAUSS, MATLAB, and FORTRAN of the
signicance of inherent bias and provide a simple remedy for estimating PIN. Basically, the
results indicate that the underestimation of PIN is caused primarily by a signicant
downward bias regarding the arrival rate of informed trades from the inaccurate likelihood
expression, which is subject to the oating-point exception.

Appendix A. Floating-point exception narrows set of basic feasible solutions

In computing exeyez, permutations of ex,ey, and ez such as exeyez and ezeyex may differ in
sensitivity to the oating-point exception. For instance, when e710 leads to an overow, let
x=y=400 and z=400. Then the expression exeyez is subject to the FPE, given exey=e800.
Given ezex=e0=1 and e0ey=e400, however, computing ezexey is not subject to the FPE.
Therefore, we focus on the order such as exp(m)xBb i Mi xM s
i
in the analysis of FPE
for LI hjT, and require that m, (BiMi)log(xb), Mi log(xs), m(BiMi)log(xb) and
m(BiMi)log(xb)Mi log(xs) all be less than E to avoid the FPE.
Accordingly, we obtain
8  9
>  moE; >
>
>  >
>
>
>  B M logx oE; M logx oE; >
>
>
>  i i b i s >
>
>
>  >
>
>
>  m B i M i logx b oE; >
>
>
>  >
>
>
>  m B M logx M logx oE; >
>
>
<  i i b i s >
=

BFSLI T E  h 2 BFS  Mi logxb oE; Si Mi logxs oE;
>
>  mM logx oE; >
>
>
>  >
>
>
> 
i b >
>
>
>  >
>
>
>  mM i logx b S i M i logx s oE; and >
>
>
>  >
>
>
>  i
B M logx S M logx oE; >
>
>
> 
i b i i s >
>
:  8i 1;2;3;. . .;I ;
8  9
>  Mi minlogxb ;logxs oE; >
>
>  8 9 >
>
>
>  m B logx ; >
>
>
<  >
< i b >
= >
=


h 2 BFS  min m S i logx s ; and Mi logxb logxs oE; ;
>
>  : B logx S logx >
> ; >
>
>
>  i b i s >
>
>
>  >
>
:  8i 1;2;3;. . .;I ;

A1
where Emin{e40| Input value e leads to overows for exponential function exp(  ) in the
computing process}E710.
Moreover, eliminating the terms m, Bi log(xb), and Si log(xs) from BFSLI T E in (A1),
we obtain
 
BFSLI T E+BFSLLI T  h 2 BFSjMi logxb logxs oE; 8i 1;2;3;. . .;I
 
h 2 BFSjMmax jlogxb logxs joE ; A2
640 H.-W. William Lin, W.-C. Ke / Journal of Financial Markets 14 (2011) 625640

and, eliminating last inequality from BFSLI T E , we obtain:


 
BFSLI T EDBFSU LI T  h 2 BFSjMi minlogxb ; logxs oE; 8i 1;2;3;. . .; I
 
h 2 BFSjMmax maxjlogxb j;jlogxs joE ; A3
where Mmax=max{Mi, i=1,2,3, y, I}.
With (A2) and (A3), we can conclude that a large number of trades will result in a
downward-biased probability of informed trading if LI hjT is adopted. When Mmax is
large, |log(xb)|, |log(xs)|, and |log(xb)log(xs)| are all trivial for holding the inequalities
embedded in BFSLLI T and BFSU L
LI T . Namely, m is small relative to eb and es in BFSLI T or
BFSU LI T , and the estimated PIN may be downward-biased in both regions. Moreover, if
|log(xb)|, |log(xs)|, and |log(xb)log(xs)| are trivial, then BFSLLI T  BFSLI T E  BFSU LI T .
The estimated PIN may also be downward-biased in BFSLI T E for a stock with a large
number of trades.

References

Aslan, H., Easley, D., Hvidkjaer, S., OHara, M., 2007. Firm characteristics and informed trading: implications
for asset pricing. Working Paper, University of Houston.
Boehmer, E., Grammig, J., Theissen, E., 2007. Estimating the probability of informed tradingdoes trade
misclassication matter? Journal of Financial Markets 10, 2647.
Duarte, J., Young, L., 2009. Why is PIN priced? Journal of Financial Economics 91, 119138.
Easley, D., Engle, R.F., OHara, M., Wu, L., 2001.Time-varying arrival rates of informed and uninformed trades.
Working Paper, Cornell University.
Easley, D., Hvidkjaer, S., OHara, M., 2002. Is information risk a determinant of asset returns? Journal of
Finance 57, 21852221.
Easley, D., Hvidkjaer, S., OHara, M., 2005. Factoring information into returns. Working Paper, Cornell
University.
Easley, D., Kiefer, N., OHara, M., Paperman, J., 1996. Liquidity, information, and infrequently traded stocks.
Journal of Finance 51, 14051436.
Easley, D., OHara, M., Paperman, J., 1998. Financial analysts and information-based trade. Journal of Financial
Markets 1 (2), 175201.
Elden, L., Wittmeyer-Koch, L., 1990. Numerical Analysis: An Introduction. Academic Press, Boston.
Grammig, J., Schiereck, D., Theissen, E., 2001. Knowing me, knowing you: trader anonymity and informed
trading in parallel markets. Journal of Financial Markets 4, 385412.
Hauser, J.R., 1996. Handling oating-point exceptions in numerical programs. ACM Transactions on
Programming and Systems 18 (2), 139174.
Lee, C., Ready, M., 1991. Inferring trade direction from intraday data. Journal of Finance 46, 733746.
Yan, Y., Zhang, S., 2006. An improved estimation method and empirical properties of the probability of informed
trading. Working Paper, University of Pennsylvania.