Академический Документы
Профессиональный Документы
Культура Документы
Advisor
Name:
Deng Weijun
number is:
6906
Mechanical.Changsha.Hunan.P.R.ChinaSchool of
Electrical and
Mechanical.Changsha.Hunan.P.R.China
Changsha, Hunan 410083
Phone: 8613517486903
Fax: 86-0731-88660172
Problem Chosen:
B
Email: yeatszone@gmail.com
Home
Phone:
86-0731-88660731
Team Member
Gender
Wang Jiaqi
Luo Hao
Yang Li
This signed original must be stapled to the top of one copy of your team's solution paper.
6906
Problem Chosen
Abstract
This paper is mainly to determine the geographical profile of a suspected serial
criminal based on the locations of the crimes and predict the possible locations of the
next crime based on the time and locations of the past crime scenes. To resolve the
questions we discussed the distance from their residence to the target, the features of
the crime sites and so on. Then we establish the following precise mathematical
models and resolve the problem.
Scheme1, Hit score model
Firstly, we discuss the alternative distance decay functions, which mainly discuss the
relationship between the target preference of a serial criminal and the distance he or
she must travel from their residence to the target, according to the existing studies
such as Rossomo, Canter, and Linear. Comparing the strengths and weaknesses of the
existing functions, we chose the Truncated negative exponential, taking the practice
into account at the same time. Then, we introduce a hit score function (9).Combining
these we model the function s(y) (10).At last we illustrate the implementation of our
two schemes using the collected data (in 2.4).
Scheme 2, Probability model
From a different angleWe have chosen the probability of each location that may
occur as a starting point forecast .We are regarding the forecast anchor points as a
random variable .After a rigorous mathematical derivation, each point may have been
recorded in the probability of a crime, and then select relatively high points from
which the probability of a crime, These points shall be in the region of
geographical profile. We also illustrate the implementation of our two schemes using
the collected data
Team #6906
page 2 of
26
.
Combination
First, by the model one can predict criminal records each have had a point scoring
value; the value would have to be a high point of a possible anchor region, recorded
as the regional one. By the model 2, predicted already had criminal records for each
point the probability of the same token, from the high points of the probability of
possible anchor points of the region, recorded as the region 2.
Second, when the point had criminal records, while in the regions 1 and 2, the point
shall be the most likely anchor points, from these points shall be composed of
regional.
Third, According to the circumference of criminal psychology theory, In these
points with a round will most likely anchor points ring up. These points within the
circle shall be located in the predicted location of the most likely to commit crimes .In
order to improve forecast accuracy, considering the characteristics of sites to start; we
introduced the scoring function of location-specific characteristics. By the scoring
function, select out the location of features which probably may lead to crime, , then
the circumference from the front to identify any features of the site to meet location,
these points need to pay special attention.
Key words:
Circumference theory
Team #6906
page 3 of
26
The contract between the need for large data of location to locate the criminal and
the need to prevent the criminal as early as possible.
The contract between the need for search every possible place to find the criminal
and the need to reduce the cost.
The criminal may do not have an anchor point.
How to considerate the influence of the different geographic features on the criminal.
Team #6906
page 4 of
26
Team #6906
page 5 of
26
Content
Abstract ............................................................................................................................................. 1
The executive summary .................................................................................................................... 3
I Introduction: ................................................................................................................................... 6
1.1 Geographic Profiling ........................................................................................................... 6
1.2Spatial Event Prediction ....................................................................................................... 6
1.3Object ................................................................................................................................... 6
II. Scheme 1 Hit score model ............................................................................................................ 7
2.1Assumptions:........................................................................................................................ 7
2.2 Notations and Definitions: .................................................................................................. 7
2.3 Geographic profiling methods............................................................................................. 7
2.3.1 Decay functions................................................................................................................ 7
2.3.2 Hit score function ............................................................................................................. 9
2.3.3The foundation of the model ............................................................................................. 9
2.4 Example Joth Duffy, the RailwayKiller ............................................................................ 10
2.5Evaluation of the model ..................................................................................................... 12
2.5.1Shortcomings: ................................................................................................................. 12
2.5.2Advantages:..................................................................................................................... 14
III Scheme 2 Probability model ................................................................................................... 14
3.1Modle overview ................................................................................................................. 14
3.2Assumptions ....................................................................................................................... 14
3.3The building of the model .................................................................................................. 15
3.4 Example: burglary in Los Angeles .................................................................................... 17
3.5 Evaluation of the model .................................................................................................... 22
3.5.1 Strengths of this Framework .................................................................................. 22
3.5.2 Weaknesses ............................................................................................................ 22
IV. Combination and prediction .................................................................................................. 22
4.1 Overview ........................................................................................................................... 22
4.2 Combination ...................................................................................................................... 22
4.3 Prediction .......................................................................................................................... 23
4.4 The improvement of the prediction ................................................................................... 23
4.4.1Model Overview ............................................................................................................. 23
4.4.2Building the model ....................................................................................................... 23
4.4.3 The realization of the model .......................................................................................... 25
V. References .................................................................................................................................. 26
Team #6906
page 6 of
26
I. Introduction:
Although tactical crime analysis has been continually improving investigation efforts,
serial crimes still pose a great challenge to police officers and investigators alike.
These cases often go unsolved because arduous investigations are required. The
primary reason for the investigation complexity is that the offender is often a stranger
to the victim. Devoid of any tie between the victim and the offender, detectives are
left with no substantial leads, thus forcing them to consider large populations of
potential suspects. Such large suspect populations strain police resources, lead to
resource allocation, problems, and Lower the likelihood of apprehending the
neruetrator.
1.3Object
The primary goal of this paper is to define a geographic profiling methodology that
improves on existing methodologies which combines known journey to crime theories
and methodologies, with criminal forecasting theories and methodologies.
Team #6906
page 7 of
26
Team #6906
page 8 of
k
f (d) =
dh
kB g h
(2Bd)g
26
if d < ,
if d B.
(1)
Figure 1
We remark that Rossmo also considers the possibility of forming hit scores by
multiplication; see (Rossmo, 2000, p. 200)
2) The method described in Canter, Coey, Huntley, and Missen (2000) is to use a
Euclidean distance, and to choose either a decay function in the form
f (d) = Aed
(2)
0
if d < ,
f (d) = 1 if A d <
Ced if d B
(3)
Figure 2
Team #6906
page 9 of
26
distance and gives the user a number of choices for the decay function, including
Linear: f (d) = A + Bd,
(4)
d
Negative exponential: f (d) = Ae ,
(5)
Normal: f (d) =
A
2S 2
Lognormal: f (d) =
exp
A
d
2S 2
(dd )2
2 S2
exp
(6)
(ln dd )2
(7)
2 S2
and
Truncated negative exponential: f (d) =
Bd
Ae
if d < dp
if d dp
(8)
Crime Stat also allows the user to use empirical data to create a different decay
function matching a set of provided data as well as the use of indirect distances.
The linear function and the negative exponential, the simplest type of distance
model , postulate that the likelihood of committing a crime at any particular
location declines by a constant amount with distance from the offender s home .
It is highest near the offender s home but drops off by a const ant amount for
each unit of distance until it falls to zero.
The normal distribution and the lognormal function assume the peak likelihood is
at some optimal distance from the offender s home base. Thus, the function rises
to that distance and then declines. The rate of increase prior to the optimal
distance and t he r at e of decrease from that distance is symmetrical in both
directions.
The truncated negative exponential is a joined function made up of two distinct
mathematical function s - the linear and t he negative exponential. Although This
function is the closest approximation to the Rossmo model. However, it differ s in
several mathematical proper ties. First, t he near home base function is linear,
rather than a non-linear function. It assumes a simple increase in t ravel
likelihoods by distance from t he home base, up to the edge of the safety zone.3
Second, t he distance decay pa r t of the function is a negative exponential, rather
than an inverse distance function; consequently, it is more stable when distances a
Team #6906
page 10 of
26
very close to zero (e.g., for a crime where there is no near home base off set).
In practice, the offender always has buffer zone when to chose the site to commit and
the willing is the similar to the Truncated Negative Exponential. Hence, we use the
Truncated Negative Exponential.
Following the existing algorithms, we chose the Truncated Negative Exponential, then
we model a hit score function the same as the (9) .
Combining these, we then obtain the expression
S(y) =
n
i=1 f(d(xi , y))
n
i=1 Bd
n
d
i=1 Ae
if d < dp
if d dp
(10)
where d is the distance from the home base, B is the slope of the linear function and
for the negative exponential function A is a coefficient and C is an exponent . Since
the negative exponential only starts at a particular distance, , A, is assumed to be
the intercept if the Y-axis were transposed to that distance. Similarly, the slope of the
linear function is estimated from the peak distance, , by a peak likelihood
function.
Regions with a high hit score are considered to be more likely to contain the
offenders anchor point than regions with a low hit score.
Alternatively, the probability surface can be viewed from a top-down perspective and
Team #6906
page 11 of
26
Team #6906
page 12 of
26
Figure.6
Some types of crime, on the other hand, are very difficult to fit . Figure 10.7
shows the distribution of bank robberies. Partly because there were a limited
number of cases (N=176) and partly because its a complex pat tern, the truncated
negative exponential gave the best fit, but not a particularly good one. As can be
seen , the linear (near home) function underestimates some of t h e near distance
likelihoods while the negative exponential drops off too quickly; in fact , to make
this function even plausible, there gression was run only u p t o 21 miles
(otherwise, it under estimated even more).
Team #6906
page 13 of
26
Figure.6
Figure.7
Team #6906
page 14 of
26
2.5.2Advantages:
1).The decay function we chose is available.
In practice, the offender always has buffer zone when to chose the site to commit and
the willing is the similar to the Truncated Negative Exponential. Hence, we use the
Truncated Negative Exponential.
2).The model provides a search strategy for law enforcement.
By examining what type of function benefits a certain type of crime, police can target
their search efforts more efficiently. The model is relatively easy to implement and is
practical.
3).The mathematical formulation is stable.
Unlike the inverse distance function in the Rossmo model, equation 10.19 will not
have problems associated with distances that are close t o 0.
4).T he model does provide a search strategy for identifying an offender.
It is a useful tool for law enforcement officer s, particularly as they frame a search for
a serial offender.
III Scheme 2
Probability model
3.1Modle overview
A deeper understanding of the probability of crime locations would provide valuable
insight into geographic profiling problem. By creating a framework that incorporates
the contributions of location and time, we can estimate the probability of committing
the crime for each location. The model achieves several important objectives :
According to this, we can generate a geographical profile.
3.2Assumptions
In order to streamline our model we have made several key assumptions,
1) Assuming event independence and that all crimes were committed by the same
person.
2) The criminals are uniformly distributed amongst all residences within the city and
furthermore that all houses are equally.
Team #6906
page 15 of
26
11
where Rt is a wh ite noise, i.e. <Rt> = 0 and D is the diffusion parameter. The drift
term can be neglected in the case of unbiased motion or could be used to describe
more complex criminal behaviors.
For instance, it has been suggested that criminals may modify their movements
towards regions of higher attractiveness when selecting their targets . This type of
behavior could be incorporated into (11) through a gradient term of the form
0 ( x z )
(12)
(13)
Integrating (12)-(13) in time, the probability density of where the crime is committed
is then determined by,
P( x | z) A( x) ( x | z),
(14)
Team #6906
page 16 of
26
where
( x | z ) ( x, t | z )dt
*( D ) *( ( x) ) ( x z).
(15)
We will refer to this equation as the forward" equation. Given the prior distribution
of criminal anchor points, P(z), the geographic profiling distribution can then be
determined using Baye Theorem,
P( x | z ) P( Z )
A( x) ( x | z ) P( z )
( x | z ) P( z )
P( z | x)
P( z | x1 ,..., xN )
R2
Since
(16)
| z ) P( z )dz
P( x1 | z )...P( xN | z ) P( z )
P( x1 | z )...P( xN
i 1
N
R2
i 1
fi ( z )P( z )
fi ( z ) P( z )dz
is the
Green's function corresponding to the linear operator on the left side of (15), for fixed
x and varying z the function f ( z ) ( x | z ) solves the backward or adjoint
equation,
*( Df ) ( z )* f A( z ) f ( z x)
(17)
Where is now with respect to the variable z. Thus the geographic profiling
density can be efficiently computed in practice by solving the backward equation
given by (17), where the point mass on the right hand side is located at the scene of
the crime, and then multiplying by the prior distribution of anchor points and
normalizing. We note that the changes sign going from the forward equation (15) to
the backward equation (17). This has practical implications, for if criminals move up
gradients of attractiveness then police investigations starting from the scene of the
crime should move down gradients of attractiveness.
A similar procedure can be carried out for multiple crimes. Based on the
assumption 1), then the geographic profiling density for multiple events is given by,
P( z | x1 ,..., xN )
R2
Where
fi ( z )
| z ) P( z )dz
P( x1 | z )...P( xN | z ) P( z )
P( x1 | z )...P( xN
i 1
N
R2
i 1
fi ( z )P( z )
fi ( z ) P( z )dz
(18)
solves.
*( Dfi ) ( z)* fi A( z) fi ( z xi )
(19)
Also, a buffer zone could be incorporated into the forward equation through,
P( x | z ) 1{|x z|} A( x) ( x | z )
(11)
(20)
Team #6906
page 17 of
26
(21)
Here the idea is that criminals may leave a buffer zone of radius r around their anchor
point, within which they do not commit any crimes. The backward equation is then
given by,
*( D ) ( z)* f 1{|x z|r} A( x) ( x z).
(22)
{| x z| r }
(23)
(24)
as discussed in [14]. In the models we also assume that criminals diffuse without drift
( 0 ).Based on the assumption 2 , both the attractiveness field
A( x) A0 * H ( x)
max log( P( zi | xi ))
xi
i 1
(25)
Team #6906
page 18 of
P( zi| | xi )
26
fi ( zi ) P( zi )
f ( z) P( z)dz
i
(26)
P( zi| | xi )
fi ( zi ) P( zi )
f ( z) P( z)dz
i
(27)
In Table 1 we list the maximized log likelihood values for the model. As expected,
taking A to be homogeneous severely limits the accuracy of the model since a great
deal of probability density is distributed in areas where there are no houses. Thus it is
important to accurately model criminal target selection through ,instead of using an ad
hoc approach that only incorporates geographic in homogeneities through P(z).
Team #6906
page 19 of
26
Figure 8: Housing density for the 18 km by 18 km region of the San Fernando Valley used in this
study. Center regions void of housing correspond to a commercial area and a park and lower
regions void of houses correspond to mountains.
Team #6906
page 20 of
26
Figure 10: Geographic profiles (plotted on a logarithmic scale) for two crimes (top) and one crime
with a buffer zone (bottom) using model 3 with best fit parameters.
Team #6906
page 21 of
26
Figure 11: Histogram of distances to crime for 2004 data (top) and simulated data
using model with best fit parameters (bottom).
Team #6906
page 22 of
26
All of the assumptions on criminal behavior are made in the open. They can be
challenged, tested, discussed and compared.
3.5.2 Weaknesses
4.2 Combination
Step1. Obtain the anchor points from the result of the first scheme.
According to the first scheme, we can produce a three-dimensional surface when
the hit score of every point on the map is calculated. This surface can be
represented by an isopleth or fishnetmap with different scores on the Z-axis
representing probability density (Garson&biggss,1992,48-52). Such maps, a form
of virtual reality(in the terms origial sense),may be generated through
computer-aided mathematical visualization techniques. We assume a constant as
the hit score that should be warned by experience, then we can obtain the
geographical profile we wanted.
Step2. Obtain the anchor points from the result of the second scheme.
The same as the step1, we can also produce a three-dimensional surface when the
probability of every point on the map is calculated. We assume a constant as the
maximum probability that should be warned by experience, then we can obtain the
geographical profile we wanted.
Team #6906
page 23 of
26
Step3. Combine the two anchor points by using the intersection and union.
First, go to the intersection of the results from the first scheme and the second one.
If we could not find the offender, then, go to the union of the results from the two
schemes.
4.3 Prediction
Find the centroid of the zone, which we obtained from the step3.
We can find the centroid of the zone by using the function
zcentoid =
1
n
n
i=1 xi
(28)
4.4.1Model Overview
A deeper understanding of the characteristics of crime sites would provide valuable
insight into anchor points, Through the scoring characteristics of crime sites, we can
identify the characteristics of the location of anchor points.
Our objective is to find the smallest feature subset (of the initial feature set) that
accounts for the underlying pattern of criminal event occurrences (hot spots). This is a
model search problem we call feature selection. The selected feature subset is called
the key feature set and the feature subspace defined by the key feature set the key
feature space.
Team #6906
page 24 of
26
sij as follows
sij
(29)
1 dij
1
and d is the average inter-event distance, where distance refers to
d
differences in g k value of an independent variable. Define the Gini index between
these two events as
Where
Ig
2
i 1
j i 1
ij
n(n 1)
(31)
The smaller the value of the I index is, the higher the level of point-pattern
cohesiveness or the better the set of features that define the point pattern.
In general,,
rk
max | xik x jk |
xik x jk Ek
max | xik x jk |
32
xik x jk Pk
Team #6906
page 25 of
26
large sample of locations that are chosen uniformly over the study region We call the
set of the feature values at the sample locations the prior feature data set. As the first
feature selection step, we calculate the ratio of the observed range to the full range of
each feature dimension to see whether there are any dimensions that do not exhibit
enough variation in the event feature data set. This ratio for feature f k defined by
rk
max | xik x jk |
xik x jk Ek
max | xik x jk |
33
xik x jk Pk
Where Ek rk and Pk are the event and the prior feature data sets for feature f k
(i.e., containing only the dimension f k ) respectively.
If the ratio rk is considered sufficiently small, we will not calculate the I g ( Ek ) score
for feature f k .Otherwise, we calculate the adjusted I g for feature f k , or the adjusted
I g( k ) , defined as follows
AdjustI g( k )
Where I g ( E k ) and
I g ( Ek )
I g ( Pk )
(34)
data set Ek And the prior feature data set Pk , respectively. The rationale for this
adjustment scheme is that I g ( Pk ) indicates how much the prior distribution of f k
deviates from the uniform distribution. The smaller I g ( Pk ) is or the further the prior
distribution is from the uniform distribution, the more I g ( Ek ) is adjusted
Team #6906
page 26 of
26
V. References
[1] Barton, G. (1989). Elements of Green's Functions, Waves, and Propaga-tion: Potentials,
Diffusion, and Waves. Clarendon Press: Oxford.
[2] Brantingham, P. J. and Tita, G. (2008). Offender mobility and crime pattern formation from
first principles.Artificial Crime Analysis System,Edited by Lin Liu and John Eck.IGI
Global :Hershey,PA.
[3] Briggs, W. L., Emden Henson, V., McCormick, S. F. (2000). Amultigridtutorial. SIAM..
[4] Estep, D. (2004). A short course on duality, adjoint operators, Greens functions, and a
posteriori error analysis. \www:math:colostate:edu/estep/research/preprints/adjointcourse final:pdf
[5] Holcman, D., Marchewka, A., and Schuss, Z. (2005). Survival probability of diffusion with
trapping in cellular neurobiology. Physical Review E, 72(3),031910.
[6] Johnson, S. D., Summers, L., Pease, K. (2009). O_ender as Forager? A Direct Test of the Boost
Account of Victimization. Journal of Quantitative Criminology,in press.
[7] Keats, A., Yee, E., and Lien F-S. (2007). Bayesian inference for source
determination with applications to a complex urban environment. Atmo-spheric Environment, 41,
465-479.
[8] O'Leary, M. (2009). The mathematics of geographic pro_ling. preprint.
[9] Schuss, Z. (1980). Theory and Applications of Stochastic Dierential Equations. Wiley Series
in Probability and Statistics: New York.
[10] Short, M. B., D'Orsogna, M. R., Pasour, V. B., Tita, G. E., Brantingham, P. J., Bertozzi, A. L. and Chayes, L. (2008). A Statistical Model of
Criminal Behavior. M3AS, 18, 1249-1267.
ham, P. J., Bertozzi, A. L. and Chayes, L. (2008). A Statistical Model of
Criminal Behavior. M3AS, 18, 1249-1267.