Академический Документы
Профессиональный Документы
Культура Документы
6 Abstract. Tools that perform pattern recognition analysis of crimes, comprising at the same time forecasting, clustering, and
7 recommendations on real data such as patrolling routes, are not fully integrated; modules are developed separately, and thus,
8 a single workflow providing all the steps necessary to perform this analysis has not been reported. In this paper, we propose
9 forecasting criminal activity in a particular region by using supervised classification; then, to use this information to automat-
10 ically cluster and find important hot spots; and finally, to optimize patrolling routes for personnel working in public security.
11 The proposed forecasting model (CR-+) is based on the family of Kora- Logical-Combinatorial algorithms operating on
12 large data volumes from several heterogeneous sources using an inductive learning process. We perform two analyses: punctual
13 prediction and tendency analysis, which show that it is possible to punctually predict one out of four crimes to be perpetrated
14 (crime family, in a specific space and time), and two out of three times the place of crime, despite of the noise of the dataset. The
15 forecasted crimes are then clustered using a density-based clustering algorithm, and finally route patrolling routes were crafted
16 using an ant-colony optimization algorithm. For three different patrolling requirements, we were always able to find optimal
17 routes in shorter time compared to commonly used random walk algorithms. We present a case study based on real crime data
18 from the municipality of Cuautitln Izcalli, in Mexico.
19 Keywords: Forecasting models for crime analysis, public security, patrolling routes optimization, ant-colony systems, Spatio-
20 temporal similarity function, pattern recognition, supervised classification, clustering
21 1. Introduction
22 Public security and crime fighting are one of the most important social priorities in great cities of the
23 world. Despite the surprisingly big quantity of human and material resources that governments assign
24 for this matter, it is still evident the need for alternative mechanisms that allow to increase the effective-
25 ness and efficiency of police forces [12]. One of the main variables that limit this effectiveness is the
26 response time to crime events. Particularly, immediate-reaction events show that the marginal improve-
27 ments obtained in this matter are not enough to reduce the general criminal incidence of the zone, as
28 well as to substantially modify the perception of insecurity between citizens [19].
29 A better perspective of this situation can be achieved if the problem is translated to the sphere of
30 prevention instead of reaction. If public forces were capable to anticipate when and where the criminal
31 activity of a specific kind might be increased, a double benefit could be achieved. On one hand, it would
32 be possible to concentrate resources and logistic activity necessary to fight that specific kind of criminal
Corresponding author: Hiram Calvo, Centro de Investigacin en Computacin, Instituto Politcnico Nacional, Av. Juan de
Dios Btiz s/n esq. Manuel Othn de Mendizbal, 07738, Mexico City, Mexico. E-mail: hcalvo@cic.ipn.mx.
1088-467X/17/$35.00
c 2017 IOS Press and the authors. All rights reserved
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 2
Fig. 1. General architecture of the computerized system to support decision-making processes in public security.
33 activity in the anticipated place and time. On the other hand, it could be possible to establish dynamically
34 and with solid foundations several of the common parameters of everyday work in public security, such
35 as the specific design of surveillance rounds, the distribution of forces in time and space, and, of course,
36 the development of security operations, or even information and prevention campaigns through massive
37 communication media [28].
38 Several systems have been already created in order to help crime analysts in their duties, for exam-
39 ple STAC (Spatial and Temporal Analysis of Crime) [w1], and CrimeLink (PCI Precision Computing
40 Intelligence) [w2]. This latter system provides an Event-Time graph, and a Pattern Analysis wheel.
41 Other systems are CrimeView (Omega Group) [w3], and ArcGis (Crime Analysis Extension), providing
42 hotspot analysis on ArcGIS 9, main centroid identification, and probable crime direction identifica-
43 tion. More recently, A.T.A.C (Automated Tactical Analysis of Crime) [w4] identifies criminal patterns
44 through data ordering; it provides tools for analysis such as time series based prediction, Google Earth
45 integration, mapping and density analysis. These systems are of great help in crime analysis and pre-
46 vention; however, being mostly built for commercial purposes, they do not disclose their forecasting
47 algorithms, making difficult to adapt them to the needs of a certain region. Moreover, these systems are
48 not designed to provide recommendations, such as patrolling routes modeling, that are able to guarantee
49 an adequate coverage of the identified hotspots.
50 In view of this, our work is devoted to a two-folded objective: first, we aim to study the spatial and
51 temporal decisions made by criminals identifying hotspots where criminal activity is concentrated; while
52 the second one is, once these activities are found and properly clustered, to provide a flexible schema
53 adapted to real needs and availability of resources for designing adequate patrolling routes that opti-
54 mally cover these hotspots.
55 In the following section (Section 2) we will focus on the forecasting of criminal activities; in Section 3
56 we tackle the problem of clustering and GIS-mapping these activities for the analyst. In Section 4 we
57 cover the problem of patrolling routes design. In Section 5 we present a set of experiments and results of
58 each one of the modules comprising this framework; and finally, in Section 6 we draw our conclusions.
60 The forecasting, clustering and routing model reported here is a framework designed to prevent and
61 react to crime. This framework is made-up by several layers as shown in Fig. 1.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 3
62 The function of the first layer is to gather, standardize and analyze data from six established informa-
63 tion sources. In terms of pattern recognition, layer one constitutes the supervision sample. The second
64 layer contains several prediction algorithms (see Section 2). The input to the third layer is the set of
65 predictions made by the algorithms of the previous layer and it identifies, clusters and maps impor-
66 tant hotspots (see Section 3). The fourth layer generates recommendations for addressing the forecasted
67 scenarios. Particularly in this paper we discuss the patrolling recommendations (see Section 4).
68 Careful structuring of the input data is of utmost importance, so that the forecasting model can be
69 efficient and has an acceptable level of precision [13]. The problem of crime prediction and recommen-
70 dation generation requires several different information sources, all of them directly related with public
71 security, but not easily accessible. For this project, we have chosen six information sources arranged into
72 four categories as follows: information on (1) crimes committed, (2) citizens reports, (3) resources and
73 police activities, and (4) socioeconomic data of the region under study. In each of these categories, infor-
74 mation should be precisely located in time and space. For our study, we use data from the municipality
75 of Cuautitln Izcalli, State of Mexico and data from the Sacramento California (CA) Police Department.
77 There are several works devoted to the study of spatial and temporal decisions made by criminals, i.e.,
78 identifying hotspots where criminal activity is concentrated see [1,2,8,26,31,33,3537,39]. A widely
79 used method is the Spatial and Temporal Analysis of Crime program (STAC) [5], which clusters crime
80 points within ellipses [3]. Jefferis [24] surveys additional hotspot methods, the most sophisticated of
81 which employs a kernel density estimation method [27]. Nevertheless, the main disadvantage of statis-
82 tical methods is that they do not offer additional semantic information for describing the phenomenon
83 under study. In the specific case of crime prediction, this kind of information is highly desirable, as it
84 is needed to support decision-making processes and, in general, to prepare preventive and corrective
85 policies. Because of this, we have selected inductive classification methods over statistical ones in or-
86 der to generate an inductive description of each type of criminal activity studied. These descriptions by
87 themselves constitute valuable information that provides a general overview of the criminal activity sce-
88 nario. Furthermore, by using these inductive definitions, it is possible to identify the expected increase
89 or decrease in specific criminal activities that will most likely occur in specific geographic areas and
90 times.
91 This section deals with the design of the forecasting model within the proposed framework of criminal
92 activity analysis within a specific time period and location using several different supervised classifica-
93 tion techniques. W present details of our forecasting model, from general Forecasting with Inductive
94 Supervised Classification (Section 2.1) to our particular implementations of CR-+ (Section 2.2) and
95 (CR-+M) with general discrimination [21] (Section 2.3).
97 One of the most interesting tasks of the Pattern Recognition discipline is the study of forecasting
98 models [10,30]. The forecasting problem can be treated as a classification problem; this allows tak-
99 ing advantage of the large number of available classification algorithms. The great majority of current
100 forecasting models has a statistical nature and is mainly devoted to time series analysis (for an exhaus-
101 tive review of these methods, refer to [10,22]). As stated previously, we are interested in the particular
102 semantics of crime analysis, so we propose an inductive forecasting model.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 4
103 The forecasting model is expressed as a supervised classification problem in the following terms:
104 Given a database containing a set of patterns corresponding to crimes perpetrated within the region under
105 study that are spatio-temporally labeled, group such patterns into crime families, where a crime family
106 consists of all the crime patterns corresponding to similar crimes that are fought with the same resources.
107 These families form the supervision sample to be used by the classification algorithms. Afterwards, a
108 learning process is carried out to describe each family in both positive and negative ways. In order to
109 predict a specific criminal scenario (i.e., the time, location and type of criminal activity to be predicted),
110 a pattern containing all the relevant data is constructed and submitted for classification in accordance to
111 the previously assembled supervision sample. The classification algorithm gives as a result the degree
112 of membership of such a pattern to each one of the established families. Consequently, each degree of
113 membership is interpreted as the forecasted increase or decrease in the criminal activity for the specified
114 time and location.
115 To make forecasts, a careful design of the classification problem semantics is required. Specifically,
116 this includes three basic aspects of the problem: (1) the objects under study and the attributes or features
117 that will be used to describe them; (2) the number of classes and how patterns will be classified; and (3)
118 what kind of learning the classification algorithm will use. Each one of these aspects is discussed below.
Fig. 2. The CR- + M Classification Algorithm will consider the X features as negative for class three with 1 = 3, whereas
the CR-+ algorithm will not.
145 contained in each one of the classes in the supervision sample. The inductive definition corresponding
146 to each Ci class, is an expression of the form:
[
(Ci ) = Pm (Ci ) (1)
147 where each Pm (Ci ) is a property identified among the patterns pertaining to the Ci class [6]. The proper-
148 ties are subsets of descriptive features associated with specific values. For each class, a positive + (Ci )
149 and a negative description (Ci ) are made. Once the descriptions are obtained, we can use an inductive
150 classification algorithm.
152 We use a combination of the Kora- algorithm proposed in [21], which is an extension of the KORA-
153 3 [6,1517] algorithm and the Representative Sets (CR+) algorithm [3,9], which both share the notion
154 of property in the form of a subset of features associated with specific values in these features.
155 A Pm property identified in the Ci class has the form shown in Eq. (2).
xp , . . . , x q
Pm = (2)
hvp i, . . . , hvq i
156 Where xi = p, . . . , q are features used to describe the objects under study and each hvj i, j = p, . . . , q is
157 a specific value in the domain of the xj feature observed among the patterns in the Ci class.
159 The CR-+ Modified Algorithm (henceforth referred as CR-+M) is based on the previous algo-
160 rithm, modifying the counting of features present in other classes, i.e., those related with the 10+ and
161 10 thresholds.
162 For the previous algorithm, 1+ (Ci ) was calculated by counting a feature set present at least 1+ times
163 in Ci and no more than 10+ times in any other class Cj with j 6= i. For the CR-+M algorithm, we
164 modifyS this last part, now requiring no more than 10+ times in the union of other classes Cj with j 6= i,
165 that is, Cj , j 6= i. The same occurs for calculating 1 (Ci ) requiring now that the feature set be present
at least 1 times in Cj , j 6= i and no more than 10 times in Ci .
S
166
167 In Fig. 2 we exemplify the effect of this modification. Given a threshold 1 = 3 for C3 the feature
168 X in the first algorithm (CR-+), 1 (C3 ) would be empty, whereas for the CR- + M algorithm, the
169 cardinality of the features would be 4, yielding 1 (C3 ) = {X}.
170 Once the classifiers are trained, it is necessary to cluster the spatiotemporal features in order to visu-
171 alize and find important hotspots. This is discussed in next section.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 6
Table 1
Example of criminal data [32]
Crime type Suspects race Suspects sex Suspects age gr Victims age gr Weapon
Robbery B M Middle Elderly Knife
Robbery W M Young Middle Bat
Robbery B M ? Elderly Knife
Robbery B F Middle Young Piston
173 We used an approach based on Nath [32]. He uses a k -means algorithm is used for identifying crime
174 patterns a crime pattern is described as a specific group of criminal actions with similar MO character-
175 istics. He calls a group or cluster of crimes, a pattern. He shows results of experiments made on a small
176 sample of data, shown in Table 1.
177 By applying a k -means clustering algorithm to the datasets, the author groups crimes with similar MO.
178 He explains that in the robbery sample (shown in Table 1) pattern behavior may be observed in rows 1
179 and 3, where the suspects description matches, as well as the victims profile. However, no explanation
180 is given about the intra-class diversity of the crimes, and why the author used k-means as the clustering
181 algorithm of choice. Figure 3 exemplifies the results published by Nath.
182 For the clustering layer, we propose the use of a clustering technique based on pattern density, together
183 with a space-time similarity function to identify areas with high concentration of crime (hot-spots).
184 Then we compare the results obtained with our similarity function with those obtained by the proposed
185 similarity function used in the original paper of the ST-DBSCAN algorithm [4]. The comparison criteria
186 used by the space-time similarity function and its specific use to cluster criminal activities are the main
187 contributions of this clustering method.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 7
Table 2
Data sample (out of 80 patterns) of the burglary dataset
Weapon Location Date
Firearm Ensueos 22/08
Not specified Cumbria 21/08
Sharp instrument Arcos de la Hacienda 13/09
Banned weapon San Isidro Labrador 01/03
189 The purpose of using density-based clustering techniques in the context of crime-analysis is to achieve
190 a non-statistical identification of the observed spatial and temporal trends in the commission of crimes,
191 as well as to isolate exception cases that do not fit into those trends. This information is useful for the
192 crime-analyst in order to develop specific strategies, both to fight and prevent delinquency, in the middle
193 and long terms.
Table 3
Data sample (out of 126 patterns) from the robbery data set
Robbery type Weapon No. of members Location Month
Break-in Sharp instrument 2 El Rosario DEC
Auto-parts theft Without weapons 4 Adolfo Lpez Mateos FEB
Robbery to passer-by Firearm 3 El Rosario FEB
Robbery to passer-by Sharp instrument 1 Cofrada-III DEC
211 The trend-identification process starts by analyzing a set of crime-patterns, each one with the same
212 level of detail and within a limited geographic location, occurred within a given time interval. Table 2
213 contains a sample of burglary patterns obtained from the Cuautitln Izcalli area, in the State of Mexico.
214 Figure 4 shows how such patterns are plotted in the map of the corresponding area divided into surveil-
215 lance sectors.
216 In Table 3 we show another sample of the same dataset with a different level of detail from the one
217 shown in the former sample. The differences between the two sets can be observed on the temporal and
218 crime specific components of the patterns. This second set contains robbery-patterns and their location
219 is another surveillance center within the same district of Cuautitln Izcalli, Mexico.
220 The complete first dataset (burglary) contains 80 patterns, while the second one (robbery) contains
221 126 patterns.
Fig. 5. Surveillance sectors. (a) Euclidean comparison; (b) Comparison by sector division.
246 Fig. 5(a), patterns p4 and p6), while patterns which are geographically close to each other, even if they
247 belong to different sectors, will be considered as similar, see Fig. 5(a), patterns: p1 and p2.
248 The spatial comparison criterion based on regions divided by surveillance sectors, strongly suggests
249 that the maximum space-similarity should be achieved by patterns belonging to the same sector (See
250 Fig. 5(b), patterns: p3 to p6), followed by patterns belonging to contiguous sectors (See Fig. 5(b), pat-
251 terns: p1 and p2). This kind of clustering yields much more useful clusters because patrolling routines, as
252 well as investigative teams usually schedule their operations by sector. Of course, different comparison
253 criteria can be considered depending on the needs by each crime-analyst.
254 Our proposed similarity function is defined by the following equation:
r
1X
f (Oi , Oj ) = (s Ccs (Oi , Oj )) (3)
r
s=1
255 Where: r is the number of features that make up the pattern. oi , oj are the patterns being compared. s
256 is the weighting factor of feature s. Ccs ( ) is the space-time and attribute comparison criteria for feature
257 s.
258 Experiments in Section 5 will show that this similarity function based on our space comparison crite-
259 rion produces better results than the space comparison criterion based on Euclidean distance. Once data
260 is properly clustered, important hotspots can be identified. These hotspots are in turn surveillance points
261 that must be considered in a patrolling route. The next section will deal with the construction of such
262 routes.
264 Several methods have been developed for tackling the problem of route optimization. The field of
265 multi-robot cooperative tasks provides an interesting set of examples; see [1] for a thorough compendium
266 of several models. Within this approach, we found two major drawbacks. The first one is that some of
267 them are designed for small devices [34], and the second one is that they are designed for automatic
268 execution, and usually they do not allow incorporating certain restrictions pertaining to real world human
269 driving and wide area sectorization. Other approaches are based on workload balancing models [40],
270 local search techniques [43], and agents [7]. However, to our knowledge, ant colony systems, while
271 being known to be effective for finding optimal routes [20,42], have been scantly applied to police
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 10
272 patrol route planning considering three real needs:1 (a) finding the optimal route for a patrol to attend an
273 emergency call; (b) finding the optimal route between the current location of a patrol and a set of nearby
274 streets that require surveillance; and finally (c) to find the optimal route for a patrol, so that it can survey
275 different points of major criminal incidence in a specified neighborhood. In the next Section 4.1, we will
276 present a short introduction to ant colony systems, then in Section 4.2 we describe our proposed method.
278 Ant Colony Optimization algorithms are models inspired in real ant colonies. Studies show how ani-
279 mals that are almost blind, such as ants, can follow the shortest path to their supplies (food) [11]. This
280 is due to the exchanging information ability ants have, since each one of them, while moving, leaves a
281 trace of a substance called pheromone along their path. Thus, while an isolated ant moves essentially in
282 a random way, agents of an ant colony detect the pheromone trace left by other ants, and tend to follow
283 such trace. These ants, in turn, leave their own pheromone along the travelled path, making it more at-
284 tractive, since the pheromone trace has been reinforced. With time, the pheromone evaporates, causing
285 the trace to weaken. In short, it could be say that the process is characterized by a positive feedback, in
286 which the probability for an ant to choose a path increases with the number of ants that previously have
287 chosen the same path. One of the first known applications of the ant colony system was the travelling
288 salesman problem (TSP) [18], obtaining favorable results. From that algorithm, several heuristics have
289 been developed to improve the original algorithm, and have been applied to other problems such as the
290 vehicle routing problem (VRP) [14] and the Quadratic Assignment Problem (QAP) [29].
291 In this section, we present results of a heuristic based on an improved version of the ant colony
292 optimization (ACO) algorithm called MMAS (Max Min Ant System) [41].
293 The ACO algorithms are iterative processes. In each iteration, a colony of m ants is deployed, and
294 each one of the ants constitutes a solution to the problem. Ants build solutions in a probabilistic way,
295 being guided by a trace of artificial pheromone, and by information calculated a priori in a heuristic
296 way. The probabilistic rule for traversing nodes on a graph is:
297 where pkij (t) is the probability, in a t iteration of the algorithm, the k ant currently situated in city i,
298 chooses city j as the next stop. N is the set of cities not yet visited by the ant k . ij (t) is the amount of
299 pheromone accumulated on the arc (i, j ) of the network at the t iteration. ij is the heuristic information
300 for which, in the case of TSP, the inverse of the distance between i and j cities. and are parameters
301 of the algorithm to be adjusted.
302 When all ants have built a solution, pheromone must be updated on each arc. The formula for this is:
ij (t + 1) = (1 ) ij (t) + ijbest ,
1
best if the arc (i, j) belongs to T best
ij = Lbest (5)
0 otherwise
1
Direct communication by the personnel of the Emergency Central C4 of the municipality of Cuautitln Izcalli.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 11
303 Where is the pheromone evaporation coefficient. T best can be the best solution found at the moment, or
304 the best solution found in the current iteration. The level of pheromone should be in a range [Tmin , Tmax ]
305 These limits are established in order to avoid stagnation in the search of solutions. All pheromone is
306 initialized with Tmax . After updating the pheromone, a new iteration can be started. The final result is
307 the best solution found over all iterations.
308 This gives us a global view of the MMAS algorithm. In the next section, we will present its application
309 to the problem of human and material resources for patrolling routes. We aim to a three-folded purpose:
310 (a) To find the optimal route between a patrols current location, and a point where a call for help has
311 been raised. (b) To find optimal routes for patrolling a small set of nearby streets in a neighborhood, and
312 finally (c) to find optimal routes for patrolling different points of major criminal incidence in a specified
313 neighborhood.
315 We will illustrate our methodology with the example case of a neighborhood of the municipality of
316 Cuautitln Izcalli, Mexico. This neighborhood was selected considering the current geographic level for
317 assignment of patrolling routes. In Fig. 6 the structure at street level can be seen. Patrols must cover the
318 points considered as the most important ones.
319 Then, the street structure is transformed to a directed graph G = (V, E), where V is a set of vertices or
320 nodes [25]. In our case, those are the crossings between streets. See Fig. 7. E is a set of arcs connecting
321 the set of nodes, and represent the streets conforming the neighborhood. Each one of these represents
322 the direction a street has. The obtained final graph can be seen in Fig. 7.
323 Our solution employs the algorithm MAX-MIN Ant [41] with modifications to the original restrictions
324 for the TSP for which it was originally presented. Compared to the original TSP, we are interested on
325 having N ants with certain routes that represent the number of available units. In the original problem,
326 we have only one individual. We adapted the MMAS as shown in Fig. 8.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 12
328 Following the architecture of our framework shown in Fig. 1, in this section we will present results
329 of each one of the layers applied to real data of a national municipality in order to validate their effec-
330 tiveness. For the forecasting model (Section 5.1), we will compare two inductive classification methods
331 (namely the CR-+ and its variation CR-+M) against a traditional KORA- classifier. Then, we will
332 compare the tendency of our prediction using RMSE against a Bayesian forecasting method. For the
333 clustering model (Section 5.2) we implemented the ST-DBSCAN with a standard Euclidian distance
334 measure, and then we compare its results with our proposed measure based on sector space division.
335 Finally, for the patrolling route recommendations (Section 5.3), we compare our proposed method with
336 a random walk algorithm.
338 In the Cuautitln Izcalli district, located in Mexico, the local Government launched the Centro de
339 Emergencias Cuautitln (CERCA, Cuautitln Emergency Center) in 2007. An important part of its
340 function is the gathering of the information corresponding to the three first categories of information
341 sources. Therefore, this district was selected as a case study and test field for the forecasting, clustering,
342 and patrolling recommendation model reported herein.
343 We perform two analyses: punctual hotspot prediction (Sections 5.1.1 and 5.1.2), and tendency analy-
344 sis (Section 5.1.3). For the first analysis, we use data from the municipality of Cuautitln Izcalli, State of
345 Mexico. Within this analysis, we perform experiments for spatial and temporal location of crime (Sec-
346 tion 5.1.2.1) and expected family of crime (Section 5.1.2.2). For the tendency analysis (Section 5.1.3)
347 we use data from the Sacramento California (CA), police department.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 13
Table 4
Pre-processed report used as input to the algorithms Mode 1
Time quadrant Date Residential zone Pub. road (Class 1) Home (Class 2) Shops (Class 3)
Q8 Jan Arcos del Alba 1 0 0
Q4 Apr Atlanta 1 0 0
Q1 Jun Bosques de la Hda. 0 1 0
Q6 Apr Bosques del Lago 0 1 0
Q6 May Centro Urbano 0 1 0
Q5 Mar Hacienda del Parque 0 0 1
Q5 Jun Infonavit Norte 0 0 1
Table 5
Pre-processed report used as input to the algorithms Mode 2
Time Date Residential zone Robbery Injury Homicide Property damage
quadrant (Class 1) (Class 2) (Class 3) (Class 4)
Q8 Jan Arcos del Alba 1 0 0 0
Q4 Apr Atlanta 1 0 0 0
Q1 Jun Bosques de la Hda. 0 1 0 0
Q6 Apr Bosques del Lago 0 1 0 0
Q6 May Centro Urbano 0 0 1 0
Q5 Mar Hacienda del Parque 0 0 0 1
Q5 Jun Infonavit Norte 0 0 0 1
Table 6
Comparison of recall measures
Recall CR-+ (2 + Features) CR-+M (2 + Features) CR-+ (1 + Features) CR-+M (1 + Features)
Train Test Train Test
I. April 2007 77.0% 22% 78.0% 23% 77.0% 24% 77.0% 24%
II. July 2007 79.0% 23% 77.0% 30% 77.0% 29% 77.0% 30%
III. April 2008 79.0% 23% 79.0% 21% 79.0% 23% 77.0% 24%
362 able to use all records. We will use this information as the input of the KORA-, CR-, and CR-+
363 algorithms. The results of the classification of both analyses are reported in the following sections.
368 Mode 1 has three classes: (1) public roads and highways; (2) homes; (3) stores and shops. Using
369 the KORA- Algorithm we calculate the characteristic features and the complementary features of the
370 sample applying it to a set of data with 160 patterns (78% of the 205 records from the whole sample);
371 these 160 records were spread in the following way: Public roads: 105, Home: 35, Stores: 20. The
372 learning percentage of the algorithm for the known data is 88%. To calculate the prediction rate, we
373 used a sample of 45 patterns (22% of the 205 records from the whole sample) divided as follows:
374 30 patterns for crimes in public roads, 8 patterns for home crimes, and 7 patterns for shop crimes. The
375 algorithm had an effectiveness of 66% for the test set.
376 We used the CR-+ Algorithm on the same 160 patterns applied in the previous experiment. The
377 learning percentage of the algorithm for the known data rose to 92.5%. For prediction rate, we used the
378 same 45 patterns for test from the previous experiment. The algorithm had a prediction rate of 69% of
379 for the real data test set. This means that it was possible to predict in more than two thirds of cases the
380 place where crimes are likely to have a greater incidence.
382 One of the main disadvantages depicted in the first analysis is the low number of records that can
383 be used for prediction, although this allowed predicting the place where crimes are more likely to be
384 perpetrated. For this experiment, we grouped the sample data in the following classes: (1) robbery in
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 15
385 all its modalities, (2) homicide, (3) injury, and (4) property damage. This analysis allows using 1231
386 records out of 1551. For the sake of considering the heterogeneous distribution of the data between
387 different dates, we tested the algorithms against:
388 1. 150 new events corresponding to the month of April of 2007.
389 2. 123 new events corresponding to the month of July of 2007.
390 3. 321 new events corresponding to the month of April of 2008.
391 In contrast with the previous experiment, where the 1551 records from January 1st 2007 to July 31st ,
392 2007 were split in 78% for training and 22% for testing, in this experiment we used 100% of such data
393 as training, and the additional patterns of I, II and III as different tests. Note that these new events were
394 not included in the previous records. The purpose of testing against different test sets is to examine the
395 performance of the algorithm given the heterogeneity of the provided data. It can be seen, for example,
396 that the newest data from April 2008 has the double the number of records compared to data from
397 previous dates (April of 2007, and July of 2007).
399 Table 6 shows the results in terms of recall of applying the algorithms CR-+ and CR-+M algo-
400 rithms to the different test sets. We also explore limiting the number of features in a feature set to at
401 least two, and applying no limitations (so that feature sets can be composed of only one feature). For
402 all experiments, the empirically chosen values for beta were: 1+ = 3, 10+ = 1, 1 = 3 and 10 = 1,
403 2+ = 0, 2 = 3 for CR-+, and 2 = 1 for CR-+M.
404 The CR-+ classifier was tested by classifying patterns previously not contained in the supervision
405 sample [3]. We compared results against those achieved using the standard KORA- algorithm, and
406 obtained an improvement for the learning rate, as well as for the test rate. The original KORA- algo-
407 rithm obtained 88% and 66% for learning rate and test rate, respectively, whereas the proposed CR-+
408 algorithm obtained 92.5 and 69%, respectively when classifying data into the following classes: (1) pub-
409 lic roads, highways, (2) homes, (3) stores and shops. This suggests that we are able to predict the kind
410 of crime spatially and temporally for two out of three crimes. We evaluated with test data for crimes
411 perpetrated from January 1st 2007 to July 31st , 2007. Approximately 78% was used for training and
412 approximately 22% for testing.
413 In our second analysis, we used the whole data set from January 1st 2007 to July 31st for training,
414 while we selected three different data sets for testing: (I) 150 patterns corresponding to the month of
415 April of 2007, (II) 123 patterns corresponding to the month of July of 2007 and (III) 321 patterns corre-
416 sponding to the month of April of 2008. We used three different datasets to evaluate the homogeneity of
417 the data. We obtained 77% recall in learning rate (up to 87% in precision), and 30% recall in forecast.
418 This suggests that we are able to predict punctually the kind of crime, given a spatio-temporal location,
419 at least for one of each four crimes perpetrated.
420 We have shown how both algorithms, the CR-+ and its variation CR-+M, perform better than the
421 classical algorithms (KORA-). Particularly, it can be seen from Table 6 that the CR-+M improves in
422 general the forecast recall for example, for test II, using 2 + features, it raises recall from 23% to 30%.
Table 7
Comparison with the systems presented by Ivaha et al.
NFM OLS-NI OLS-PC Ours
STMRSE 1.57 1.139 1.131 0.97
428 of forecast errors, and it is based on the root mean squared error, divided by the number of days of the
429 sample. Two or more models may be compared using STRMSE as a measure of how well they explain
430 a given set of observations: the unbiased model with the smallest STRMSE is generally interpreted as
431 best explaining the variability in the observations. STRMSE is calculated as shown in Eq. (6). n is the
432 total number of days forecasted and m is the total number of samples.
v
u m
u 1 X (Oi O i )2
STMRSE = t (6)
n m
i
433 To test the proposed forecasting algorithm, we used the Sacramento dataset. This dataset contains
434 152,812 registered crimes and was made available by the Sacramento CA, police department.2 All crimes
435 were committed within 19 surveillance sectors (space-units), over a period from January 2004 to De-
436 cember 2008 (time-units).
437 By analyzing only records from the last five years (2004 to 2008), a forecast was calculated for time-
438 unit January 2009, all registered crime-families and within all 19 surveillance sectors. The foretold
439 number of crimes was then compared with the real-life police-registers from that same space-time unit
440 (2,219 crimes during January 2009).
442 Using the aforementioned method, all positive and negative characteristic space-time properties for
443 each crime-family were found. For the training set from 1/1/2004 to 31/12/2008, and test set from Jan-
444 uary 2009, the STMRSE of the Bayes (NFM) forecast was 7.05, while ours was 0.90. A similar behavior
445 was observed for the test set of February 2009 (using the same training set): Bayes STMRSE yielded
446 8.45, while we obtained 0.97. We can (indirectly) compare with the system presented by Ivaha et al. [23].
447 His results are shown in Table 7, along with ours. NFM is the Nave Forecasting Method, OLS-NI is
448 the Ordinary Least Square method on Number of Incidences, and OLS-PC is the Ordinary Least Square
449 method on Percentages of Crime (OLS-PC). For details on how NFM, OLSI-NI and OLS-PC results are
450 obtained, please refer to [23].
451 These results show that the proposed method has very high effectiveness, with an STRMSE below
452 1.0 forecasting all space-units, during January 2009 (with a total of 2,219 crimes). This means that,
453 in average, the proposed method only fails in less than five occurrences of each crime-family. Such
454 precision is fairly acceptable for automated crime-analysis systems and might constitute a useful tool
455 for planning preventive police operations.
457 The values of Eps and MinPts in our implementation of ST-DBSCAN were calculated with Eqs (7)
458 and (8):
Eps = 1 min(f (oi , oj )) (7)
2
http://www.sacpd.org/crime/stats/reports/.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 17
Fig. 9. ST-DBSCAN results: (a) with Euclidean space similarity function (left), (b) with space similarity function based on
sector division (right).
459 where f (oi , oj ) is the similarity function with i = 1, 2, . . . , n; j 6= i. n being the total number of
460 patterns.
Table 8
Results of clustering of the robbery type of crime data
Cluster Type of crime Month Year
Robbery to passer-by July, August & September 2006, 2007 & 2008
Auto-parts theft July & September 2007 & 2008
Break-in January & April 2008
Auto-parts theft February & September 2006 & 2007
? (noise) Break-in December 2007
482 firearm, while the patterns that make-up Cluster B were committed without weapons. This result turns
483 out to be very important because, following this path, crimes that were probably perpetrated by the same
484 aggressors can be semantically identified.
485 In the second experiment we worked with patterns that have a higher level of detail, which means
486 more descriptive features. Also, each component (set of related features) in the crime-patterns, were
487 weighted as follows: 60% space-time, 30% crime-specifics and 10% crime features, due to the fact that
488 some features are more important than others.
489 This experiment is more related with the work performed by the preventive police, since the family of
490 crimes related to robbery is the one under study. This crime-family is made up by: robbery to passerby,
491 break-in, and auto-parts theft.
492 The weighting may be obtained through a criminology expert. The objective is to identify trends by
493 taking advantage of the experts knowledge in criminology. Figure 6 shows the results achieved.
494 Table 8 shows the results of our last experiment. Of the two residential areas studied in the North areas,
495 the one containing a higher amount of crimes from the robbery family is the Santa Barbara residential
496 area, which belongs to the sector with the same name. Besides, we found that those months of the year
497 with the highest incidence of crime are July and September, so it is necessary to undertake programs
498 and campaigns in such sector, and in that season of the year have prevention programs and campaigns to
499 fight this type of crime.
501 In this section, we present results of three selected cases with MMAS for solving patrolling routes
502 optimization problems as described in previous paragraphs. After several tests, we found the optimal
503 parameters shown in Table 9. We compared our results against a random walk baseline, which consists
504 basically on using the algorithm shown in Fig. 9 (see Section 4.2) without using Eqs (1) and (2), i.e.,
505 using a plain random roulette with equal probabilities, and not using pheromones at all.
Table 9 Table 10
Parameters used in the MMAS algorithm Results obtained from experiment A
Parameter Value Alert Start End Ant colony MMAS Random walk baseline
Number of vertices of the generated graph 128 # point point Cost Time Optimal? Cost Time Optimal?
Number of undirected graphs 26 1 1 31 716 23 Yes 819 32 No
Number of directed arcs 38 2 1 109 674 35 Yes 674 35 Yes
Number of ants generated in each iteration 20 2 1 105 780 35 Yes 1432 43 No
Number of iterations 50 4 1 89 1197 37 Yes 1781 57 No
Pheromone evaporation constant 0.98 5 1 64 1964 96 Yes N/A
1
Limits [Tmin , Tmax ] 0.0078,
p
Fig. 10. Clusters identified with ST-DBSCAN with a space similarity function based on sector division (Northern area).
Table 11
Optimal routes for experiment B
# Start point Rute points Route cost Time Optimal route
1 1 59-61-73-75-86-88 2,387 66 1-2-3-6-7-12-18-22-27-26-25-31-41-45-49-50-78-77-76-
75-74-73-88-87-86-75-74-73-72-67-68-61-60-59
2 1 31-32-33-39-40-45 1,190 148 1-2-3-6-7-12-18-22-27-33-35-39-108-109-38-34-30-27-
33-35-36-32-26-25-31-41-45-49-80-79-40
3 1 91-92-93-95-120-122 2,136 62 1-20-29-37-127-126-125-124-123-122-121-120-95-94-
93-69-92-91
528 routes. Each one of them must pass through three different surveillance areas. Each area is integrated
529 with 4 nearby points, randomly selected from the studied neighborhood. All routes depart from a com-
530 mon initial point. See Fig. 13.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 21
Table 12
MMAS vs. Random walk routes for experiment B
Alert # Ant colony MMAS Random walk baseline
Cost Time Optimal? Cost Time Optimal?
1 2,387 66 Yes N/A
2 1,190 148 Yes 3,215 217 No
3 2,136 62 Yes 4,384 250 No
Table 13
Optimal routes for experiment C
# Start point Rute points Route cost Time Optimal route
1 1 59-61-86-88 4,546 3,421 1-2-3-4-10-15-16-23-25-31-32-33-35-39-108-107-106-105-
2 32-33-39-40 104-103-102-101-100-99-98-97-96-95-94-93-122-121-120-
3 93-95-120-122 95-94-93-122-121-120-119-96-90-89-88-87-86-75-74-73-72-
68-61-60-59-56-54-53-78-82-81-80-79-40
4 1 11-18-21-22 4,126 990 1-2-3-6-7-12-18-22-27-28-21-19-11-5-2-1-20-29-37-127-126-
5 47-48-52-53 127-37-127-126-125-124-123-122-121-120-119-118-117-116-
6 113-115-124-125 115-114-113-102-84-77-54-53-51-48-47-52
531 All optimal routes shown in Table 13 were found by our Ant Colony Algorithm, while the Random
532 walk algorithm was not able to find a route covering the requested route points within the specified
533 number of iterations. In general, several routes were calculated for all neighborhoods in the municipality
534 of Cuautitln Izcalli, always finding optimal routes, implying the proposed algorithm is a reliable way
535 of calculating patrolling routes given important points to be covered. These points can be obtained from
536 daily operation of patrolling routes planning sessions.
538 We have presented a framework for forecasting, clustering and patrol routes recommending in order to
539 prevent crime incidents. To our knowledge, this is the first work comprehending a single workflow from
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 22
540 raw data of crime events, to patrolling routes recommendation. Each stage of this framework has been
541 validated against methods commonly used by commercial state of the art crime analysis systems, such
542 as Bayesian tendency analysis, Euclidian distance-based clustering, and random walk route generation.
543 In all cases, we were able to provide a performance improvement, as well as other advantages such
544 as obtaining valuable information for describing the criminal scenarios under study by using inductive
545 definitions. We added other flexibilities such as the use of thresholds, which allow us to determine the
546 level of precision we want in the inductive description of each class, and managing several restrictions
547 for covering real patrolling needs.
548 We performed two analyses: punctual prediction and tendency analysis, which show that it is possible
549 to predict punctually one of four crimes to be perpetrated (crime family, in a specific space and time),
550 and 66% of prediction of the place of crime, despite of the noise of the dataset. The tendency analysis
551 yielded an STRMSE (Spatio-Temporal RMSE), of less than 1.0.
552 For clustering relevant hotspots, we implemented the ST-DBSCAN algorithm, proposing a space sim-
553 ilarity function based on sector division. This generates better results than the standard one based on
554 Euclidean distance, taking the following aspects into account: (1) The semantics adapt better to real-
555 ity under the context of the type of analysis made and (2) Higher percentage of noise identification
556 contributes to the reduction of elements for the analysis.
557 Our recommendations on route patrolling were based on ant colony systems, finding that they are
558 efficient and effective for optimizing several kinds of routes. In all cases, we were able to find an optimal
559 route within a limited number of iterations, while the random walk algorithm found an optimal route
560 in only a few cases. For Patrolling Area Optimization, the random walk algorithm was not able to find
561 a patrolling route within the specified number of iterations. These experiments show that computing
562 the probability of transition for an ant based on a pheromone component improves the ability of an
563 exploration algorithm to find a feasible solution in short time.
564 The problems tackled in our experiments are extendable to cover many problems arising currently
565 in great urban zones of the world. Around 50 iterations were needed to find an optimal route with our
566 method. The compared method was not able to find an optimal route within this number of iterations.
567 As a future work, there are several paths to explore in this project. First, it is necessary to incorporate
568 other information sources available. Second, it is of the utmost importance to calculate the optimal
569 thresholds for the learning process. A statistical analysis of the data included in the supervision sample
570 would make this task easier, as well as exploring evolutionary techniques [38].
571 Further experimentation with the baseline algorithm for finding the needed number of iterations to
572 obtain an optimal route (if possible) has been left as future work.
573 Also, as future work, we plan considering traffic factors affecting patrolling maneuvers, as well as
574 considering other factors impeding free vehicular transit and thus, affect the response time of a patrol.
575 Acknowledgments
576 We thank the support of Mexican Government (SNI, SIP-IPN, COFAA-IPN, and BEIFI-IPN), and
577 CONACYT, Red TTL.
578 References
579 [1] N. Agmon, Multi-robot patrolling and other multi-robot cooperative tasks: An algorithmic approach. Diss, Bar Ilan
580 University, 2009.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 23
581 [2] J. Baldwin and A. Bottoms, The urban criminal: A study in Sheffield, London: Tavistock Publications, 1976.
582 [3] L.V. Baskakova and Y.I. Zhuravlv, Recognition algorithm models with representative sets and supporting sets systems
583 (in Russian), Zh Vichislitielnoi Matematiki i Matematicheskoi Fiziki 21(5) (1981), 12641275.
584 [4] D. Birant and A. Kut, ST-DBSCAN: An algorithm for clustering spatial-temporal data, Data & Knowledge Engineering
585 60(1) (2007), 208221.
586 [5] C. Block, STAC hot-spot areas: A statistical tool for law enforcement decisions, in: Crime analysis through computer
587 mapping, C.R. Block, M. Dabdoub and S. Fregly, eds, Washington, DC: Police Executive Research Forum, 1995, pp.
588 1532.
589 [6] M.N. Bongard, Solving geological problems using recognition programs, Journal Soviet Geology C 6 (1963), 147165.
590 [7] R. Calvo, J.R. de Oliveira, M. Figueiredo and R.A. Romero, Parametric investigation of a distributed strategy for multiple
591 agents systems applied to cooperative tasks, in: Proceedings of the 29th Annual ACM Symposium on Applied Computing
592 (2014), 207212.
593 [8] D.L. Capone and W.W. Nichols, Jr., Urban structure and criminal mobility, American Behavioral Scientist 20(2) (1976),
594 199213.
595 [9] J.A. Carrasco-Ochoa, Representative-sets-based Classifiers, Masters Thesis, CINVESTAV-IPN, Mexico, 1994.
596 [10] E.N. Cheremesina and J. Ruiz-Shulcloper, Cuestiones metodolgicas de la aplicacin de modelos matemticos de Re-
597 conocimiento de Patrones en zonas del conocimiento poco formalizadas, Revista Ciencias Matemticas 13(2) (1992),
598 93108.
599 [11] A.M. Colorni, M. Dorigo and V. Maniezzo, Distributed optimization by ant colonies, actes de la premire confrence
600 europenne sur la vie artificielle, Paris, France, Elsevier Publishing, 1992, pp. 134142.
601 [12] R.M. Kramer and T.R. Tyler, Trust in organizations: Frontiers of theory and research, Sage (1996).
602 [13] N. Cressie, Statistics for spatial data, John Wiley & Sons, 2015.
603 [14] G.B. Dantzig and J.H. Ramser, The truck dispatching problem, Management Science 6(1) (1959), 8091.
604 [15] L.A. De-la-Vega-Doria, Extension to the fuzzy case of the KORA-3 algorithm (in Spanish), Masters Thesis,
605 CINVESTAV-IPN, Mexico, 1994.
606 [16] L.A. De-la-Vega-Doria, J.A. Carrasco-Ochoa and J. Ruiz-Schulcloper, Fuzzy KORA- algorithm, Proceedings of the
607 6th European Congress on Intelligent Techniques and Soft Computing, EUFIT, Aachen, Germany (1998), 710.
608 [17] E.V. Diukova, On a parametric model of KORA based recognition algorithms (in Russian), Soovshenia po prikladmoi
609 matematiki, Russia, 1998.
610 [18] M. Dorigo and L.M. Gambardella, Ant colony system: A cooperative learning approach to the traveling salesman prob-
611 lem, IEEE Transactions on Evolutionary Computation 1(1) (1997), 5366.
612 [19] J.E. Eck and E.R. Maguire, Have changes in policing reduced violent crime? An assessment of the evidence, The Crime
613 Drop in America (2000), 207228.
614 [20] E.S. Fard, K. Monfaredi and M.H. Nadimi, Application methods of ant colony algorithm, Am J Softw Eng Appl 3(2)
615 (2014), 1220.
616 [21] S. Godoy-Caldern, H. Calvo, V.M. Martnez-Hernndez and M.A. Moreno-Armendriz, The CR-+ classification
617 algorithm for spatio-temporal prediction of criminal activity, Journal of Applied Research and Technology 8(1) (2010),
618 523.
619 [22] L. Goldfarb, A new approach to pattern recognition, Progress in Pattern Recognition 2 (1985), 241402.
620 [23] C. Ivaha, H. Al-Madfai, G. Higgs, A. Ware and J. Corcoran, The simple spatial disaggregation approach to spatio-
621 temporal crime forecasting, International Journal of Innovative Computing Information and Control 3(3) (2007), 509
622 523.
623 [24] E.S. Jefferis, A multi-method exploration of crime hot spots: SaTScan results, National Institute of Justice, Crime Map-
624 ping Research Center, (1998).
625 [25] B. Jiang and C. Claramunt, A structural approach to the model generalization of an urban street network, GeoInformatica
626 8(2) (2004), 157171.
627 [26] J.L. LeBeau, The journey to rape: Geographic distance and the rapists method of approaching the victim, Journal of
628 Police Science & Administration (1987).
629 [27] N. Levine, Hot Spot analysis using CrimeStat kernel density interpolation, in: Presentation at the Annual Meeting of
630 the Academy of Criminal Justice Sciences (1998), 1014.
631 [28] L. Liu, ed., Artificial Crime Analysis Systems: Using Computer Simulations and Geographic Information Systems, IGI
632 Global, 2008.
633 [29] V. Maniezzo and A. Colorni, The ant system applied to the quadratic assignment problem, IEEE Transactions on Knowl-
634 edge and Data Engineering 11(5) (1999), 769778.
635 [30] J.F. Martnez-Trinidad and A. Guzmn-Arenas, The logical combinatorial approach to pattern recognition, an overview
636 through selected works, Pattern Recognition 34(4) (2001), 741751.
637 [31] T. Molumby, Patterns of crime in a university housing project, American Behavioral Scientist 20(2) (1976), 247259.
Galley Proof 14/04/2017; 11:14 File: ida883.tex; BOKCTP/xhs p. 24
638 [32] S.V. Nath, Crime pattern detection using data mining, in: Web Intelligence and Intelligent Agent Technology Workshops,
639 2006, WI-IAT 2006 Workshops, 2006 IEEE/WIC/ACM International Conference, IEEE (2006), 4144.
640 [33] O. Newman, Defensible space: Crime prevention through urban design, Ekistics 1 (1973), 325332.
641 [34] D. Portugal and R.P. Rocha, Cooperative multi-robot patrol in an indoor infrastructure, in: Human Behavior Understand-
642 ing in Networked Sensing, Springer International Publishing (2014), 339358.
643 [35] T.A. Repetto, Residential crime, Ballinger, Springfield, IL, 1974.
644 [36] D.K. Rossmo, Target patterns of serial murders: A methodological model, American Journal of Criminal Justice 17(2)
645 (1993), 121.
646 [37] D.K. Rossmo, Targeting victims: Serial killers and the urban environment, Serial and Mass Murder: Theory, Research
647 and Policy (1996), 133153.
648 [38] G. Sanchez-Diaz, G. Diaz-Sanchez, M. Mora-Gonzalez, I. Piza-Davila, C.A. Aguirre-Salado, G. Huerta-Cuellar, O.
649 Reyes-Cardenas and A. Cardenas-Tristan, An evolutionary algorithm with acceleration operator to generate a subset of
650 typical testors, Pattern Recognition Letters 41 (2014), 3442.
651 [39] H.A. Scarr, J.L. Pinsky and D.S. Wyatt, Patterns of burglary, Washington, DC: National Institute of Law Enforcement
652 and Criminal Justice, 1973.
653 [40] A. Shafahi and A. Haghani, Balanced routing of patrolling vehicles focusing on areas with historical crime, in: Trans-
654 portation Research Board 94th Annual Meeting 2015 (2015), (No. 15-4387).
655 [41] T. Sttzle and H.H. Hoos, MAX-MIN ant system, Future Generation Computer Systems 16(8) (2000), 889914.
656 [42] N.E. Toklu, L.M. Gambardella and R. Montemanni, A multiple ant colony system for a vehicle routing problem with
657 time windows and uncertain travel times, Journal of Traffic and Logistics Engineering 2(1) (2014), 5258.
658 [43] T. Watanabe and M. Takamiya, Police patrol routing on network voronoi diagram, Proceedings of the 8th International
659 Conference on Ubiquitous Information Management and Communication, ACM (2014).