Вы находитесь на странице: 1из 14





Bloomberg found Denver ranked No. 8 for the most-moved-to city

in the United States from April 2020 to October 2020. The analysis

compared the number of people leaving and entering zip codes among

the 174 million U.S. based LinkedIn users. For every one person

that moved out of Denver in that period, 1.34 people moved in, the

study found. Reasons for relocation range from remote office work

rendering working in any one place meaningless to seeking fewer


Nowadays we all know that basic necessity in this time of

pandemic is growing, especially in Denver city where the number of

people increase, so obviously the needs in every neighborhood

increase. Even in this pandemic, crime is still occurring in every

part of Denver, so in establishing a business this is one of the

factors to be consider. Most of the people only get out of their

house to buy foods and grocery to the nearest supermarket or

grocery store, and due to high demand of basic necessity, most of

stores are full of customers and most of people consume a lot of

their time queue outside the store. It may be good reason that a

stakeholder may get interested to open a business place like

Supermarket or Grocery Store in Denver City, Colorado.


This project aims to find a safe and secure location to start a

business place in Denver City, Colorado in this time of pandemic.

The first thing to do choose an area with least crime rate near

the area by analyzing crime data. Next is to get a shortlisting of

neighborhood where same like business is not common on the area.

In this report, we will use data science tools to analyze data

and select the safest place to open a commercial establishment and

explore its neighborhoods and get the 10 most common venues in

each neighborhood where same like business is not common on the



2.1. Data Sources

Most of Data sources acquired in this project comes from Denver

City’s Open Data Portal which is free to use. Other sources as

Wikipedia were also used in this project.

City to be analyzed in this project: Denver City, Colorado

These are the following data needed to extract in order to

answer the business problem:


Crime Data in Denver City – data set from Denver City’s Open

Data Portal (https://www.denvergov.org/) which is free to use.

This data set is needed in order to locate and analyze which

specific area have the least crime rate and the data is saved in

terms of what kind of crime committed as well as the location

indicated in longitude and latitude.

Dataset source:




The dataset containing the crimes committed in Denver from

2015 to 2020. The dataset has almost ~500,000 rows and 19 columns.

Because of that, we will only process dataset from 2019 to 2020,

as we all know that we are getting the dataset since pandemic


These are needed columns to answer the business problem:

• FIRST_OCCURRENCE_DATE - contains the occurrence date of


• OFFENSE_TYPE_ID - contains what type crime committed

• GEO_LON - longitude of the area where the crime committed

• GEO_LAT - latitude of the area where the crime committed

• NEIGHBORHOOD_ID - name of the neighborhood where the crime



List of Neighborhood in Denver City – data set also comes

from Denver City’s Open Data Portal (https://www.denvergov.org/).

In this data set contains the official list of neighborhoods in

Denver that is will be used to map the existing data where each

neighborhood can be assigned with its respective borough.

Dataset source:





Neighborhood's coordinates (Longitude and Latitude) – After

we get the list of neighborhoods in Denver City, we will get the

coordinate of each neighborhood in Denver. To fetch data, we will

need to use Geocoder, then explore the neighborhood by plotting it

on maps using Folium and perform exploratory data analysis.


Shortlisting of commercial establishment along the

neighborhood of Denver with coordinates - This data will be fetched

using Four Square API to explore the neighborhood venues and to

apply machine learning algorithm to cluster the neighborhoods and

present the findings by plotting it on maps using Folium.


Figure 1. Geographical Map of Denver City with Neighborhood

labeled with blue circles

Figure 2. Top Five Neighborhoods with highest crime counts in

Denver City, Colorado

Figure 3. Top Five Neighborhoods with lowest crime counts in

Denver City, Colorado

In figure 2, we can say that the safest city in year 2020 in

terms of crime rate recorded in data. Five points, Capitol Hills,

Stapleton (now Central Park), CBD (Central Business District) and

Montbello. In the result and even in visualization of result, it

is obvious that Stapleton is one of the highest crimes counts due

to Black Lives Matter Movement this September. In 2020, in the

wake of the Black Lives Matter movement and to condemn the legacy

of the former Denver mayor Benjamin F. Stapleton, a member of the

Ku Klux Klan, the community formally changed the name of the

neighborhood from 'Stapleton' to 'Central Park'.

Figure 3, also show the top 5 neighborhoods with lowest crime

count in year 2020. Skyland, Nosedale, Country Club, Indian Creek,

and Wilshire, these are the neighborhood that with lowest crimes

count. Even in this data, it shown where we can know where is the

safest place to establish business, but we still need in depth

analyzation to decide more based on data we gathered.

The City and County of Denver, capital of the U.S. state of

Colorado, has 78 official neighborhoods. In addition to the

official administrative neighborhoods, many residents have names

for local neighborhoods that may not conform to the boundaries of

official neighborhoods. Denver City has 9 boroughs: Central, East,

North, Northeast, Northwest, South, Southeast, Southwest, and

Figure 4. Boroughs in Denver City with Highest Crime counts

Using Foursquare API features to explore near-by

establishments or venue to fetch data frame that contains top most

popular venue around each neighborhood in Denver City.

Figure 5. Using Foursquare API

Due to http request limitations the number of places per

neighborhood parameter is set to 100 and the radius parameter set

to 500.

Data modeling is a process used to define and analyze data

requirements needed to support the business processes within the

scope of corresponding information systems in organizations. In

this project in order to analyze the data that the researchers

gathered, we must model the data in order to identify the

relationships in each data variable we have and also to apply data

model patterns available.

Figure 6. Data Modelling

To compare the similarities of different neighborhood in

Denver city, the researcher decided to explore neighborhoods,

segment them, and group them into clusters to find the top

establishments around each neighborhood. To be able to do that, we

need to cluster data which is a form of unsupervised machine

learning: k-means clustering algorithm.

K-means clustering is one of the simplest and popular

unsupervised machine learning algorithms. the objective of K-means

is simple: group similar data points together and discover

underlying patterns. To achieve this objective, K-means looks for

a fixed number (k) of clusters in a dataset. A cluster refers to

a collection of data points aggregated together because of certain

Figure 7. Using K-means Clustering Approach

Figure 8. K-means Clustering Approach (Most Venue Around each

Neighborhood in Denver City)

Figure 9. The Generated Map by using K-means Clustering Approach


This project show that graphical representation and

interpretation of data can really help us to decide in every

problem we want to answer, in this project we want to know where

is the best place to open a business-like supermarket or grocery

store that sells basic necessity in this time of pandemics. As we

apply and use data science tools and statistical approach, we can

really see how a large data set can means so much. Like in this

project we can see which neighborhood are the least number of

crimes committed and at the same time we also know which borough

and neighborhood have the highest number of crimes this year 2020.
We also know what kind of business a is already established on

every neighborhood. And through statistical method which is k-

means clustering, we cluster different neighborhood in to five

cluster, also visually graphically show it on map, in this manner

a stakeholder will not have hard time to choose where to build a

business, as this project breakdown the best possible choices that

he can choose to decide, and this is not just because of our own

selection, but based on the data set we clean, processed, analyzed

and interpreted.

Вам также может понравиться