DAT 375 Project One
.docx
keyboard_arrow_up
School
Southern New Hampshire University *
*We aren’t endorsed by this school
Course
375
Subject
Industrial Engineering
Date
Dec 6, 2023
Type
docx
Pages
3
Uploaded by HighnessCrownMouse16
Nathan Cumbo
DAT 375 Project One
Data Analysis Process Job Aid
This job aid is intended to be used by newly hired data analysts at the Miami Police
Department in coordination with the SCDR as a direct reference.
In 1984, a man named John Naisbitt once said “We are drowning in information, but
starved for knowledge.”
Nearly 4 decades later, this is still true, if not even moreso. A large
reason for this is not the lack of data, but rather the lack of skilled data analysts in the field.
According to the McKinsey report, there will be a shortage of talent necessary for organizations
to take advantage of big data
(Larose & Larose, 2015). While the use of data mining and
analysis is incredibly useful, we require human direction of data mining to be maximally
efficient and fluent.
Large companies and corporations globally use data mining and predictive analysis to
increase productivity, efficiency, accuracy and precision, leading to less wasted time and fewer
wasted resources. So, what exactly does that entail? Data mining is defined as “the process of
discovering patterns and trends in large data sets.” This is where you, our newly hired data
analysts, come into play.
Let’s talk about what type of analyses we are going to use to create a Storm and Crime
Data Report (SCDR). Since we are using data from a few years back to create a prediction model
of future crime rates, this signifies using historical and predictive analyses; but above all is
exploratory analysis. Predictive analysis is defined as “The process of extracting information
from large data sets in order to make predictions and estimates about future outcomes”, whereas
exploratory data analysis, otherwise known as EDA, allows an analyst to delve into the data set,
examine the interrelationships among the attributes, identify interesting subsets of the
observations, and develop an initial idea of possible associations amongst the predictors, as well
as between the predictors and the target variable.
Next, let’s explore two specific angles of analysis: “Are certain storms connected with an
increase in certain crimes?” and
“Is the severity of a storm correlated to the severity of a crime?”
To answer this, we require both qualitative and quantitative analyses; the quantifiable data will
include the percentage increase in crime rate during storms.
When we run the analysis in MySQL, the script we need to run is one that calls only
events where StormEventID and CrimeEventID are not null, resembling a crime that occurs
during a storm. Then, we need to set further parameters to our search; in this scenario, our search
parameters only include crimes in the city of Miami, Florida, during the month of October in the
year 2019.
After running this script, we see the three most common crimes occurring in this data
collection sample are ‘Violent Crimes’, ‘Burglary’, and ‘Murder and non-negligent
manslaughter’. However, we also need to ensure the validity of this data. This leads to some
important questions: Is only one month enough data to make correlations? Do we need more
time? How valid is this data? In other words, where did this data come from?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help