Skip to content
Toggle navigation
P
Projects
G
Groups
S
Snippets
Help
Dave Foote
/
CAPP30254-1
This project
Loading...
Sign in
Toggle navigation
Go to a project
Project
Repository
Pipelines
Members
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Commit
4ce3f49b
authored
Apr 07, 2019
by
Dave Foote
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
can i push from here?
parents
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
60 additions
and
0 deletions
pa1.py
pa1.py
0 → 100644
View file @
4ce3f49b
'''
Programming Assignment #1: Diagnostic
Dave Foote
'''
import
pandas
as
pd
import
matplotlib
as
plt
from
sodapy
import
Socrata
#data download helpers
def
get_full_dataset
(
data_set_id
):
#goes to Chicago Open Data Portal and downloads full crime dataset for a
#given year
domain
=
'data.cityofchicago.org'
client
=
Socrata
(
domain
,
None
)
offset
=
0
limit
=
50000
new_downloads
=
make_new_downloads
(
data_set_id
,
client
,
limit
,
offset
)
rv
=
[]
while
len
(
new_downloads
)
>
0
:
offset
+=
limit
rv
.
extend
(
new_downloads
)
new_downloads
=
make_new_downloads
(
data_set_id
,
client
,
limit
,
offset
)
return
rv
def
make_new_downloads
(
data_set_id
,
client
,
limit
,
offset
):
#downloads the next 50,000 rows
return
client
.
get
(
data_set_id
,
limit
=
limit
,
offset
=
offset
)
def
create_df
(
list_of_dicts
):
#takes the full list of crime records and makes a df out of them
return
pd
.
DataFrame
(
list_of_dicts
)
#get the data
id_17
=
'd62x-nvdr'
id_18
=
'3i3m-jwuy'
df_17
=
create_df
(
get_full_dataset
(
id_17
))
df_18
=
create_df
(
get_full_dataset
(
id_18
))
#summary statistics:
print
(
'5 Most Common Chicago Crimes in 2017: '
,
df_17
.
primary_type
.
value_counts
()
.
head
())
print
(
'5 Most Common Chicago Crimes in 2018: '
,
df_18
.
primary_type
.
value_counts
()
.
head
())
print
(
'5 Wards with Highest Volume of Crime in 2017: '
,
df_17
.
ward
.
value_counts
()
.
head
())
print
(
'5 Wards with Highest Volume of Crime in 2018: '
,
df_18
.
ward
.
value_counts
()
.
head
())
arrest_rates_17
=
df_17
.
arrest
.
value_counts
()
arrest_rates_18
=
df_18
.
arrest
.
value_counts
()
print
(
'Arrests Per Stop in 2017: '
,
(
arrest_rates_17
[
1
]
/
arrest_rates_17
[
0
]))
print
(
'Arrests Per Stop in 2018: '
,
(
arrest_rates_18
[
1
]
/
arrest_rates_18
[
0
]))
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment