Today, we'll use a real dataset and lots of motivating questions to illustrate key DataFrame manipulation techniques.
pandas
¶pandas
.pandas
is the tool for doing data science in Python.pandas
is not so cute...¶babypandas
!¶pandas
called babypandas
.pandas
code.babypandas
🐼¶babypandas
(and pandas
) are called "DataFrames."babypandas
. (We'll need numpy
as well.)import babypandas as bpd
import numpy as np
data/get-it-done-jan-16.csv
contains service requests made on January 16, 2023 through the Get It Done program. We can read in a CSV using bpd.read_csv(...)
. Give it the path to a file relative to your notebook (if the file is in the same folder as your notebook, this is just the name of the file).
jan_16 = bpd.read_csv('data/get-it-done-jan-16.csv')
jan_16
service_request_id | date_requested | neighborhood | service | status | street_address | public_description | |
---|---|---|---|---|---|---|---|
0 | 4065025 | 2023-01-16T00:03:00 | Pacific Beach | Parking | open | 4630 Bayard St | Cars blocking alleyway, not on private propert... |
1 | 4065026 | 2023-01-16T00:33:00 | Otay Mesa-Nestor | Encampment | open | 615 Saturn Blvd, San Diego, CA 92154, USA | There are encampments located behind I - hop, ... |
2 | 4065027 | 2023-01-16T00:37:00 | Otay Mesa-Nestor | Encampment | open | 1862 Palm Ave, San Diego, CA 92154, USA | Encampments with large group of encampment ind... |
3 | 4065028 | 2023-01-16T00:46:00 | Old Town San Diego | Street Sweeping | open | 4400 TAYLOR ST, 92110 | ROCKSLIDE- COVERING BOTH LANES - BOBCAT NEEDED... |
4 | 4065029 | 2023-01-16T00:52:00 | Mid-City:City Heights | Other | open | 3860 Cherokee Ave | House near me has a car that multiple times a... |
... | ... | ... | ... | ... | ... | ... | ... |
1355 | 4066532 | 2023-01-16T22:09:00 | Navajo | Other | closed | 6735 Wandermere Dr | Cars doing donuts in st Dunsteen's church park... |
1356 | 4066544 | 2023-01-16T22:19:00 | Mid-City:City Heights | Other | closed | 3949 33rd Street | Destruction of property. Man camped on 33rd St... |
1357 | 4066547 | 2023-01-16T22:21:00 | Torrey Pines | Pothole | closed | I-5, Del Mar, CA 92014, USA | There is a big pot hole that made me pop my ti... |
1358 | 4066564 | 2023-01-16T22:43:00 | Southeastern San Diego | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole is getting bigger with the rain and yo... |
1359 | 4066565 | 2023-01-16T22:43:00 | Greater Golden Hill | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole on the off-ramp to 30th. Popped my tir... |
1360 rows × 7 columns
type
.'neighborhood'
and 'status'
.# This DataFrame has 1360 rows and 7 columns
jan_16
service_request_id | date_requested | neighborhood | service | status | street_address | public_description | |
---|---|---|---|---|---|---|---|
0 | 4065025 | 2023-01-16T00:03:00 | Pacific Beach | Parking | open | 4630 Bayard St | Cars blocking alleyway, not on private propert... |
1 | 4065026 | 2023-01-16T00:33:00 | Otay Mesa-Nestor | Encampment | open | 615 Saturn Blvd, San Diego, CA 92154, USA | There are encampments located behind I - hop, ... |
2 | 4065027 | 2023-01-16T00:37:00 | Otay Mesa-Nestor | Encampment | open | 1862 Palm Ave, San Diego, CA 92154, USA | Encampments with large group of encampment ind... |
3 | 4065028 | 2023-01-16T00:46:00 | Old Town San Diego | Street Sweeping | open | 4400 TAYLOR ST, 92110 | ROCKSLIDE- COVERING BOTH LANES - BOBCAT NEEDED... |
4 | 4065029 | 2023-01-16T00:52:00 | Mid-City:City Heights | Other | open | 3860 Cherokee Ave | House near me has a car that multiple times a... |
... | ... | ... | ... | ... | ... | ... | ... |
1355 | 4066532 | 2023-01-16T22:09:00 | Navajo | Other | closed | 6735 Wandermere Dr | Cars doing donuts in st Dunsteen's church park... |
1356 | 4066544 | 2023-01-16T22:19:00 | Mid-City:City Heights | Other | closed | 3949 33rd Street | Destruction of property. Man camped on 33rd St... |
1357 | 4066547 | 2023-01-16T22:21:00 | Torrey Pines | Pothole | closed | I-5, Del Mar, CA 92014, USA | There is a big pot hole that made me pop my ti... |
1358 | 4066564 | 2023-01-16T22:43:00 | Southeastern San Diego | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole is getting bigger with the rain and yo... |
1359 | 4066565 | 2023-01-16T22:43:00 | Greater Golden Hill | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole on the off-ramp to 30th. Popped my tir... |
1360 rows × 7 columns
.set_index(column_name)
..set_index
returns a new DataFrame; it does not modify the original DataFrame.jan_16.set_index('service_request_id')
date_requested | neighborhood | service | status | street_address | public_description | |
---|---|---|---|---|---|---|
service_request_id | ||||||
4065025 | 2023-01-16T00:03:00 | Pacific Beach | Parking | open | 4630 Bayard St | Cars blocking alleyway, not on private propert... |
4065026 | 2023-01-16T00:33:00 | Otay Mesa-Nestor | Encampment | open | 615 Saturn Blvd, San Diego, CA 92154, USA | There are encampments located behind I - hop, ... |
4065027 | 2023-01-16T00:37:00 | Otay Mesa-Nestor | Encampment | open | 1862 Palm Ave, San Diego, CA 92154, USA | Encampments with large group of encampment ind... |
4065028 | 2023-01-16T00:46:00 | Old Town San Diego | Street Sweeping | open | 4400 TAYLOR ST, 92110 | ROCKSLIDE- COVERING BOTH LANES - BOBCAT NEEDED... |
4065029 | 2023-01-16T00:52:00 | Mid-City:City Heights | Other | open | 3860 Cherokee Ave | House near me has a car that multiple times a... |
... | ... | ... | ... | ... | ... | ... |
4066532 | 2023-01-16T22:09:00 | Navajo | Other | closed | 6735 Wandermere Dr | Cars doing donuts in st Dunsteen's church park... |
4066544 | 2023-01-16T22:19:00 | Mid-City:City Heights | Other | closed | 3949 33rd Street | Destruction of property. Man camped on 33rd St... |
4066547 | 2023-01-16T22:21:00 | Torrey Pines | Pothole | closed | I-5, Del Mar, CA 92014, USA | There is a big pot hole that made me pop my ti... |
4066564 | 2023-01-16T22:43:00 | Southeastern San Diego | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole is getting bigger with the rain and yo... |
4066565 | 2023-01-16T22:43:00 | Greater Golden Hill | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole on the off-ramp to 30th. Popped my tir... |
1360 rows × 6 columns
jan_16
service_request_id | date_requested | neighborhood | service | status | street_address | public_description | |
---|---|---|---|---|---|---|---|
0 | 4065025 | 2023-01-16T00:03:00 | Pacific Beach | Parking | open | 4630 Bayard St | Cars blocking alleyway, not on private propert... |
1 | 4065026 | 2023-01-16T00:33:00 | Otay Mesa-Nestor | Encampment | open | 615 Saturn Blvd, San Diego, CA 92154, USA | There are encampments located behind I - hop, ... |
2 | 4065027 | 2023-01-16T00:37:00 | Otay Mesa-Nestor | Encampment | open | 1862 Palm Ave, San Diego, CA 92154, USA | Encampments with large group of encampment ind... |
3 | 4065028 | 2023-01-16T00:46:00 | Old Town San Diego | Street Sweeping | open | 4400 TAYLOR ST, 92110 | ROCKSLIDE- COVERING BOTH LANES - BOBCAT NEEDED... |
4 | 4065029 | 2023-01-16T00:52:00 | Mid-City:City Heights | Other | open | 3860 Cherokee Ave | House near me has a car that multiple times a... |
... | ... | ... | ... | ... | ... | ... | ... |
1355 | 4066532 | 2023-01-16T22:09:00 | Navajo | Other | closed | 6735 Wandermere Dr | Cars doing donuts in st Dunsteen's church park... |
1356 | 4066544 | 2023-01-16T22:19:00 | Mid-City:City Heights | Other | closed | 3949 33rd Street | Destruction of property. Man camped on 33rd St... |
1357 | 4066547 | 2023-01-16T22:21:00 | Torrey Pines | Pothole | closed | I-5, Del Mar, CA 92014, USA | There is a big pot hole that made me pop my ti... |
1358 | 4066564 | 2023-01-16T22:43:00 | Southeastern San Diego | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole is getting bigger with the rain and yo... |
1359 | 4066565 | 2023-01-16T22:43:00 | Greater Golden Hill | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole on the off-ramp to 30th. Popped my tir... |
1360 rows × 7 columns
jan_16 = jan_16.set_index('service_request_id')
jan_16
date_requested | neighborhood | service | status | street_address | public_description | |
---|---|---|---|---|---|---|
service_request_id | ||||||
4065025 | 2023-01-16T00:03:00 | Pacific Beach | Parking | open | 4630 Bayard St | Cars blocking alleyway, not on private propert... |
4065026 | 2023-01-16T00:33:00 | Otay Mesa-Nestor | Encampment | open | 615 Saturn Blvd, San Diego, CA 92154, USA | There are encampments located behind I - hop, ... |
4065027 | 2023-01-16T00:37:00 | Otay Mesa-Nestor | Encampment | open | 1862 Palm Ave, San Diego, CA 92154, USA | Encampments with large group of encampment ind... |
4065028 | 2023-01-16T00:46:00 | Old Town San Diego | Street Sweeping | open | 4400 TAYLOR ST, 92110 | ROCKSLIDE- COVERING BOTH LANES - BOBCAT NEEDED... |
4065029 | 2023-01-16T00:52:00 | Mid-City:City Heights | Other | open | 3860 Cherokee Ave | House near me has a car that multiple times a... |
... | ... | ... | ... | ... | ... | ... |
4066532 | 2023-01-16T22:09:00 | Navajo | Other | closed | 6735 Wandermere Dr | Cars doing donuts in st Dunsteen's church park... |
4066544 | 2023-01-16T22:19:00 | Mid-City:City Heights | Other | closed | 3949 33rd Street | Destruction of property. Man camped on 33rd St... |
4066547 | 2023-01-16T22:21:00 | Torrey Pines | Pothole | closed | I-5, Del Mar, CA 92014, USA | There is a big pot hole that made me pop my ti... |
4066564 | 2023-01-16T22:43:00 | Southeastern San Diego | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole is getting bigger with the rain and yo... |
4066565 | 2023-01-16T22:43:00 | Greater Golden Hill | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole on the off-ramp to 30th. Popped my tir... |
1360 rows × 6 columns
.shape
returns the number of rows and columns in a given DataFrame.[]
: .shape[0]
for rows..shape[1]
for columns.# There were 7 columns before, but one of them became the index, and the index is not a column!
jan_16.shape
(1360, 6)
# Number of rows
jan_16.shape[0]
1360
# Number of columns
jan_16.shape[1]
6
data/get-it-done-requests.csv
contains a summary of all Get It Done requests submitted so far in 2023.'closed'
) versus unresolved ('open'
).requests = bpd.read_csv('data/get-it-done-requests.csv')
requests
neighborhood | service | closed | open | |
---|---|---|---|---|
0 | Balboa Park | Dead Animal | 4 | 0 |
1 | Balboa Park | Encampment | 22 | 16 |
2 | Balboa Park | Environmental Services Code Compliance | 0 | 1 |
3 | Balboa Park | Graffiti - Code Enforcement | 0 | 1 |
4 | Balboa Park | Graffiti - Public | 62 | 37 |
... | ... | ... | ... | ... |
1057 | Uptown | Tree Maintenance | 19 | 29 |
1058 | Uptown | Vegetation Encroachment | 1 | 2 |
1059 | Uptown | Waste on Private Property | 3 | 1 |
1060 | Uptown | Weed Cleanup | 1 | 0 |
1061 | Via De La Valle | Pothole | 0 | 2 |
1062 rows × 4 columns
Key concepts: Accessing columns, performing operations with them, and adding new columns.
.get(column_name)
.requests
neighborhood | service | closed | open | |
---|---|---|---|---|
0 | Balboa Park | Dead Animal | 4 | 0 |
1 | Balboa Park | Encampment | 22 | 16 |
2 | Balboa Park | Environmental Services Code Compliance | 0 | 1 |
3 | Balboa Park | Graffiti - Code Enforcement | 0 | 1 |
4 | Balboa Park | Graffiti - Public | 62 | 37 |
... | ... | ... | ... | ... |
1057 | Uptown | Tree Maintenance | 19 | 29 |
1058 | Uptown | Vegetation Encroachment | 1 | 2 |
1059 | Uptown | Waste on Private Property | 3 | 1 |
1060 | Uptown | Weed Cleanup | 1 | 0 |
1061 | Via De La Valle | Pothole | 0 | 2 |
1062 rows × 4 columns
requests.get('closed')
0 4 1 22 2 0 3 0 4 62 .. 1057 19 1058 1 1059 3 1060 1 1061 0 Name: closed, Length: 1062, dtype: int64
requests.get('closed')
0 4 1 22 2 0 3 0 4 62 .. 1057 19 1058 1 1059 3 1060 1 1061 0 Name: closed, Length: 1062, dtype: int64
requests.get('open')
0 0 1 16 2 1 3 1 4 37 .. 1057 29 1058 2 1059 1 1060 0 1061 2 Name: open, Length: 1062, dtype: int64
requests.get('closed') + requests.get('open')
0 4 1 38 2 1 3 1 4 99 .. 1057 48 1058 3 1059 4 1060 1 1061 2 Length: 1062, dtype: int64
.assign(name_of_column=data_in_series)
to assign a Series (or array, or list) to a DataFrame.name_of_column
.requests.assign(
total=requests.get('closed') + requests.get('open')
)
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
0 | Balboa Park | Dead Animal | 4 | 0 | 4 |
1 | Balboa Park | Encampment | 22 | 16 | 38 |
2 | Balboa Park | Environmental Services Code Compliance | 0 | 1 | 1 |
3 | Balboa Park | Graffiti - Code Enforcement | 0 | 1 | 1 |
4 | Balboa Park | Graffiti - Public | 62 | 37 | 99 |
... | ... | ... | ... | ... | ... |
1057 | Uptown | Tree Maintenance | 19 | 29 | 48 |
1058 | Uptown | Vegetation Encroachment | 1 | 2 | 3 |
1059 | Uptown | Waste on Private Property | 3 | 1 | 4 |
1060 | Uptown | Weed Cleanup | 1 | 0 | 1 |
1061 | Via De La Valle | Pothole | 0 | 2 | 2 |
1062 rows × 5 columns
requests
neighborhood | service | closed | open | |
---|---|---|---|---|
0 | Balboa Park | Dead Animal | 4 | 0 |
1 | Balboa Park | Encampment | 22 | 16 |
2 | Balboa Park | Environmental Services Code Compliance | 0 | 1 |
3 | Balboa Park | Graffiti - Code Enforcement | 0 | 1 |
4 | Balboa Park | Graffiti - Public | 62 | 37 |
... | ... | ... | ... | ... |
1057 | Uptown | Tree Maintenance | 19 | 29 |
1058 | Uptown | Vegetation Encroachment | 1 | 2 |
1059 | Uptown | Waste on Private Property | 3 | 1 |
1060 | Uptown | Weed Cleanup | 1 | 0 |
1061 | Via De La Valle | Pothole | 0 | 2 |
1062 rows × 4 columns
requests = requests.assign(
total=requests.get('closed') + requests.get('open')
)
requests
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
0 | Balboa Park | Dead Animal | 4 | 0 | 4 |
1 | Balboa Park | Encampment | 22 | 16 | 38 |
2 | Balboa Park | Environmental Services Code Compliance | 0 | 1 | 1 |
3 | Balboa Park | Graffiti - Code Enforcement | 0 | 1 | 1 |
4 | Balboa Park | Graffiti - Public | 62 | 37 | 99 |
... | ... | ... | ... | ... | ... |
1057 | Uptown | Tree Maintenance | 19 | 29 | 48 |
1058 | Uptown | Vegetation Encroachment | 1 | 2 | 3 |
1059 | Uptown | Waste on Private Property | 3 | 1 | 4 |
1060 | Uptown | Weed Cleanup | 1 | 0 | 1 |
1061 | Via De La Valle | Pothole | 0 | 2 | 2 |
1062 rows × 5 columns
Key concept: Computing statistics of columns using Series methods.
Series, like arrays, have helpful methods, including .min()
, .max()
, and .mean()
.
requests.get('total').max()
842
What is it that people are reporting so frequently, and where? We'll see how to find out shortly!
Other statistics:
requests.get('total').mean()
17.718455743879474
requests.get('total').median()
5.0
requests.get('open').mean()
10.002824858757062
requests.get('open').median()
2.0
Key concepts: Sorting. Accessing using integer positions.
.sort_values(by=column_name)
method to sort.by=
can be omitted, but helps readability.requests.sort_values(by='total')
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
483 | Miramar Ranch North | Weed Cleanup | 1 | 0 | 1 |
627 | Old Town San Diego | Missed Collection | 0 | 1 | 1 |
876 | Serra Mesa | Street Light Maintenance | 0 | 1 | 1 |
167 | Downtown | Right-of-Way Code Enforcement | 0 | 1 | 1 |
866 | Serra Mesa | Environmental Services Code Compliance | 1 | 0 | 1 |
... | ... | ... | ... | ... | ... |
916 | Southeastern San Diego | Illegal Dumping | 338 | 50 | 388 |
108 | Clairemont Mesa | Pothole | 46 | 381 | 427 |
173 | Downtown | Street Light Maintenance | 10 | 420 | 430 |
318 | Mid-City:City Heights | Illegal Dumping | 451 | 31 | 482 |
155 | Downtown | Encampment | 315 | 527 | 842 |
1062 rows × 5 columns
This sorts, but in ascending order (small to large). We want the opposite!
.sort_values(by=column_name, ascending=False)
to sort in descending order.ascending
is an optional argument. If omitted, it will be set to True
by default.ascending=
.ordered_requests = requests.sort_values(by='total', ascending=False)
ordered_requests
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
155 | Downtown | Encampment | 315 | 527 | 842 |
318 | Mid-City:City Heights | Illegal Dumping | 451 | 31 | 482 |
173 | Downtown | Street Light Maintenance | 10 | 420 | 430 |
108 | Clairemont Mesa | Pothole | 46 | 381 | 427 |
916 | Southeastern San Diego | Illegal Dumping | 338 | 50 | 388 |
... | ... | ... | ... | ... | ... |
876 | Serra Mesa | Street Light Maintenance | 0 | 1 | 1 |
878 | Serra Mesa | Traffic Engineering | 0 | 1 | 1 |
470 | Mira Mesa | Waste on Private Property | 0 | 1 | 1 |
880 | Serra Mesa | Traffic Signal Issue | 1 | 0 | 1 |
531 | Mission Valley | Sidewalk Repair Issue | 0 | 1 | 1 |
1062 rows × 5 columns
# We must specify the role of False by using ascending=,
# otherwise Python does not know how to interpret this.
ordered_requests = requests.sort_values(by='total', False)
ordered_requests
File "/var/folders/ch/hyjw6whx3g9gshnp58738jc80000gp/T/ipykernel_12154/1443585216.py", line 3 ordered_requests = requests.sort_values(by='total', False) ^ SyntaxError: positional argument follows keyword argument
'Encampment'
in 'Downtown'
, but how do we extract that information using code?.iloc[integer_position]
.ordered_requests
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
155 | Downtown | Encampment | 315 | 527 | 842 |
318 | Mid-City:City Heights | Illegal Dumping | 451 | 31 | 482 |
173 | Downtown | Street Light Maintenance | 10 | 420 | 430 |
108 | Clairemont Mesa | Pothole | 46 | 381 | 427 |
916 | Southeastern San Diego | Illegal Dumping | 338 | 50 | 388 |
... | ... | ... | ... | ... | ... |
876 | Serra Mesa | Street Light Maintenance | 0 | 1 | 1 |
878 | Serra Mesa | Traffic Engineering | 0 | 1 | 1 |
470 | Mira Mesa | Waste on Private Property | 0 | 1 | 1 |
880 | Serra Mesa | Traffic Signal Issue | 1 | 0 | 1 |
531 | Mission Valley | Sidewalk Repair Issue | 0 | 1 | 1 |
1062 rows × 5 columns
ordered_requests.get('neighborhood')
155 Downtown 318 Mid-City:City Heights 173 Downtown 108 Clairemont Mesa 916 Southeastern San Diego ... 876 Serra Mesa 878 Serra Mesa 470 Mira Mesa 880 Serra Mesa 531 Mission Valley Name: neighborhood, Length: 1062, dtype: object
ordered_requests.get('neighborhood').iloc[0]
'Downtown'
ordered_requests.get('service').iloc[0]
'Encampment'
Key concept: Accessing using row labels.
On January 16, you submitted service request 4065843. Has the issue been resolved?
This cannot be answered from the annual summary data, but must be answered from the detailed data about January 16.
jan_16
date_requested | neighborhood | service | status | street_address | public_description | |
---|---|---|---|---|---|---|
service_request_id | ||||||
4065025 | 2023-01-16T00:03:00 | Pacific Beach | Parking | open | 4630 Bayard St | Cars blocking alleyway, not on private propert... |
4065026 | 2023-01-16T00:33:00 | Otay Mesa-Nestor | Encampment | open | 615 Saturn Blvd, San Diego, CA 92154, USA | There are encampments located behind I - hop, ... |
4065027 | 2023-01-16T00:37:00 | Otay Mesa-Nestor | Encampment | open | 1862 Palm Ave, San Diego, CA 92154, USA | Encampments with large group of encampment ind... |
4065028 | 2023-01-16T00:46:00 | Old Town San Diego | Street Sweeping | open | 4400 TAYLOR ST, 92110 | ROCKSLIDE- COVERING BOTH LANES - BOBCAT NEEDED... |
4065029 | 2023-01-16T00:52:00 | Mid-City:City Heights | Other | open | 3860 Cherokee Ave | House near me has a car that multiple times a... |
... | ... | ... | ... | ... | ... | ... |
4066532 | 2023-01-16T22:09:00 | Navajo | Other | closed | 6735 Wandermere Dr | Cars doing donuts in st Dunsteen's church park... |
4066544 | 2023-01-16T22:19:00 | Mid-City:City Heights | Other | closed | 3949 33rd Street | Destruction of property. Man camped on 33rd St... |
4066547 | 2023-01-16T22:21:00 | Torrey Pines | Pothole | closed | I-5, Del Mar, CA 92014, USA | There is a big pot hole that made me pop my ti... |
4066564 | 2023-01-16T22:43:00 | Southeastern San Diego | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole is getting bigger with the rain and yo... |
4066565 | 2023-01-16T22:43:00 | Greater Golden Hill | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole on the off-ramp to 30th. Popped my tir... |
1360 rows × 6 columns
Your service request is buried in the middle of the DataFrame. Only the first few rows and last few rows are shown, so you can't tell just by looking at the DataFrame.
To pull out one particular entry of a DataFrame corresponding to a row and column with certain labels:
.get(column_name)
to extract the entire column as a Series..loc[]
to access the element of a Series with a particular row label.In this class, we'll always first access a column, then a row (but row, then column is also possible).
jan_16.get('status')
service_request_id 4065025 open 4065026 open 4065027 open 4065028 open 4065029 open ... 4066532 closed 4066544 closed 4066547 closed 4066564 closed 4066565 closed Name: status, Length: 1360, dtype: object
jan_16.get('status').loc[4065843]
'open'
Oh no, your service request 4065843 has still not been resolved! What was the problem again?
Write one line of code that evaluates to the full description of the problem, as you described it in your service request.
...
Ellipsis
.loc[]
uses the row label..iloc[]
uses the integer position..loc[]
is more convenient, but each is best for different scenarios.bpd.read_csv
.bpd.read_csv('data/get-it-done-jan-16.csv')
service_request_id | date_requested | neighborhood | service | status | street_address | public_description | |
---|---|---|---|---|---|---|---|
0 | 4065025 | 2023-01-16T00:03:00 | Pacific Beach | Parking | open | 4630 Bayard St | Cars blocking alleyway, not on private propert... |
1 | 4065026 | 2023-01-16T00:33:00 | Otay Mesa-Nestor | Encampment | open | 615 Saturn Blvd, San Diego, CA 92154, USA | There are encampments located behind I - hop, ... |
2 | 4065027 | 2023-01-16T00:37:00 | Otay Mesa-Nestor | Encampment | open | 1862 Palm Ave, San Diego, CA 92154, USA | Encampments with large group of encampment ind... |
3 | 4065028 | 2023-01-16T00:46:00 | Old Town San Diego | Street Sweeping | open | 4400 TAYLOR ST, 92110 | ROCKSLIDE- COVERING BOTH LANES - BOBCAT NEEDED... |
4 | 4065029 | 2023-01-16T00:52:00 | Mid-City:City Heights | Other | open | 3860 Cherokee Ave | House near me has a car that multiple times a... |
... | ... | ... | ... | ... | ... | ... | ... |
1355 | 4066532 | 2023-01-16T22:09:00 | Navajo | Other | closed | 6735 Wandermere Dr | Cars doing donuts in st Dunsteen's church park... |
1356 | 4066544 | 2023-01-16T22:19:00 | Mid-City:City Heights | Other | closed | 3949 33rd Street | Destruction of property. Man camped on 33rd St... |
1357 | 4066547 | 2023-01-16T22:21:00 | Torrey Pines | Pothole | closed | I-5, Del Mar, CA 92014, USA | There is a big pot hole that made me pop my ti... |
1358 | 4066564 | 2023-01-16T22:43:00 | Southeastern San Diego | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole is getting bigger with the rain and yo... |
1359 | 4066565 | 2023-01-16T22:43:00 | Greater Golden Hill | Pothole | closed | CA-94, San Diego, CA 92102, USA | Pothole on the off-ramp to 30th. Popped my tir... |
1360 rows × 7 columns
bpd.read_csv('data/get-it-done-jan-16.csv').get('public_description').loc[561]
'Sidewalk covered in mud'
bpd.read_csv('data/get-it-done-jan-16.csv').get('public_description').iloc[561]
'Sidewalk covered in mud'
requests.get('open').max()
.jan_16.shape[0]
.jan_16.sort_values(by='date_requested', ascending=False).get('public_description').iloc[0]
.Moving forward, let's just focus on the requests
DataFrame. As a reminder, here's what it looks like:
requests
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
0 | Balboa Park | Dead Animal | 4 | 0 | 4 |
1 | Balboa Park | Encampment | 22 | 16 | 38 |
2 | Balboa Park | Environmental Services Code Compliance | 0 | 1 | 1 |
3 | Balboa Park | Graffiti - Code Enforcement | 0 | 1 | 1 |
4 | Balboa Park | Graffiti - Public | 62 | 37 | 99 |
... | ... | ... | ... | ... | ... |
1057 | Uptown | Tree Maintenance | 19 | 29 | 48 |
1058 | Uptown | Vegetation Encroachment | 1 | 2 | 3 |
1059 | Uptown | Waste on Private Property | 3 | 1 | 4 |
1060 | Uptown | Weed Cleanup | 1 | 0 | 1 |
1061 | Via De La Valle | Pothole | 0 | 2 | 2 |
1062 rows × 5 columns
'Tree Maintenance'
requests?'University'
neighborhood (near UCSD)?'Downtown'
neighborhood, how many open service requests are there?The common thread between these questions is that they all involve only a subset of the rows in our DataFrame.
'Tree Maintenance'
requests? 🌳¶Key concept: Querying
'Tree Maintenance'
requests if we had a DataFrame consisting of only these type of requests.'total'
column in descending order, then extract the neighborhood name in the first row.requests[requests.get('service') == 'Tree Maintenance']
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
21 | Balboa Park | Tree Maintenance | 2 | 7 | 9 |
41 | Barrio Logan | Tree Maintenance | 1 | 2 | 3 |
57 | Black Mountain Ranch | Tree Maintenance | 2 | 0 | 2 |
95 | Carmel Valley | Tree Maintenance | 6 | 2 | 8 |
122 | Clairemont Mesa | Tree Maintenance | 14 | 15 | 29 |
... | ... | ... | ... | ... | ... |
974 | Torrey Highlands | Tree Maintenance | 0 | 1 | 1 |
985 | Torrey Hills | Tree Maintenance | 1 | 1 | 2 |
1006 | Torrey Pines | Tree Maintenance | 1 | 2 | 3 |
1029 | University | Tree Maintenance | 5 | 9 | 14 |
1057 | Uptown | Tree Maintenance | 19 | 29 | 48 |
43 rows × 5 columns
Use ==
to check equality. Not =
, as that's for assignment of a value to a variable.
'Tree Maintenance' == 'Weed Cleanup'
False
'Tree Maintenance' == 'Tree Maintenance'
True
We can broadcast the equality check to each element of a Series. The comparison happens element-wise.
requests.get('service') == 'Tree Maintenance'
0 False 1 False 2 False 3 False 4 False ... 1057 True 1058 False 1059 False 1060 False 1061 False Name: service, Length: 1062, dtype: bool
True
or False
.bool
is a data type in Python, just like int
, float
, and str
. True
or False
.5 == 6
False
True
True
There are several types of comparisons we can make.
symbol | meaning |
---|---|
== |
equal to |
!= |
not equal to |
< |
less than |
<= |
less than or equal to |
> |
greater than |
>= |
greater than or equal to |
When comparing an entire Series to a value, the result is a Series of bool
s.
requests.get('service') == 'Tree Maintenance'
0 False 1 False 2 False 3 False 4 False ... 1057 True 1058 False 1059 False 1060 False 1061 False Name: service, Length: 1062, dtype: bool
To select only certain rows of requests
:
True
s (keep) and False
s (toss), usually by making a comparison.requests[sequence_goes_here]
.requests[requests.get('service') == 'Tree Maintenance']
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
21 | Balboa Park | Tree Maintenance | 2 | 7 | 9 |
41 | Barrio Logan | Tree Maintenance | 1 | 2 | 3 |
57 | Black Mountain Ranch | Tree Maintenance | 2 | 0 | 2 |
95 | Carmel Valley | Tree Maintenance | 6 | 2 | 8 |
122 | Clairemont Mesa | Tree Maintenance | 14 | 15 | 29 |
... | ... | ... | ... | ... | ... |
974 | Torrey Highlands | Tree Maintenance | 0 | 1 | 1 |
985 | Torrey Hills | Tree Maintenance | 1 | 1 | 2 |
1006 | Torrey Pines | Tree Maintenance | 1 | 2 | 3 |
1029 | University | Tree Maintenance | 5 | 9 | 14 |
1057 | Uptown | Tree Maintenance | 19 | 29 | 48 |
43 rows × 5 columns
This time, we'll find the neighborhoods and services with at least one open request.
requests
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
0 | Balboa Park | Dead Animal | 4 | 0 | 4 |
1 | Balboa Park | Encampment | 22 | 16 | 38 |
2 | Balboa Park | Environmental Services Code Compliance | 0 | 1 | 1 |
3 | Balboa Park | Graffiti - Code Enforcement | 0 | 1 | 1 |
4 | Balboa Park | Graffiti - Public | 62 | 37 | 99 |
... | ... | ... | ... | ... | ... |
1057 | Uptown | Tree Maintenance | 19 | 29 | 48 |
1058 | Uptown | Vegetation Encroachment | 1 | 2 | 3 |
1059 | Uptown | Waste on Private Property | 3 | 1 | 4 |
1060 | Uptown | Weed Cleanup | 1 | 0 | 1 |
1061 | Via De La Valle | Pothole | 0 | 2 | 2 |
1062 rows × 5 columns
requests.get('open') > 1
0 False 1 True 2 False 3 False 4 True ... 1057 True 1058 True 1059 False 1060 False 1061 True Name: open, Length: 1062, dtype: bool
requests[requests.get('open') > 1]
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
1 | Balboa Park | Encampment | 22 | 16 | 38 |
4 | Balboa Park | Graffiti - Public | 62 | 37 | 99 |
5 | Balboa Park | Illegal Dumping | 3 | 4 | 7 |
6 | Balboa Park | Other | 3 | 4 | 7 |
7 | Balboa Park | Parking | 2 | 2 | 4 |
... | ... | ... | ... | ... | ... |
1055 | Uptown | Traffic Signal Timing | 4 | 2 | 6 |
1056 | Uptown | Trash/Recycling Collection | 1 | 2 | 3 |
1057 | Uptown | Tree Maintenance | 19 | 29 | 48 |
1058 | Uptown | Vegetation Encroachment | 1 | 2 | 3 |
1061 | Via De La Valle | Pothole | 0 | 2 | 2 |
644 rows × 5 columns
'Tree Maintenance'
requests?¶Strategy:
'Tree Maintenance'
requests.'total'
in descending order.'neighborhood'
column.tree_maintenance_only = requests[requests.get('service') == 'Tree Maintenance']
tree_maintenance_sorted = tree_maintenance_only.sort_values(by='total', ascending=False)
tree_maintenance_sorted
neighborhood | service | closed | open | total | |
---|---|---|---|---|---|
935 | Southeastern San Diego | Tree Maintenance | 12 | 44 | 56 |
596 | North Park | Tree Maintenance | 24 | 30 | 54 |
1057 | Uptown | Tree Maintenance | 19 | 29 | 48 |
179 | Downtown | Tree Maintenance | 15 | 28 | 43 |
712 | Pacific Beach | Tree Maintenance | 22 | 15 | 37 |
... | ... | ... | ... | ... | ... |
497 | Mission Bay Park | Tree Maintenance | 0 | 1 | 1 |
963 | Tijuana River Valley | Tree Maintenance | 0 | 1 | 1 |
974 | Torrey Highlands | Tree Maintenance | 0 | 1 | 1 |
638 | Old Town San Diego | Tree Maintenance | 1 | 0 | 1 |
518 | Mission Beach | Tree Maintenance | 1 | 0 | 1 |
43 rows × 5 columns
tree_maintenance_sorted.get('neighborhood').iloc[0]
'Southeastern San Diego'
requests[requests.get('service') == 'Car Maintenance']
neighborhood | service | closed | open | total |
---|
Which expression below evaluates to the total number of service requests in the 'Downtown'
neighborhood?
A. requests[requests.get('neighborhood') == 'Downtown'].get('total').sum()
B. requests.get('total').sum()[requests.get('neighborhood') == 'Downtown']
C. requests['Downtown'].get('total').sum()
D. More than one of the above.
...
Ellipsis
Question: What is the most commonly requested service in the 'University'
neighborhood (near UCSD)?
Write one line of code that evaluates to the answer.
...
Ellipsis
babypandas
notes, and the babypandas
documentation when working on assignments..groupby
, for organizing rows of a DataFrame with the same value in a particular column.