# Run this cell to set up packages for lecture.
from lec10_imports import *
Announcements¶
- Homework 2 is due tomorrow at 11:59PM.
- Lab 3 is due Saturday at 11:59PM.
- Quiz 2 scores are released!
- Focus on improvement! Two of six quiz scores are dropped, so it's ok to make mistakes.
- Scored below 20/24? Meet with a tutor one-on-one to review your mistakes and brainstorm strategies for improvement. Sign up here.
Midterm Project released, due Thursday, February 15th at 11:59PM¶
- In the project, you'll explore Taylor Swift's music and lyrics and implement some fun tools. You'll make a song recommender that suggests the Taylor Swift songs that are similar to your favorite song. Here's a sneak peek:
Agenda¶
- Booleans.
- Conditional statements (i.e.
if
-statements). - Iteration (i.e.
for
-loops).
Note:
- We've finished introducing new DataFrame manipulation techniques.
- Today we'll cover some foundational programming tools, which will be very relevant as we start to cover more ideas in statistics in the second half of the class.
Booleans¶
Recap: Booleans¶
bool
is a data type in Python, just likeint
,float
, andstr
.- It stands for "Boolean", named after George Boole, an early mathematician.
- There are only two possible Boolean values:
True
orFalse
.- Yes or no.
- On or off.
- 1 or 0.
- Comparisons result in Boolean values.
dept = 'DSC'
course = 10
course < 20
True
type(course < 20)
bool
The in
operator¶
Sometimes, we'll want to check if a particular element is in a list/array, or a particular substring is in a string. The in
operator can do this for us, and it also results in a Boolean value.
course in [10, 20, 30]
True
'DS' in dept
True
'DS' in 'Data Science'
False
Boolean operators; not
¶
There are three operators that allow us to perform arithmetic with Booleans – not
, and
, and or
.
not
flips True
↔️ False
.
dept == 'DSC'
True
not dept == 'DSC'
False
The and
operator¶
The and
operator is placed between two bool
s. It is True
if both are True
; otherwise, it's False
.
80 < 30 and course < 20
False
80 > 30 and course < 20
True
The or
operator¶
The or
operator is placed between two bool
s. It is True
if at least one is True
; otherwise, it's False
.
course in [10, 20, 30, 80] or type(course) == str
True
# Both are True!
course in [10, 20, 30, 80] or type(course) == int
True
# Both are False!
course == 80 or type(course) == str
False
course == 10 or (dept == 'DSC' and dept == 'CSE')
True
# Different meaning!
(course == 10 or dept == 'DSC') and dept == 'CSE'
False
# With no parentheses, "and" has precedence.
course == 10 or dept == 'DSC' and dept == 'CSE'
True
Note: &
and |
vs. and
and or
¶
- Use the
&
and|
operators between two Series. Arithmetic will be done element-wise (separately for each row).- This is relevant when writing DataFrame queries, e.g.
courses[(courses.get('dept') == 'DSC') & (courses.get('course') == 10)]
.
- This is relevant when writing DataFrame queries, e.g.
- Use the
and
andor
operators between two individual Booleans.- e.g.
dept == 'DSC' and course == 10
.
- e.g.
Conditionals¶
if
-statements¶
- Often, we'll want to run a block of code only if a particular conditional expression is
True
. - The syntax for this is as follows (don't forget the colon!):
if <condition>:
<body>
- Indentation matters!
capstone = 'finished'
capstone
'finished'
if capstone == 'finished':
print('Looks like you are ready to graduate!')
Looks like you are ready to graduate!
else
¶
If you want to do something else if the specified condition is False
, use the else
keyword.
capstone = 'finished'
capstone
'finished'
if capstone == 'finished':
print('Looks like you are ready to graduate!')
else:
print('Before you graduate, you need to finish your capstone project.')
Looks like you are ready to graduate!
elif
¶
- What if we want to check more than one condition? Use
elif
. elif
: if the specified condition isFalse
, check the next condition.- If that condition is
False
, check the next condition, and so on, until we see aTrue
condition.- After seeing a
True
condition, it evaluates the indented code and stops.
- After seeing a
- If none of the conditions are
True
, theelse
body is run.
capstone = 'in progress'
units = 123
if capstone == 'finished' and units >= 180:
print('Looks like you are ready to graduate!')
elif capstone != 'finished' and units < 180:
print('Before you graduate, you need to finish your capstone project and take',
180 - units, 'more units.')
elif units >= 180:
print('Before you graduate, you need to finish your capstone project.')
else:
print('Before you graduate, you need to take', 180 - units, 'more units.')
Before you graduate, you need to finish your capstone project and take 57 more units.
What if we use if
instead of elif
?
if capstone == 'finished' and units >= 180:
print('Looks like you are ready to graduate!')
if capstone != 'finished' and units < 180:
print('Before you graduate, you need to finish your capstone project and take',
180 - units, 'more units.')
if units >= 180:
print('Before you graduate, you need to finish your capstone project.')
else:
print('Before you graduate, you need to take', 180 - units, 'more units.')
Before you graduate, you need to finish your capstone project and take 57 more units. Before you graduate, you need to take 57 more units.
Example: Percentage to letter grade¶
Below, complete the implementation of the function, grade_converter
, which takes in a percentage grade (grade
) and returns the corresponding letter grade, according to this table:
Letter | Range |
---|---|
A | [90, 100] |
B | [80, 90) |
C | [70, 80) |
D | [60, 70) |
F | [0, 60) |
Your function should work on these examples:
>>> grade_converter(84)
'B'
>>> grade_converter(60)
'D'
✅ Click here to see the solution after you've tried it yourself.
def grade_converter(grade): if grade >= 90: return 'A' elif grade >= 80: return 'B' elif grade >= 70: return 'C' elif grade >= 60: return 'D' else: return 'F'
def grade_converter(grade):
...
grade_converter(84)
grade_converter(60)
Extra Practice¶
def mystery(a, b):
if (a + b > 4) and (b > 0):
return 'bear'
elif (a * b >= 4) or (b < 0):
return 'triton'
else:
return 'bruin'
Without running code:
- What does
mystery(2, 2)
return? - Find inputs so that calling
mystery
will produce'bruin'
.
def mystery(a, b):
if (a + b > 4) and (b > 0):
return 'bear'
elif (a * b >= 4) or (b < 0):
return 'triton'
else:
return 'bruin'
Iteration¶
for
-loops¶
import time
print('Launching in...')
for x in [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]:
print('t-minus', x)
time.sleep(0.5) # Pauses for half a second.
print('Blast off! 🚀')
Launching in... t-minus 10 t-minus 9 t-minus 8 t-minus 7 t-minus 6 t-minus 5 t-minus 4 t-minus 3 t-minus 2 t-minus 1 Blast off! 🚀
for
-loops¶
- Loops allow us to repeat the execution of code. There are two types of loops in Python; the
for
-loop is one of them. - The syntax of a
for
-loop is as follows:
for <element> in <sequence>:
<for body>
- Read this as: "for each element of this sequence, repeat this code."
- Lists, arrays, and strings are all examples of sequences.
- Like with
if
-statements, indentation matters!
Example: Squares¶
num = 4
print(num, 'squared is', num ** 2)
num = 2
print(num, 'squared is', num ** 2)
num = 1
print(num, 'squared is', num ** 2)
num = 3
print(num, 'squared is', num ** 2)
4 squared is 16 2 squared is 4 1 squared is 1 3 squared is 9
# The loop variable can be anything!
list_of_numbers = [4, 2, 1, 3]
for num in list_of_numbers:
print(num, 'squared is', num ** 2)
4 squared is 16 2 squared is 4 1 squared is 1 3 squared is 9
The line print(num, 'squared is', num ** 2)
is run four times:
- On the first iteration,
num
is 4. - On the second iteration,
num
is 2. - On the third iteration,
num
is 1. - On the fourth iteration,
num
is 3.
This happens, even though there is no num =
anywhere.
Activity¶
Using the array colleges
, write a for
-loop that prints:
Revelle College
John Muir College
Thurgood Marshall College
Earl Warren College
Eleanor Roosevelt College
Sixth College
Seventh College
Eighth College
✅ Click here to see the solution after you've tried it yourself.
for college in colleges: print(college + ' College')
colleges = np.array(['Revelle', 'John Muir', 'Thurgood Marshall',
'Earl Warren', 'Eleanor Roosevelt', 'Sixth', 'Seventh', 'Eighth'])
...
Ellipsis
Ranges¶
- Recall, each element of a list/array has a numerical position.
- The position of the first element is 0, the position of the second element is 1, etc.
- We can write a
for
-loop that accesses each element in an array by using its position. np.arange
will come in handy.
actions = np.array(['ate', 'slept', 'ran'])
feelings = np.array(['content 🙂', 'energized 😃', 'exhausted 😓'])
len(actions)
3
for i in np.arange(len(actions)):
print(i)
0 1 2
for i in np.arange(len(actions)):
print('I', actions[i], 'and I felt', feelings[i])
I ate and I felt content 🙂 I slept and I felt energized 😃 I ran and I felt exhausted 😓
Example: Goldilocks and the Three Bears¶
We don't have to use the loop variable!
for i in np.arange(3):
print('🐻')
print('👧🏼')
🐻 🐻 🐻 👧🏼
Randomization and iteration¶
- In the next few lectures, we'll learn how to simulate random events, like flipping a coin.
- Often, we will:
- Run an experiment, e.g. "flip 10 coins."
- Compute some statistic, e.g. "number of heads," and write it down somewhere.
- Repeat steps 1 and 2 many, many times using a
for
-loop.
np.append
¶
- This function takes two inputs:
- An array.
- An element to add on to the end of the array.
- It returns a new array. It does not modify the input array.
- We typically use it like this to extend an array by one element:
name_of_array = np.append(name_of_array, element_to_add)
- ⚠️ Remember to store the result!
some_array = np.array([])
np.append(some_array, 'hello')
array(['hello'], dtype='<U32')
some_array
array([], dtype=float64)
# Need to save the new array!
some_array = np.append(some_array, 'hello')
some_array
array(['hello'], dtype='<U32')
some_array = np.append(some_array, 'there')
some_array
array(['hello', 'there'], dtype='<U32')
Example: Coin flipping¶
The function flip(n)
flips n
fair coins and returns the number of heads it saw. (Don't worry about how it works for now.)
def flip(n):
'''Returns the number of heads in n simulated coin flips, using randomness.'''
return np.random.multinomial(n, [0.5, 0.5])[0]
# Run this cell a few times – you'll see different results!
flip(10)
2
Let's repeat the act of flipping 10 coins, 10000 times.
- Each time, we'll use the
flip
function to flip 10 coins and compute the number of heads we saw. - We'll store these numbers in an array,
heads_array
. - Every time we use our
flip
function to flip 10 coins, we'll add an element to the end ofheads_array
.
# heads_array starts empty – before the simulation, we haven't flipped any coins!
heads_array = np.array([])
for i in np.arange(10000):
# Flip 10 coins and count the number of heads.
num_heads = flip(10)
# Add the number of heads seen to heads_array.
heads_array = np.append(heads_array, num_heads)
Now, heads_array
contains 10000 numbers, each corresponding to the number of heads in 10 simulated coin flips.
heads_array
array([3., 3., 6., ..., 6., 6., 6.])
len(heads_array)
10000
(bpd.DataFrame().assign(num_heads=heads_array)
.plot(kind='hist', density=True, bins=np.arange(0, 12), ec='w', legend=False,
title = 'Distribution of the number of heads in 10 coin flips')
);
The accumulator pattern¶
- To store our results, we'll typically use an
int
or an array. - If using an
int
, we define anint
variable (usually to0
) before the loop, then use+
to add to it inside the loop.- Think of this like using a tally.
- If using an array, we create an array (usually empty) before the loop, then use
np.append
to add to it inside the loop.- Think of this like writing the results on a piece of paper.
- This pattern – of repeatedly adding to an
int
or an array – is called the accumulator pattern.
for
-loops in DSC 10¶
Almost every
for
-loop in DSC 10 will use the accumulator pattern.Do not use
for
-loops to perform mathematical operations on every element of an array or Series.- Instead use DataFrame manipulations and built-in array or Series methods.
Helpful video 🎥: For Loops (and when not to use them) in DSC 10.
Working with strings¶
String are sequences, so we can iterate over them, too!
for letter in 'uc san diego':
print(letter.upper())
U C S A N D I E G O
'california'.count('a')
2
Example: Vowel count¶
Below, complete the implementation of the function vowel_count
, which returns the number of vowels in the input string s
(including repeats). Example behavior is shown below.
>>> vowel_count('king triton')
3
>>> vowel_count('i go to uc san diego')
8
✅ Click here to see the solution after you've tried it yourself.
def vowel_count(s): # We need to keep track of the number of vowels seen so far. Before we start, we've seen zero vowels. number = 0 # For each of the 5 vowels: for vowel in 'aeiou': # Count the number of occurrences of this vowel in s. num_vowel = s.count(vowel) # Add this count to the variable number. number = number + num_vowel # Once we've gotten through all 5 vowels, return the answer. return number
def vowel_count(s):
# We need to keep track of the number of vowels seen so far. Before we start, we've seen zero vowels.
number = 0
# For each of the 5 vowels:
# Count the number of occurrences of this vowel in s.
# Add this count to the variable number.
# Once we've gotten through all 5 vowels, return the answer.
vowel_count('king triton')
vowel_count('i go to uc san diego')
Summary, next time¶
Summary¶
if
-statements allow us to run pieces of code depending on whether certain conditions areTrue
.for
-loops are used to repeat the execution of code for every element of a sequence.- Lists, arrays, and strings are examples of sequences.
Next time¶
- Probability.
- A math lesson – no code!