Lecture 3 – Strings, Lists, and Arrays

DSC 10, Fall 2023

Announcements

Resources 🤝

Agenda

Data types

int and float

int

float

The pitfalls of float

Converting between int and float

Strings 🧶

Strings 🧶

String arithmetic

When using the + symbol between two strings, the operation is called "concatenation".

String methods

Type conversion to and from strings

Concept Check ✅ – Answer at cc.dsc10.com

Assume you have run the following statements:

x = 3
y = '4'
z = '5.6'

Choose the expression that will be evaluated without an error.

A. x + y

B. x + int(y + z)

C. str(x) + int(y)

D. str(x) + z

E. All of them have errors

Means and medians

Describing numerical data

The mean (i.e. average)

The mean is a one-number summary of a collection of numbers.

For example, the mean of $1$, $4$, $7$, and $12$ is $\frac{1 + 4 + 7 + 12}{4} = 6$.

Observe that the mean:

The median

Like the mean, the median is a one-number summary of a collection of numbers.

Mean vs. median

Activity

  1. Find two different datasets that have the same mean and different medians.

  2. Find two different datasets that have the same median and different means.

  3. Find two different datasets that have the same median and the same mean.

Means and medians are just summaries; they don't tell the whole story about a dataset!

In a few weeks, we'll learn about how to visualize the distribution of a collection of numbers using a histogram.

These two distributions have different means but the same median!

Lists

Average temperature for a week

How would we store the temperatures for a week to compute the average temperature?

Our best solution right now is to create a separate variable for each day of the week.

This technically allows us to do things like compute the average temperature:

avg_temperature = 1/7 * (
    temp_sunday
    + temp_monday
    + temp_tuesday
    + ...)

Imagine a whole month's data, or a whole year's data. It seems like we need a better solution.

Lists in Python

In Python, a list is used to store multiple values within a single value. To create a new list from scratch, we use [square brackets].

Notice that the elements in a list don't need to be unique!

Lists make working with sequences easy!

To find the average temperature, we just need to divide the sum of the temperatures by the number of temperatures recorded:

Types

The type of a list is... list.

Within a list, you can store elements of different types.

There's a problem...

Arrays

NumPy

Arrays

Think of NumPy arrays (just "arrays" from now on) as fancy, faster lists.

To create an array, we pass a list as input to the np.array function.

Positions

When people wait in line, each person has a position.

Similarly, each element of an array (and list) has a position.

Accessing elements by position

Types

Earlier in the lecture, we saw that lists can store elements of multiple types.

This is not true of arrays – all elements in an array must be of the same type.

Array-number arithmetic

Arrays make it easy to perform the same operation to every element. This behavior is formally known as "broadcasting".

Note: In none of the above cells did we actually modify temperature_array! Each of those expressions created a new array.

To actually change temperature_array, we need to reassign it to a new array.

Element-wise arithmetic

Summary, next time

Summary

Next time

We'll learn more about arrays and we'll see how to use Python to work with real-world tabular data.