Skip to main content Link Search Menu Expand Document (external link)

Principles of Data Science

DSC 10, Fall 2023 at UC San Diego

Rod Albuyeh

Lecture(s): MWF 8-8:50AM (D), Peterson 104

Suraj Rampure

Lecture(s): MWF 1-1:50PM (C), Mandeville B-210

Janine Tiefenbruck

Lecture(s): MWF 9-9:50AM (B), 10-10:50AM (A), Mandeville B-210

The Final Exam is on Saturday from 7-10PM. Read this Ed post for more details, and check your assigned room and seat here. In lecture on Wednesday, we will take up the solutions to the Spring 2023 Final Exam, so you should work on it before then.

If at least 85% of the class fills out both SETs and the End-of-Quarter Survey by Saturday at 8AM, then we will add 1% of extra credit to everyone’s overall grade. We appreciate your feedback!

The solutions to the Spring 2023 Final Exam have been posted; video walkthroughs of some problems (taken from Wednesday’s lectures) can be found at the top.

Jump to the current week

Week 0 – Welcome to DSC 10!

Fri Sep 29

LEC 1 Introduction ✏️

CIT 1.0-1.3

Keywords: data science, course structure, policies, syllabus, Little Women demo

SUR Welcome Survey

Week 1 – Python Basics

Mon Oct 2

LEC 2 Expressions and Data Types ✏️

BPD 1-6

Keywords: Jupyter notebooks, expressions, variables, assignment, functions, int, float

Wed Oct 4

LEC 3 Strings, Lists, and Arrays ✏️

BPD 7-8, CIT 14.1

Keywords: string methods, mean, median, lists, arrays, array arithmetic

DIS 1 Getting Started with Jupyter Notebooks (problems)

Fri Oct 6

LEC 4 Arrays and DataFrames ✏️

BPD 9-10

Keywords: array methods, np.arange, .read_csv, .get, .assign, .sort_values, .iloc, .loc, index

Sat Oct 7

Lab 0 Expressions and Data Types

Week 2 – DataFrames and Visualization

Mon Oct 9

LEC 5 Querying and Grouping ✏️

BPD 10-11

Keywords: .set_index, Booleans, querying, .shape, &, |, .take, .groupby, aggregation

Wed Oct 11

LEC 6 Grouping and Data Visualization ✏️

CIT 7.0-7.1

Keywords: .groupby, numerical vs. categorical, scatter plot, line plot, bar chart

DIS 2 Arrays and DataFrames

QUIZ 1 Solutions

Thu Oct 12

Lab 1 Arrays and DataFrames

Fri Oct 13

LEC 7 Distributions and Histograms ✏️

CIT 7.2, 7.3

Keywords: distributions, density histograms, binning, total area, overlaid plots

Sat Oct 14

HW 1 Basic Python, Arrays, and DataFrames

Week 3 – Functions and Control Flow

Mon Oct 16

LEC 8 Functions and Applying ✏️

BPD 6, 12

Keywords: functions, arguments, print vs. return, .apply, .reset_index

Wed Oct 18

LEC 9 Grouping on Multiple Columns, Merging ✏️

BPD 11, 13

Keywords: .groupby([col_1, col_2, …]), subgroups, MultiIndex, .merge, number of rows

DIS 3 Querying, Grouping, and Plotting

Thu Oct 19

Lab 2 Data Visualizations and Functions

Fri Oct 20

LEC 10 Conditional Statements and Iteration ✏️

CIT 9.0-9.2

Keywords: in, not, and, or, if, else, elif, for-loops, np.append, accumulator pattern

Sat Oct 21

HW 2 DataFrames, Data Visualization, and Functions

Week 4 – Probability and Simulation

Mon Oct 23

LEC 11 Probability (annotated: 8AM β€’ 1PM)

CIT 9.5

Keywords: event, conditional prob., multiplication and addition rules, independence

Wed Oct 25

LEC 12 Simulation ✏️

CIT 9.3-9.4

Keywords: np.random.choice, replacement, np.count_nonzero, coin flipping, Monty Hall

DIS 4 DataFrames, Control Flow, and Probability

QUIZ 2 Solutions

Thu Oct 26

Lab 3 DataFrames, Control Flow, and Probability

Fri Oct 27

LEC 13 Midterm Review (annotated: 8AM β€’ 9AM β€’ 10AM β€’ 1PM)

Sat Oct 28

HW 3 DataFrames, Control Flow, and Probability

Week 5 – Midterm Exam

Mon Oct 30

EXAM Midterm Exam (in registered lecture section)

Wed Nov 1

LEC 14 Distributions and Sampling ✏️

CIT 10.0-10.4

Keywords: probability vs. empirical distribution, SRS, .sample, parameter, statistic

DIS 5 Midterm Exam Walkthrough

Fri Nov 3

LEC 15 Bootstrapping and Confidence Intervals ✏️

CIT 13.0-13.2

Keywords: inference, bootstrapping, resample, np.percentile, confidence interval

Week 6 – Confidence Intervals and the Normal Distribution

Mon Nov 6

LEC 16 Confidence Intervals, Center, and Spread ✏️

CIT 13.3-13.4

Keywords: interpreting CIs, robust vs. sensitive, center, standard deviation, Chebyshev

PROJ Midterm Project: Taylor Swift (see partner guidelines)

Wed Nov 8

LEC 17 Standardization and the Normal Distribution ✏️

CIT 14.2-14.3

Keywords: Chebyshev, standard units, normal distribution, CDF, inflection points

DIS 6 Sampling, Bootstrapping, and Confidence Intervals

Thu Nov 9

Lab 4 Simulation, Sampling, & Bootstrapping

Fri Nov 10

No Lecture (Veterans Day πŸŽ–οΈ)

Sat Nov 11

HW 4 Simulation, Sampling, Bootstrapping

SUR Mid-Quarter Survey

Week 7 – Central Limit Theorem

Mon Nov 13

LEC 18 The Central Limit Theorem ✏️

CIT 14.4-14.5

Keywords: interpreting CIs, robust vs. sensitive, center, standard deviation, Chebyshev

Wed Nov 15

LEC 19 Choosing Sample Sizes, Statistical Models ✏️

CIT 14.6, 11.1

Keywords: standard deviation of 0s and 1s, np.random.multinomial, Robert Swain jury panel

DIS 7 Standardization and the Normal Distribution

QUIZ 3 In Discussion, Covers Lectures 14-17

Fri Nov 17

LEC 20 Hypothesis Testing ✏️

CIT 11.3

Keywords: null and alternative hypotheses, test statistic, fair or unfair coin

Sat Nov 18

Lab 5 Variability and the Normal Distribution

Week 10 – Review

Mon Dec 4

LEC 26 Residuals and Inference ✏️

CIT 15.5-16.3

Tue Dec 5

PROJ Final Project: Meteorites (see partner guidelines)

Wed Dec 6

LEC 27 Review of the Spring 2023 Final Exam (annotated: 10AM + 1PM) πŸŽ₯

DIS 10 Regression

Thu Dec 7

Lab 7 Regression

Fri Dec 8

LEC 28 Review, Conclusion ✏️ (review problems: blank)

Sat Dec 9

EXAM Final Exam (7-10PM, see location here and details here)

SUR SETs and End-of-Quarter Survey (due 8AM)