Skip to main content Link Search Menu Expand Document (external link)

Principles of Data Science

DSC 10, Winter 2024 at UC San Diego

Janine Tiefenbruck
she/her

jlobue@ucsd.edu

Lecture(s): MWF 9-9:50AM (A), 10-10:50AM (B), 11-11:50AM (C) Mandeville B-202

The Final Exam is this Saturday 3/16 from 7-10PM in Catalyst 0125 near Plant Power. Join us for a collaborative study session on Friday 3/15 from 5-8PM in Solis 104.

If at least 75% of the class fills out both SETs and the internal End-of-Quarter Survey, then the entire class will have 1% of extra credit added to their overall grade. The deadline is Saturday 3/16 at 8AM.

Jump to the current week

Week 1 – Python Basics

Mon Jan 8

LEC 1 Introduction ✏️

CIT 1.0-1.3

Keywords: data science, course structure, policies, syllabus, Little Women demo

DISC Getting Started with Jupyter Notebooks

SUR Welcome Survey

Wed Jan 10

LEC 2 Expressions and Data Types ✏️

BPD 1-6

Keywords: Jupyter notebooks, expressions, variables, assignment, functions, int, float

Fri Jan 12

LEC 3 Strings, Lists, and Arrays ✏️

BPD 7-8, CIT 14.1

Keywords: string methods, mean, median, lists, arrays, array arithmetic

PRAC Extra Practice Session

Sat Jan 13

LAB 0 Expressions and Data Types

Week 2 – DataFrames

Mon Jan 15

No Lecture (MLK Day)

Wed Jan 17

LEC 4 Arrays and DataFrames ✏️

BPD 9

Keywords: array methods, np.arange, .read_csv, .get, .assign, .sort_values, .iloc, .loc, index

Fri Jan 19

LEC 5 Querying and Grouping ✏️

BPD 10-11

Keywords: .set_index, Booleans, querying, .shape, &, |, .take, .groupby, aggregation

PRAC Extra Practice Session

Sat Jan 20

LAB 1 Arrays and DataFrames

Week 3 – Data Visualization and Functions

Mon Jan 22

LEC 6 Grouping and Data Visualization ✏️

CIT 7.0-7.1

Keywords: .groupby, numerical vs. categorical, scatter plot, line plot, bar chart

QUIZ 1 Quiz 1 covers Lectures 1-4

Wed Jan 24

LEC 7 Distributions and Histograms ✏️

CIT 7.2-7.3

Keywords: distributions, density histograms, binning, total area, overlaid plots

Thu Jan 25

HW 1 Basic Python, Arrays, and DataFrames

Fri Jan 26

LEC 8 Functions and Applying ✏️

BPD 6, 12

Keywords: functions, arguments, print vs. return, .apply, .reset_index

PRAC Extra Practice Session

Week 4 – Control Flow and Probability

Mon Jan 29

LEC 9 Grouping on Multiple Columns, Merging ✏️

BPD 11, 13

Keywords: .groupby([col_1, col_2, …]), subgroups, MultiIndex, .merge, number of rows

QUIZ 2 Quiz 2 covers Lectures 5-7

Tue Jan 30

LAB 2 Data Visualizations and Python Functions

Wed Jan 31

LEC 10 Conditional Statements and Iteration ✏️

CIT 9.0-9.2

Keywords: in, not, and, or, if, else, elif, for-loops, np.append, accumulator pattern

Thu Feb 1

HW 2 DataFrames, Data Visualization, and Functions

Fri Feb 2

LEC 11 Probability (blank, 9AM, 10AM, 11AM)

CIT 9.5

Keywords: event, conditional prob., multiplication and addition rules, independence

PRAC Extra Practice Session

Sat Feb 3

LAB 3 DataFrames, Control Flow, and Probability

Week 5 – Simulation, Sampling, and Confidence Intervals

Mon Feb 5

LEC 12 Simulation ✏️

CIT 9.3-9.4

Keywords: np.random.choice, replacement, np.count_nonzero, coin flipping, Monty Hall

QUIZ 3 Quiz 3 covers Lectures 8-11

Wed Feb 7

LEC 13 Distributions and Sampling ✏️

CIT 10.0-10.4

Keywords: probability vs. empirical distribution, SRS, .sample, parameter, statistic

Thu Feb 8

HW 3 DataFrames, Control Flow, and Probability

Fri Feb 9

LEC 14 Bootstrapping and Confidence Intervals ✏️

CIT 13.0-13.2

Keywords: inference, bootstrapping, resample, np.percentile, confidence interval

PRAC Extra Practice Session

Week 6 – Midterm Exam and the Normal Distribution

Mon Feb 12

EXAM Midterm Exam

DISC Exam Solutions Review

Wed Feb 14

LEC 15 Confidence Intervals, Center, and Spread ✏️

CIT 13.3-13.4

Keywords: interpreting CIs, robust vs. sensitive, center, standard deviation, Chebyshev

Thu Feb 15

PROJ Midterm Project

Fri Feb 16

LEC 16 Standardization and the Normal Distribution ✏️

CIT 14.2-14.3

Keywords: Chebyshev, standard units, normal distribution, CDF, inflection points

PRAC Extra Practice Session

Sat Feb 17

LAB 4 Simulation, Sampling, & Bootstrapping

Week 7 – Central Limit Theorem

Mon Feb 19

No Lecture (Presidentsβ€˜ Day)

Wed Feb 21

LEC 17 The Central Limit Theorem ✏️

CIT 14.4-14.5

Keywords: distribution of the sample mean, square root law, CLT-based CIs

Thu Feb 22

HW 4 Simulation, Sampling, Bootstrapping

Fri Feb 23

LEC 18 Choosing Sample Sizes, Statistical Models ✏️

CIT 14.6, 11.1

Keywords: standard deviation of 0s and 1s, np.random.multinomial, Robert Swain jury

PRAC Extra Practice Session

Week 8 – Hypothesis and Permutation Testing

Mon Feb 26

LEC 19 Hypothesis Testing ✏️

CIT 11.3

Keywords: null and alternative hypotheses, test statistic, fair or unfair coin

QUIZ 4 Quiz 4 covers Lectures 13-17

Tue Feb 27

LAB 5 Variability and the Normal Distribution

Wed Feb 28

LEC 20 Hypothesis Testing and Total Variation Distance ✏️

CIT 11.2, 11.4

Keywords: fair or unfair coin, p-value, midterm exam scores, Alameda County jury, TVD

Thu Feb 29

HW 5 The Normal Distribution and the Central Limit Theorem

Fri Mar 1

LEC 21 TVD, Hypothesis Testing, and Permutation Testing ✏️

CIT 12.0-12.1

Keywords: confidence intervals for hypothesis testing, body temperature, smoking/babies

PRAC Extra Practice Session

Sat Mar 2

LAB 6 Hypothesis Testing

Week 9 – Prediction

Mon Mar 4

LEC 22 Permutation Testing ✏️

CIT 12.3

Keywords: smoking and birth weight, np.random.permutation, shuffling, Deflategate

QUIZ 5 Quiz 5 covers Lectures 18-21 (excluding Permutation Testing)

Wed Mar 6

LEC 23 Correlation ✏️

CIT 15.0-15.2

Keywords: association, correlation coefficient (r), predicting heights, regression line (su)

Thu Mar 7

HW 6 Hypothesis Testing and Permutation Testing

Fri Mar 8

LEC 24 Regression and Least Squares ✏️ Recording πŸŽ₯

CIT 15.2-15.4

Keywords: regression line in original units, outliers, errors, RMSE, best fit, least squares

PRAC Extra Practice Session

Week 10 – Review

Mon Mar 11

LEC 25 Residuals and Inference ✏️

CIT 15.5-16.3

Keywords: residuals, residual plots, patterns, datasaurus dozen, prediction intervals

QUIZ 6 Quiz 6 covers Lectures 21-24

Tue Mar 12

PROJ Final Project

Wed Mar 13

LEC 26 Review - Annotated 9AM, 10AM, 11AM

Thu Mar 14

LAB 7 Regression

Fri Mar 15

LEC 27 Review, Conclusion ✏️ - Blank - Annotated 9AM, 10AM, 11AM

STUDY Collaborative Study Session (5-8PM in Solis 104)

Sat Mar 16

EXAM Final Exam (7-10PM)

SUR SETs and End-of-Quarter Survey (due 8AM)