DSC 80 – The Practice and Application of Data Science


📜 Syllabus

Welcome to DSC 80 in Fall 2022! This page should answer most of the questions you might have about how the course is run; check out the frequently asked questions for answers to some common ones.

Here is what the syllabus will cover:

Instructor

  • Dr. Justin Eldridge (or just Justin)
    jeldridge@ucsd.edu
    webpage

Modality

UCSD's plan for the 2022-23 academic year is to fully return to on-campus instruction. That said, lectures in DSC 80 will be recorded and available online. Attendance is appreciated, but not required. On the other hand, exams will be fully in-person, and physical attendance is required for those.

Getting Started

To get started in DSC 80, you'll need to set up accounts on a couple of websites.

Ed

We'll be using Ed as our course message board. Ed is like Piazza, but unlike Piazza, Ed does not sell student data to third parties. You should have received an invitation via email, but if not you should get in touch with a course staff member as soon as possible, as we'll be making all course announcements via Ed.

If you have a question about anything to do with the course — if you're stuck on a assignment problem, want clarification on the logistics, or just have a general question about data science — you can make a post on Ed. We only ask that if your question includes some or all of an answer, please make your post private so that others cannot see it. You can also post anonymously if you would prefer.

Course staff will regularly check Ed and try to answer any questions that you have. You're also encouraged to answer a question asked by another student if you feel that you know the answer.

Gradescope

We'll be using Gradescope for assignment submission and grading. Most of the assignments will be coding assignments. Parts of these assignments will be manually graded, but most of them will be autograded. You should have received an email invitation for Gradescope, but if not please let us know as soon as possible (preferably via Ed).

Canvas

We won't be using Canvas. All course materials will be available at dsc80.com or Gradescope.

Required Materials

No materials are required for this course; we'll use lecture slides as the main resource, as well as the course notes written by Prof. Aaron Fraenkel.

Lectures

Lectures will be held in-person at the regularly-scheduled time and place, but they will be podcasted and posted online for remote viewing. Attendance is appreciated, but not required.

Office Hours

Course staff, including tutors, TAs, and instructors, will hold office hours regularly throughout the week. Office hours will be offered in a mix of in-person and remote modalities. Please see the office hours page for the schedule and for instructions.

Discussions

Discussions will be held in-person at 03:00 pm PST on Fridays in CSB 002.

Attendance is recommended, but not required. The discussions will be podcasted, but the nature of discussion section (they usually involve a large amount of groupwork) means that the podcasted discussion might not be as useful as in-person attendance as in-person attendance.

During discussion you will work through a short notebook or coding problem designed to prepare you for the coming week's assignments. Attending discussion and submitting the discussion assignment are not mandatory. However, each discussion assignment submitted will earn you 0.3% of extra credit 🎉 applied towards your overall score at the end of the quarter. Since there will be 10 discussions, there will be 3% of extra credit made available. This has the potential to boost your grade by half a letter.

Discussion assignments will be due via Gradescope on midnight of the Saturday after discussion.

Labs

There will be nine lab assignments due weekly throughout the quarter. Each lab assignment will be a mixture of coding and free response questions. Coding questions will ask you to fill in the body of a function. Doctests are usually provided so that you can make sure that you're on the right track (a la DSC 20), however, your submission will be graded using a private autograder with hidden tests.

Each lab is worth the same amount, but the lowest lab will be dropped when calculating your final score.

Lab assignments are eligible for redemption with the following scheme. If you re-submit your lab within one week of the due date, we'll regrade it with a 10% penalty. If you re-submit it between one and two weeks after the due date, there will be a 30% penalty. Beyond this we cannot accept redemption requests. A special Gradescope assignment will be created for each lab redemption -- to submit your redemption, just submit your updated solutions to this Gradescope assignment. Note that the lab solutions will be posted while you work on your redemption, and while you're encouraged to look at the solutions, you should never copy code. We recommend spending at least an hour between looking at the solutions and working on your redemption.

In DSC 80, we want to get as much practice as possible with the tools of the trade, including git. Therefore, all assignments -- labs included -- are obtained by pulling the course GitHub repository

Projects

There will be five projects due every other week throughout the quarter. Like labs, projects consist of coding and free response questions. As their name implies, however, projects are more open-ended and allow you to simulate applying your data science skills in practical situations. You can think of the projects as being mini-take-home-exams that track your practical skills throughout the quarter (whereas the exams themselves test for conceptual understanding).

Projects are due bi-weekly. However, the week before a project is due, there will (usually) be a project checkpoint. This checkpoint will ensure that you're on-track to complete the project on time, and should (hopefully) be a source of easy points.

Note that, unlike labs, projects are not eligible for redemption, nor is the lowest project dropped. Like all assignments, you can obtain the project by pulling from the course GitHub repository.

The last project will be due during finals week, and can be thought of as a practical component of the final exam.

Pair Programming

You may work together on projects using the paradigm of pair programming. In pair programming, you may work with one partner of your choosing to complete the assignment. You must both be present (physically or virtually), and working on the same piece of code simultaneously. One person types while the other watches for errors. Pair programming is not where one person does Question 1 while the other person does Question 2.

If you choose to pair program a project, you must submit the project as a pair (even if you do just a single problem together). You must also submit the project checkpoint together (as otherwise you'd be working separately on part of the project).

Note that you may not pair program on lab assignments.

Slip Days

You have five slip days to use throughout the quarter on any deadline, including a discussion assignment, lab, project, or project checkpoint. A slip day extends the deadline of any one assignment by 24 hours. Slip days cannot be "stacked" or "combined" to extend the deadline further — the latest any one assignment can be submitted is 24 hours after the deadline. Slip days are applied automatically at the end of the quarter, but it's your responsibility to keep track of how many you have left.

Slip days are designed to be a transparent and predictable source of leniency in deadlines. You can use a slip day if you are too busy to complete an assignment on its original due date (or if you forgot about it). But slips days are also meant for things like the internet going down at 11:58 PM just as you go to submit your assignment. Slip days are to be used in exceptional circumstances, so you probably shouldn't get close to using all five — if you feel that you will need that many, send me a message and we'll figure something out.

Regrade Requests

Most of the projects and labs are autograded, but some questions are manually graded. If you feel that there in an error in the autograder or that the manual grader has made a mistake, you may submit a regrade request within one week of the grades being released. To do so, please submit a private post to Ed. Note that part of your grade is clarity, so if your answer was mostly right but unclear you may still not be eligible for full credit.

Catastrophic Regrades

The autograder is very picky: it expects your assignments to have exactly the correct file names, all functions must be named correctly, etc. If these are wrong, your code may not run and the autograder may assign zero points. This is a grading catastrophe 😧.

Grading catastrophes are preventable! After submitting your assignment, always wait around to see the output of the Gradescope grader and ensure that it runs properly. Also, be sure to submit your assignment (or at least part of it) to Gradescope with enough time before the deadline to get help if there is a strange autograder problem.

In the case that you submit code that doesn't run and discover this at a later date, you have some options:

  1. If it is still before the late deadline, you may use a slip day to fix your code and re-submit. Note that you're free to do this even if your code runs -- this is just making use of the normal slip day mechanism to submit an assignment late.
  2. If it is past the late deadline and your code requires only minor fixes (e.g., the file name is wrong) we will fix your code at the cost of two slip days. You must have two full slip days to use this option!

To submit a catastrophic regrade request, please make a private Ed post.

Exams

There will be a midterm exam and a final exam:

  • Midterm: Thursday, November 03 (covers Lectures 01 — 09)
  • Final: Tuesday, December 06 (cumulative)

The exams will be held in-person during the regularly-scheduled lecture times.

The final exam is cumulative. If your score on the final exam is higher than your midterm score, your final exam score will replace your midterm score. Note, however, that the final exam covers things that the midterm does not, so it is mandatory (you cannot skip the final exam even if you did well on the midterm).

Grading Scheme

We'll be using the following grading scheme:

  • 25%: Labs (lowest dropped)
  • 30%: Projects
  • 5%: Project Checkpoints
  • 15%: Midterm (see note below)
  • 25%: Final
  • 3% extra credit: Discussions

Note: if you score higher on the cumulative final exam than you did on the midterm, your final score will replace your midterm score.

Support and Resources

As an instructor, my job is to foster an environment where everyone, regardless of identity, feels welcome and is able to focus on learning. If there is something we can do in this mission, or if there is something preventing you from succeeding in the class, please let us know. If you feel uncomfortable speaking with us or are searching for help on a specific concern, there are several campus resources available to you, including:

More generally, if you have any concerns about your ability to focus or succeed in this course, or just need someone to talk to, please contact us ASAP and we'll figure something out.

Illness

Because of the pandemic, we must prepare for the unfortunate possibility that you will get sick and be unable to participate in this class for long periods of time. The university has a mechanism for helping in this situation: the Incomplete. If you are unable to complete the course because of reasons outside of your control, you may be given an Incomplete instead of a letter grade. This simply means that you will complete the rest of the work at a later time. Once you have done so, your overall grade is calculated and your Incomplete grade is replaced.

An Incomplete does not allow you to re-do work that has already been completed, only to do work that hasn't been completed.

Frequently Asked Questions

Is this class curved?

The course is not usually curved at the end of the quarter. Instead, I try to design the assignments so that your final grade is predictable. In this quarter, the extra credit from doing discussion assignments serves the same purpose as a curve. When I assign letter grades, the standard grading scale (where an A is 93+, A- is 90+, B+ is 87+, etc.) will be used as a starting point, but once all scores are in, I will run a clustering algorithm to automatically find the best cutoffs for each letter grade. These cutoffs can only be lowered. For instance, the threshold for an "A" will never be higher than 93%. A+ grades are awarded to the top 5\% of students by grade.