Skip to main content Link Search Menu Expand Document (external link)

📖 Syllabus

Table of contents

  1. About 🧐
  2. Getting Started đŸ’»
    1. Websites
    2. Development Environment
    3. Forms
  3. Communication 💬
  4. Course Components 🍎
    1. Lectures
    2. Discussions and Lab Reflections
    3. Labs
    4. Projects
    5. Office Hours
    6. Weekly Schedule
  5. Exams 📝
    1. Redemption Policy
  6. Policies 💯
    1. Grading
    2. Late Policy, Slip Days, and Drops
    3. Regrade Requests
    4. Incompletes
    5. A note on letter grades
  7. Collaboration Policy and Academic Integrity đŸ€
    1. Why is academic integrity important?
    2. What counts as cheating?
    3. Use of Generative Artificial Intelligence
  8. Support đŸ«‚
    1. Accommodations
    2. Diversity and Inclusion
    3. Campus Resources
  9. Acknowledgements 🙏

About 🧐

DSC 80 serves as a bridge between lower-division and upper-division data science courses. In DSC 80, students will gain proficiency with the data science life cycle and learn many of the fundamental principles and techniques of data science spanning algorithms, statistics, machine learning, visualization, and data systems.

After DSC 80, students will be prepared for data science internships and interviews, will have the tools to create their own data science portfolios, and will have the maturity necessary to succeed in upper-division machine learning and statistics courses.

Prerequisites: DSC 30 and DSC 40A.


Getting Started đŸ’»

The course website, dsc80.com, will contain links to all course content. There are also a few things you’ll need to do to get set up.

Websites

You’ll need to make accounts on the following sites:

  • Ed: We’ll be using Ed as our course message and discussion board. More details are in the Communication section below. If you didn’t already get an invitation to our Ed course, sign up here.

  • Gradescope: You’ll submit all assignments and exams to Gradescope. This is where all of your grades will live as well. Most of the assignments will be coding assignments. Parts of these assignments will be manually graded, but most of them will be autograded. You should have received an email invitation for Gradescope, but if not please let us know as soon as possible (preferably via Ed).

  • GitHub: Like in DSC 30, you’ll access all course content (lecture slides and assignments) by pulling our course GitHub repository. That repo is here: github.com/dsc-courses/dsc80-2024-wi. In most assignments, you won’t need to push anything to GitHub, however in Projects 3 and 5 you will, and so you’ll need to have an account by then.

Note that we will not be using Canvas for anything this quarter.

Development Environment

As soon as you are able to, go follow the steps in the Tech Support page of the course website to set up your development environment for the course.

Forms

Please fill out the Welcome Survey to tell us a bit more about yourself and tell us if you need an alternate exam.


Communication 💬

This quarter, we’ll be using Ed as our course message board. You will be added to Ed automatically; use the invite link in the section above if you weren’t added.

If you have a question about anything to do with the course — if you’re stuck on a problem, didn’t understand something from lecture, want clarification on course logistics, or just have a general question about data science — you can make a post on Ed. We only ask that if your question includes some or all of an answer (even if you’re not sure it’s right), please make your post private so that others cannot see it. You can also post anonymously to other students if you prefer.

Course staff will regularly check Ed and try to answer any questions that you have. You’re also encouraged to answer questions asked by other students. Explaining something is a great way to solidify your understanding of it!

Please don’t email individual staff members, just make a private or public Ed post instead.


Course Components 🍎

Lectures

Lectures will be held in-person on Tuesdays and Thursdays from 3:30-4:50PM in Pepper Canyon Hall 109. Attendance is not required, though you are encouraged to attend in-person if you are able to. Lectures will be podcasted.

Lecture notebooks will be your main resource in this class. You can access them, along with all course materials, by pulling from the course GitHub repository, github.com/dsc-courses/dsc80-2024-wi. We will also link HTML previews of each lecture notebook from the course homepage; you can use these to annotate the lecture notebooks with a tablet, if you’d like.

New Before lecture, we may post a “pre-lecture reading” which contains an introduction to the material being covered in the lecture. These should only take ~20 minutes to read. If everyone comes to class having read the pre-lecture reading, we’ll be able to spend more time in class working through challenging problems, like you may see in labs or projects, rather than having to listen to Suraj talk 😮. This is a new experiment, and we may tweak the experience during the quarter. We appreciate your feedback!

Supplementary readings (which are different from pre-lecture readings) will primarily come from Learning Data Science, a freely-available textbook written by another DSC 80 instructor, Sam Lau. It can be found at learningds.org. Some readings will come from notes.dsc80.com, a set of notes that were originally written to supplement DSC 80. Supplementary readings are not required, in that you won’t be tested on anything that appears only in the readings but not in lectures or assignments, but you should still complete them to supplement your understanding!

Discussions and Lab Reflections

Discussions will be held in-person on Wednesdays from 7-7:50PM in Pepper Canyon Hall 109, the same room as lecture. Discussion sections will be podcasted.

You’ll spend the vast majority of your time in this course on labs and projects, which you’ll read more about in the sections below. The labs you complete each week will give you hands-on practice with the tools and techniques introduced in lectures. While completing the labs is important, it’s also important to reflect on your lab work once grades are released, and think about how you could have approached problems differently (e.g. more efficiently).

Therefore, in discussion sections on Wednesdays, we discuss solutions to a subset of the lab that was due that Monday (2 days before discussion). When you attend, you’ll have a chance to discuss your implementation with course staff and hear how others attempted the problems. Hopefully, you’ll leave with a stronger understanding of the learning objectives of the lab.

To encourage you to attend and reflect, we will offer extra credit to those who do all 3 of the following:

  1. Submit the lab.
  2. Attend discussion section in-person on Wednesday.
  3. Satisfactorily complete a Lab Reflection form for the lab that was taken up in discussion by Thursday at 11:59PM (the next day). The reflection form, hosted on Gradescope, will ask you to comment on how your answers compared to the solution and how you could’ve approached the problems differently (even if you received full credit on the problems that were taken up).

Since there are 9 labs, there will be 9 lab reflections and 9 discussion sections in which we take up lab solutions. For each lab that you submit, if you attend the corresponding discussion section and complete the lab reflection form, you will receive 0.2% of extra credit added to your overall grade. On Wednesday of Week 1, since no lab will have been due yet, we will award this extra credit to anyone who attends discussion. Thus, there are 10 extra credit opportunities available, which could amount to 0.2% * 10 = 2% of extra credit for your overall grade.

Note that:

  • To earn extra credit for a particular lab, you must submit the lab, attend discussion, and complete the reflection form to receive the EC. If you fail to do all 3 of these things, you won’t receive extra credit.
  • We will be manually grading the Lab Reflection forms for completion. In order to receive credit, you’ll need to provide meaningful responses. Simply saying “I could’ve made my code more efficient” is not enough to receive credit – what was suboptimal about your implementation? What benefits and drawbacks are there to the solutions you heard in discussion?

Labs

There will be 9 lab assignments due weekly throughout the quarter. Each lab assignment will consist of a mixture of coding and free response questions. Coding questions will ask you to fill in the body of a function. Public tests are usually provided so that you can make sure that you're on the right track (similar to DSC 20), however, your submission will be graded using a private autograder with hidden tests.

Each lab is worth the same amount, but the lowest lab will be dropped when calculating your final score. Labs will usually be released on Tuesdays and due on Mondays at 11:59PM (except in Weeks 2 and 7, in which Monday is a holiday and the lab is due on Wednesday at 5PM).

You will access labs (and projects) by pulling the course GitHub repository.

Projects

There will be 5 projects due every other week throughout the quarter. Like labs, projects consist of coding and free response questions. As their name implies, however, projects are more open-ended and allow you to simulate applying your data science skills in practical situations. You can think of the projects as being mini-take-home-exams that track your practical skills throughout the quarter (whereas the exams themselves test for conceptual understanding).

Projects are due bi-weekly. However, the week before a project is due, there will often be a project checkpoint. This checkpoint will ensure that you're on-track to complete the project on time, and should (hopefully) be a source of easy points.

The last project, Project 4, will be due during finals week, and can be thought of as a practical component of the Final Exam.

Note that, unlike labs, the lowest project score is not dropped. Projects and project checkpoints will usually be due on Thursdays at 11:59PM.

Working in Pairs

You may work together on projects (and projects only!) with a partner. If you work with a partner, you are both required to actively contribute to all parts of the project. You must both be working on the assignment at the same time together, either physically or virtually on a Zoom call. You are encouraged to follow the pair programming model, in which you work on just a single computer and alternate who writes the code and who thinks about the problems at a high level.

In particular, you cannot split up the project and each work on separate parts independently.

If you work with a partner:

  • Only one partner needs to submit the project on Gradescope; this partner should add the other partner to their submission.
  • You must also submit the checkpoint together.
  • You and your partner will receive the same score on any submissions you make together.

If you are unhappy with your partnership (e.g., if your partner does not keep in touch, does not come prepared to work on the assignment, or does not seem to be engaged in the process), please first address your concerns to your partner, and try to agree on what should be done to make the partnership work well for both of you. If that approach is not successful, explain the issues to the instructors, who will work with you and your partner to improve the situation.

You may use different partners on different projects.

Note that you may not work with partners on lab assignments, however you’re encouraged to discuss all assignments with others at a conceptual level in office hours and study groups.

Office Hours

To get help on assignments and concepts, course staff will be hosting several office hours per week. All office hours will be held in person. See the Calendar tab of the course website for the most up-to-date schedule and instructions.

Weekly Schedule

To summarize all of the events and deadlines, refer to this general weekly schedule (which is subject to change in any given week):

SundayMondayTuesdayWednesdayThursdayFridaySaturday
  Lecture Lecture  
   Discussion   
 Lab due  Project/checkpoint due
Lab reflection due (extra credit)
  

Exams 📝

This class has one Midterm Exam and one Final Exam. Exams are cumulative, though the Final Exam will emphasize material after the Midterm Exam.

  • Midterm Exam: Thursday, February 8th, 3:30-4:50PM, Pepper Canyon Hall 109 (during lecture)

  • Final Exam: Tuesday, March 19th, 3-6PM, Pepper Canyon Hall 109

Both exams will be administered in-person. If you have conflicts with either of the exams, please let us know on the Welcome Survey.

Redemption Policy

The Final Exam will consist of two parts: a “Midterm” section and a “post-Midterm” section. If you do better on the “Midterm” section of the Final Exam than you did on the original Midterm Exam, your score on the “Midterm” section will replace your original Midterm Exam score. This lowers the stakes of the Midterm Exam and gives you two opportunities to demonstrate your understanding of the content from the first half of the course. This also means that you can miss the Midterm Exam for any reason and have the score be replaced by your score on the “Midterm” section of the Final Exam (though we do not recommend this).

You must take the Final Exam to pass the course.


Policies 💯

Grading

Here is how we’ll compute your grade:

ComponentWeightNotes
Labs25%3.125% per lab, lowest dropped
Projects30%6% each for Projects 1-3, 12% for Project 4
Project Checkpoints5%1% each for Projects 1-3, 2% for Project 4
Midterm Exam15%see the Redemption Policy above
Final Exam25% 
Discussion Attendance + Lab Reflections2% (extra credit)0.2% per week

Late Policy, Slip Days, and Drops

All assignments must be submitted by 11:59PM San Diego time on the due date to be considered on time, with the exception of Labs 1 and 6 (due in Weeks 2 and 7 on Wednesday at 5PM). You may turn them in as many times as you like before the deadline, and only the most recent submission will be graded, so it’s a good habit to submit early and often. If you make a submission after the deadline, your assignment will be counted as late.

You have 7 “slip days” (up from 6) to use throughout the quarter. A slip day extends the deadline of an assignment by 24 hours. The number of slip days you can use on an assignment depends on the kind of assignment:

  • On labs, you may use up to 1 slip day. Labs will not be accepted more than 24 hours after the deadline. Note that you will not be able to use slip days on Labs 1 and 6, but their deadlines will be extended.
  • On projects and project checkpoints, you may use up to 2 slip days. These assignments will not be accepted more than 48 hours after the deadline. Note that you will not be able to use slip days on Project 4 (which is due on Thursday, March 21st, no exceptions).
  • You may not use slip days on lab reflection assignments. These assignments will not be accepted after the original deadline.

Slip days are designed to be a transparent and predictable source of leniency in deadlines. You can use a slip day if you are too busy to complete an assignment on its original due date (or if you forgot about it). But slip days are also meant for things like the internet going down at 11:58PM just as you go to submit your assignment. Slip days are meant to be used in exceptional circumstances, so you probably should not need to use all 6, but if you have something going on in your life that is impeding your ability to do your classwork on time, please reach out to us as soon as possible so we can work something out.

Slip days are applied automatically at the end of the quarter, and you don’t need to ask in order to use one. It’s your responsibility to keep track of how many you have left. If you’ve run out of slip days and submit an assignment late, that assignment may still be graded, but you will receive a 0 on it when we calculate grades at the end of the quarter. However, in the event that you use all 6 days and submit another assignment late, we will allocate your slip days first to your projects (in chronological order), then to your labs (in chronological order), and then to other assignments. This is done to prevent you from receiving a 0 on, say, Project 3, if you’ve used all of your slip days on labs up until that point; in such a case, you’d instead receive a 0 on an earlier lab that isn’t weighted as much in your grade.

Regrade Requests

Most of the labs and projects are autograded, but some questions are manually graded. If you feel that there in an error in the autograder or that the manual grader has made a mistake, you may submit a regrade request within three days of the grades being released. If you do not submit a regrade request within three days, your original grade will be final.

Regrade Requests for Manually Graded Problems

To submit a regrade request for a manually graded problem, make the regrade request directly on Gradescope. Note that part of your grade is clarity, so if your answer was mostly right but unclear you may still not be eligible for full credit.

Regrade Requests for Autograded Problems

To submit an autograder regrade request, please fill out the Autograder Regrade Request Form.

The autograder is very picky: it expects your assignments to have exactly the correct file names, all functions must be named correctly, etc. If these are wrong, your code may not run and the autograder may assign zero points. This is a grading catastrophe 😧.

Grading catastrophes are preventable! After submitting your assignment, always wait around to see the output of the Gradescope grader and ensure that it runs properly. Also, be sure to submit your assignment (or at least part of it) to Gradescope with enough time before the deadline to get help if there is a strange autograder problem.

In the case that you submit code that doesn’t run and discover this at a later date, you have some options:

  1. If it is still before the late deadline, you may use slip days to fix your code and re-submit. Note that you’re free to do this even if your code runs – this is just making use of the normal slip day mechanism to submit an assignment late.
  2. If it is past the late deadline and your code requires only minor fixes (e.g., the file name is wrong) we will fix your code at the cost of 2 slip days. Note that these slip days are in addition to any slip days you already used on the assignment. You can submit a catastrophe regrade request the same way you submit a regular autograder regrade request, by filling out the Autograder Regrade Request Form.

Incompletes

In the unfortunate circumstance that you become sick, suffer a loss, or otherwise experience a significant setback that is outside of your control, you may be eligible for an Incomplete grade, which allows you to complete the rest of the work at a later time. If you are experiencing challenges due to circumstances outside your control, please contact me ASAP and we can discuss the best course of action. Note that an Incomplete does not allow you to re-do work that has already been completed, only to do work that hasn’t been completed, so it’s best to reach out right away.

A note on letter grades

The following is adapted from CSE 160 at the University of Washington.

Grading for this class is not curved in the sense that the average is set at (say) a B+ and half of the class must receive a grade lower than that. If everyone does well and shows mastery of the material, everyone can receive an A (this would be awesome!). If no one does well (this is unlikely), then everyone can receive a C.

Grading for this class is curved in the sense that we do not have a pre-defined mapping from project and exam scores to a final GPA. There is no pre-determined score (e.g., 90% of all possible points) that earns an A or a B or a C or any other grade. To determine the final grade, we will ask questions like “Did this student master the material?”. With that said, grades will not be any stricter than the standard grading scale (where an A+ is a 97+, A is 93+, A- is 90+, etc). For instance, the threshold for an “A” will never be higher than 93%.

Try your best not to worry about grades, and we’ll reciprocate by being fair. We’re in this together 😎.


Collaboration Policy and Academic Integrity đŸ€

DSC 80 is known for being a rigorous but rewarding course. While you will be challenged this quarter, we will be offering you plenty of support through office hours and Ed. Make good use of these resources, and you will be able to succeed in this course.

There is no excuse for cheating in this course. If you do cheat, we will enforce the UCSD Policy on Integrity of Scholarship. This means you will likely fail the course and the Dean of your college will put you on probation or suspend or dismiss you from UCSD. Students agree that by taking this course, their assignments may be submitted to third-party software to help detect plagiarism.

Why is academic integrity important?

Academic integrity is an issue that is pertinent to all students on campus. When students act unethically by copying someone’s work, taking an exam for someone else, plagiarizing, etc., these students are misrepresenting their academic abilities. This makes it impossible for instructors to give grades (and for the University to give degrees) that reflect student knowledge. This devalues the worth of a UCSD degree for all students, making it imperative for the the campus as a whole to enforce that all members of this community are honest and ethical. We want your degree to be meaningful and we want you to be proud to call yourself a graduate of UCSD!

The UCSD Policy on Integrity of Scholarship and this syllabus list some of the standards by which you are expected to complete your academic work, but your good ethical judgment (or asking us for advice) is also expected as we cannot list every behavior that is unethical or not in the spirit of academic integrity. Ignorance of the rules will not excuse you from any violations.

What counts as cheating?

In DSC 80, you can read books, surf the web, talk to your friends and the DSC 80 staff to get help understanding the concepts you need to know to complete your assignments. However, all code must be written by you (or, in the instance of projects, together with your partner).

The following activities are considered cheating and are not allowed in DSC 80 (not an exhaustive list):

  • Using or submitting code acquired from other students (except from your pair programming partner during projects), the web, or any other resource not officially sanctioned by this course
  • Posting your code online, including on Ed, unless privately to instructors only
  • Having any other person complete any part of your assignment on your behalf
  • Completing an assignment on behalf of someone else
  • Providing code, exam questions, or solutions to any other student in the course
  • Splitting up project questions with your pair programming partner and each working on different questions
  • Collaborating with others on exams

The following activities are examples of appropriate collaboration and are allowed in DSC 80 (not an exhaustive list):

  • Discussing the general approach to solving labs or projects
  • Talking about problem-solving strategies or issues you ran into and how you solved them
  • Discussing the answers to exams with other students who have already taken the exam after the exam is complete
  • Using code provided in class, by the textbook or any other assigned reading or video, with attribution
  • Google searching for documentation on Python or pandas
  • Working together with other students on assignments without copying or sharing answers
  • Posting a question about your approach to a problem on Ed, without sharing your code

The best way to avoid problems is by using your best judgement and remembering to act with Honesty, Trust, Fairness, Respect, Responsibility, and Courage. Here are some suggestions for completing your work:

  • Don’t look at or discuss the details of another student’s code for an assignment you are working on, and don’t let another student look at your code.
  • Don’t start with someone else’s code and make changes to it, or in any way share code with other students.
  • If you are talking to another student about an assignment, don’t take notes, and wait an hour afterward before you write any code.

Use of Generative Artificial Intelligence

Generative Artificial Intelligence (GenAI) describes tools, such as ChatGPT and GitHub Copilot, that are trained to generate responses to user-defined prompts, or questions. The existence of such tools is a major milestone in machine learning, and an impressive application of data science in the real world.

Our course policy on the use of GenAI tools for coursework is simple: you may use these tools to build an understanding of course material and to assist you on assignments, keeping in mind that no tool is a substitute for a strong understanding of course concepts.

Be mindful of how you are using GenAI tools. These tools can be very useful to help you preview material before lecture, summarize material after lecture, explain concepts you didn’t understand, and explore how different concepts are related. “Explain it like I’m five” can be a helpful prompt to give you a basic understanding of new concepts before being exposed to them in lecture. Consolidating your knowledge after learning something new and relating it to other things you know is important for learning and retention.

Unfortunately, GenAI tools are not a consistently reliable source of quality information. Because of how GenAI tools are trained, they often provide answers and write code that look correct, but aren’t actually correct. A goal of your education is to develop an ability to identify and produce information that actually is correct and doesn’t just sound correct. Human supervision of GenAI tools is always necessary.

Proceed with caution when using tools to assist you with your assignments. DSC 80 is a foundational class for your study of data science; you need to master the skills and concepts of this course if you want to use data science effectively. Through exams, you will be tested on your independent ability to apply course material to novel problems. Labs and projects are meant to prepare you for these assessments, so overreliance on GenAI for assignments will rob you of opportunities to learn and make it hard for you to perform well on exams.

If you do use GenAI to assist you on assignments, keep these guidelines in mind:

  • Design your prompts carefully. Don’t just ask one question; ask a follow-up question based on the output to the first. To use these tools effectively, you need to engineer your prompts carefully.
  • Test the outputs. GenAI tools can and do make mistakes, and being able to verify the correctness of a proposed answer is an important skill for you to develop. Validate the output against course-provided references, or follow up with a search on Google or Stack Overflow. Remember that GenAI tools provide crowdsourced likely answers, not necessarily correct answers.
  • Don’t submit any code that you don’t understand, or that uses content not taught in this class. In our experience last quarter, students who used ChatGPT to help with assignments ended up with code that was difficult for both them and the teaching staff to understand. If you answer questions with out-of-scope content, you are not practicing the foundational skills that the course is meant to teach you. Be careful!

If your assignment submission includes any content generated by an AI tool, it should be cited to acknowledge the source of the material. In each assignment, you will be provided with a space to explain and reflect on your use of GenAI tool(s).


Support đŸ«‚

Accommodations

From the Office for Students with Disabilities (OSD):

OSD works with students with documented disabilities to review documentation and determine reasonable accommodations. Disabilities can occur in these areas: psychological, psychiatric, learning, attention, chronic health, physical, vision, hearing, and acquired brain injuries, and may occur at any time during a student’s college career. We encourage you to contact the OSD as soon as you become aware of a condition that is disabling so that we can work with you.

If you already have accommodations via OSD, please make sure that we receive your Authorization for Accommodation (AFA) letter by the end of Week 1 so that we can make arrangements for accommodations. Share your AFA letter with the instructor and the Data Science OSD Liaison, who can be reached at dscstudent@ucsd.edu.

Diversity and Inclusion

We are committed to an inclusive learning environment that respects our diversity of perspectives, experiences, and identities. Our goal is to create a diverse and inclusive learning environment where all students feel comfortable and can thrive. If you have any suggestions as to how we could create a more inclusive setting, please let us know. We also expect that you, as a student in this course, will honor and respect your classmates, abiding by the UCSD Principles of Community. Please understand that others’ backgrounds, perspectives and experiences may be different than your own, and help us to build an environment where everyone is respected and feels comfortable.

Campus Resources

If there is an issue you feel uncomfortable speaking with us or are searching for help on a specific concern, there are several campus resources available to you, including:


Acknowledgements 🙏

This offering of DSC 80 builds off of prior offerings by Sam Lau, Tauhidur Rahman, Suraj Rampure, Justin Eldridge, Marina Langlois, and Aaron Fraenkel. Along with the help of their tutors and TAs, they developed much of the content that we will use in this course.