DSC 40B – Theoretical Foundations of Data Science II


📜 Syllabus

Welcome to DSC 40B in Fall 2024! This page should answer most of the questions you might have about how the course is run; check out the frequently asked questions for answers to some common ones. If you don't find what you're looking for here, feel free to make a post on Campuswire.

Here is what the syllabus will cover:

Instructor

  • Dr. Justin Eldridge (you can just call me Justin)
    jeldridge@ucsd.edu
    webpage

There will be two different lecture times and two different discussion times, but they will cover the same content on the same schedule. You may attend whichever lecture section you would like after Week 02.

Getting Started

To get started in DSC 40B, you'll need to set up accounts on a couple of websites.

Campuswire

We'll be using Campuswire as our course message board. You should have received an invitation via email, but if not you should be able to join by clicking the link above and using the access code 6036. Be sure sure to join Campuswire as soon as possible, since all course communication will be done through it.

If you have a question about anything to do with the course — if you're stuck on a homework problem, want clarification on the logistics, or just have a general question about data science — you can make a post on Campuswire. We only ask that if your question includes some or all of an answer, please make your post private so that others cannot see it. You can also post anonymously if you would prefer.

Course staff will regularly check Campuswire and try to answer any questions that you have. You're also encouraged to answer a question asked by another student if you feel that you know the answer.

Gradescope

We'll be using Gradescope for homework submission and grading. Most of the assignments will be a mixture of math and coding, and the coding parts are usually autograded via Gradescope., You should have received an email invitation for Gradescope, but if not you can join with code XGKP8P.

Canvas

We will not be using Canvas. All course materials will be available at dsc40b.com or Gradescope.

Required Materials

You will not need to purchase any materials for this course; we'll use lecture slides the main resource, as well as our own course notes. If you'd like additional textbooks to study from, we can recommend these:

  • Dasgupta, Papadimitriou, Vazirani; Algorithms
  • Cormen, Leiserson, Rivest, Stein; Introduction to Algorithms

These books are also excellent resources for preparing for coding interviews.

Lectures

Lectures will be held in-person at the regularly-scheduled time and place, but they will be podcasted and posted online for remote viewing. Attendance is appreciated, but not required.

Since there are two sections of the course, there will be two different lecture times, but they will cover the same content on the same schedule. The lecture times are:

  • 11:00 AM on T/Th in WLH 2111
  • 2:00 PM on T/Th in CSB 001

You may attend whichever lecture section you would like after Week 02.

You will be able to find the lecture recordings at podcast.ucsd.edu.

Office Hours

Course staff, including tutors, TAs, and instructors, will hold office hours regularly throughout the week. Please see the office hours page for the schedule and for instructions.

Discussions

There are two discussion times, but they will cover the same content on the same schedule.

  • 9:00 AM on Friday in PODEM 1A18
  • 6:00 PM on Monday in PODEM 1A19

The discussions review the materials from that week's lectures and prepare you for the homework. Just as with lecture, topics and techniques introduced in discussion might appear on the homework and in exams. In particular, some of the more difficult homework problems may be partially solved in discussion section to give you a good start.

Discussions will also serve as midterm reviews in the weeks leading up to the exams.

Attendance is recommended, but not required. The discussions will be podcasted, but the nature of discussion section (they usually involve a large amount of groupwork) means that the podcasted discussion might not be as useful as in-person attendance.

Labs

There will be two types of assignments in DSC 40B: labs and homeworks. Labs help develop essential knowledge, while homeworks test your ability to apply that knowledge to solve more difficult problems. You can think of labs as a quick check on your understanding before you head into the homework.

Labs consist of a small number of autograded multiple choice or numerical answer questions. They will be posted on Gradescope weekly. The exams will mostly consist of questions of a similar format and difficulty as those on the labs. However, the exams will have a time limit, while the labs have no time limit.

In previous iterations of DSC 40B, these "essential" questions were actually a part of the homeworks. We have decided to move these essential problems to their own lab assignment, therefore making the homeworks shorter. This has a big benefit: because the labs are autograded and due before the homeworks, you'll get your lab grade before heading into the homework. This gives you an opportunity to patch up any misunderstandings.

Lab Redemptions

You should think of the labs as a first practice towards the goal of mastering the topics in DSC 40B. But the first time you practice anything, you're not going to be perfect. The key is to learn from the mistakes.

To encourage this, DSC 40B uses the concept of "redemption" on lab assignments. Under this policy, you may regain 85% of the credit for a lab problem that was previously answered incorrectly by submitting an explanation of your mistake along with a correction. This policy encourages you to revisit lab mistakes in order to correct your understanding, and allows us to give quick, targeted feedback through grading.

For a problem to be eligible for redemption, you must have submitted an answer to the problem.

If you got the problem correct, you'll receive the total number of points for it. If you didn't get the problem correct, even due to a relatively small mistake, you'll receive no credit until you submit a redemption request (see below). If your redemption request is accepted, you'll be given 85% of the credit for the problem. You can think of the 15% deduction as the cost of requiring a tutor to look over your redemption request — or, if you're a glass-half-full kind of person, as an incentive to get the problem correct the first time around.

Redemption Requests

There are two ways to submit a redemption request for a lab problem:

  • Option 1. Come to any tutor, TA, or instructor office hours and discuss the problem (preferred)
  • Option 2. Submit a regrade request on Gradescope.

Whichever method you choose, you should answer the following questions:

  1. What was the main misconception or misunderstanding that led to your answer being wrong?
  2. How did this misconception cause the wrong answer?
  3. How does fixing the misunderstanding lead you to the right answer?

The next section contains an example of a good redemption request.

The amount of detail needed in your request depends on how complex your mistake was; if it was a simple one, only one or two sentences may be necessary. A grader will review your request shortly (as long as you submit it within a week of the homework scores being posted, your regrade request will be reviewed). If you aren't able to identify what you did incorrectly, you'll be asked to attend a grader's office hours in order to discuss the problem in more detail.

Note that we will not be able to handle redemption requests which are submitted more than a week after you have received your grade. However, as long as your request is submitted within a week, we will process it.

Example

Here's a simple example to demonstrate the redemption process. Suppose you're given the following simple problem:

Question: What is 3 + 5 * 2?

Let's say you misapplied the order of operations, giving you an incorrect answer of 16 (the correct answer is, of course, 13). Here's a good redemption request that uses the template above:

1) I misapplied the order of operations. 2) I added before multiplying, so I got (3 + 5) * 2 = 8 * 2 = 16. 3) Multiplication should be done first so that we get 3 + (5 * 2) = 3 + 10 = 13.

Again, the key isn't just giving the right answer — that's published in the solutions, after all. The important part (according to the research) is identifying why you made the mistake.

Homeworks

There will be eight homeworks assigned throughout the quarter, plus one "super homework" (described below). Homeworks will be a mixture of written problems (which are manually graded by our tutor staff) and coding problems (which are autograded). Each homework will be due via Gradescope at 11:59 PM on the Tuesday after it is assigned except otherwise noted, and you'll have roughly a week to complete each assignment from the time it is posted.

The homework due date is carefully chosen to fit within a one week "cycle". A "week" in DSC 40B will start with Tuesday's lecture, followed by Thursday's. That week's discussion on Friday will review the lecture topics with an eye towards practical application. The lab is then due on Friday, giving you some practice before the homework. The homework is then due on the next Tuesday, giving you some time after the discussion and lab to complete it.

Regrade Requests

If you feel that the grader has made a mistake, you may submit a regrade request via Gradescope within one week of the grades being released. Note that part of your grade is clarity, so if your answer was mostly right but unclear you may still not receive full credit.

Note that regrade requests are not the same thing as redemption requests (though both are submitted on Gradescope in the same way). Unfortunately, we cannot offer redemption requests for homework problems as we do with lab problems — homework problems are typically more complex and require more time to grade, and regrading them would take more resources than we have available.

The "Super Homework"

Instead of a comprehensive final exam, we'll have a comprehensive "Super Homework". The super homework will focus on the content from the last two weeks of the quarter, but it will also contain material from throughout DSC 40B. It will be about twice as long as a typical homework.

Because the super homework covers twice as much material as a usual homework, it will be worth roughly twice as much. However, you may still collaborate on the super homework as long as you write up solutions in your own words.

The super homework will be due during finals week (the exact date is yet to be determined).

Collaboration and AI

You are highly encouraged to think about the lab and homework problems together, but you must turn in your own solutions written in your own words. We feel that discussing homework problems is an excellent way to learn, but writing the solutions in your own words promotes a deeper, more solid understanding than discussion alone.

We recommend the following way of working on the labs and homeworks. First, meet with your partner to discuss the solutions, but don't leave the meeting with anything written down. Wait an hour or so, then write up the solutions in your own words working from memory. In that hour, you inevitably forgot some of the details of the solution. If you find that you have trouble filling them in, its a sign that you might not have understood the solution as well as you first thought!

You're also encouraged to use AI (ChatGPT, etc.) in a similar way: you can talk to ChatGPT about a problem, but don't copy its answer verbatim. Instead, wait about an hour and put the answer in your own words. Keep in mind that ChatGPT is infamous for being very confidently wrong, so be critical of its output. Also keep in mind that you won't have ChatGPT on the exams, so you'll need to understand the fundamental concepts for yourself in order to do well.

If you have any questions or worries about whether your collaboration constitutes a violation of academic integrity, feel free to ask us on Campuswire.

Slip Days

You have five slip days to use throughout the quarter on any lab or homework (including the super homework). A slip day extends the deadline by 24 hours. Slip days cannot be "stacked" or "combined" to extend the deadline further — the latest any assignment can be submitted is 24 hours after the deadline. Slip days are applied automatically at the end of the quarter, but it's your responsibility to keep track of how many you have left.

Slip days are designed to be a transparent and predictable source of leniency in deadlines. You can use a slip day if you are too busy to complete an assignment on its original due date (or if you forgot about it). But slips days are also meant for things like the internet going down at 11:58 PM just as you go to submit your homework. Slip days are to be used in exceptional circumstances, so you probably shouldn't get close to using all of them — if you do get close to using that many, we will likely reach out to make sure that everything is OK.

Note that slip days are not designed to help in the case of a serious illness or other unfortunate event that severely disrupts your ability to participate in the class. If something like that should arise, please let us know ASAP!

Lastly, a technical note: some homeworks are broken up into multiple parts that are submitted separately (in particular, homeworks with programming problems will have one separate Gradescope submission for each programming problem in addition to the Gradescope submission for the written problems). Slip days are applied to the entire homework, not to individual parts of the homework, meaning that you only need to use one slip day per homework, no matter how many parts there are.

Exams

Midterms

There will be two midterm exams:

  • Midterm 01: Tuesday, October 29 (focuses on Lectures 01 — 08)
  • Midterm 02: Tuesday, November 26 (focuses on Lectures 09 — 15)

The exams will be held in-person during the regularly-scheduled lecture times.

For each midterm, you'll be allowed one sheet of notes on standard 8.5 by 11 inch paper, front and back. The notes can be handwritten, typed up, painted, etched with a laser, whatever, but it must be on paper (i.e., you can't use an iPad to display your notes during the exam).

The midterm questions themselves will be most similar to the practice problems at dsc40b.com/practice; in fact, all of the practice problems are former exam problems. More details about the midterm will be sent out about one week beforehand.

Final Exam

The final exam for DSC 40B is a "no fault" final split into two sections:

  1. An optional Midterm 01 "Redemption" section focusing on Lectures 01 — 08
  2. An optional Midterm 02 "Redemption" section focusing on Lectures 09 — 15

If your score on the midterm redemption section is higher than your score on the original midterm, it will replace that grade. Getting a lower score on a redemption section cannot hurt you (but it will make us sad). As a consequence, the redemption sections are effectively optional.

Under this policy, a bad performance on an earlier exam can be erased by good performance on the same material in a later exam.

Example: You got an "F" on Midterm 1 and a "B" on Midterm 2. You decide to take only the first redemption section on the final (though you could have taken both), and you receive an "A". Your midterm scores are now "A" and "B".

The redemption exams will be held on the date scheduled by the registrar: Saturday, December 07.

Note that the topics from Lectures 16, 17, and 18 are not on any exam. These will instead be tested in the Super Homework.

For the redemption exams, you're allowed one sheet of notes per exam that you're taking.

Exam Pass Criterion

In order to pass the class, the mean of your two midterm scores (after redemption is taken into account) must be 60% or greater.

The reason for this policy is that the exams are the only assessment in this class which you are sure to complete by yourself, and so they are (in theory) the purest measure of your individual understanding. This policy is not meant to be punitive: If your exam scores are not above passing after several attempts, it indicates that you might be better served by retaking the class with a fresh start before moving on to later courses which will draw upon the material from DSC 40B.

Grading

We'll be using the following grading scheme:

  • 12.5%: Labs
  • 30%: Homeworks
  • 7.5%: "Super Homework"
  • 25%: Midterm 01 (or Redemption Midterm 01, whichever is larger)
  • 25%: Midterm 02 (or Redemption Midterm 02, whichever is larger)

In a typical quarter, the midterm redemption policy has the same effect as a traditional "curve", therefore replacing the need for one. The standard grading scale (where an A is 93+, A- is 90+, B+ is 87+, etc.) will be used as a starting point, but once all scores are in, we will run a clustering algorithm to automatically find the best cutoffs for each letter grade. These cutoffs can only be lowered. For instance, the threshold for an "A" will never be higher than 93%.

A+ grades are not awarded according to a threshold. Instead, A+'s are awarded to the top 5% of students by overall grade.

Note that in order to pass the class, the mean of your two midterm scores (after redemption is taken into account) must be 60% or greater.

Support and Resources

As instructors, our job is to foster an environment where everyone, regardless of identity, feels welcome and is able to focus on learning. If there is something we can do in this mission, or if there is something preventing you from succeeding in the class, please let us know. If you feel uncomfortable speaking with us or are searching for help on a specific concern, there are several campus resources available to you, including:

More generally, if you have any concerns about your ability to focus or succeed in this course, or just need someone to talk to, please contact us ASAP and we'll figure something out.

OSD Exam Accommodations

If you have exam accommodations from the OSD, you should receive an email from the data science program that will ask you to provide your availability for your accommodated exam. The program will then schedule the exam and notify the instructor of its time and location. If you do not receive such an email by the end of the second week of classes, please let us know!

Please be sure to respond to the email from the data science program; if the program does not hear back from you, they will be unable to schedule your accommodated exam.

Waitlist

If you're on the waitlist, make sure you participate in the class just as if you were enrolled (for example, by doing all of the assignments) so that if you do get in, you're not behind.

Often, people will ask about their chances of making it off the waitlist. Unfortunately, that can be hard to answer! In some quarters, the waitlist moves a lot; in others, not at all.

FAQ

Is this class curved?

In a typical quarter, the midterm redemption policy has the same effect as a traditional "curve", therefore replacing the need for one. The standard grading scale (where an A is 93+, A- is 90+, B+ is 87+, etc.) will be used as a starting point, but once all scores are in, we will run a clustering algorithm to automatically find the best cutoffs for each letter grade. These cutoffs can only be lowered. For instance, the threshold for an "A" will never be higher than 93%.