User-centric Systems for Data Science (CS 599 L1)
This project is maintained by jliagouris
Make sure to become familiar with the Official Semester Dates. Some of the critical Semester Dates are:
Students are expected to attend each lecture in person according to the BU safety guidlines. All course material will be posted on Piazza. Ultimately, students are responsible for their own learning and, thus, for keeping up with the material.
Date | Topic | Note |
---|---|---|
09/06 | Lec 0: Course introduction | |
09/08 | Lec 1: Database Concepts | Overview of basic concepts in Part I |
09/13 | Lec 2: Data Provenance | Read: Provenance: What's Next? |
09/15 | Lec 3: Introduction to Ray | Read: Ray: A Distributed Framework for Emerging AI Applications
(Ray is the system we will be using in the assignments) |
09/20 | Lec 4: Discussion on Assignments #1 and #2 | Read: Explaining Collaborative Filtering Recommendations |
09/22 | Hacking Day | |
09/27 | Lec 5: Explaining Non-Answers | Read: Why not? |
09/29 | Lec 6: Data Causality | Read: Causality in Databases |
10/04 | Lec 7: Dataflow Provenance | Read: Explaining outputs in modern data analytics Optional: Provenance for generalized Map and Reduce workflows |
10/04 | DUE DATE: Assignment #1 | |
10/06 | Lec 8: Discussion on Assignment #1 Quiz #1 (during lecture) |
Common mistakes in Assignment #1 |
10/11 | No Lecture | Substitute Monday |
10/13 | Lec 9: Machine Learning Concepts |
Overview of basic concepts in Part II |
10/18 | Lec 10: Generalized Additive Models | Read: Intelligible Models for Classification and Regression
Watch: The Science Behind InterpretML: Explainable Boosting Machine |
10/20 | Lec 11: Explaining Classification Results (LIME) | Read: “Why should I trust you?” Explaining the predictions of any classifier Watch: The Science Behind InterpretML: LIME |
10/21 | DUE DATE: Assignment #2 | |
10/25 | Lec 12: Interpreting Model Predictions (SHAP) | Watch: The Science Behind InterpretML: SHAP Optional: A unified approach to interpreting model predictions |
10/27 | Lec 13: Discussion on Assignment #3 Quiz #2 (during lecture) |
|
11/01 | Lec 14: Guest Lecture by Bojan Karlas (Harvard Medical School) | Data Systems for Managing and Debugging Machine Learning Development Workflows |
11/03 | Lec 15: Distributed Systems Concepts | Overview of basic concepts in Part III |
11/08 | Lec 16: Causal Profiling | Read: Coz: finding code that counts with causal profiling Optional: SOSP'15 talk |
11/10 | Lec 17: Distributed System Tracing | Read: Dapper: A large-scale distributed systems tracing infrastructure
Optional: X-Trace: A pervasive network tracing framework |
11/11 | DUE DATE: Assignment #3 | |
11/15 | Lec 18: Distributed System Tracing (cont.) Discussion on Assignment #4 |
Read: Pivot Tracing: Dynamic causal monitoring for distributed systems |
11/17 | Hacking Day | Office hours during lecture time |
11/22 | Lec 19: Blocked Time Analysis | Read: Making sense of performance in data analytics frameworks Optional: Scalability! But at what COST? |
11/24 | No Lecture | Thanksgiving Recess |
11/29 | Lec 20: Critical Path Analysis | Read: SnailTrail: Generalizing critical paths for online analysis of distributed dataflows |
12/01 | Lec 21: Guest Lecture | TBA |
12/06 | Lec 22: Log-based Performance Analysis | Read: The Mystery Machine: End-to-end performance analysis of large-scale internet services |
12/08 | Lec 23: Black-Box Performance Analysis Quiz #3 (during lecture) |
Read: Performance debugging for distributed systems of black boxes |
12/09 | DUE DATE: Assignment #4 |