Week# Date Title
Week 1 Jan 17 Course outline and Overview of Mini Projects - Slides
a) Computer System Analytics - Blue Waters System Monitoring
b) Healthcare analytics - Genomics/Cancer Example
c) Healthcare analytics – Neuroscience Example
Resilience of large-scale systems
Week 2 Jan 22 Lecture 1 : Lecture 1: Probability Basics Overview (Recorded Lecture) - Slides
Jan 24 Lecture 2 : Greg Bauer’s lecture on monitoring Blue Waters, 13.3 PF supercomputer - Slides
Implementing end-to-end workflow for resiliency analysis
(i) Data management for resiliency logs, selection of feature pruning methods, dimensionality reduction methods
(ii) Domain component: Reliability/availability modeling, Dependent and common-mode failures and their characterizations
(iii) Introduction to LogDiver toolchain for error-data acquisition and preliminary filtering of Blue Waters datasets
(i) Raw measurements to initial features using coalescing techniques (temporal reduction and data management)
(ii) Dimensionality reduction (Spatial reduction of dataset, PCA, correlation techniques, and sensitivity analysis)
(iii) Applying joint probabilities and conditional/marginal probabilities to understand dependent failures
Week 3 Jan 29 Lecture 1 : Failure/Reliability/Availability of Compute System, Series-Parallel Systems, ECC - Slides
Jan 31 Lecture 2 : Chipkill, Importance of Filtering in Data Analysis, Coalescing techniques, Mini-project 1 discussion & release - Slides
Describing project and related analytical methods:
1. Measuring reliability and availability of Blue Waters system and applications
2. Diagnosing the cause of failures
Week 4 Feb 5 Lecture 1 : Introduction to Bayesian Networks - Slides
Feb 7 Mini Project 1
Lecture 2 : Bayesian Networks - Slides
Week 5 Feb 12 Lecture 1 : In-class activity on Bayesian Networks - Homework 2 and In Class Activity
Feb 14 Lecture 2 : In-class lab - Notebook
Healthcare analytics: Quantifying drug response and disease progression
(i) Data management for genomics and related health-records capturing the uniqueness of these health-records
(ii) Data filtering, feature selection, fitting distributions, selection of clustering methods (GMMs, K-means), longitudinal data analysis
(iii) Domain component: gene expression, dependence between gene expression and genetic diseases (in particular breast cancer), understanding diseases progression
(iv) Introduction to MiMoSA toolchain and publicly available breast cancer dataset
(i) Raw measurements to initial features using sensitivity analysis
(ii) Identifying data modalities based on the source of the dataset and the demographics represented in the dataset
(iii) Using k-means and GMM's for clustering on processed dataset for understanding gene expression. Developing insights into the understanding the difference between these clustering methods
(iv) Using developed framework to understand disease progression
Week 6 Feb 19 Lecture 1: Introduction to health-care domain: disease models, drug response, forecasting disease progression - Slides
Feb 21 Lecture 2: MP1 In Class Presentations
Week 7 Feb 26 Lecture 1: Data filtering, feature selection, fitting distributions, selection of clustering methods (GMMs, K-means, Linear Regression) - Slides
Feb 28 Lecture 2: Clustering - Hierarchical Clustering - Slides
Week 8 Mar 5 Lecture 1: In Class Activity 2
Mar 7 Lecture 2: Principal Component Analysis, Guest Lecture by a Mayo clinician from Center for Individualized Medicine - Slides
Mar 10 Midterm Review - Slides
Week 9 Mar 12 Lecture 1: MIDTERM - Rubric
Mar 14 Lecture 2: In Class Lab on PCA - Notebook + Datasets
Autonomous Security Monitoring for Enterprise Systems: Data-driven learning and inference
Week 10 Mar 26 Lecture 1: Introduction to security monitoring for enterprise systems
Mar 28 Lecture 2: Guest Lecture
Week 11 Apr 2 Implementing end-to-end workflow for scheduling on a heterogeneous system
Apr 4 Implementing end-to-end workflow for security monitoring for enterprise systems
(i) Data acquisition using Bro Intrusion Detection System (IDS), Syslogs, Network logs
(ii) Using probabilistic graphical models for depicting attach evolution and malicious payload execution
(iii) Domain component: Security vulnerabilities, IDS tools
(iv) Introduction to container-based AttackTagger test-bed
Week 12 Apr 9 Lecture 1: Mini project 3 / Describing project
(i) Data acquisition by running selected benchmarks on personal laptops
(ii) Probabilistic graphical model parameterization and generation of attack evolution
(iii) Attack prediction and response
Apr 11 Lecture 2: In-class lab on mini-project 3
Week 13 Apr 16 Lecture 1 : In class Lab
Apr 18 Lecture 2 : Graphical Models
Mini project 3 presentation (out of class)
Validation methods and review
Week 14 Apr 23 Lecture 1: Validation techniques - Fault injection based approaches and test bed simulation
Apr 25 Lecture 2: Review
FINAL Wednesday, May 9, 7-10pm As per the exam calendar