Lectures


Schedule

Week# Title
Week 1 Course outline and Overview of Mini Projects
a) Computer System Analytics - Blue Waters System Monitoring
b) Healthcare analytics - Genomics/Cancer Example
c) Healthcare analytics – Neuroscience Example
Resilience of large-scale systems
Week 2 Lecture 1 : Lecture 1: Probability Basics Overview (Recorded Lecture)
Lecture 2 : Greg Bauer’s lecture on monitoring Blue Waters, 13.3 PF supercomputer
Week 3 Implementing end-to-end workflow for resiliency analysis
(i) Data management for resiliency logs, selection of feature pruning methods, dimensionality reduction methods
(ii) Domain component: Reliability/availability modeling, Dependent and common-mode failures and their characterizations
(iii) Introduction to LogDiver toolchain for error-data acquisition and preliminary filtering of Blue Waters datasets
Details
(i) Raw measurements to initial features using coalescing techniques (temporal reduction and data management)
(ii) Dimensionality reduction (Spatial reduction of dataset, PCA, correlation techniques, and sensitivity analysis)
(iii) Applying joint probabilities and conditional/marginal probabilities to understand dependent failures
Lecture 1 : Failure/Reliability/Availability of Compute System, Series-Parallel Systems, ECC
Lecture 2 : Chipkill, Importance of Filtering in Data Analysis, Coalescing techniques, Mini-project 1 discussion & release
Describing project and related analytical methods:
1. Measuring reliability and availability of Blue Waters system and applications
2. Diagnosing the cause of failures
Week 4 Lecture 1 : Introduction to LogDiver and its application
Mini Project 1
Lecture 2 : Introduction to Bayesian Networks
Week 5 Lecture 1 : In-class lab on mini-project 1
Lecture 2 : In-class quiz (30 minutes)
Mini project 1 presentation (report submission)
Healthcare analytics: Quantifying drug response and disease progression
Week 6 Lecture 1: Introduction to health-care domain: disease models, drug response, forecasting disease progression
Lecture 2: Guest Lecture by a Mayo clinician from Center for Individualized Medicine
Week 7 Implementing end-to-end workflow for understanding breast-cancer-causing genetic factors
(i) Data management for genomics and related health-records capturing the uniqueness of these health-records
(ii) Data filtering, feature selection, fitting distributions, selection of clustering methods (GMMs, K-means), longitudinal data analysis
(iii) Domain component: gene expression, dependence between gene expression and genetic diseases (in particular breast cancer), understanding diseases progression
(iv) Introduction to MiMoSA toolchain and publicly available breast cancer dataset
Details:
(i) Raw measurements to initial features using sensitivity analysis
(ii) Identifying data modalities based on the source of the dataset and the demographics represented in the dataset
(iii) Using k-means and GMM's for clustering on processed dataset for understanding gene expression. Developing insights into the understanding the difference between these clustering methods
(iv) Using developed framework to understand disease progression
Week 8 Lecture 1 : Mini project 2
Describing the specific project and related analytical methods: Gene expression analysis with focus on breast cancer
(i) Estimating distributions of gene expression using mixture models
(ii) Study the efficacy of mixture model clusters as opposed to standard clustering approaches assuming normality in data
Lecture 2 : MIDTERM
Week 9 Clustering Techniques
Week 10 Lecture 1: In-class Lab
Lecture 2: Clustering Techniques
Mini project 2 presentation (out of class)
Autonomous Security Monitoring for Enterprise Systems: Data-driven learning and inference
Week 11 Lecture 1: Introduction to security monitoring for enterprise systems
Lecture 2: Guest Lecture
Week 12 Implementing end-to-end workflow for scheduling on a heterogeneous system
Implementing end-to-end workflow for security monitoring for enterprise systems
(i) Data acquisition using Bro Intrusion Detection System (IDS), Syslogs, Network logs
(ii) Using probabilistic graphical models for depicting attach evolution and malicious payload execution
(iii) Domain component: Security vulnerabilities, IDS tools
(iv) Introduction to container-based AttackTagger test-bed
Week 13 Lecture 1: Mini project 3 / Describing project
(i) Data acquisition by running selected benchmarks on personal laptops
(ii) Probabilistic graphical model parameterization and generation of attack evolution
(iii) Attack prediction and response
Lecture 2: In-class lab on mini-project 3
Week 14 Lecture 1 : In class Lab
Lecture 2 : Graphical Models
Mini project 3 presentation (out of class)
Validation methods and review
Week 15 Lecture 1: Validation techniques - Fault injection based approaches and test bed simulation
Lecture 2: Review
FINAL As per the exam calendar