10-Jan Webpage Up!

Course Description

Explosive changes in applications have been observed with influx of machine intelligence (AI) into systems with critical resilience requirements. This new generation of systems while managed by AI, also rely on conventional fault models that rely on traditional methods of redundancy, coding, checkpointing, and storage management. This course will address both classical and AI-centric fault tolerance techniques via three applications with significant societal impact.

Autonomous Vehicles: Autonomy is already a part of modern vehicles, and increasingly newer methods are being adopted which has lead to significant concerns on safety.

Hybrid Cloud Infrastructure: The performance and resilience of the cloud systems is rapidly being automated, driving new fault models and recovery methods tied to performance and resilience objectives referred to as service level metrics.

Resource Disaggregation: Disaggregation is the latest emerging paradigm in datacenters where resources such as computing, storage, and memory are decoupled from their physical limits; thus allowing for dynamic allocation based on real-time demands and workloads. Resource disaggregation introduces new fault models due to communication between traditionally colocated components.

Course Components
  • Lectures on classical and AI-centric fault tolerance techniques
  • Guest lectures from industry and academia
  • Student-led presentations and discussion
  • In-class design innovation activity
  • A project focused on resilience assessment and design

More details here

  • Class Timings: Tue/Thu 12:30pm - 1:50pm (CT) at 2013 ECE Building. Live lectures and discussion.
  • Paper discussions and class announcements will be made on Campuswire. Students should enroll in the “ECE 542 Fault Tolerant Digital Systems Design” course.
  • Presentations, reviews, and reports should be submitted on Canvas.
  • Signup Deadlines: Presentation (TBD), Project (TBD)
  • Evaluation: Details here
  • Academic Accomodation: DRES requirements must be reported to instructor/TA by the end of 1st week (1/19/2024)


Instructor Teaching Assistant
Ravishankar K. Iyer Archit Patke
Prof. Iyer Archit
Office Hours: 255 Coordinated Science Lab; 10:00am - 11:00am Monday Office Hours: 245 Coordinated Science Lab; 10:00am - 11:00am Wednesday
Email: Email:

Academic Integrity Policy