Why do systems fail? How do we determine what went wrong? How do we learn from failure to build better systems and prevent similar problems from occurring in the future? In this course we will examine a variety of ways that software and hardware systems can fail, their causes, impacts and (where applicable) remediation. We will learn about tools and techniques that can be used to debug, analyze and simulate failures, and will conduct a series of experiments where we will observe various forms of failure. The course, its content and direction will be, to some extent, determined by participants’ skills and interests.