Chaos Engineering: The fine art of breaking stuff in production

Traditional monitoring solutions are dead. In the microservices and distributed systems era your complex landscape is never 100% up. If built well this shouldn’t matter. But how do you test it? There is really only 1 place to test high availability, fail saves, and other solutions that you implement to keep your system up and running all the time: that place is your production environment. Chaos Engineering uses a number of practices to be able to do experiments in your production environment without impacting end users. This way you know that when something goes wrong your system is able to recover instead of having an idea of what would happen and you only know it when something actually happens. In this session we’ll cover the basics of chaos engineering, practices of creating an environment that allows for experimenting and how you can start doing this in your own production environments.

