CHAOS ENGINEERING

Gabriel Castro
3 min readOct 16, 2020

What is it?

Lets say you’ve just built your awesome application to help cats find their soul mates using Ruby on Rails/React. You put in countless hours making sure your MVC, Sterilizers, API’s, Seeds and Schema all talk to each other correctly. You’ve poured over numerous styling with bootstrap, CSS and Google Map Styling. Its production day and time for the worlds cats find the love of their life! Only to find that failures are hitting your application that you didn't quiet anticipate. Here is where the idea of chaos engineering comes into play, the experimentation of injecting failures into your system to identify weak points to help create a better end-user experience.

Give Em’ the Pickle!

Bob Farrel’s principle is simple, give your customers those special extra things to make your customers happy. How can we accomplish this in the engineering field? By testing our applications to their fullest in developing mode even after our applications have gone to production. In here lies rub, the ghost in the machine.

Principles of Chaos Engineering

  1. Start by defining ‘steady state’ as some measurable output of a system that indicates normal behavior.
  2. Hypothesize that this steady state will continue in both the control group and the experimental group.
  3. Introduce variables that reflect real world events like servers that crash, hard drives that malfunction, network connections that are severed, etc.
  4. Try to disprove the hypothesis by looking for a difference in steady state between the control group and the experimental group.

In Practice

Alright, so we got our cat meeting their soul mate application. We know that we want to do Bob Farrel proud and make our cat experience better then any other pet dating site out there! Now we can get to testing. Lets say we want to run a simple test of bringing our sever down while the application is live to simulate what would happen in the real world, we can do this with now being armed with our principles.

  1. Steady State

Simple, we know that our steady state is when our front end is communicating with our back end. We can see this while running our application and seeing real time data being translated to the server.

2. Hypothesize

Remember back in grade school when you had to create a hypothesis for science class on an expeirment you were about to run to guess the out come? Its all about to pay off now…. Our prediction is simply that when our database is offline we hypothesize that our application will not be able to function correctly.

3. Inject Chaos

Introducing something like a security group rule to allows us to simply stop the connection from our backend to our frontend that we can implement and remove for testing.

4. Attempt to Disprove

Time to test our theory! If our application continues to run without any issue, then great! our application is better then expected at handling attempts to be taken offline, however if our application does go offline we would then want to advised our user in this case something better then dreaded 404 or 500 error which means nothing to our user. We can do this by implementing something like a “Offline Mode” that simply advised the user on the front page that the system is currently down.

Conclusion

As you can see this is a very simplified example of introducing chaos engineering into your own applications, below I left some resources that I found useful to help better understand this concept. Its also easy to see how implementing something like this can be useful to test out greater vulnerabilities to systems like malware, DDoS attacks etc…

Chaos Engineering at Netflix

Give Em’ the Pickle!

--

--

Gabriel Castro

Full Stack, Software Engineer. Focus in Rails, JavaScript and React.