Circuit Breakers and Application Resilience

For most people circuit breakers are a concept that belongs in the world of electricity. They are usually a box of fuses under the stairs that prevents the tangle of extension cords from turning into an unexpected open fireplace behind the TV. But the concept of a circuit breaker is something that we can apply to software and software services.

Martin Fowler described a software circuit breaker as follows:

“… a protected function call in a circuit breaker object, which monitors for failures. Once the failures reach a certain threshold, the circuit breaker trips, and all further calls to the circuit breaker return with an error, without the protected call being made at all”

Background

OpenTable runs on many microservices that depend on one another to deliver the OpenTable site. These services can have dependencies on data stores such as MongoDB and Elastic Search as well as services such as RabbitMQ and Redis. Sometimes these dependencies can have performance or availability issues; such as when somebody accidentally drops an index on a collection in Mongo.

This actually happened, your author accidentally dropped a critical index while working with a DB synchronization tool which resulted in downtime. Hence this article.

When this happens calls to dependencies may fail or take a long time to complete which then causes the dependent service to behave in the same way. Other upstream services can also suffer the same problem. This can be described as a cascading failure.

Ideally what we would like to happen at this point is for our service to recognize what is happening, fail gracefully, and stop calling the service that already can’t keep up with the load placed upon it. (As well as alerting us to the problem). This is where a software circuit breaker can help us.

Polly to the rescue

[Read More]

Hosting external events

This June, OpenTable hosted the London Machine Learning Study Group to great success. We provided the food, drinks, and space for an extremely talented speaker and over 100 RSVPs. It was also my first experience hosting an external meetup at OpenTable - and I learned a lot!

My Experience

OpenTable has frequently hosted meetups in our office such as Web Platform London and WEBdeLDN. However, acting as an event host was a complete unknown for me, so leading up to the machine learning event, I made a point of sitting down with someone who had organised one before.

We documented the key steps in setting up an external event, and then published what we’d discussed on our internal wiki so as to avoid siloing useful information for future use. Following the steps documented in the wiki proved invaluable in reducing my stress on the day.

After our last event we also looked back on how we’d done the last few events.

Retrospective

Following our last meetup, we held a retrospective as, along with the London Machine Learning Study Group, we have had a recent spate of meetups at OpenTable. This was extremely useful to underline our shared experiences as well as the considerations we want to make for our next external event. Here are a few areas we will look to improve.

Push social media more

We want the largest number of people to know about these events ahead of time, which means not just relying on one platform. Instead we aim to advertise our meetups on all channels available to us to get as diverse an audience as possible. We will tweet about our events more via @OpenTableTechUK and we will use this blog to advertise upcoming events.

A lot of people don’t turn up

[Read More]