What is the SAGA Pattern?
SAGA pattern has been around for over 30 years, and it is widely used for distributed transaction management. It’s simply one of the ways to maintain data consistency for distributed systems. When you are working with microservices-based architectures, data consistency easily becomes a headache for developers. Since each service can have its database and be in a different environment, it’s no longer possible to take advantage of the ACID transactions. Saga simply sacrifices atomicity and relies on eventual consistency.
When to use it?
When atomicity is not a necessity in distributed systems, like microservices-based architectures using a database per service. For example, when you make an online holiday reservation, the system first returns a success message to the customer and then sequentially makes reservations such as flight (connecting flights, if there are), hotel, car, etc. using different subsystems.
SAGA vs. 2PC
2PC works as a single commit and aims to perform ACID transactions on distributed systems. It is used wherever strong consistency is important. On the other hand, SAGA works sequentially, not as a single commit. Each operation gets committed before the subsequent one, and this makes the data eventually consistent. Thus, Saga consists of multiple steps whereas 2PC acts like a single request.
One important consideration with 2PC is that it could be a single point of failure in the system and cause a “blocking problem”.
Choreography-Based Saga
In this approach, there would be no orchestrator in the system. Each service performs its process, and if the result is successful, it fires a success event for the next step to continue. In case of a failure, it fires a failure event for the previous step. So, the services that have worked before this step can sequentially perform rollbacks.
This approach is useful for cases that consist of only a few steps. As it includes more steps, the events and design could become more complicated to manage.

- Order Service creates the order and sends “order_created_event” into the message queue.
- Payment Service first receives the message, creates the payment and then sends “payment_billed_event” into the message queue.
- Stock Service receives this message from Payment Service, performs the required processes, and sends “stock_prepared_event” to the message queue.
- Delivery Service runs into an error when performing this event and sends back a “delivery_failed_event” message to roll back the entire process.
- Stock Service receives this failure message using the transaction_is provided in it, performs a compensation process, and sends back “stock_failed_event”.
- Finally, Payment Service receives the failure message and performs its compensation process, and sends “payment_failed_event” for Order Service.
- Order Service for such cases can use a retry mechanism with some delay, and if the error persists, it can use some warning mechanisms for a manual check.
Orchestration-Based Saga
In this approach, on the other hand, there would be an orchestrator to manage the entire operation from one center. An orchestrator receives a start command from a source and commences calling related services sequentially. After each successful response, it makes the next call to the following service. If one of the steps fails and the service returns a failure message, the orchestrator makes rollback calls for each previous step/service.
As it also brings along some scalability issues and the risk of a single point of failure, developers should consider necessary recovery actions or simply go for Choreography-Based architecture if they can.

- Order Service creates the order and employs the Orchestrator.
- Orchestrator sends “order_created_event” for Payment Service.
- Payment Service creates the payment and sends “payment_billed_event” for the Orchestrator.
- This time, the Orchestrator sends “payment_billed_event” for Stock Service.
- Stock Service runs into an error when performing this event and sends “stock_failed_event” for Orchestrator.
- Orchestrator initiates the rollback cycle and sends “rollback_event” for the previous service being the Payment Service in this scenario.
- Payment Service performs a compensation process after receiving this message.
Thanks for reading 🙂
