R Systems
Chaos Engineering for Azure Web Application Resiliency
Pages
6
Time to read
5 mins
Publication
Language
English
Pages
6
Time to read
5 mins
Publication
Language
English
This document is a technical report detailing the implementation of a disaster recovery solution for a business-critical web application deployed across multiple Azure regions. The application, essential for managing procurement, invoicing, and customer interactions, required high availability to prevent financial loss and customer dissatisfaction during outages. The report outlines the architecture, including the use of Azure Traffic Manager for traffic routing and health monitoring, Azure Firewalls for security, and SQL database replication for failover. It describes the chaos engineering approach utilized to test the system's resiliency through simulated failures, such as network disruptions and virtual machine shutdowns. The chaos experiments were conducted during Game Days to assess the application's ability to maintain accessibility under stress. The findings confirmed that the implemented solution effectively ensured minimal downtime and business continuity, demonstrating the system's robustness against regional outages.