Where To Start
How to get started practicing Chaos Engineering in 5 simple steps
Identify your top 5 critical services
Choose one of these critical services
Whiteboard the service with your team
Select the Recommended Gremlin Scenario (based on your use cases)
Determine the magnitude: number of servers/length of time
What uses cases are you focusing on for the next 12 months?
Slow dependency
Unavailable dependency
Starved resources
Auto-scaling for peak traffic
Host failure within a fleet or cluster
Disaster recovery (region failover)
Alerting and monitoring observability
Time-based issues (certificates and DST)
Training for on-call
Application request failures (ALFI)
Playbook validation
What business goals are you focusing on for the next 12 months?
Regulatory compliance
Increase brand trust and reduce customer churn
Validate disaster recovery
Reduce incidents
Automating Chaos Engineering experiments into a build pipeline
Reduce downtime for customers
Migration to a new technology platform
Migration to a Public Cloud
What is your business justification for practicing Chaos Engineering?
Utilize Chaos Engineering as part of your automated test coverage
Lower costs from timely migration or DR testing
Increase engineering velocity
Reduce churned customers
Reduce lost revenue from outages