In this episode, we cover the following topics:
- AWS Well-Architected Framework
- Provides consistent approach to evaluating systems against cloud best practices
- Helps advise changes necessary to make specific architecture align with best practices
- Comprised of 3 components:
- Design Principles
- Pillars
- Operational Excellence
- Security
- Reliability
- Performance Efficiency
- Cost Optimization
- Questions
- General design principles
- Cloud-native has changed everything. In cloud, you can:
- Stop guessing capacity needs
- Test at scale
- Automate all the things to make experimentation easier
- Allow for evolutionary architectures (you are never stuck with a particular technology)
- Drive architectures using data (allows you to make fact based decisions on how to improve your workload)
- Improve through game days
- Pillars in depth
- Operational Excellence
- "Ability to run and monitor systems to deliver business value and to continuously improve supporting processes and procedures"
- Design principles
- Perform operations as code
- Annotate documentation
- Make frequent, small, reversible changes
- Refine operations procedures frequently
- Anticipate failure
- Learn from all operational failures
- Key service: CloudFormation
- Focus areas
- Prepare
- Services: AWS Config, AWS Config Rules
- Operate
- Services: CloudWatch, X-Ray, CloudTrail, VPC Flow Logs
- Evolve
- Services: Elasticsearch (for searching log data to gain insights), CloudWatch Insights
- Best practices
- Prepare
- Implement telemetry for:
- Application
- Workload
- User activity
- Dependencies
- Implement transaction traceability
- Operate
- Any event for which you raise an alert should have associated runbook
- Runbook defines triggers for escalations
- Users should be notified when system is impacted
- Communicate status through dashboards
- Provide dashboards to communicate the current operating status of the business and provide metrics of interest
- Evolve
- Feedback loops
- Identify areas for improvement
- Gauge impact of changes to the system (i.e. did it make an improvement?)
- Perform operations metrics reviews
- Retrospective analysis of operations metrics
- Use these reviews to identify opportunities for improvement, potential courses of action, and share lessons learned
- Key points
- Runbooks, playbooks
- Document environments
- Make small changes through automation
- Monitor workload with business metrics
- Exercise your response to failures
- Have well-defined escalation management
- In future episodes, we'll cover the remaining 4 pillars
Links
Whitepapers
End Song:
30 Days & 30 Nights by Fortune Finder
For a full transcription of this episode, please visit the episode webpage.
We'd love to hear from you! You can reach us at: