(Mindmap and Audio Discussion is a paid feature)
Problem:
A T3 or T2 burstable instance hosting a web server becomes slow and unresponsive during high traffic.
The instance works well during low traffic but performance degrades under load.
There is a suspicion that CPU credits are being exhausted.
Step-by-Step Troubleshooting:
Review CloudWatch Metrics:
Check CPU credit balance in CloudWatch to see if credits drop to zero during peak load.
Monitor CPU credit usage over time to confirm if credit consumption spikes align with performance issues.
Review CPU utilization to verify that the instance is hitting CPU limits.
Analyze Workload Patterns:
Identify if the traffic load is consistently high or only spiking occasionally.
Determine if the workload fits the burstable instance model, which is ideal for low baseline and occasional spikes.
Look for any sudden unexpected increases in CPU demand.
Evaluate Instance Mode:
Check if the instance is in standard mode or unlimited mode.
In standard mode, performance drops when CPU credits are depleted.
In unlimited mode, the instance can continue using CPU beyond credits by incurring additional charges.
If in standard mode and performance is critical, consider switching to unlimited mode.
Consider Scaling Options:
If the workload requires sustained CPU performance, vertical scaling may be necessary.
Upgrade to a larger instance type such as M5 or C5 which do not rely on CPU credits.
Alternatively, use horizontal scaling by adding more instances behind a load balancer to spread the load.
Configure Auto Scaling policies to handle traffic spikes automatically.
Review Application Behavior:
Analyze the application for inefficient code that could cause high CPU usage.
Check for unnecessary background tasks or services running on the instance.
Optimize the application to reduce CPU load, such as implementing caching or optimizing database queries.
Ensure that the application is properly tuned for performance.
Key AWS Terms:
CPU Credits: Tokens earned by burstable instances that allow them to use full CPU.
CloudWatch: AWS monitoring tool for viewing metrics like CPU credit balance and usage.
Burstable Instances: T2 and T3 instance types that use CPU credits for bursts of performance.
Standard Mode: Instance performance is limited when CPU credits run out.
Unlimited Mode: Instance can exceed CPU credits for a fee without performance degradation.
Interview Insight:
When I see degraded performance on a T3 or T2 instance, I first check CloudWatch for CPU credit exhaustion.
If credits are consistently depleted, I consider switching to unlimited mode or moving to a non-burstable instance.
I also examine the workload to determine if scaling or optimization is needed to handle traffic efficiently.