🧩 Right-Sizing Resources for Azure & OCI
Shrink your bill, not your performance. If your cloud costs feel oversized but your apps aren’t, you’re probably due for a little right-sizing TLC. 🍳 This guide serves up Azure & OCI-specific tips, smart defaults, and habits to right-size once—and keep it that way. Let’s turn “more RAM just in case” into “just enough for the job.” 🛠️
Table of Contents
- ⚙️ What is Right-Sizing? (No, it’s not a gym thing)
- 🎯 Why It Matters (Especially for SMBs)
- 📊 Key Metrics for Right-Sizing
- Azure Right-sizing Techniques
- Azure VM Right-sizing
- Azure Database Right-sizing
- Azure App Service Right-sizing
- OCI Right-sizing Techniques
- OCI Compute Right-sizing
- OCI Database Right-sizing
- OCI Storage Right-sizing
- Automating Right-sizing
- 🧠 Best Practices for Right-Sizing
- 🛠️ Common Right-Sizing Challenges
⚙️ What is Right-Sizing? (No, it’s not a gym thing)
Right-sizing means matching your compute and storage to what your workload actually needs — not what sounded good at provisioning time. It’s the cloud equivalent of swapping your unused 8-seater SUV for a sleek hybrid: same destination, way less fuel burned.
🧮 The payoff? Most teams save 20–40% just by adjusting instance types, sizes, and volumes. No refactor required. Just smarter choices.
🎯 Why It Matters (Especially for SMBs)
💸 1. Cost Optimization
Overprovisioned = Overpaid. Right-sizing is the easiest way to stop funding idle VMs and give your budget a breather. Estimated savings: 20–40% without even touching your architecture.
🚀 2. Performance Boost
Right-size ≠ underpowered. It means your workloads get exactly what they need — no more, no less. Think fewer hiccups, faster response times, and happier users.
🌍 3. Environmental Impact
Yes, the cloud is virtual — but those servers are still drawing power somewhere. Shrink the waste, and you shrink your carbon footprint. Less compute = less energy = more planet-friendly cloud.
🧹 4. Operational Efficiency
Right-sized resources are easier to scale, monitor, and manage. You spend less time babysitting and more time building. Clean infra = clear mind.
When “just in case” becomes “just wasting cash.” Studies show most cloud setups are like ordering a triple espresso for a toddler:
- 📦 40–45% of VMs are one size tier too big
- 🧠 25–30% of databases run at 2–4x needed capacity
- 🧪 50–60% of non-prod environments are built like prod (why tho?)
That’s not “extra safe” — that’s extra spend. And when every oversized resource piles up, we’re talking billions in global cloud waste. TL;DR: Oversizing is easy. Cost-effective sizing takes intention — and a little Nerdy know-how.
📊 Key Metrics for Right-Sizing
Because you can’t shrink what you don’t measure. Right-sizing isn’t guesswork — it’s about watching the right dials and knowing when something’s way overbuilt for the job. Here’s what to track:
🧠 CPU Utilization
What it tells you: How much brainpower your resource is really using.
🧠 Memory Utilization
What it tells you: How much memory your resource actually needs vs hoards “just in case.”
💽 I/O Operations
What it tells you: How busy your storage and databases are with reads/writes and network traffic.
Because high efficiency means nothing if your app cries under pressure. When right-sizing, it’s not just about trimming the fat — it’s about keeping your app smooth and snappy too.
- 🧮 Utilization Metrics = What your resource is doing (CPU%, memory usage, IOPS — the raw fuel burn).
- 🧪 Performance Metrics = How your app is feeling (Response time, latency, throughput — aka: “Is it slow, or just chill?”).
🎯 Nerd Rule of Thumb: Always check both. High utilization with good performance? Nice. Low utilization but laggy UX? You might’ve right-sized too hard.
Azure Right-sizing Implementation
Azure provides several tools and features to help you identify and implement right-sizing opportunities.
Virtual Machine Right-sizing
Azure VMs are often the largest component of cloud spend and present significant right-sizing opportunities.
Azure: VM Right-sizing Process
- Collect utilization data: Use Azure Monitor to collect CPU, memory, disk, and network metrics for at least 14 days.
- Analyze usage patterns: Look for consistent patterns of underutilization (average CPU < 30% and peak CPU < 50%).
- Identify right-sizing candidates: Use Azure Advisor or create custom queries to identify VMs that can be downsized.
- Plan the change: Determine the right VM size based on actual usage patterns and performance requirements.
- Implement and validate: Resize the VM during a maintenance window and monitor performance after the change.
Azure Advisor provides built-in recommendations for VM right-sizing. You can access these recommendations in the Azure Portal:
- Navigate to Azure Advisor in the Azure Portal
- Select the "Cost" tab
- Review the "Right-size or shutdown underutilized virtual machines" recommendations
Azure VM Right-sizing Query
You can use Azure Resource Graph to identify right-sizing candidates with this query:
// Find underutilized VMs based on CPU metrics
let timeRange = 14d;
let cpuThreshold = 30;
resources
| where type == "microsoft.compute/virtualmachines"
| project name, resourceGroup, location, id, tags
| join kind=leftouter (
metrics
| where namespace == "Microsoft.Compute" and name == "Percentage CPU"
| where timerange(ago(timeRange)..now())
| summarize AvgCPU = avg(Average) by resource
) on $left.id == $right.resource
| where isnotnull(AvgCPU) and AvgCPU < cpuThreshold
| project name, resourceGroup, location, AvgCPU, tags
| order by AvgCPU asc
Database Right-sizing
Azure database services like Azure SQL Database and Azure Cosmos DB offer flexible scaling options that can be optimized based on usage.
Azure: Database Right-sizing
Azure SQL Database
For Azure SQL Database, monitor these key metrics:
- DTU/vCore utilization (target: average < 40%)
- Storage utilization (target: used storage < 70% of provisioned)
- Query performance (ensure response times meet requirements)
Right-sizing options for Azure SQL Database:
- Scale down service tier (Premium → Standard → Basic)
- Reduce DTUs or vCores within the same tier
- Switch between provisioned and serverless models based on usage patterns
- Consider Azure SQL Elastic Pools for multiple databases with variable workloads
Storage Right-sizing
Azure Storage offers multiple tiers that can be optimized based on access patterns and data lifecycle.
Azure: Storage Right-sizing
Azure Storage right-sizing strategies:
- Tier optimization: Move infrequently accessed data from Hot to Cool or Archive tiers
- Lifecycle management: Implement automatic tiering based on access patterns
- Redundancy level: Adjust redundancy (LRS, ZRS, GRS) based on data criticality
Example Azure Storage lifecycle policy to automatically tier data:
{
"rules": [
{
"name": "MoveToCoolTier",
"enabled": true,
"type": "Lifecycle",
"definition": {
"filters": {
"blobTypes": [ "blockBlob" ]
},
"actions": {
"baseBlob": {
"tierToCool": { "daysAfterModificationGreaterThan": 30 }
}
}
}
},
{
"name": "MoveToArchiveTier",
"enabled": true,
"type": "Lifecycle",
"definition": {
"filters": {
"blobTypes": [ "blockBlob" ]
},
"actions": {
"baseBlob": {
"tierToArchive": { "daysAfterModificationGreaterThan": 90 }
}
}
}
}
]
}
OCI Right-sizing Implementation
Oracle Cloud Infrastructure (OCI) offers flexible compute shapes and autoscaling capabilities that facilitate right-sizing.
Compute Instance Right-sizing
OCI's flexible compute shapes allow you to independently specify CPU and memory, enabling precise right-sizing.
OCI: Compute Right-sizing Process
- Collect utilization data: Use OCI Monitoring to collect CPU, memory, and network metrics for at least 14 days.
- Analyze usage patterns: Look for consistent patterns of underutilization (average CPU < 30% and peak CPU < 50%).
- Identify right-sizing candidates: Use OCI Cloud Advisor or create custom monitoring queries to identify instances that can be downsized.
- Plan the change: Determine the right compute shape based on actual usage patterns and performance requirements.
- Implement and validate: Resize the instance during a maintenance window and monitor performance after the change.
OCI's flexible shapes allow you to specify:
- Exact number of OCPUs (CPU cores)
- Precise memory allocation
- Network bandwidth requirements
OCI Compute Right-sizing Script
You can use this Python script to identify right-sizing candidates in OCI:
import oci
import datetime
from datetime import timedelta
# Initialize OCI clients
config = oci.config.from_file()
monitoring_client = oci.monitoring.MonitoringClient(config)
compute_client = oci.core.ComputeClient(config)
identity_client = oci.identity.IdentityClient(config)
# Set parameters
compartment_id = "ocid1.compartment.oc1..example"
days_to_analyze = 14
cpu_threshold = 30 # Average CPU utilization threshold
# Get current time and start time (14 days ago)
end_time = datetime.datetime.now()
start_time = end_time - timedelta(days=days_to_analyze)
# Get all instances in the compartment
instances = compute_client.list_instances(compartment_id).data
# Check CPU utilization for each instance
right_size_candidates = []
for instance in instances:
# Skip instances that are not in RUNNING state
if instance.lifecycle_state != "RUNNING":
continue
# Get CPU utilization metrics
metric_data = monitoring_client.summarize_metrics_data(
compartment_id=compartment_id,
summarize_metrics_data_details=oci.monitoring.models.SummarizeMetricsDataDetails(
namespace="oci_computeagent",
query=f'CpuUtilization[{days_to_analyze}d].mean() {{ resourceId = "{instance.id}" }}',
start_time=start_time,
end_time=end_time,
resolution="1d"
)
).data
# Calculate average CPU utilization
if metric_data and len(metric_data) > 0 and len(metric_data[0].aggregated_datapoints) > 0:
datapoints = metric_data[0].aggregated_datapoints
avg_cpu = sum(point.value for point in datapoints) / len(datapoints)
# Check if instance is a right-sizing candidate
if avg_cpu < cpu_threshold:
shape_details = None
if hasattr(instance.shape_config, "ocpus"):
shape_details = f"{instance.shape_config.ocpus} OCPUs, {instance.shape_config.memory_in_gbs} GB memory"
right_size_candidates.append({
"name": instance.display_name,
"id": instance.id,
"shape": instance.shape,
"shape_details": shape_details,
"avg_cpu": avg_cpu
})
# Print right-sizing candidates
print(f"Found {len(right_size_candidates)} right-sizing candidates:")
for candidate in right_size_candidates:
print(f"Instance: {candidate['name']}")
print(f" Shape: {candidate['shape']}")
if candidate['shape_details']:
print(f" Details: {candidate['shape_details']}")
print(f" Avg CPU: {candidate['avg_cpu']:.2f}%")
print(f" Recommendation: Consider reducing OCPUs or changing to a smaller shape")
print("")
Database Right-sizing
OCI offers several database services, including Autonomous Database, that can be right-sized based on workload requirements.
OCI: Database Right-sizing
Autonomous Database
For OCI Autonomous Database, monitor these key metrics:
- CPU utilization (target: average < 40%)
- Storage utilization (target: used storage < 70% of provisioned)
- Query performance (ensure response times meet requirements)
Right-sizing options for OCI Autonomous Database:
- Scale down CPU cores
- Adjust storage allocation
- Switch between dedicated and shared infrastructure
- Enable auto-scaling for variable workloads
Storage Right-sizing
OCI Storage services offer multiple tiers that can be optimized based on access patterns and data lifecycle.
OCI: Storage Right-sizing
OCI Storage right-sizing strategies:
- Tier optimization: Move infrequently accessed data to Archive Storage
- Object lifecycle management: Implement automatic archiving based on object age
- Block volume performance: Adjust performance units based on IOPS and throughput requirements
Example OCI Object Storage lifecycle policy to automatically archive objects:
{
"name": "ArchiveOldObjects",
"action": "ARCHIVE",
"time_amount": 90,
"time_unit": "DAYS"
}
Right-sizing Automation
Automating the right-sizing process helps maintain optimal resource allocation as workloads change over time.
Azure Automation Options
- Azure Advisor for automated recommendations
- Azure Automation runbooks for scheduled right-sizing
- Azure Functions for event-driven right-sizing
- Azure Monitor Autoscale for dynamic scaling
OCI Automation Options
- OCI Cloud Advisor for automated recommendations
- OCI Functions for scheduled right-sizing
- OCI Events for event-driven right-sizing
- OCI Instance Pools with autoscaling for dynamic scaling
Azure: Automated Right-sizing Script
Here's a PowerShell script to automatically right-size Azure VMs based on utilization:
# Connect to Azure
Connect-AzAccount
# Parameters
$cpuThreshold = 30
$daysToAnalyze = 14
$resourceGroupName = "your-resource-group"
# Get all VMs in the resource group
$vms = Get-AzVM -ResourceGroupName $resourceGroupName
foreach ($vm in $vms) {
# Get VM size information
$vmSize = $vm.HardwareProfile.VmSize
$vmSizeInfo = Get-AzVMSize -VMName $vm.Name -ResourceGroupName $resourceGroupName | Where-Object { $_.Name -eq $vmSize }
# Get CPU metrics for the VM
$endTime = Get-Date
$startTime = $endTime.AddDays(-$daysToAnalyze)
$timeGrain = "01:00:00"
$metrics = Get-AzMetric -ResourceId $vm.Id -MetricName "Percentage CPU" -StartTime $startTime -EndTime $endTime -TimeGrain $timeGrain
# Calculate average CPU utilization
$datapoints = $metrics.Data
$avgCpu = ($datapoints | Measure-Object -Property Average -Average).Average
Write-Output "VM: $($vm.Name)"
Write-Output "Current Size: $vmSize (Cores: $($vmSizeInfo.NumberOfCores), Memory: $($vmSizeInfo.MemoryInMB / 1024) GB)"
Write-Output "Average CPU: $($avgCpu)%"
# Check if VM is a right-sizing candidate
if ($avgCpu -lt $cpuThreshold) {
# Get available VM sizes for this VM
$availableSizes = Get-AzVMSize -VMName $vm.Name -ResourceGroupName $resourceGroupName
# Find a smaller size that meets requirements
$recommendedSize = $null
foreach ($size in $availableSizes) {
# Look for a size with fewer cores but still above 50% of current utilization
if ($size.NumberOfCores -lt $vmSizeInfo.NumberOfCores -and
$size.NumberOfCores -ge [math]::Ceiling($vmSizeInfo.NumberOfCores * $avgCpu / 100 * 2)) {
$recommendedSize = $size
break
}
}
if ($recommendedSize) {
Write-Output "Recommended Size: $($recommendedSize.Name) (Cores: $($recommendedSize.NumberOfCores), Memory: $($recommendedSize.MemoryInMB / 1024) GB)"
# Uncomment to actually resize the VM
# $vm.HardwareProfile.VmSize = $recommendedSize.Name
# Update-AzVM -VM $vm -ResourceGroupName $resourceGroupName
# Write-Output "VM has been resized"
} else {
Write-Output "No smaller size available that meets requirements"
}
} else {
Write-Output "VM is appropriately sized"
}
Write-Output "------------------------"
}
OCI: Automated Right-sizing Script
Here's a Python script to automatically right-size OCI Autonomous Databases based on utilization:
import oci
import datetime
from datetime import timedelta
# Initialize OCI clients
config = oci.config.from_file()
monitoring_client = oci.monitoring.MonitoringClient(config)
database_client = oci.database.DatabaseClient(config)
# Set parameters
compartment_id = "ocid1.compartment.oc1..example"
days_to_analyze = 14
cpu_threshold = 30 # Average CPU utilization threshold
# Get current time and start time (14 days ago)
end_time = datetime.datetime.now()
start_time = end_time - timedelta(days=days_to_analyze)
# Get all Autonomous Databases in the compartment
autonomous_dbs = database_client.list_autonomous_databases(compartment_id).data
# Check CPU utilization for each database
right_size_candidates = []
for db in autonomous_dbs:
# Skip databases that are not AVAILABLE
if db.lifecycle_state != "AVAILABLE":
continue
# Get CPU utilization metrics
metric_data = monitoring_client.summarize_metrics_data(
compartment_id=compartment_id,
summarize_metrics_data_details=oci.monitoring.models.SummarizeMetricsDataDetails(
namespace="oci_autonomous_database",
query=f'CpuUtilization[{days_to_analyze}d].mean() {{ resourceId = "{db.id}" }}',
start_time=start_time,
end_time=end_time,
resolution="1d"
)
).data
# Calculate average CPU utilization
if metric_data and len(metric_data) > 0 and len(metric_data[0].aggregated_datapoints) > 0:
datapoints = metric_data[0].aggregated_datapoints
avg_cpu = sum(point.value for point in datapoints) / len(datapoints)
# Check if database is a right-sizing candidate
if avg_cpu < cpu_threshold:
right_size_candidates.append({
"name": db.display_name,
"id": db.id,
"cpu_core_count": db.cpu_core_count,
"data_storage_size_in_tbs": db.data_storage_size_in_tbs,
"avg_cpu": avg_cpu,
"recommended_cpu": max(1, int(db.cpu_core_count * avg_cpu / 50)) # Aim for 50% utilization
})
# Print right-sizing candidates and optionally resize
print(f"Found {len(right_size_candidates)} right-sizing candidates:")
for candidate in right_size_candidates:
print(f"Database: {candidate['name']}")
print(f" Current CPU Cores: {candidate['cpu_core_count']}")
print(f" Avg CPU: {candidate['avg_cpu']:.2f}%")
print(f" Recommended CPU Cores: {candidate['recommended_cpu']}")
# Uncomment to actually resize the database
# if candidate['recommended_cpu'] < candidate['cpu_core_count']:
# print(f" Resizing database...")
# database_client.update_autonomous_database(
# autonomous_database_id=candidate['id'],
# update_autonomous_database_details=oci.database.models.UpdateAutonomousDatabaseDetails(
# cpu_core_count=candidate['recommended_cpu']
# )
# )
# print(f" Database has been resized")
print("")
When implementing right-sizing automation, consider these factors:
- Always test automation scripts in non-production environments first
- Implement safety thresholds to prevent excessive downsizing
- Include approval workflows for production environment changes
- Monitor performance closely after automated right-sizing
Best Practices for Right-Sizing
🧪 Planning & Analysis
- 📊 Collect at least 14 days of performance data
- 🗓️ Account for seasonal workload spikes (month-end, holidays, Black Friday, etc.)
- 🔍 Analyze both average and peak utilization — not just one or the other
- 📈 Leave 20–30% buffer for “uh-oh” moments
Talk Nerdy Rule: Plan like a pessimist, optimize like a boss.
🛠️ Implementation
- 🚦Begin with non-prod environments — lower risk, faster feedback
- 🕒 Time changes for maintenance windows
- 📝 Document everything — what changed, why, and who approved it
- 🧯 Have a rollback plan (because stuff happens)
📡 Monitoring & Validation
- 👀 Watch performance for 48–72 hours
- 🚨 Set up alerts for any hiccups (latency, CPU spikes, etc.)
- 💰 Validate that the juice was worth the squeeze — check those savings
- 📚 Log lessons learned so future-you (and your team) get smarter
🏛️ Governance & Culture
- 📆 Run monthly or quarterly review cycles
- 🧾 Create policies that guide right-sizing at provisioning time
- 📣 Educate teams — show why sizing right beats sizing safe
- 🎉 Celebrate every successful right-size like a win on the cloud scoreboard
📌 Talk Nerdy Tip: Right-sizing isn’t just cost control — it’s cloud craftsmanship. Keep it lean, keep it clean, and make it a team sport. 🧢💡
You wouldn’t go to the gym once and call it a transformation, right? Right-sizing isn’t a checkbox — it’s a habit. Cloud environments evolve, and so should your resource sizing. Here’s how to keep it tight (and cost-light):
- 📆 Review Regularly: Monthly or quarterly cycles are your friend. Mark it on the calendar like a team ritual: “Budget Brunch & Resize Review” anyone?
- 🤖 Automate What You Can: Let scripts do the snooping. Use monitoring tools to auto-flag idle or oversized resources. Azure Monitor, OCI custom metrics — they’re your cloud sous-chefs.
- 📣 Adjust Based on Feedback: Look at your results. Talk to teams. Fine-tune your strategy. Optimization without communication = chaos in a trench coat.
- 🆕 Stay Cloud-Curious: New SKUs and pricing models pop up faster than you can say “E4-Standard.” Make checking for newer, leaner instance types part of your routine.
📌 Talk Nerdy Tip: Right-sizing isn’t about cutting corners — it’s about fitting the cloud to your needs right now. Then doing it again next quarter. And the next.
🛠️ Common Right-Sizing Challenges
🚧 Performance Panic
The fear: “What if we shrink it and the app tanks?”.
The fix: Start small. Begin with non-critical workloads and monitor closely. Once you’ve got proof that performance holds steady, scale the wins. 🧪 Pro tip: Keep a “Right-Sizing Hall of Fame” doc with before/after metrics.
📉 Missing Metrics, Missing Context
The problem: Your monitoring stack doesn’t track everything — looking at you, memory usage.
The fix: Install proper agents. Azure? Use Azure Monitor Agent. OCI? Tap into custom metrics via OCI Monitoring. No data = no decisions.
🔄 Workload Whiplash (aka Variability)
The struggle: Some apps are chill on Monday and on fire by Friday.
The fix: Don’t hard-size — auto-scale. Use Azure VM Scale Sets or OCI Instance Pools. Set thresholds to scale based on real demand. That way, your infra breathes with your load.
🕒 Downtime Drama
The blocker: “We can’t shut this down just to resize!”
The fix: Use maintenance windows or pull off a blue-green right-sizing. Spin up right-sized clones. Test them. Flip the switch. Zero downtime, max optimization.
Related Resources
Was this documentation helpful?
Have suggestions for improving this document? Contact us.