🧩 Right-Sizing Resources for Azure & OCI

Shrink your bill, not your performance. If your cloud costs feel oversized but your apps aren’t, you’re probably due for a little right-sizing TLC. 🍳 This guide serves up Azure & OCI-specific tips, smart defaults, and habits to right-size once—and keep it that way. Let’s turn “more RAM just in case” into “just enough for the job.” 🛠️

⚙️ What is Right-Sizing? (No, it’s not a gym thing)

Right-sizing means matching your compute and storage to what your workload actually needs — not what sounded good at provisioning time. It’s the cloud equivalent of swapping your unused 8-seater SUV for a sleek hybrid: same destination, way less fuel burned.

🧮 The payoff? Most teams save 20–40% just by adjusting instance types, sizes, and volumes. No refactor required. Just smarter choices.

🎯 Why It Matters (Especially for SMBs)

💸 1. Cost Optimization

Overprovisioned = Overpaid. Right-sizing is the easiest way to stop funding idle VMs and give your budget a breather. Estimated savings: 20–40% without even touching your architecture.

🚀 2. Performance Boost

Right-size ≠ underpowered. It means your workloads get exactly what they need — no more, no less. Think fewer hiccups, faster response times, and happier users.

🌍 3. Environmental Impact

Yes, the cloud is virtual — but those servers are still drawing power somewhere. Shrink the waste, and you shrink your carbon footprint. Less compute = less energy = more planet-friendly cloud.

🧹 4. Operational Efficiency

Right-sized resources are easier to scale, monitor, and manage. You spend less time babysitting and more time building. Clean infra = clear mind.

💰 The Cost of Oversizing

When “just in case” becomes “just wasting cash.” Studies show most cloud setups are like ordering a triple espresso for a toddler:

  • 📦 40–45% of VMs are one size tier too big
  • 🧠 25–30% of databases run at 2–4x needed capacity
  • 🧪 50–60% of non-prod environments are built like prod (why tho?)

That’s not “extra safe” — that’s extra spend. And when every oversized resource piles up, we’re talking billions in global cloud waste. TL;DR: Oversizing is easy. Cost-effective sizing takes intention — and a little Nerdy know-how.

📊 Key Metrics for Right-Sizing

Because you can’t shrink what you don’t measure. Right-sizing isn’t guesswork — it’s about watching the right dials and knowing when something’s way overbuilt for the job. Here’s what to track:

🧠 CPU Utilization

What it tells you: How much brainpower your resource is really using.

Optimization threshold: 🔍 Nerdy Threshold: If average CPU < 30% or max < 50% over 2 weeks, it’s time to size down that overthinking VM.

🧠 Memory Utilization

What it tells you: How much memory your resource actually needs vs hoards “just in case.”

Optimization threshold: 🔍 Nerdy Threshold: If average RAM usage < 40% over 2 weeks, you’re probably paying for digital hoarding.

💽 I/O Operations

What it tells you: How busy your storage and databases are with reads/writes and network traffic.

Optimization threshold: 🔍 Nerdy Threshold: If peak disk or network usage is under 50% of what you’ve provisioned, you’ve got room to downsize.
⚖️ Utilization vs. Performance

Because high efficiency means nothing if your app cries under pressure. When right-sizing, it’s not just about trimming the fat — it’s about keeping your app smooth and snappy too.

  • 🧮 Utilization Metrics = What your resource is doing (CPU%, memory usage, IOPS — the raw fuel burn).
  • 🧪 Performance Metrics = How your app is feeling (Response time, latency, throughput — aka: “Is it slow, or just chill?”).

🎯 Nerd Rule of Thumb: Always check both. High utilization with good performance? Nice. Low utilization but laggy UX? You might’ve right-sized too hard.

Azure Right-sizing Implementation

Azure provides several tools and features to help you identify and implement right-sizing opportunities.

Virtual Machine Right-sizing

Azure VMs are often the largest component of cloud spend and present significant right-sizing opportunities.

Azure: VM Right-sizing Process

  1. Collect utilization data: Use Azure Monitor to collect CPU, memory, disk, and network metrics for at least 14 days.
  2. Analyze usage patterns: Look for consistent patterns of underutilization (average CPU < 30% and peak CPU < 50%).
  3. Identify right-sizing candidates: Use Azure Advisor or create custom queries to identify VMs that can be downsized.
  4. Plan the change: Determine the right VM size based on actual usage patterns and performance requirements.
  5. Implement and validate: Resize the VM during a maintenance window and monitor performance after the change.

Azure Advisor provides built-in recommendations for VM right-sizing. You can access these recommendations in the Azure Portal:

  1. Navigate to Azure Advisor in the Azure Portal
  2. Select the "Cost" tab
  3. Review the "Right-size or shutdown underutilized virtual machines" recommendations

Azure VM Right-sizing Query

You can use Azure Resource Graph to identify right-sizing candidates with this query:

Kusto Query
// Find underutilized VMs based on CPU metrics
let timeRange = 14d;
let cpuThreshold = 30;
resources
| where type == "microsoft.compute/virtualmachines"
| project name, resourceGroup, location, id, tags
| join kind=leftouter (
    metrics
    | where namespace == "Microsoft.Compute" and name == "Percentage CPU"
    | where timerange(ago(timeRange)..now())
    | summarize AvgCPU = avg(Average) by resource
) on $left.id == $right.resource
| where isnotnull(AvgCPU) and AvgCPU < cpuThreshold
| project name, resourceGroup, location, AvgCPU, tags
| order by AvgCPU asc

Database Right-sizing

Azure database services like Azure SQL Database and Azure Cosmos DB offer flexible scaling options that can be optimized based on usage.

Azure: Database Right-sizing

Azure SQL Database

For Azure SQL Database, monitor these key metrics:

  • DTU/vCore utilization (target: average < 40%)
  • Storage utilization (target: used storage < 70% of provisioned)
  • Query performance (ensure response times meet requirements)

Right-sizing options for Azure SQL Database:

  • Scale down service tier (Premium → Standard → Basic)
  • Reduce DTUs or vCores within the same tier
  • Switch between provisioned and serverless models based on usage patterns
  • Consider Azure SQL Elastic Pools for multiple databases with variable workloads

Storage Right-sizing

Azure Storage offers multiple tiers that can be optimized based on access patterns and data lifecycle.

Azure: Storage Right-sizing

Azure Storage right-sizing strategies:

  • Tier optimization: Move infrequently accessed data from Hot to Cool or Archive tiers
  • Lifecycle management: Implement automatic tiering based on access patterns
  • Redundancy level: Adjust redundancy (LRS, ZRS, GRS) based on data criticality

Example Azure Storage lifecycle policy to automatically tier data:

JSON
{
  "rules": [
    {
      "name": "MoveToCoolTier",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ]
        },
        "actions": {
          "baseBlob": {
            "tierToCool": { "daysAfterModificationGreaterThan": 30 }
          }
        }
      }
    },
    {
      "name": "MoveToArchiveTier",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ]
        },
        "actions": {
          "baseBlob": {
            "tierToArchive": { "daysAfterModificationGreaterThan": 90 }
          }
        }
      }
    }
  ]
}

OCI Right-sizing Implementation

Oracle Cloud Infrastructure (OCI) offers flexible compute shapes and autoscaling capabilities that facilitate right-sizing.

Compute Instance Right-sizing

OCI's flexible compute shapes allow you to independently specify CPU and memory, enabling precise right-sizing.

OCI: Compute Right-sizing Process

  1. Collect utilization data: Use OCI Monitoring to collect CPU, memory, and network metrics for at least 14 days.
  2. Analyze usage patterns: Look for consistent patterns of underutilization (average CPU < 30% and peak CPU < 50%).
  3. Identify right-sizing candidates: Use OCI Cloud Advisor or create custom monitoring queries to identify instances that can be downsized.
  4. Plan the change: Determine the right compute shape based on actual usage patterns and performance requirements.
  5. Implement and validate: Resize the instance during a maintenance window and monitor performance after the change.

OCI's flexible shapes allow you to specify:

  • Exact number of OCPUs (CPU cores)
  • Precise memory allocation
  • Network bandwidth requirements

OCI Compute Right-sizing Script

You can use this Python script to identify right-sizing candidates in OCI:

Python
import oci
import datetime
from datetime import timedelta

# Initialize OCI clients
config = oci.config.from_file()
monitoring_client = oci.monitoring.MonitoringClient(config)
compute_client = oci.core.ComputeClient(config)
identity_client = oci.identity.IdentityClient(config)

# Set parameters
compartment_id = "ocid1.compartment.oc1..example"
days_to_analyze = 14
cpu_threshold = 30  # Average CPU utilization threshold

# Get current time and start time (14 days ago)
end_time = datetime.datetime.now()
start_time = end_time - timedelta(days=days_to_analyze)

# Get all instances in the compartment
instances = compute_client.list_instances(compartment_id).data

# Check CPU utilization for each instance
right_size_candidates = []

for instance in instances:
    # Skip instances that are not in RUNNING state
    if instance.lifecycle_state != "RUNNING":
        continue
        
    # Get CPU utilization metrics
    metric_data = monitoring_client.summarize_metrics_data(
        compartment_id=compartment_id,
        summarize_metrics_data_details=oci.monitoring.models.SummarizeMetricsDataDetails(
            namespace="oci_computeagent",
            query=f'CpuUtilization[{days_to_analyze}d].mean() {{ resourceId = "{instance.id}" }}',
            start_time=start_time,
            end_time=end_time,
            resolution="1d"
        )
    ).data
    
    # Calculate average CPU utilization
    if metric_data and len(metric_data) > 0 and len(metric_data[0].aggregated_datapoints) > 0:
        datapoints = metric_data[0].aggregated_datapoints
        avg_cpu = sum(point.value for point in datapoints) / len(datapoints)
        
        # Check if instance is a right-sizing candidate
        if avg_cpu < cpu_threshold:
            shape_details = None
            if hasattr(instance.shape_config, "ocpus"):
                shape_details = f"{instance.shape_config.ocpus} OCPUs, {instance.shape_config.memory_in_gbs} GB memory"
            
            right_size_candidates.append({
                "name": instance.display_name,
                "id": instance.id,
                "shape": instance.shape,
                "shape_details": shape_details,
                "avg_cpu": avg_cpu
            })

# Print right-sizing candidates
print(f"Found {len(right_size_candidates)} right-sizing candidates:")
for candidate in right_size_candidates:
    print(f"Instance: {candidate['name']}")
    print(f"  Shape: {candidate['shape']}")
    if candidate['shape_details']:
        print(f"  Details: {candidate['shape_details']}")
    print(f"  Avg CPU: {candidate['avg_cpu']:.2f}%")
    print(f"  Recommendation: Consider reducing OCPUs or changing to a smaller shape")
    print("")

Database Right-sizing

OCI offers several database services, including Autonomous Database, that can be right-sized based on workload requirements.

OCI: Database Right-sizing

Autonomous Database

For OCI Autonomous Database, monitor these key metrics:

  • CPU utilization (target: average < 40%)
  • Storage utilization (target: used storage < 70% of provisioned)
  • Query performance (ensure response times meet requirements)

Right-sizing options for OCI Autonomous Database:

  • Scale down CPU cores
  • Adjust storage allocation
  • Switch between dedicated and shared infrastructure
  • Enable auto-scaling for variable workloads

Storage Right-sizing

OCI Storage services offer multiple tiers that can be optimized based on access patterns and data lifecycle.

OCI: Storage Right-sizing

OCI Storage right-sizing strategies:

  • Tier optimization: Move infrequently accessed data to Archive Storage
  • Object lifecycle management: Implement automatic archiving based on object age
  • Block volume performance: Adjust performance units based on IOPS and throughput requirements

Example OCI Object Storage lifecycle policy to automatically archive objects:

JSON
{
  "name": "ArchiveOldObjects",
  "action": "ARCHIVE",
  "time_amount": 90,
  "time_unit": "DAYS"
}

Right-sizing Automation

Automating the right-sizing process helps maintain optimal resource allocation as workloads change over time.

Azure Automation Options

  • Azure Advisor for automated recommendations
  • Azure Automation runbooks for scheduled right-sizing
  • Azure Functions for event-driven right-sizing
  • Azure Monitor Autoscale for dynamic scaling

OCI Automation Options

  • OCI Cloud Advisor for automated recommendations
  • OCI Functions for scheduled right-sizing
  • OCI Events for event-driven right-sizing
  • OCI Instance Pools with autoscaling for dynamic scaling

Azure: Automated Right-sizing Script

Here's a PowerShell script to automatically right-size Azure VMs based on utilization:

PowerShell
# Connect to Azure
Connect-AzAccount

# Parameters
$cpuThreshold = 30
$daysToAnalyze = 14
$resourceGroupName = "your-resource-group"

# Get all VMs in the resource group
$vms = Get-AzVM -ResourceGroupName $resourceGroupName

foreach ($vm in $vms) {
    # Get VM size information
    $vmSize = $vm.HardwareProfile.VmSize
    $vmSizeInfo = Get-AzVMSize -VMName $vm.Name -ResourceGroupName $resourceGroupName | Where-Object { $_.Name -eq $vmSize }
    
    # Get CPU metrics for the VM
    $endTime = Get-Date
    $startTime = $endTime.AddDays(-$daysToAnalyze)
    $timeGrain = "01:00:00"
    
    $metrics = Get-AzMetric -ResourceId $vm.Id -MetricName "Percentage CPU" -StartTime $startTime -EndTime $endTime -TimeGrain $timeGrain
    
    # Calculate average CPU utilization
    $datapoints = $metrics.Data
    $avgCpu = ($datapoints | Measure-Object -Property Average -Average).Average
    
    Write-Output "VM: $($vm.Name)"
    Write-Output "Current Size: $vmSize (Cores: $($vmSizeInfo.NumberOfCores), Memory: $($vmSizeInfo.MemoryInMB / 1024) GB)"
    Write-Output "Average CPU: $($avgCpu)%"
    
    # Check if VM is a right-sizing candidate
    if ($avgCpu -lt $cpuThreshold) {
        # Get available VM sizes for this VM
        $availableSizes = Get-AzVMSize -VMName $vm.Name -ResourceGroupName $resourceGroupName
        
        # Find a smaller size that meets requirements
        $recommendedSize = $null
        foreach ($size in $availableSizes) {
            # Look for a size with fewer cores but still above 50% of current utilization
            if ($size.NumberOfCores -lt $vmSizeInfo.NumberOfCores -and 
                $size.NumberOfCores -ge [math]::Ceiling($vmSizeInfo.NumberOfCores * $avgCpu / 100 * 2)) {
                $recommendedSize = $size
                break
            }
        }
        
        if ($recommendedSize) {
            Write-Output "Recommended Size: $($recommendedSize.Name) (Cores: $($recommendedSize.NumberOfCores), Memory: $($recommendedSize.MemoryInMB / 1024) GB)"
            
            # Uncomment to actually resize the VM
            # $vm.HardwareProfile.VmSize = $recommendedSize.Name
            # Update-AzVM -VM $vm -ResourceGroupName $resourceGroupName
            # Write-Output "VM has been resized"
        } else {
            Write-Output "No smaller size available that meets requirements"
        }
    } else {
        Write-Output "VM is appropriately sized"
    }
    
    Write-Output "------------------------"
}

OCI: Automated Right-sizing Script

Here's a Python script to automatically right-size OCI Autonomous Databases based on utilization:

Python
import oci
import datetime
from datetime import timedelta

# Initialize OCI clients
config = oci.config.from_file()
monitoring_client = oci.monitoring.MonitoringClient(config)
database_client = oci.database.DatabaseClient(config)

# Set parameters
compartment_id = "ocid1.compartment.oc1..example"
days_to_analyze = 14
cpu_threshold = 30  # Average CPU utilization threshold

# Get current time and start time (14 days ago)
end_time = datetime.datetime.now()
start_time = end_time - timedelta(days=days_to_analyze)

# Get all Autonomous Databases in the compartment
autonomous_dbs = database_client.list_autonomous_databases(compartment_id).data

# Check CPU utilization for each database
right_size_candidates = []

for db in autonomous_dbs:
    # Skip databases that are not AVAILABLE
    if db.lifecycle_state != "AVAILABLE":
        continue
        
    # Get CPU utilization metrics
    metric_data = monitoring_client.summarize_metrics_data(
        compartment_id=compartment_id,
        summarize_metrics_data_details=oci.monitoring.models.SummarizeMetricsDataDetails(
            namespace="oci_autonomous_database",
            query=f'CpuUtilization[{days_to_analyze}d].mean() {{ resourceId = "{db.id}" }}',
            start_time=start_time,
            end_time=end_time,
            resolution="1d"
        )
    ).data
    
    # Calculate average CPU utilization
    if metric_data and len(metric_data) > 0 and len(metric_data[0].aggregated_datapoints) > 0:
        datapoints = metric_data[0].aggregated_datapoints
        avg_cpu = sum(point.value for point in datapoints) / len(datapoints)
        
        # Check if database is a right-sizing candidate
        if avg_cpu < cpu_threshold:
            right_size_candidates.append({
                "name": db.display_name,
                "id": db.id,
                "cpu_core_count": db.cpu_core_count,
                "data_storage_size_in_tbs": db.data_storage_size_in_tbs,
                "avg_cpu": avg_cpu,
                "recommended_cpu": max(1, int(db.cpu_core_count * avg_cpu / 50))  # Aim for 50% utilization
            })

# Print right-sizing candidates and optionally resize
print(f"Found {len(right_size_candidates)} right-sizing candidates:")
for candidate in right_size_candidates:
    print(f"Database: {candidate['name']}")
    print(f"  Current CPU Cores: {candidate['cpu_core_count']}")
    print(f"  Avg CPU: {candidate['avg_cpu']:.2f}%")
    print(f"  Recommended CPU Cores: {candidate['recommended_cpu']}")
    
    # Uncomment to actually resize the database
    # if candidate['recommended_cpu'] < candidate['cpu_core_count']:
    #     print(f"  Resizing database...")
    #     database_client.update_autonomous_database(
    #         autonomous_database_id=candidate['id'],
    #         update_autonomous_database_details=oci.database.models.UpdateAutonomousDatabaseDetails(
    #             cpu_core_count=candidate['recommended_cpu']
    #         )
    #     )
    #     print(f"  Database has been resized")
    
    print("")
Automation Considerations

When implementing right-sizing automation, consider these factors:

  • Always test automation scripts in non-production environments first
  • Implement safety thresholds to prevent excessive downsizing
  • Include approval workflows for production environment changes
  • Monitor performance closely after automated right-sizing

Best Practices for Right-Sizing

🧪 Planning & Analysis

  • 📊 Collect at least 14 days of performance data
  • 🗓️ Account for seasonal workload spikes (month-end, holidays, Black Friday, etc.)
  • 🔍 Analyze both average and peak utilization — not just one or the other
  • 📈 Leave 20–30% buffer for “uh-oh” moments

Talk Nerdy Rule: Plan like a pessimist, optimize like a boss.

🛠️ Implementation

  • 🚦Begin with non-prod environments — lower risk, faster feedback
  • 🕒 Time changes for maintenance windows
  • 📝 Document everything — what changed, why, and who approved it
  • 🧯 Have a rollback plan (because stuff happens)

📡 Monitoring & Validation

  • 👀 Watch performance for 48–72 hours
  • 🚨 Set up alerts for any hiccups (latency, CPU spikes, etc.)
  • 💰 Validate that the juice was worth the squeeze — check those savings
  • 📚 Log lessons learned so future-you (and your team) get smarter

🏛️ Governance & Culture

  • 📆 Run monthly or quarterly review cycles
  • 🧾 Create policies that guide right-sizing at provisioning time
  • 📣 Educate teams — show why sizing right beats sizing safe
  • 🎉 Celebrate every successful right-size like a win on the cloud scoreboard

📌 Talk Nerdy Tip: Right-sizing isn’t just cost control — it’s cloud craftsmanship. Keep it lean, keep it clean, and make it a team sport. 🧢💡

🔁 Right-Sizing: Not a One-and-Done Deal

You wouldn’t go to the gym once and call it a transformation, right? Right-sizing isn’t a checkbox — it’s a habit. Cloud environments evolve, and so should your resource sizing. Here’s how to keep it tight (and cost-light):

  • 📆 Review Regularly: Monthly or quarterly cycles are your friend. Mark it on the calendar like a team ritual: “Budget Brunch & Resize Review” anyone?
  • 🤖 Automate What You Can: Let scripts do the snooping. Use monitoring tools to auto-flag idle or oversized resources. Azure Monitor, OCI custom metrics — they’re your cloud sous-chefs.
  • 📣 Adjust Based on Feedback: Look at your results. Talk to teams. Fine-tune your strategy. Optimization without communication = chaos in a trench coat.
  • 🆕 Stay Cloud-Curious: New SKUs and pricing models pop up faster than you can say “E4-Standard.” Make checking for newer, leaner instance types part of your routine.

📌 Talk Nerdy Tip: Right-sizing isn’t about cutting corners — it’s about fitting the cloud to your needs right now. Then doing it again next quarter. And the next.

🛠️ Common Right-Sizing Challenges

🚧 Performance Panic

The fear: “What if we shrink it and the app tanks?”.

The fix: Start small. Begin with non-critical workloads and monitor closely. Once you’ve got proof that performance holds steady, scale the wins. 🧪 Pro tip: Keep a “Right-Sizing Hall of Fame” doc with before/after metrics.

📉 Missing Metrics, Missing Context

The problem: Your monitoring stack doesn’t track everything — looking at you, memory usage.

The fix: Install proper agents. Azure? Use Azure Monitor Agent. OCI? Tap into custom metrics via OCI Monitoring. No data = no decisions.

🔄 Workload Whiplash (aka Variability)

The struggle: Some apps are chill on Monday and on fire by Friday.

The fix: Don’t hard-size — auto-scale. Use Azure VM Scale Sets or OCI Instance Pools. Set thresholds to scale based on real demand. That way, your infra breathes with your load.

🕒 Downtime Drama

The blocker: “We can’t shut this down just to resize!”

The fix: Use maintenance windows or pull off a blue-green right-sizing. Spin up right-sized clones. Test them. Flip the switch. Zero downtime, max optimization.

Was this documentation helpful?

Have suggestions for improving this document? Contact us.