🧩 Right-Sizing Resources for Azure & OCI

Shrink your bill, not your performance. If your cloud costs feel oversized but your apps aren’t, you’re probably due for a little right-sizing TLC. 🍳 This guide serves up Azure & OCI-specific tips, smart defaults, and habits to right-size once—and keep it that way. Let’s turn “more RAM just in case” into “just enough for the job.” 🛠️

⚙️ What is Right-Sizing? (No, it’s not a gym thing)
🎯 Why It Matters (Especially for SMBs)
📊 Key Metrics for Right-Sizing
Azure Right-sizing Techniques
Azure VM Right-sizing
Azure Database Right-sizing
Azure App Service Right-sizing
OCI Right-sizing Techniques
OCI Compute Right-sizing
OCI Database Right-sizing
OCI Storage Right-sizing
Automating Right-sizing
🧠 Best Practices for Right-Sizing
🛠️ Common Right-Sizing Challenges

⚙️ What is Right-Sizing? (No, it’s not a gym thing)

Right-sizing means matching your compute and storage to what your workload actually needs — not what sounded good at provisioning time. It’s the cloud equivalent of swapping your unused 8-seater SUV for a sleek hybrid: same destination, way less fuel burned.

🧮 The payoff? Most teams save 20–40% just by adjusting instance types, sizes, and volumes. No refactor required. Just smarter choices.

🎯 Why It Matters (Especially for SMBs)

💸 1. Cost Optimization

Overprovisioned = Overpaid. Right-sizing is the easiest way to stop funding idle VMs and give your budget a breather. Estimated savings: 20–40% without even touching your architecture.

🚀 2. Performance Boost

Right-size ≠ underpowered. It means your workloads get exactly what they need — no more, no less. Think fewer hiccups, faster response times, and happier users.

🌍 3. Environmental Impact

Yes, the cloud is virtual — but those servers are still drawing power somewhere. Shrink the waste, and you shrink your carbon footprint. Less compute = less energy = more planet-friendly cloud.

🧹 4. Operational Efficiency

Right-sized resources are easier to scale, monitor, and manage. You spend less time babysitting and more time building. Clean infra = clear mind.

💰 The Cost of Oversizing

When “just in case” becomes “just wasting cash.” Studies show most cloud setups are like ordering a triple espresso for a toddler:

📦 40–45% of VMs are one size tier too big
🧠 25–30% of databases run at 2–4x needed capacity
🧪 50–60% of non-prod environments are built like prod (why tho?)

That’s not “extra safe” — that’s extra spend. And when every oversized resource piles up, we’re talking billions in global cloud waste. TL;DR: Oversizing is easy. Cost-effective sizing takes intention — and a little Nerdy know-how.

📊 Key Metrics for Right-Sizing

Because you can’t shrink what you don’t measure. Right-sizing isn’t guesswork — it’s about watching the right dials and knowing when something’s way overbuilt for the job. Here’s what to track:

🧠 CPU Utilization

What it tells you: How much brainpower your resource is really using.

Optimization threshold: 🔍 Nerdy Threshold: If average CPU < 30% or max < 50% over 2 weeks, it’s time to size down that overthinking VM.

🧠 Memory Utilization

What it tells you: How much memory your resource actually needs vs hoards “just in case.”

Optimization threshold: 🔍 Nerdy Threshold: If average RAM usage < 40% over 2 weeks, you’re probably paying for digital hoarding.

💽 I/O Operations

What it tells you: How busy your storage and databases are with reads/writes and network traffic.

Optimization threshold: 🔍 Nerdy Threshold: If peak disk or network usage is under 50% of what you’ve provisioned, you’ve got room to downsize.

⚖️ Utilization vs. Performance

Because high efficiency means nothing if your app cries under pressure. When right-sizing, it’s not just about trimming the fat — it’s about keeping your app smooth and snappy too.

🧮 Utilization Metrics = What your resource is doing (CPU%, memory usage, IOPS — the raw fuel burn).
🧪 Performance Metrics = How your app is feeling (Response time, latency, throughput — aka: “Is it slow, or just chill?”).

🎯 Nerd Rule of Thumb: Always check both. High utilization with good performance? Nice. Low utilization but laggy UX? You might’ve right-sized too hard.

Azure Right-sizing Implementation

Azure provides several tools and features to help you identify and implement right-sizing opportunities.

Virtual Machine Right-sizing

Azure VMs are often the largest component of cloud spend and present significant right-sizing opportunities.

Azure: VM Right-sizing Process

Collect utilization data: Use Azure Monitor to collect CPU, memory, disk, and network metrics for at least 14 days.
Analyze usage patterns: Look for consistent patterns of underutilization (average CPU < 30% and peak CPU < 50%).
Identify right-sizing candidates: Use Azure Advisor or create custom queries to identify VMs that can be downsized.
Plan the change: Determine the right VM size based on actual usage patterns and performance requirements.
Implement and validate: Resize the VM during a maintenance window and monitor performance after the change.

Azure Advisor provides built-in recommendations for VM right-sizing. You can access these recommendations in the Azure Portal:

Navigate to Azure Advisor in the Azure Portal
Select the "Cost" tab
Review the "Right-size or shutdown underutilized virtual machines" recommendations

Azure VM Right-sizing Query

You can use Azure Resource Graph to identify right-sizing candidates with this query:

Kusto Query

// Find underutilized VMs based on CPU metrics
let timeRange = 14d;
let cpuThreshold = 30;
resources
| where type == "microsoft.compute/virtualmachines"
| project name, resourceGroup, location, id, tags
| join kind=leftouter (
    metrics
    | where namespace == "Microsoft.Compute" and name == "Percentage CPU"
    | where timerange(ago(timeRange)..now())
    | summarize AvgCPU = avg(Average) by resource
) on $left.id == $right.resource
| where isnotnull(AvgCPU) and AvgCPU < cpuThreshold
| project name, resourceGroup, location, AvgCPU, tags
| order by AvgCPU asc

Database Right-sizing

Azure database services like Azure SQL Database and Azure Cosmos DB offer flexible scaling options that can be optimized based on usage.

Azure: Database Right-sizing

Azure SQL Database

For Azure SQL Database, monitor these key metrics:

DTU/vCore utilization (target: average < 40%)
Storage utilization (target: used storage < 70% of provisioned)
Query performance (ensure response times meet requirements)

Right-sizing options for Azure SQL Database:

Scale down service tier (Premium → Standard → Basic)
Reduce DTUs or vCores within the same tier
Switch between provisioned and serverless models based on usage patterns
Consider Azure SQL Elastic Pools for multiple databases with variable workloads

Storage Right-sizing

Azure Storage offers multiple tiers that can be optimized based on access patterns and data lifecycle.

Azure: Storage Right-sizing

Azure Storage right-sizing strategies:

Tier optimization: Move infrequently accessed data from Hot to Cool or Archive tiers
Lifecycle management: Implement automatic tiering based on access patterns
Redundancy level: Adjust redundancy (LRS, ZRS, GRS) based on data criticality

Example Azure Storage lifecycle policy to automatically tier data:

JSON

{
  "rules": [
    {
      "name": "MoveToCoolTier",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ]
        },
        "actions": {
          "baseBlob": {
            "tierToCool": { "daysAfterModificationGreaterThan": 30 }
          }
        }
      }
    },
    {
      "name": "MoveToArchiveTier",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ]
        },
        "actions": {
          "baseBlob": {
            "tierToArchive": { "daysAfterModificationGreaterThan": 90 }
          }
        }
      }
    }
  ]
}

OCI Right-sizing Implementation

Oracle Cloud Infrastructure (OCI) offers flexible compute shapes and autoscaling capabilities that facilitate right-sizing.

Compute Instance Right-sizing

OCI's flexible compute shapes allow you to independently specify CPU and memory, enabling precise right-sizing.

OCI: Compute Right-sizing Process

Collect utilization data: Use OCI Monitoring to collect CPU, memory, and network metrics for at least 14 days.
Analyze usage patterns: Look for consistent patterns of underutilization (average CPU < 30% and peak CPU < 50%).
Identify right-sizing candidates: Use OCI Cloud Advisor or create custom monitoring queries to identify instances that can be downsized.
Plan the change: Determine the right compute shape based on actual usage patterns and performance requirements.
Implement and validate: Resize the instance during a maintenance window and monitor performance after the change.

OCI's flexible shapes allow you to specify:

Exact number of OCPUs (CPU cores)
Precise memory allocation
Network bandwidth requirements

OCI Compute Right-sizing Script

You can use this Python script to identify right-sizing candidates in OCI:

Python

import oci
import datetime
from datetime import timedelta

# Initialize OCI clients
config = oci.config.from_file()
monitoring_client = oci.monitoring.MonitoringClient(config)
compute_client = oci.core.ComputeClient(config)
identity_client = oci.identity.IdentityClient(config)

# Set parameters
compartment_id = "ocid1.compartment.oc1..example"
days_to_analyze = 14
cpu_threshold = 30  # Average CPU utilization threshold

# Get current time and start time (14 days ago)
end_time = datetime.datetime.now()
start_time = end_time - timedelta(days=days_to_analyze)

# Get all instances in the compartment
instances = compute_client.list_instances(compartment_id).data

# Check CPU utilization for each instance
right_size_candidates = []

for instance in instances:
    # Skip instances that are not in RUNNING state
    if instance.lifecycle_state != "RUNNING":
        continue
        
    # Get CPU utilization metrics
    metric_data = monitoring_client.summarize_metrics_data(
        compartment_id=compartment_id,
        summarize_metrics_data_details=oci.monitoring.models.SummarizeMetricsDataDetails(
            namespace="oci_computeagent",
            query=f'CpuUtilization[{days_to_analyze}d].mean() {{ resourceId = "{instance.id}" }}',
            start_time=start_time,
            end_time=end_time,
            resolution="1d"
        )
    ).data
    
    # Calculate average CPU utilization
    if metric_data and len(metric_data) > 0 and len(metric_data[0].aggregated_datapoints) > 0:
        datapoints = metric_data[0].aggregated_datapoints
        avg_cpu = sum(point.value for point in datapoints) / len(datapoints)
        
        # Check if instance is a right-sizing candidate
        if avg_cpu < cpu_threshold:
            shape_details = None
            if hasattr(instance.shape_config, "ocpus"):
                shape_details = f"{instance.shape_config.ocpus} OCPUs, {instance.shape_config.memory_in_gbs} GB memory"
            
            right_size_candidates.append({
                "name": instance.display_name,
                "id": instance.id,
                "shape": instance.shape,
                "shape_details": shape_details,
                "avg_cpu": avg_cpu
            })

# Print right-sizing candidates
print(f"Found {len(right_size_candidates)} right-sizing candidates:")
for candidate in right_size_candidates:
    print(f"Instance: {candidate['name']}")
    print(f"  Shape: {candidate['shape']}")
    if candidate['shape_details']:
        print(f"  Details: {candidate['shape_details']}")
    print(f"  Avg CPU: {candidate['avg_cpu']:.2f}%")
    print(f"  Recommendation: Consider reducing OCPUs or changing to a smaller shape")
    print("")

Database Right-sizing

OCI offers several database services, including Autonomous Database, that can be right-sized based on workload requirements.

OCI: Database Right-sizing

Autonomous Database

For OCI Autonomous Database, monitor these key metrics:

CPU utilization (target: average < 40%)
Storage utilization (target: used storage < 70% of provisioned)
Query performance (ensure response times meet requirements)

Right-sizing options for OCI Autonomous Database:

Scale down CPU cores
Adjust storage allocation
Switch between dedicated and shared infrastructure
Enable auto-scaling for variable workloads

Storage Right-sizing

OCI Storage services offer multiple tiers that can be optimized based on access patterns and data lifecycle.

OCI: Storage Right-sizing

OCI Storage right-sizing strategies:

Tier optimization: Move infrequently accessed data to Archive Storage
Object lifecycle management: Implement automatic archiving based on object age
Block volume performance: Adjust performance units based on IOPS and throughput requirements

Example OCI Object Storage lifecycle policy to automatically archive objects:

JSON

{
  "name": "ArchiveOldObjects",
  "action": "ARCHIVE",
  "time_amount": 90,
  "time_unit": "DAYS"
}

Right-sizing Automation

Automating the right-sizing process helps maintain optimal resource allocation as workloads change over time.

Azure Automation Options

Azure Advisor for automated recommendations
Azure Automation runbooks for scheduled right-sizing
Azure Functions for event-driven right-sizing
Azure Monitor Autoscale for dynamic scaling

OCI Automation Options

OCI Cloud Advisor for automated recommendations
OCI Functions for scheduled right-sizing
OCI Events for event-driven right-sizing
OCI Instance Pools with autoscaling for dynamic scaling

Azure: Automated Right-sizing Script

Here's a PowerShell script to automatically right-size Azure VMs based on utilization:

PowerShell

# Connect to Azure
Connect-AzAccount

# Parameters
$cpuThreshold = 30
$daysToAnalyze = 14
$resourceGroupName = "your-resource-group"

# Get all VMs in the resource group
$vms = Get-AzVM -ResourceGroupName $resourceGroupName

foreach ($vm in $vms) {
    # Get VM size information
    $vmSize = $vm.HardwareProfile.VmSize
    $vmSizeInfo = Get-AzVMSize -VMName $vm.Name -ResourceGroupName $resourceGroupName | Where-Object { $_.Name -eq $vmSize }
    
    # Get CPU metrics for the VM
    $endTime = Get-Date
    $startTime = $endTime.AddDays(-$daysToAnalyze)
    $timeGrain = "01:00:00"
    
    $metrics = Get-AzMetric -ResourceId $vm.Id -MetricName "Percentage CPU" -StartTime $startTime -EndTime $endTime -TimeGrain $timeGrain
    
    # Calculate average CPU utilization
    $datapoints = $metrics.Data
    $avgCpu = ($datapoints | Measure-Object -Property Average -Average).Average
    
    Write-Output "VM: $($vm.Name)"
    Write-Output "Current Size: $vmSize (Cores: $($vmSizeInfo.NumberOfCores), Memory: $($vmSizeInfo.MemoryInMB / 1024) GB)"
    Write-Output "Average CPU: $($avgCpu)%"
    
    # Check if VM is a right-sizing candidate
    if ($avgCpu -lt $cpuThreshold) {
        # Get available VM sizes for this VM
        $availableSizes = Get-AzVMSize -VMName $vm.Name -ResourceGroupName $resourceGroupName
        
        # Find a smaller size that meets requirements
        $recommendedSize = $null
        foreach ($size in $availableSizes) {
            # Look for a size with fewer cores but still above 50% of current utilization
            if ($size.NumberOfCores -lt $vmSizeInfo.NumberOfCores -and 
                $size.NumberOfCores -ge [math]::Ceiling($vmSizeInfo.NumberOfCores * $avgCpu / 100 * 2)) {
                $recommendedSize = $size
                break
            }
        }
        
        if ($recommendedSize) {
            Write-Output "Recommended Size: $($recommendedSize.Name) (Cores: $($recommendedSize.NumberOfCores), Memory: $($recommendedSize.MemoryInMB / 1024) GB)"
            
            # Uncomment to actually resize the VM
            # $vm.HardwareProfile.VmSize = $recommendedSize.Name
            # Update-AzVM -VM $vm -ResourceGroupName $resourceGroupName
            # Write-Output "VM has been resized"
        } else {
            Write-Output "No smaller size available that meets requirements"
        }
    } else {
        Write-Output "VM is appropriately sized"
    }
    
    Write-Output "------------------------"
}

OCI: Automated Right-sizing Script

Here's a Python script to automatically right-size OCI Autonomous Databases based on utilization:

Python

import oci
import datetime
from datetime import timedelta

# Initialize OCI clients
config = oci.config.from_file()
monitoring_client = oci.monitoring.MonitoringClient(config)
database_client = oci.database.DatabaseClient(config)

# Set parameters
compartment_id = "ocid1.compartment.oc1..example"
days_to_analyze = 14
cpu_threshold = 30  # Average CPU utilization threshold

# Get current time and start time (14 days ago)
end_time = datetime.datetime.now()
start_time = end_time - timedelta(days=days_to_analyze)

# Get all Autonomous Databases in the compartment
autonomous_dbs = database_client.list_autonomous_databases(compartment_id).data

# Check CPU utilization for each database
right_size_candidates = []

for db in autonomous_dbs:
    # Skip databases that are not AVAILABLE
    if db.lifecycle_state != "AVAILABLE":
        continue
        
    # Get CPU utilization metrics
    metric_data = monitoring_client.summarize_metrics_data(
        compartment_id=compartment_id,
        summarize_metrics_data_details=oci.monitoring.models.SummarizeMetricsDataDetails(
            namespace="oci_autonomous_database",
            query=f'CpuUtilization[{days_to_analyze}d].mean() {{ resourceId = "{db.id}" }}',
            start_time=start_time,
            end_time=end_time,
            resolution="1d"
        )
    ).data
    
    # Calculate average CPU utilization
    if metric_data and len(metric_data) > 0 and len(metric_data[0].aggregated_datapoints) > 0:
        datapoints = metric_data[0].aggregated_datapoints
        avg_cpu = sum(point.value for point in datapoints) / len(datapoints)
        
        # Check if database is a right-sizing candidate
        if avg_cpu < cpu_threshold:
            right_size_candidates.append({
                "name": db.display_name,
                "id": db.id,
                "cpu_core_count": db.cpu_core_count,
                "data_storage_size_in_tbs": db.data_storage_size_in_tbs,
                "avg_cpu": avg_cpu,
                "recommended_cpu": max(1, int(db.cpu_core_count * avg_cpu / 50))  # Aim for 50% utilization
            })

# Print right-sizing candidates and optionally resize
print(f"Found {len(right_size_candidates)} right-sizing candidates:")
for candidate in right_size_candidates:
    print(f"Database: {candidate['name']}")
    print(f"  Current CPU Cores: {candidate['cpu_core_count']}")
    print(f"  Avg CPU: {candidate['avg_cpu']:.2f}%")
    print(f"  Recommended CPU Cores: {candidate['recommended_cpu']}")
    
    # Uncomment to actually resize the database
    # if candidate['recommended_cpu'] < candidate['cpu_core_count']:
    #     print(f"  Resizing database...")
    #     database_client.update_autonomous_database(
    #         autonomous_database_id=candidate['id'],
    #         update_autonomous_database_details=oci.database.models.UpdateAutonomousDatabaseDetails(
    #             cpu_core_count=candidate['recommended_cpu']
    #         )
    #     )
    #     print(f"  Database has been resized")
    
    print("")

Automation Considerations

When implementing right-sizing automation, consider these factors:

Always test automation scripts in non-production environments first
Implement safety thresholds to prevent excessive downsizing
Include approval workflows for production environment changes
Monitor performance closely after automated right-sizing

Best Practices for Right-Sizing

🧪 Planning & Analysis

📊 Collect at least 14 days of performance data
🗓️ Account for seasonal workload spikes (month-end, holidays, Black Friday, etc.)
🔍 Analyze both average and peak utilization — not just one or the other
📈 Leave 20–30% buffer for “uh-oh” moments

Talk Nerdy Rule: Plan like a pessimist, optimize like a boss.

🛠️ Implementation

🚦Begin with non-prod environments — lower risk, faster feedback
🕒 Time changes for maintenance windows
📝 Document everything — what changed, why, and who approved it
🧯 Have a rollback plan (because stuff happens)

📡 Monitoring & Validation

👀 Watch performance for 48–72 hours
🚨 Set up alerts for any hiccups (latency, CPU spikes, etc.)
💰 Validate that the juice was worth the squeeze — check those savings
📚 Log lessons learned so future-you (and your team) get smarter

🏛️ Governance & Culture

📆 Run monthly or quarterly review cycles
🧾 Create policies that guide right-sizing at provisioning time
📣 Educate teams — show why sizing right beats sizing safe
🎉 Celebrate every successful right-size like a win on the cloud scoreboard

📌 Talk Nerdy Tip: Right-sizing isn’t just cost control — it’s cloud craftsmanship. Keep it lean, keep it clean, and make it a team sport. 🧢💡

🔁 Right-Sizing: Not a One-and-Done Deal

You wouldn’t go to the gym once and call it a transformation, right? Right-sizing isn’t a checkbox — it’s a habit. Cloud environments evolve, and so should your resource sizing. Here’s how to keep it tight (and cost-light):

📆 Review Regularly: Monthly or quarterly cycles are your friend. Mark it on the calendar like a team ritual: “Budget Brunch & Resize Review” anyone?
🤖 Automate What You Can: Let scripts do the snooping. Use monitoring tools to auto-flag idle or oversized resources. Azure Monitor, OCI custom metrics — they’re your cloud sous-chefs.
📣 Adjust Based on Feedback: Look at your results. Talk to teams. Fine-tune your strategy. Optimization without communication = chaos in a trench coat.
🆕 Stay Cloud-Curious: New SKUs and pricing models pop up faster than you can say “E4-Standard.” Make checking for newer, leaner instance types part of your routine.

📌 Talk Nerdy Tip: Right-sizing isn’t about cutting corners — it’s about fitting the cloud to your needs right now. Then doing it again next quarter. And the next.

🛠️ Common Right-Sizing Challenges

🚧 Performance Panic

The fear: “What if we shrink it and the app tanks?”.

The fix: Start small. Begin with non-critical workloads and monitor closely. Once you’ve got proof that performance holds steady, scale the wins. 🧪 Pro tip: Keep a “Right-Sizing Hall of Fame” doc with before/after metrics.

📉 Missing Metrics, Missing Context

The problem: Your monitoring stack doesn’t track everything — looking at you, memory usage.

The fix: Install proper agents. Azure? Use Azure Monitor Agent. OCI? Tap into custom metrics via OCI Monitoring. No data = no decisions.

🔄 Workload Whiplash (aka Variability)

The struggle: Some apps are chill on Monday and on fire by Friday.

The fix: Don’t hard-size — auto-scale. Use Azure VM Scale Sets or OCI Instance Pools. Set thresholds to scale based on real demand. That way, your infra breathes with your load.

🕒 Downtime Drama

The blocker: “We can’t shut this down just to resize!”

The fix: Use maintenance windows or pull off a blue-green right-sizing. Spin up right-sized clones. Test them. Flip the switch. Zero downtime, max optimization.

Was this documentation helpful?

Have suggestions for improving this document? Contact us.

🧩 Right-Sizing Resources for Azure & OCI

Table of Contents

⚙️ What is Right-Sizing? (No, it’s not a gym thing)

🎯 Why It Matters (Especially for SMBs)

💸 1. Cost Optimization

🚀 2. Performance Boost

🌍 3. Environmental Impact

🧹 4. Operational Efficiency

📊 Key Metrics for Right-Sizing

🧠 CPU Utilization

🧠 Memory Utilization

💽 I/O Operations

Azure Right-sizing Implementation

Virtual Machine Right-sizing

Azure: VM Right-sizing Process

Azure VM Right-sizing Query

Database Right-sizing

Azure: Database Right-sizing

Azure SQL Database

Storage Right-sizing

Azure: Storage Right-sizing

OCI Right-sizing Implementation

Compute Instance Right-sizing

OCI: Compute Right-sizing Process

OCI Compute Right-sizing Script

Database Right-sizing

OCI: Database Right-sizing

Autonomous Database

Storage Right-sizing

OCI: Storage Right-sizing

Right-sizing Automation

Azure Automation Options

OCI Automation Options

Azure: Automated Right-sizing Script

OCI: Automated Right-sizing Script

Best Practices for Right-Sizing

🧪 Planning & Analysis

🛠️ Implementation

📡 Monitoring & Validation

🏛️ Governance & Culture

🛠️ Common Right-Sizing Challenges

🚧 Performance Panic

📉 Missing Metrics, Missing Context

🔄 Workload Whiplash (aka Variability)

🕒 Downtime Drama

Related Resources

Tagging Strategies

Reserved Instances

Cost Allocation

Was this documentation helpful?