Understanding Disk Storage and Capacity Planning in OpenStack
Storage decisions in OpenStack directly affect your workload's performance, cost, and resilience. The platform offers multiple storage options, each designed for different use cases. Choosing the right combination of disk type (HDD or NVMe) and storage architecture (local or distributed) determines whether your applications run efficiently and your data remains protected.
This guide explains the differences between storage types available in OpenStack, how to select the right option for your workload, and how to estimate your storage needs including backup considerations.
Storage Types: HDD vs NVMe
The physical media underlying your storage determines baseline performance characteristics. OpenStack environments typically offer two disk types: traditional hard disk drives (HDD) and solid-state NVMe drives.
HDD Storage
Hard disk drives use spinning magnetic platters and mechanical read/write heads. This technology has been the standard for decades and remains cost-effective for large capacity requirements.
Performance characteristics:
- Sequential read/write speeds typically range from 100 to 200 MB/s
- Random I/O performance is limited by seek time (the physical movement of the read head), typically 50 to 150 IOPS
- Higher latency compared to flash storage, often 5 to 15 milliseconds per operation
Best suited for:
- Archival data and cold storage
- Log aggregation and long-term retention
- Backup storage where retrieval speed is not critical
- Large datasets with primarily sequential access patterns (media files, data lakes)
- Development and test environments where performance is not the primary concern
Cost profile: Lower cost per gigabyte, making HDD the economical choice when you need large volumes of storage and can tolerate slower access.
NVMe Storage
NVMe (Non-Volatile Memory Express) drives are solid-state devices connected via the PCIe bus, eliminating the mechanical limitations of HDDs entirely.
Performance characteristics:
- Sequential read/write speeds can exceed 3,000 MB/s on modern drives
- Random I/O performance reaches tens of thousands to hundreds of thousands of IOPS
- Latency measured in microseconds rather than milliseconds
Best suited for:
- Production databases (MySQL, PostgreSQL, MongoDB)
- High-transaction applications
- Real-time analytics and data processing
- Boot volumes where fast startup matters
- Any workload where I/O latency impacts user experience or application throughput
Cost profile: Higher cost per gigabyte than HDD. The premium is justified when your application's performance depends on storage speed.
Quick Comparison: HDD vs NVMe
| Attribute | HDD | NVMe |
|---|---|---|
| Sequential throughput | 100 to 200 MB/s | 3,000+ MB/s |
| Random IOPS | 50 to 150 | 50,000 to 500,000+ |
| Latency | 5 to 15 ms | 0.1 to 0.5 ms |
| Cost per GB | Lower | Higher |
| Durability | Sensitive to vibration/shock | No moving parts |
| Best for | Archival, logs, cold data | Databases, production apps |
Storage Architecture: Local vs Distributed (Ceph)
Beyond the physical disk type, OpenStack offers two fundamentally different storage architectures: local storage and distributed block storage (typically Ceph).
Local Storage
Local storage refers to disk space physically attached to the compute host running your instance. When you select a flavor (instance size), the "disk" value shown represents local storage allocated from the hypervisor's directly attached drives. This storage is persistent: your data survives instance reboots and remains intact as long as the instance exists.
Note: Local storage is different from ephemeral storage. Ephemeral storage is non-persistent and disappears when an instance is terminated. Local storage, as described here, persists with the instance.
Characteristics:
- Storage is tied to a specific compute host
- Data is persistent and survives reboots
- Data does not replicate automatically to other hosts
- If the host fails, local disk data may be lost or inaccessible until the host recovers
- No network overhead for I/O operations
- Performance limited only by the underlying drive (HDD or NVMe)
Advantages:
- Lowest latency possible since no network hop is involved
- Maximum throughput of the underlying disk is available
- Simple to understand: the disk size in your flavor is what you get
- Data persists across instance reboots
Limitations:
- No built-in redundancy: If the physical host experiences a hardware failure, data on local storage is at risk until the host is recovered
- Host-bound: Your instance and its data are tied to a specific physical server
- Limited live migration options: Moving an instance with local storage to another host requires copying the entire disk, which can take considerable time
Best suited for:
- Workloads requiring maximum I/O performance
- Applications that manage their own replication (like clustered databases)
- High-performance computing where network latency is unacceptable
- Scenarios where host-level failure risk is acceptable or mitigated by application-level redundancy
Distributed Storage (Ceph Volumes)
Ceph is a distributed storage system that spreads data across multiple nodes in the cluster. In OpenStack, Ceph backs the Cinder volume service, meaning volumes you create and attach to instances are stored in Ceph.
Characteristics:
- Data is replicated across multiple storage nodes (typically three copies)
- Volumes exist independently of any single compute host
- Accessible from any instance in the same availability zone
- I/O traverses the network between compute and storage nodes
Advantages:
- High availability: If a storage node fails, data remains accessible from other replicas
- Persistence: Volumes survive instance termination and can be reattached to different instances
- Snapshots and backups: Cinder provides integrated snapshot and backup capabilities for Ceph volumes
- Live migration friendly: Instances using only Ceph volumes can be live-migrated without copying disk data
Limitations:
- Network latency adds overhead compared to local disk
- Throughput is constrained by network bandwidth between compute and storage tiers
- More complex architecture means more components that could experience issues
Best suited for:
- Production databases and stateful applications
- Any data that must survive instance or host failures
- Workloads requiring snapshot or backup capabilities
- Environments where live migration and high availability are requirements
Quick Comparison: Local vs Ceph
| Attribute | Local Storage | Ceph Volumes |
|---|---|---|
| Persistence | Yes (survives reboots) | Yes |
| Redundancy | None (single host) | Typically 3x replication |
| Survives host failure | At risk until host recovers | Yes |
| Independent of instance | No (tied to instance) | Yes (volumes persist independently) |
| Live migration | Slow (requires disk copy) | Fast (no disk copy needed) |
| Latency | Lowest | Network-dependent |
| Snapshots/Backups | Limited | Full Cinder support |
| Best for | High-performance, HPC | Stateful apps, HA requirements |
How to Choose the Right Storage
Selecting storage involves two decisions: disk type (HDD vs NVMe) and architecture (local vs Ceph). Use these guidelines based on your workload requirements.
Decision 1: HDD vs NVMe
Choose HDD when:
- Cost per gigabyte is the primary concern
- Workload is not I/O-sensitive (archival, logs, backups)
- Access patterns are primarily sequential
- You need large capacity without premium performance
Choose NVMe when:
- Application performance depends on storage speed
- Workload involves random I/O (databases, transactional systems)
- Low latency is critical (real-time applications)
- Boot time matters for rapid scaling
Decision 2: Local Storage vs Ceph Volumes
Choose local storage when:
- You need maximum possible I/O performance with zero network overhead
- Application handles its own data replication (clustered databases, distributed systems)
- Host-level failure risk is acceptable or mitigated by application-level redundancy
- Live migration speed is not a priority
Choose Ceph volumes when:
- Data must be portable between instances
- High availability at the storage layer is required
- You need snapshot and backup capabilities through Cinder
- Live migration support matters for your operational model
- Compliance or business requirements mandate storage-level redundancy
Common Scenarios
Production database (PostgreSQL, MySQL):
- NVMe-backed Ceph volumes
- Provides fast I/O for queries, persistence across failures, snapshot capabilities for backups
Web application servers:
- Local storage with HDD or NVMe depending on performance needs
- Good choice when application data lives elsewhere (database, object storage) and instances can be rebuilt from images if needed
Development and testing:
- Local HDD storage
- Cost-effective, acceptable performance for non-production workloads
Data analytics processing:
- Local NVMe for high-speed working storage during processing
- Ceph volumes for input datasets and output storage that need to be portable or backed up
Archival and backup storage:
- HDD-backed Ceph volumes
- Cost-effective for large volumes, redundancy protects archived data
Estimating Storage Needs
Proper capacity planning prevents both over-provisioning (wasted cost) and under-provisioning (performance degradation or out-of-space failures).
Typical Storage Requirements by Workload
| Workload Type | Typical Storage Range | Notes |
|---|---|---|
| Small web application | 20 to 50 GB | OS, application code, logs |
| Medium application server | 50 to 200 GB | Application data, local caching |
| Production database (small) | 100 to 500 GB | Data files, logs, temp space |
| Production database (medium to large) | 500 GB to 2 TB+ | Scale based on actual data volume |
| Development instance | 20 to 100 GB | Varies by project |
| Log aggregation | 500 GB to 5 TB | Depends on retention policy |
| Data processing (scratch) | 100 GB to 1 TB | Temporary storage for jobs |
Calculating Your Requirements
When estimating storage needs, account for these factors:
- Operating system and base software: Linux distributions typically require 5 to 15 GB depending on installed packages
- Application code and dependencies: Usually 1 to 10 GB for most applications
- Application data: This is workload-specific. For databases, calculate based on current data size plus expected growth. For file storage, audit existing usage patterns.
- Logs and temporary files: Allocate 10 to 20% additional space for logs, swap, and temp files unless these are directed elsewhere
- Growth margin: Add 20 to 30% headroom for growth. Running volumes at 90%+ capacity causes performance issues and leaves no room for unexpected spikes.
Example calculation for a database server:
- OS and packages: 15 GB
- Database engine: 2 GB
- Current data: 150 GB
- Expected 6-month growth: 50 GB
- Logs and temp: 30 GB
- Total with headroom: approximately 300 GB volume
Backup Storage Considerations
Understanding where backups reside and how they consume storage is critical for capacity planning.
Volume Snapshots
Snapshots are stored within the Cinder backend (Ceph). On InMotion Cloud, this means snapshot storage comes from the same Ceph cluster as your volumes.
Storage impact:
- Snapshots use copy-on-write, so the initial snapshot consumes minimal additional space
- As the source volume changes, snapshot storage grows to preserve the original blocks
- Long-running snapshots of active volumes can consume significant storage over time
Quota accounting: Snapshot storage counts against your block storage quota.
Volume Backups
Backups are written to object storage (Swift), which is separate from the Ceph block storage pool.
Storage impact:
- Initial full backup consumes storage equal to the used space on the volume (not the allocated size)
- Incremental backups store only changed blocks, reducing subsequent backup sizes
- Backup storage is independent of block storage quota
Quota accounting: Backup storage counts against your object storage quota, not block storage.
Do Backups Consume the Same Pool?
No. Backups and volumes use different storage pools:
| Data Type | Storage Pool | Quota Type |
|---|---|---|
| Volumes | Ceph block storage | Block storage quota |
| Snapshots | Ceph block storage | Block storage quota |
| Backups | Swift object storage | Object storage quota |
This separation is intentional. Backups in object storage remain accessible even if the block storage tier experiences issues, which is why backups provide disaster recovery protection that snapshots cannot.
Planning Backup Storage Capacity
When planning backup capacity, consider:
- Number of volumes to back up: List all volumes requiring backup protection
- Full backup size: Sum the used space across all volumes (check volume details for actual usage versus allocated size)
- Retention period: How many days/weeks of backups will you keep?
- Backup frequency and change rate: If you take daily incremental backups and your data changes 5% daily, each incremental is roughly 5% of the full volume size
Example backup storage calculation:
- 5 volumes, averaging 200 GB used space each = 1 TB total
- Weekly full backups, daily incrementals
- 30-day retention (4 full backups + approximately 26 incrementals)
- Assume 5% daily change rate
Backup storage needed:
- 4 full backups: 4 TB
- 26 incremental backups at 5% of 1 TB: approximately 1.3 TB
- Total backup storage: approximately 5.3 TB object storage
Summary
Storage selection in OpenStack comes down to matching your workload requirements to the right combination of disk type and architecture:
- HDD for cost-effective, high-capacity storage where performance is secondary
- NVMe for performance-critical workloads requiring low latency and high IOPS
- Local storage for workloads needing maximum raw performance where application-level redundancy handles failure scenarios
- Ceph volumes for portable, highly available storage with snapshot and backup support
Plan capacity by auditing actual needs, adding growth margin, and separately accounting for backup storage in object storage versus block storage quotas.
For help selecting the right storage configuration for your specific workload, contact InMotion Cloud support.