Affinity and Anti-Affinity Policies

What Are Server Groups and Placement Policies?

Server groups in OpenStack provide precise control over where your instances (virtual machines) run on the underlying physical infrastructure. By applying affinity or anti-affinity policies, you tell the Nova scheduler exactly how to distribute instances across compute nodes (physical servers).

This level of control becomes critical when architecting resilient applications. The wrong placement strategy can mean a single hardware failure takes down your entire application stack. The right strategy ensures availability, meets compliance requirements, and optimizes performance.

The Four Core Placement Policies

OpenStack provides four distinct policies that govern instance placement behavior. Each serves different architectural goals and carries specific tradeoffs.

Anti-Affinity Policy

Anti-affinity guarantees that instances in a server group will never run on the same physical host. If you have three instances with an anti-affinity policy, they will be distributed across three different compute nodes.

Use case: High availability for critical services. If you run three API servers with anti-affinity, a physical host failure affects only one instance. Your load balancer continues routing traffic to the remaining two servers.

Hard constraint: If you request a fourth instance but only have three compute nodes available, the request fails. The scheduler refuses to violate the anti-affinity rule.

Affinity Policy

Affinity does the opposite. It forces all instances in a server group onto the same physical host. This creates a shared-fate configuration where all instances succeed or fail together.

Use case: Workloads that communicate heavily and benefit from local network speeds. Database primary and read replica pairs, tightly coupled microservices, or distributed computing tasks where inter-node latency degrades performance.

Performance benefit: Network traffic between instances stays within the host's internal network bridge, avoiding the physical network entirely. This reduces latency from milliseconds to microseconds for local traffic.

Risk: Complete loss of redundancy. A single host failure takes down all instances in the group.

Soft Anti-Affinity Policy

Soft anti-affinity attempts to separate instances across different hosts but allows exceptions when necessary. If the scheduler cannot find separate hosts, it will place instances on the same host rather than fail the request.

Use case: Development environments, test deployments, or situations where availability is preferred but not mandatory. You want separation when possible but need flexibility when resources are constrained.

Tradeoff: You sacrifice the high availability guarantee of hard anti-affinity in exchange for deployment reliability. This policy never fails due to resource constraints, but you may end up with instances on shared hardware during peak usage.

Soft Affinity Policy

Soft affinity tries to place instances together on the same host but tolerates separation if required. The scheduler prefers co-location but prioritizes successful deployment over strict adherence.

Use case: Workloads that benefit from proximity but can tolerate distribution. Batch processing jobs, non-critical services, or applications where performance optimization is desirable but not required.

Behavior: During normal operation, instances land on the same host. Under resource pressure, they may spread across multiple hosts without failing the deployment.

How OpenStack Scheduling Works with Server Groups

When you create an instance that belongs to a server group, the Nova scheduler evaluates available compute nodes against the policy constraints.

Hard policies (affinity and anti-affinity):

Scheduler filters compute nodes that satisfy the policy
If no nodes meet the requirements, the instance enters ERROR state
The deployment fails with a "No valid host found" error
You must resolve resource constraints before retrying

Soft policies (soft-affinity and soft-anti-affinity):

Scheduler first attempts to honor the policy preference
If preferred placement is impossible, it falls back to any available host
The instance deploys successfully regardless of policy satisfaction
No deployment failures occur due to placement constraints

This difference makes hard policies appropriate for production workloads with strict requirements, while soft policies suit flexible or development environments.

Real-World Deployment Patterns

Pattern 1: Highly Available Web Application

You run a three-tier web application requiring 99.9% uptime. Deploy each tier using anti-affinity policies:

Frontend tier: Three Nginx instances in an anti-affinity server group behind a load balancer. Each instance runs on a separate physical host.

Application tier: Four application servers in an anti-affinity group. Loss of one host reduces capacity by 25% but maintains service availability.

Database tier: Two-node database cluster with anti-affinity. Primary and replica never share a physical host, ensuring data availability during hardware maintenance or failure.

Result: Any single compute node failure affects only one instance per tier. The application continues serving traffic with reduced capacity until failed nodes are recovered.

Pattern 2: High-Performance Analytics Cluster

You process large datasets using distributed computing where network latency between nodes directly impacts job completion time.

Configuration: Deploy all compute workers in an affinity server group. A 16-core job spawns 16 instances on the same physical host.

Performance gain: Inter-instance communication uses local virtual networking at near-native speeds. Network-intensive shuffle operations complete faster than distributed placement.

Limitation: Cluster size cannot exceed the resources of a single compute node. This pattern works for workloads that fit on large hosts but does not scale horizontally across multiple nodes.

Pattern 3: Development Environment with Best-Effort Separation

Your development team runs multiple staging environments that mirror production architecture but require cost optimization.

Configuration: Use soft anti-affinity for application instances. The policy attempts separation but tolerates co-location when compute resources are limited.

Benefit during normal load: Instances spread across hosts, approximating production behavior for testing.

Benefit during high utilization: New deployments succeed even when compute capacity is exhausted. Developers can continue testing without waiting for infrastructure scaling.

Tradeoff: Testing may not catch issues related to cross-host communication or failure scenarios when instances end up co-located.

Creating and Managing Server Groups

Using Horizon

You can create and assign server groups through the Horizon dashboard. Note that the Horizon UI only supports affinity and anti-affinity policies. Soft affinity and soft anti-affinity are only available via CLI.

Create a server group:

Navigate to Project > Compute > Server Groups.
Click Create Server Group in the upper right corner.
Enter a descriptive Name (e.g., ha-web-servers).
Select a Policy from the dropdown: Anti-Affinity or Affinity.
Click Create Server Group.

Assign an instance to a server group at launch:

Navigate to Project > Compute > Instances and click Launch Instance.
Complete the Details, Source, Flavor, and Networks tabs.
Open the Server Groups tab.
Click the + button next to your server group to move it to the Allocated list.
Finish any remaining tabs and click Launch Instance.

View server group details:

Navigate to Project > Compute > Server Groups.
Click a server group name to see its policy, ID, and member instances.

Using the CLI

For soft affinity and soft anti-affinity policies, or for scripting and automation, use the OpenStack CLI.

Create a Server Group

1openstack server group create \
2  --policy anti-affinity \
3  ha-web-servers

This creates a server group named "ha-web-servers" with an anti-affinity policy. Instances added to this group will never share a physical host.

Available policies: affinity, anti-affinity, soft-affinity, soft-anti-affinity

Launch Instances into a Server Group

1openstack server create \
2  --flavor m1.medium \
3  --image ubuntu-22.04 \
4  --network private-net \
5  --hint group=<server-group-id> \
6  web-01

The --hint group=<server-group-id> parameter tells the scheduler to apply the server group's policy when selecting a compute node.

Repeat this command for additional instances. Each will follow the same placement policy.

Verify Instance Placement

1openstack server list --long

Check the Host column to confirm instances are distributed according to your policy. For anti-affinity groups, each instance should show a different host value.

List Server Groups

1openstack server group list

Shows all server groups, their policies, and member instance counts.

View Server Group Details

1openstack server group show <server-group-id>

Displays the policy, member instances, and metadata for a specific group.

Common Mistakes and How to Avoid Them

Mistake 1: Insufficient Compute Capacity for Hard Policies

You create an anti-affinity group for six instances but have only four compute nodes. The fifth and sixth instances fail with "No valid host found" errors.

Solution: Before deploying with hard policies, verify you have sufficient compute nodes. For an anti-affinity group with N instances, you need at least N compute nodes.

Command to check available nodes:

1openstack hypervisor list
2openstack hypervisor show <hypervisor-id>

Review the state and available resources on each compute node.

Mistake 2: Forgetting About Maintenance Windows

You have five instances in an anti-affinity group across five compute nodes. During planned maintenance, you need to evacuate instances from one node. With no spare capacity, you cannot maintain anti-affinity while consolidating onto remaining nodes.

Solution: Maintain N+1 compute node capacity for anti-affinity groups. If you need five separated instances, provision six compute nodes. This provides headroom for maintenance operations without violating policy constraints.

Mistake 3: Using Affinity Without Understanding Single Points of Failure

You deploy a database cluster with three nodes using an affinity policy to maximize replication speed. When the physical host experiences a hardware failure, you lose all three database nodes simultaneously.

Solution: Affinity creates a shared-fate architecture by design. Only use it when the performance benefit justifies losing all redundancy, and ensure you have backup and recovery procedures that account for total cluster failure.

Mistake 4: Assuming Soft Policies Guarantee Separation

You use soft anti-affinity expecting instances to spread across hosts. During a capacity crunch, the scheduler places multiple instances on the same node. Your application experiences correlated failures you thought were prevented.

Solution: Soft policies are best-effort only. For critical workloads requiring guaranteed separation, use hard anti-affinity. Reserve soft policies for environments where separation is preferred but not required.

Monitoring and Troubleshooting Placement Issues

Check Scheduler Logs

When instance creation fails with "No valid host found", the Nova scheduler logs explain why:

1grep "No valid host" /var/log/nova/nova-scheduler.log

Common reasons include:

Insufficient hosts for anti-affinity policy
All compute nodes lacking required resources
Filter mismatches between instance requirements and available hosts

Audit Current Placement

Use the compute host list to verify physical placement matches your policy expectations:

1for instance in $(openstack server list -f value -c ID); do
2  echo "Instance: $instance"
3  openstack server show $instance -f value -c "OS-EXT-SRV-ATTR:host"
4done

This shows which physical host each instance runs on. For anti-affinity groups, no host should appear more than once.

Validate Server Group Membership

1openstack server group show <group-id> -c members

Confirms which instances belong to the server group. If an instance is not a member, it will not follow the group's placement policy.

When to Use Each Policy

Use hard anti-affinity when:

Application requires guaranteed high availability
Compliance mandates physical separation
Service-level agreements require resilience to single host failures
You have sufficient compute capacity to support separation

Use soft anti-affinity when:

Separation is preferred but not mandatory
Development or test environments need flexibility
Deployment reliability matters more than strict placement
Compute capacity fluctuates and you want graceful degradation

Use hard affinity when:

Performance requires low-latency local networking
Workload size fits on a single large compute node
Shared-fate architecture is acceptable
Network bandwidth between instances is a bottleneck

Use soft affinity when:

Performance benefits from co-location but workload tolerates distribution
Workload may exceed single-node capacity during scaling
You want performance optimization without risking deployment failures

Best Practices

Start with soft policies in development: Test your application behavior under both separated and co-located scenarios before enforcing hard policies in production.

Document your topology: Maintain clear records of which server groups exist, their policies, and which instances belong to each. This prevents confusion during incident response.

Monitor actual placement: Regularly audit that instances are placed according to policy. Soft policies may drift from intended placement over time as the infrastructure changes.

Plan capacity for N+1 redundancy: Always provision more compute nodes than your largest anti-affinity group requires. This provides operational flexibility during maintenance and scaling events.

Combine policies with availability zones: Use anti-affinity within an availability zone and distribute server groups across multiple zones for defense in depth.

Review policies during scaling: As applications grow, server group policies may need adjustment. A policy that worked for three instances may not scale to twenty.

Conclusion

Affinity and anti-affinity policies give you precise control over instance placement in OpenStack. Hard policies guarantee separation or co-location but may fail deployments when resources are constrained. Soft policies prioritize deployment success while attempting to honor placement preferences.

For production workloads requiring high availability, hard anti-affinity prevents correlated failures across your infrastructure. For performance-sensitive applications where instances communicate heavily, hard affinity optimizes local network speeds at the cost of shared-fate risks.

Understanding these tradeoffs allows you to design cloud architectures that balance availability, performance, and operational flexibility. Choose policies that align with your application's requirements and your infrastructure's capacity constraints.

Understanding Affinity and Anti-Affinity Policies in OpenStack