Fix IP Allocation Bug: Reserved Range Not Detected#2657
Conversation
src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb
Show resolved
Hide resolved
…r/ip_repo.rb Co-authored-by: Ivaylo Ivanov <ivaylogi98@gmail.com>
…_provider/ip_repo.rb" This reverts commit 6f689a9.
| # Sort by address first, then by prefix (smaller prefix = larger block = earlier) | ||
| sorted_ips = addresses_we_cant_allocate.sort_by { |ip| [ip.to_i, ip.prefix] } | ||
|
|
||
| # Remove IPs contained within larger CIDR blocks | ||
| sorted_ips = sorted_ips.reject.with_index do |ip, index| | ||
| sorted_ips[0...index].any? do |other_ip| | ||
| other_ip.prefix < ip.prefix && other_ip.include?(ip) | ||
| rescue StandardError | ||
| false |
There was a problem hiding this comment.
Why is this whole change needed? For performance reasons? Logic wise it seems to get to the same result as the previous code or am I wrong?
There was a problem hiding this comment.
The idea of the sorting is to potentially avoid comparisons like this:
10.0.11.32/32.include?(10.0.11.32/30)That is the main change along with the two way comparison further down:
blocking_ip = filtered_ips.find do |ip|
current_prefix.include?(ip) || ip.include?(current_prefix)
endI am not confident enough in the "fix" and we haven't been able to isolate the issue well enough. I will move the PR to draft for now.
| sorted_restricted_ips = restricted_ips.to_a.sort_by { |ip| [ip.to_i, ip.prefix] } | ||
|
|
||
| deduplicated_ips = sorted_restricted_ips.reject.with_index do |ip, index| | ||
| sorted_restricted_ips[0...index].any? do |other_ip| | ||
| other_ip.prefix < ip.prefix && other_ip.include?(ip) | ||
| rescue StandardError | ||
| false |
There was a problem hiding this comment.
Same as above. Why is this needed?
|
Moving to draft for the moment. Needs further attention. |
|
@neddp I think you discovered a legitimate bug. We have noticed sporadic deployment failures of the following kind: {
"error":{
"code":"PrivateIPAddressInReservedRange",
"message":"Private static IP address 192.168.11.2 falls within reserved IP range of subnet prefix 192.168.11.0/24.",
"details":[]
}
}The subnet prefix in this example was belonging to the following bosh network: networks:
- name: compilation_network
subnets:
- az: z1
cloud_properties:
application_security_groups:
- asg-bosh
subnet_name: subnet-bosh-z1
virtual_network_name: vnet-bosh
dns:
- 168.63.129.16
gateway: 192.168.11.1
range: 192.168.11.0/24
reserved:
- 192.168.11.0 - 192.168.11.141
- 192.168.11.242 - 192.168.11.255If you look at the first reserved range ( Do you plan to continue to work on this change? |
s4heid
left a comment
There was a problem hiding this comment.
Perhaps, it would also make sense to add an integration / regression test to avoid catching something critical like this in the future again.
| deduplicated_ips = sorted_restricted_ips.reject.with_index do |ip, index| | ||
| sorted_restricted_ips[0...index].any? do |other_ip| | ||
| other_ip.prefix < ip.prefix && other_ip.include?(ip) | ||
| rescue StandardError |
There was a problem hiding this comment.
this is very broad. perhaps IPAddr::InvalidAddressError would be sufficient to catch?
|
Hi @s4heid, We plan on continuing to work on it, but any help is appreciated in the meantime. Thanks! |

What is this change about?
This PR fixes a bug where BOSH allocates IPs from reserved CIDR ranges, causing CPI failures with "Address is in subnet's reserved address range" errors.
Root Cause: The overlap check in
find_next_available_ipwas unidirectional. It only checked if the candidate IP includes a blocking IP, but not if a blocking IP includes the candidate.When allocating a
/32(single IP) and the reserved range is a/30(4 IPs), the checkcandidate.include?(blocking)always returnsfalsebecause a smaller block cannot include a larger one. The reserved range was invisible to the algorithm.The Bug
Example:
10.0.11.32/3210.0.11.32/3010.0.11.32/32.include?(10.0.11.32/30)→falseThe Fix
Now
10.0.11.32/30.include?(10.0.11.32/32)→true→ correctly blocked.Additional Improvement: Deterministic Deduplication
The deduplication logic that removes redundant IPs (e.g.,
/32entries already covered by a/30block) has been refactored to sort IPs before processing. This ensures deterministic behavior regardless of Set iteration order, making the code easier to test and debug.What tests have you run?
Release notes
Fixed: IP allocation now correctly detects reserved CIDR ranges. Previously, when allocating single IPs (
/32) from a subnet with reserved ranges specified as larger blocks (e.g.,/30), the algorithm failed to detect the overlap and attempted to allocate reserved IPs, causing CPI errors.Breaking change?
No. This is a bug fix.