|
| 1 | +# DASH Private Link and Private Link NSG HLD |
| 2 | + |
| 3 | +| Rev | Date | Author | Change Description | |
| 4 | +| --- | ---- | ------ | ------------------ | |
| 5 | +| 0.1 | 03/29/2024 | Riff Jiang | Initial version | |
| 6 | + |
| 7 | +1. [1. Terminology](#1-terminology) |
| 8 | +2. [2. Background](#2-background) |
| 9 | +3. [3. SDN transformation](#3-sdn-transformation) |
| 10 | + 1. [3.1. Private Link](#31-private-link) |
| 11 | + 1. [3.1.1. VM-to-PLS direction](#311-vm-to-pls-direction) |
| 12 | + 2. [3.1.2. PLS-to-VM direction](#312-pls-to-vm-direction) |
| 13 | + 2. [3.2. Private Link NSG](#32-private-link-nsg) |
| 14 | + 1. [3.2.1. VM-to-PLS direction](#321-vm-to-pls-direction) |
| 15 | + 2. [3.2.2. PLS-to-VM direction](#322-pls-to-vm-direction) |
| 16 | + 3. [3.3. Load balancer fast path support](#33-load-balancer-fast-path-support) |
| 17 | + 4. [3.4. Non-required features](#34-non-required-features) |
| 18 | +4. [4. Resource modeling, requirement, and SLA](#4-resource-modeling-requirement-and-sla) |
| 19 | + 1. [4.1. Scaling requirements](#41-scaling-requirements) |
| 20 | + 2. [4.2. Reliability requirements](#42-reliability-requirements) |
| 21 | +5. [5. SAI API design](#5-sai-api-design) |
| 22 | + 1. [5.1. DASH ENI attributes](#51-dash-eni-attributes) |
| 23 | + 2. [5.2. DASH CA-PA mapping attributes](#52-dash-ca-pa-mapping-attributes) |
| 24 | + 3. [5.3. DASH tunnel table and attributes](#53-dash-tunnel-table-and-attributes) |
| 25 | +6. [6. DASH pipeline behavior](#6-dash-pipeline-behavior) |
| 26 | + 1. [6.1. VM-to-PLS direction (Outbound)](#61-vm-to-pls-direction-outbound) |
| 27 | + 1. [6.1.1. Private Link](#611-private-link) |
| 28 | + 2. [6.1.2. Private Link NSG](#612-private-link-nsg) |
| 29 | + 2. [6.2. PLS-to-VM direction](#62-pls-to-vm-direction) |
| 30 | +7. [7. DASH database schema](#7-dash-database-schema) |
| 31 | + |
| 32 | +## 1. Terminology |
| 33 | + |
| 34 | +| Term | Explanation | |
| 35 | +| --- | --- | |
| 36 | +| PL | Private Link: <https://azure.microsoft.com/en-us/products/private-link>. | |
| 37 | +| NSG | Network Security Group. | |
| 38 | +| PE | Private endpoint. | |
| 39 | +| PLS | Private Link Service. This is the term for private endpoint from server side. Customer can create their private link service, then expose them to their VNETs as a private endpoint. | |
| 40 | + |
| 41 | +## 2. Background |
| 42 | + |
| 43 | +Azure Private Link provides private connectivity from a virtual network to Azure platform as a service, by providing an 1-to-1 VNET mapping to the service. |
| 44 | + |
| 45 | +This doc is used to capture the requirements for implementing the Private Link and Private Link NSG in the context of DASH APIs. |
| 46 | + |
| 47 | +## 3. SDN transformation |
| 48 | + |
| 49 | +### 3.1. Private Link |
| 50 | + |
| 51 | +#### 3.1.1. VM-to-PLS direction |
| 52 | + |
| 53 | +When a packet coming from the VM and being sent to PLS, it will be transformed as below: |
| 54 | + |
| 55 | + |
| 56 | + |
| 57 | +#### 3.1.2. PLS-to-VM direction |
| 58 | + |
| 59 | +And the return packet from PLS to VM, will be transformed as below: |
| 60 | + |
| 61 | + |
| 62 | + |
| 63 | +### 3.2. Private Link NSG |
| 64 | + |
| 65 | +#### 3.2.1. VM-to-PLS direction |
| 66 | + |
| 67 | +When NSG appliance is enabled, the VM-to-PLS packet will have an additional outer encap that tunnels the packet to NSG appliance as below: |
| 68 | + |
| 69 | + |
| 70 | + |
| 71 | +#### 3.2.2. PLS-to-VM direction |
| 72 | + |
| 73 | +The return packet will be the same as Private Link, coming directly from PLS bypassing the NSG appliance. |
| 74 | + |
| 75 | +### 3.3. Load balancer fast path support |
| 76 | + |
| 77 | +The fast path here is not the DASH hardware fast path, but the [load balancer fast path ICMP flow redirection](../load-bal-service/fast-path-icmp-flow-redirection.md). |
| 78 | + |
| 79 | +1. If PL NSG is not used, it changes the flow just like regular PL case. |
| 80 | +2. If PL NSG is used, it updates the PL encap, and **removes** the outer NSG encap. |
| 81 | + |
| 82 | +For more information on how Fast Path works, please refer to [Fast Path design doc](../load-bal-service/fast-path-icmp-flow-redirection.md). |
| 83 | + |
| 84 | +### 3.4. Non-required features |
| 85 | + |
| 86 | +- RST on connection idle timeout. |
| 87 | + |
| 88 | +## 4. Resource modeling, requirement, and SLA |
| 89 | + |
| 90 | +### 4.1. Scaling requirements |
| 91 | + |
| 92 | +| Metric | Requirement | |
| 93 | +| --- | --- | |
| 94 | +| # of ENIs per DPU | 32 | |
| 95 | +| # of VNET mapping per ENI | 64K | |
| 96 | +| # of PPS | 64M | |
| 97 | +| VNET mapping change rate (CRUD) | (TBD) | |
| 98 | +| # of fast path packets | Same as CPS. 3M per card. | |
| 99 | +| # of tunnels | (TBD) | |
| 100 | +| # of next hop in each tunnel | (TBD) | |
| 101 | + |
| 102 | +### 4.2. Reliability requirements |
| 103 | + |
| 104 | +The flows replication follows the SmartSwitch HA design. |
| 105 | + |
| 106 | +For more information, please refer to [SmartSwitch HA design doc](https://github.com/sonic-net/SONiC/blob/master/doc/smart-switch/high-availability/smart-switch-ha-hld.md). |
| 107 | + |
| 108 | +## 5. SAI API design |
| 109 | + |
| 110 | +The following SAI API only includes the SAI updates that used for setting up the PL / PL NSG scenarios. |
| 111 | + |
| 112 | +### 5.1. DASH ENI attributes |
| 113 | + |
| 114 | +The following attributes will be added on ENI: |
| 115 | + |
| 116 | +| Attribute name | Type | Description | |
| 117 | +| --- | --- | --- | |
| 118 | +| SAI_ENI_ATTR_PL_UNDERLAY_SIP | sai_ip_address_t | Underlay IP that will be used for private link routing type. | |
| 119 | + |
| 120 | +### 5.2. DASH CA-PA mapping attributes |
| 121 | + |
| 122 | +The following attributes will be added to CA-to-PA entry, for supporting service rewrites for PL/PL NSG: |
| 123 | + |
| 124 | +| Attribute name | Type | Description | |
| 125 | +| --- | --- | --- | |
| 126 | +| SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_OVERLAY_SIP_MASK | sai_ip_address_t | Used with overlay sip to support src prefix rewrite. | |
| 127 | +| SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_OVERLAY_DIP_MASK | sai_ip_address_t | Used with overlay dip to support dst prefix rewrite. | |
| 128 | +| SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_TUNNEL_ID | sai_object_id_t | Used to specify the tunnel. It can be a tunnel next hop id or the tunnel id, depending on if multiple dips as ECMP group is required. | |
| 129 | + |
| 130 | +The PL and PL NSG will share the same routing type on the CA-PA mapping entry: |
| 131 | + |
| 132 | +```c |
| 133 | +typedef enum _sai_outbound_ca_to_pa_entry_action_t |
| 134 | +{ |
| 135 | + // ... |
| 136 | + |
| 137 | + SAI_OUTBOUND_CA_TO_PA_ENTRY_ACTION_SET_PRIVATE_LINK_MAPPING, |
| 138 | + |
| 139 | + // ... |
| 140 | +} sai_outbound_ca_to_pa_entry_action_t; |
| 141 | +``` |
| 142 | + |
| 143 | +### 5.3. DASH tunnel table and attributes |
| 144 | + |
| 145 | +A new tunnel next hop table needs to be added with the following attributes: |
| 146 | + |
| 147 | +| Attribute name | Type | Description | |
| 148 | +| --- | --- | --- | |
| 149 | +| SAI_DASH_TUNNEL_ENTRY_ATTR_DASH_ENCAPSULATION | sai_dash_encapsulation_t | Encapsulation type, such as VxLan, NvGRE. Optional. If not specified, the encap from tunnel will be used. | |
| 150 | +| SAI_DASH_TUNNEL_ENTRY_ATTR_VNI | sai_uint32_t | VNI used in the encap. Optional. If not specified, the VNI from tunnel will be used. | |
| 151 | +| SAI_DASH_TUNNEL_NEXT_HOP_ENTRY_ATTR_DIP | sai_ip_address_t | Destination IP of the next hop. | |
| 152 | + |
| 153 | +When multiple destination IPs are required as ECMP group, the tunnel table and tunnel member will be used to specify the tunnel with multiple next hop information: |
| 154 | + |
| 155 | +- A new tunnel table needs to be added with the following attributes: |
| 156 | + |
| 157 | + | Attribute name | Type | Description | |
| 158 | + | --- | --- | --- | |
| 159 | + | SAI_DASH_TUNNEL_ENTRY_ATTR_DASH_ENCAPSULATION | sai_dash_encapsulation_t | Encapsulation type, such as VxLan, NvGRE. | |
| 160 | + | SAI_DASH_TUNNEL_ENTRY_ATTR_VNI | sai_uint32_t | VNI used in the encap. | |
| 161 | + |
| 162 | +- A new tunnel member table needs to be added to create the bindings between tunnel and next hop: |
| 163 | + |
| 164 | + | Attribute name | Type | Description | |
| 165 | + | --- | --- | --- | |
| 166 | + | SAI_DASH_TUNNEL_MEMBER_ENTRY_ATTR_TUNNEL_ID | sai_object_id_t | Tunnel Id | |
| 167 | + | SAI_DASH_TUNNEL_MEMBER_ENTRY_ATTR_TUNNEL_NEXT_HOP_ID | sai_object_id_t | Tunnel next hop id | |
| 168 | + |
| 169 | +## 6. DASH pipeline behavior |
| 170 | + |
| 171 | +### 6.1. VM-to-PLS direction (Outbound) |
| 172 | + |
| 173 | +The VM-to-PLS direction is modeled as outbound pipeline in DASH. |
| 174 | + |
| 175 | +To demonstrate how the DASH pipeline works, let's say, we have a VM in with IP 10.0.0.1, trying to reach the Private Endpoint in their VNET with IP 10.0.1.1, and the VM Outbound VNI is 1000000. |
| 176 | + |
| 177 | +#### 6.1.1. Private Link |
| 178 | + |
| 179 | +For private link, the packet will go through the pipeline with following setup: |
| 180 | + |
| 181 | +1. **Direction Lookup**: First, we will look up the VNI to determine the packet direction. In this case, we consider all the packets from on-premises network as outbound direction from the floating NIC perspective. |
| 182 | + |
| 183 | + | SAI field name | Type | Value | |
| 184 | + | --- | --- | --- | |
| 185 | + | entry.vni | `sai_uint32_t` | `1000000` | |
| 186 | + | entry_attr.SAI_DIRECTION_LOOKUP_ENTRY_ATTR_ACTION | `sai_direction_lookup_entry_action_t` | `SAI_DIRECTION_LOOKUP_ENTRY_ACTION_SET_OUTBOUND_DIRECTION` | |
| 187 | + |
| 188 | +2. **ENI Lookup**: Then, we will use the inner MAC address to find the ENI pipeline. Then, the outer encap will be decap’ed, leaving inner packet going through the rest of pipeline. |
| 189 | + |
| 190 | + First, we use ENI ether address map table to find the ENI id: |
| 191 | + |
| 192 | + | SAI field name | Type | Value | |
| 193 | + | --- | --- | --- | |
| 194 | + | entry.address | `sai_mac_t` | `11-22-33-44-55-66` | |
| 195 | + | entry_attr.SAI_ENI_ETHER_ADDRESS_MAP_ENTRY_ATTR_ENI_ID | `sai_object_id_t` | (SAI object ID of the ENI) | |
| 196 | + |
| 197 | + Then, we use the ENI id to find the ENI, which contains the PL underlay source IP as below: |
| 198 | + |
| 199 | + | SAI field name | Type | Value | |
| 200 | + | --- | --- | --- | |
| 201 | + | entry_attr.SAI_ENI_ATTR_PL_UNDERLAY_SIP | `sai_ip_address_t` | 2.2.2.1 | |
| 202 | + |
| 203 | +3. **Conntrack Lookup**: If flow already exists, we directly apply the transformation from the flow, otherwise, move on. |
| 204 | +4. **ACL**: No changes in the ACL stage, it will work just like the other cases. |
| 205 | +5. **Routing**: The inner destination IP (a.k.a. overlay dip) will be used for finding the route entry. This will trigger the maprouting action to run, which makes the packet entering Mapping stage. |
| 206 | + |
| 207 | + The routing stage could also have underlay source ip defined, but the `PL_UNDERLAY_SIP` will be used first, whenever the routing type is set to `privatelink`. |
| 208 | + |
| 209 | + The outbound routing entry will look like as below: |
| 210 | + |
| 211 | + | SAI field name | Type | Value | |
| 212 | + | --- | --- | --- | |
| 213 | + | entry.eni_id | `sai_object_id_t` | (SAI object ID of the ENI) | |
| 214 | + | entry.destination | `sai_ip_prefix_t` | `10.0.1.0/24` | |
| 215 | + | entry_attr.SAI_OUTBOUND_ROUTING_ENTRY_ATTR_ACTION | `sai_outbound_routing_entry_action_t` | `SAI_OUTBOUND_ROUTING_ENTRY_ACTION_ROUTE_VNET` | |
| 216 | + | entry_attr.SAI_OUTBOUND_ROUTING_ENTRY_ATTR_DST_VNET_ID | `sai_object_id_t` | (SAI object ID of the destination VNET) | |
| 217 | + | entry_attr.SAI_OUTBOUND_ROUTING_ENTRY_ATTR_METER_CLASS | `sai_uint16_t` | `60000` | |
| 218 | + |
| 219 | +6. **Mapping - VNET**: The inner destination IP will be used for finding the outbound CA-PA mapping entry, of which the routing type will be set to private link. |
| 220 | + |
| 221 | + | SAI field name | Type | Value | |
| 222 | + | --- | --- | --- | |
| 223 | + | entry.dst_vnet_id | `sai_object_id_t` | (SAI object ID of the destination VNET) | |
| 224 | + | entry.dip | `sai_ip_address_t` | `10.0.1.1` | |
| 225 | + | entry_attr.SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_ACTION | `sai_outbound_ca_to_pa_entry_action_t` | `SAI_OUTBOUND_CA_TO_PA_ENTRY_ACTION_SET_PRIVATE_LINK_MAPPING` | |
| 226 | + | entry_attr.SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_UNDERLAY_DIP | `sai_ip_address_t` | `3.3.3.1` | |
| 227 | + | entry_attr.SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_OVERLAY_DMAC | `sai_mac_t` | `99-88-77-66-55-44` | |
| 228 | + | entry_attr.SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_OVERLAY_SIP | `sai_ip_address_t` | `9988::` | |
| 229 | + | entry_attr.SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_OVERLAY_SIP_MASK | `sai_ip_address_t` | `FFFF:FFFF:FFFF:FFFF:FFFF:FFFF::` | |
| 230 | + | entry_attr.SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_OVERLAY_DIP | `sai_ip_address_t` | `1122:3344:5566:7788::303:301/128` | |
| 231 | + | entry_attr.SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_OVERLAY_DIP_MASK | `sai_ip_address_t` | `FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF` | |
| 232 | + | entry_attr.SAI_OUTBOUND_ROUTING_ENTRY_ATTR_METER_CLASS | `sai_uint16_t` | `60001` | |
| 233 | + |
| 234 | +7. **Metering**: The last action we need to do is to find the corresponding metering rule. |
| 235 | +8. **Conntrack Update**: Both forwarding and reverse flows will be created by this stage. |
| 236 | +9. **Metering Update**: Metering update will update the metering counter based on the rules that we found before. |
| 237 | +10. **Underlay routing**: Underlay routing will move the packet to the right port and forward it out. |
| 238 | + |
| 239 | +#### 6.1.2. Private Link NSG |
| 240 | + |
| 241 | +The changes needed for PL NSG is mostly the same as PL. In addition, on the VNET mapping, we need to provide the extra tunnel info. |
| 242 | + |
| 243 | +| SAI field name | Type | Value | |
| 244 | +| --- | --- | --- | |
| 245 | +| entry_attr.SAI_OUTBOUND_CA_TO_PA_ENTRY_ATTR_TUNNEL_ID | `sai_object_id_t` | (SAI object ID of the NSG tunnel) | |
| 246 | + |
| 247 | +And we can use the tunnel next hop table to specify the tunnel information: |
| 248 | + |
| 249 | +| SAI field name | Type | Value | |
| 250 | +| --- | --- | --- | |
| 251 | +| entry_attr.SAI_DASH_TUNNEL_ENTRY_ATTR_DASH_ENCAPSULATION | `sai_dash_encapsulation_t` | `SAI_DASH_ENCAPSULATION_VXLAN` | |
| 252 | +| entry_attr.SAI_DASH_TUNNEL_ENTRY_ATTR_VNI | `sai_uint32_t` | `2000000` | |
| 253 | +| entry_attr.SAI_DASH_TUNNEL_NEXT_HOP_ENTRY_ATTR_DIP | `sai_ip_address_t` | `100.0.1.1` | |
| 254 | + |
| 255 | +### 6.2. PLS-to-VM direction |
| 256 | + |
| 257 | +On the return path, we will leverage the reverse flow that is created by the outbound side to process the packet and forward it back to the original source. |
| 258 | + |
| 259 | +Since the packet that being sent back to the VM in PL NSG scenario will be exactly the same as regular PL, and the reverse flow that being created in the PL NSG scenario will also be the same, there is nothing we need to change for the PL NSG case. |
| 260 | + |
| 261 | +The packet will go through the DASH pipeline as below: |
| 262 | + |
| 263 | +1. **Direction Lookup**: First, we will use the VNI to determine the packet direction. In this case, since Private Link Key is not in the outbound VNI list, we consider all the packets from PLS side as inbound direction. |
| 264 | + |
| 265 | +2. **ENI Lookup**: We will use the inner destination MAC address to find the ENI pipeline. Once found, the outer encap will be decap'ed, exposing the inner packet for later processing. |
| 266 | + |
| 267 | + The ENI entry that we are using will be the same as before. Hence, omitted here. |
| 268 | + |
| 269 | +3. **Conntrack Lookup**: The return packet transformation will be handled by reverse flow. |
| 270 | +4. **Metering Update**: Metering update will update the metering counter based on the rules that we saved in the reverse flow. |
| 271 | +5. **Underlay routing**: Underlay routing will move the packet to the right port and forward it out. |
| 272 | + |
| 273 | +## 7. DASH database schema |
| 274 | + |
| 275 | +For the DASH database schema, please refer to the [SONIC-DASH HLD](https://github.com/sonic-net/SONiC/blob/master/doc/dash/dash-sonic-hld.md). |
0 commit comments