neuron l3 ha ecmp with bfd 应用分析

1. neutron l3 ha 实现拓扑

参考: https://docs.openstack.org/networking-ovn/latest/admin/routing.html

image.png

关于BFD的两端,一端就是就是lrp,lrp可以在多个gw chassis node漂移,另一端可以是物理节点上的一个网卡|网桥(基于bfdd实现bfd的另一端)。

所以L3的HA,即基于ecmp(或bgp)的路由基于bfd的链路检测做路由变更或者路由决策调。那么HA包含router LRP的高可用以及LRP的对端(节点上的网桥|网卡)的高可用。

网桥会对外暴露主机,而ns内的网卡(internal port不会)。

关于neutron ovn实现的l3 HA的可靠解释全文:https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/networking_with_open_virtual_network/index#:~:text=to%20master.-,2.5.%20Layer%203%20high%20availability%20with%20OVN,-OVN%20supports%20Layer

2.5. Layer 3 high availability with OVN
OVN supports Layer 3 high availability (L3 HA) without any special configuration. OVN automatically schedules the router port to all available gateway nodes that can act as an L3 gateway on the specified external network. OVN L3 HA uses the gateway_chassis column in the OVN Logical_Router_Port table. Most functionality is managed by OpenFlow rules with bundled active_passive outputs. The ovn-controller handles the Address Resolution Protocol (ARP) responder and router enablement and disablement. Gratuitous ARPs for FIPs and router external addresses are also periodically sent by the ovn-controller.

ovn 不需要额外配置 即支持l3 lrp的HA,它会自动将lrp漂移到功能正常的节点。lrp 表中有一列用于记录lrp所在的gw chassis。这个功能是基于openflow规则结合捆绑的主备 output端口列表实现的。

ovn-controller 负责处理lrp是否响应arp。 以及fip的免费arp周期性的对外响应。

Note
L3HA uses OVN to balance the routers back to the original gateway nodes to avoid any nodes becoming a bottleneck.

L2 HA 使用ovn来平衡网关节点的公网流量。如果之前的网关节点挂了,然后又恢复了,流量会自动切回到原来的gw node。

BFD monitoring

OVN uses the Bidirectional Forwarding Detection (BFD) protocol to monitor the availability of the gateway nodes. This protocol is encapsulated on top of the Geneve tunnels established from node to node.

ovn 使用bfd协议来监控所有gw node的可用性。该协议封装与geneve隧道协议之上,在node之间互相检测。

Each gateway node monitors all the other gateway nodes in a star topology in the deployment. Gateway nodes also monitor the compute nodes to let the gateways enable and disable routing of packets and ARP responses and announcements.

在部署上,每个网关节点都 基于星型拓扑监控其他网关节点。同时网关节点也监控其他计算节点。 以便控制网关启用|禁用包的路由记忆arp的响应和宣告。

Each compute node uses BFD to monitor each gateway node and automatically steers external traffic, such as source and destination Network Address Translation (SNAT and DNAT), through the active gateway node for a given router. Compute nodes do not need to monitor other compute nodes.

每个计算节点也使用BFD来监控每一个网关节点并且自动化轮转到外部的流量,比如snat 和 dnat的流量。这些流量需要基于活跃网关节点的给定软路由的lrp才能转发出去。计算节点不监控其他计算节点。

Note
External network failures are not detected as would happen with an ML2-OVS configuration.

无法实现像ML2-OVS配置那样检测到外部网络的故障。

L3 HA for OVN supports the following failure modes:

  1. The gateway node becomes disconnected from the network (tunneling interface).
  2. ovs-vswitchd stops (ovs-switchd is responsible for BFD signaling)
  3. ovn-controller stops (ovn-controller removes itself as a registered node).

L3 HA 支持容忍以下故障模式:

  1. 网关节点断联(tunnel网卡故障)
  2. ovs-vswitchd 进程down了。 该服务负责BFD信号
  3. ovn-controller 进程down了。 该服务负责将其自身从注册node中移除(hypervisor gw chassis相关的资源记录)。
    Note
    This BFD monitoring mechanism only works for link failures, not for routing failures.

BFD机制仅能用于链路检测,而非路由故障。

1.1 分析neutron ovn的功能列表

github neutron\doc\source\admin\ovn\features.rst

Open Virtual Network (OVN) offers the following virtual network
services:

  • Layer-2 (switching)

    Native implementation. Replaces the conventional Open vSwitch (OVS)
    agent.

  • Layer-3 (routing)

    Native implementation that supports distributed routing. Replaces the
    conventional Neutron L3 agent. This includes transparent L3HA :doc::routing
    support, based on BFD monitorization integrated in core OVN.

可以看到neutron “透明L3 HA的实现”是基于 内置于ovn BFD 动力

  • DHCP

    Native distributed implementation. Replaces the conventional Neutron DHCP
    agent. Note that the native implementation does not yet support DNS
    features.

  • DPDK

    OVN and the OVN mechanism driver may be used with OVS using either the Linux
    kernel datapath or the DPDK datapath.

  • Trunk driver

    Uses OVN's functionality of parent port and port tagging to support trunk
    service plugin. One has to enable the 'trunk' service plugin in neutron
    configuration files to use this feature.

  • VLAN tenant networks

    The OVN driver does support VLAN tenant networks when used
    with OVN version 2.11 (or higher).

  • DNS

    Native implementation. Since the version 2.8 OVN contains a built-in
    DNS implementation.

  • Port Forwarding

    The OVN driver supports port forwarding as an extension of floating
    IPs. Enable the 'port_forwarding' service plugin in neutron configuration
    files to use this feature.

  • Packet Logging

    Packet logging service is designed as a Neutron plug-in that captures network
    packets for relevant resources when the registered events occur. OVN supports
    this feature based on security groups.

  • Segments

    Allows for Network segments ranges to be used with OVN. Requires OVN
    version 20.06 or higher.

.. TODO What about tenant networks?

  • Routed provider networks

    Allows for multiple localnet ports to be attached to a single Logical
    Switch entry. This work also assumes that only a single localnet
    port (of the same Logical Switch) is actually mapped to a given
    hypervisor. Requires OVN version 20.06 or higher.


+----------------------------------+---------------------------------+
| Extension Name                   | Extension Alias                 |
+==================================+=================================+
| Allowed Address Pairs            | allowed-address-pairs           |
+----------------------------------+---------------------------------+
| Auto Allocated Topology Services | auto-allocated-topology         |
+----------------------------------+---------------------------------+
| Availability Zone                | availability_zone               |
+----------------------------------+---------------------------------+
| Default Subnetpools              | default-subnetpools             |
+----------------------------------+---------------------------------+
| DNS Integration                  | dns-integration                 |
+----------------------------------+---------------------------------+
| DNS domain for ports             | dns-domain-ports                |
+----------------------------------+---------------------------------+
| DNS domain names with keywords   | dns-integration-domain-keywords |
+----------------------------------+---------------------------------+
| Subnet DNS publish fixed IP      | subnet-dns-publish-fixed-ip     |
+----------------------------------+---------------------------------+
| Multi Provider Network           | multi-provider                  |
+----------------------------------+---------------------------------+
| Network IP Availability          | network-ip-availability         |
+----------------------------------+---------------------------------+
| Network Segment                  | segment                         |
+----------------------------------+---------------------------------+
| Neutron external network         | external-net                    |
+----------------------------------+---------------------------------+
| Neutron Extra DHCP opts          | extra_dhcp_opt                  |
+----------------------------------+---------------------------------+
| Neutron Extra Route              | extraroute                      |
+----------------------------------+---------------------------------+
| Neutron L3 external gateway      | ext-gw-mode                     |
+----------------------------------+---------------------------------+
| Neutron L3 Router                | router                          |
+----------------------------------+---------------------------------+
| Network MTU                      | net-mtu                         |
+----------------------------------+---------------------------------+
| Packet Logging                   | logging                         |
+----------------------------------+---------------------------------+
| Port Binding                     | binding                         |
+----------------------------------+---------------------------------+
| Port Bindings Extended           | binding-extended                |
+----------------------------------+---------------------------------+
| Port Forwarding                  | port_forwarding                 |
+----------------------------------+---------------------------------+
| Port MAC address Regenerate      | port-mac-address-regenerate     |
+----------------------------------+---------------------------------+
| Port Security                    | port-security                   |
+----------------------------------+---------------------------------+
| Provider Network                 | provider                        |
+----------------------------------+---------------------------------+
| Quality of Service               | qos                             |
+----------------------------------+---------------------------------+
| Quota management support         | quotas                          |
+----------------------------------+---------------------------------+
| RBAC Policies                    | rbac-policies                   |
+----------------------------------+---------------------------------+
| Resource revision numbers        | standard-attr-revisions         |
+----------------------------------+---------------------------------+
| security-group                   | security-group                  |
+----------------------------------+---------------------------------+
| standard-attr-description        | standard-attr-description       |
+----------------------------------+---------------------------------+
| Subnet Allocation                | subnet_allocation               |
+----------------------------------+---------------------------------+
| Subnet service types             | subnet-service-types            |
+----------------------------------+---------------------------------+
| Tag support                      | standard-attr-tag               |
+----------------------------------+---------------------------------+
| Time Stamp Fields                | standard-attr-timestamp         |
+----------------------------------+---------------------------------+

2. 分析 neutron如何支持bfd

neutron 关于 支持bfd的提议记录这里(2021-04-27): https://blueprints.launchpad.net/neutron/+spec/bfd-support-for-neutron

最初的bug讨论: https://bugs.launchpad.net/neutron/+bug/1907089

大致描述是:
基于bfd来实现router间的链路故障。主要用于以下两个场景:

  1. 如果下一跳路由活跃|故障,则调整响应的路由。
  2. 基于路由协议ecmp或者bgp 基于bfd反馈的链路状态改变路由策略。

关于提议的主要部分:

  1. 添加关于bfd_monitors 资源对象相关的api接口
    POST /v2.0/bfd_monitors
    GET /v2.0/bfd_monitors
    GET /v2.0/bfd_monitors/{monitor_uuid}
    DELETE POST /v2.0/bfd_monitors/{monitor_uuid}
    PUT /v2.0/bfd_monitors/{monitor_uuid}

  2. 支持获取bfd-monitor的状态。
    GET /v2.0/bfd_monitors/{monitor_uuid}/monitor_status

  3. 修改router的api支持将bfd_monitor 绑定到 路由
    PUT /v2.0/routers/{router_uuid}/add_extraroutes OR PUT /v2.0/routers/{router_id}
    {"router" : {"routes" : [{ "destination" : "10.0.3.0/24", "nexthop" : "10.0.0.13" , "bfd": <bfd_monitor_id>}]}}

  1. 支持查看router的详情
    GET /v2.0/routers/{router_id}/routes_status

BFD 不仅仅会给出监控信息,并且要求对链路状态做出快速响应。

在neutron的产经中,可能需要根据bfd的监控信息,来将“故障”路由从路由表中移除,并且在它恢复后将路由表加回(目前kube-ovn的ecmp多nexthop设计基本类似)。
其他的场景中够可以基于交换|路由机制来实现更为成熟的解决方案。
简单的后端开源实现可以依赖于ovn,ovs可以负责实现BFD monitor。

关于上面的关键词解释 bfd_monitor 大概可以解释为对应的是bfd rfc的中的bfd会话,对应的是ovn nbdb中的bfd 表,而neutron采用ovn方案,应该是通过ovn来直接对接bfd的资源需求。

小结一下: neutron想搞一套面向bfd 会话资源维护的接口。这个接口后面的bfd资源可以为l3 ha对应的路由变更和决策提供链路检测。

当然有了这些接口,也可以在外部实现一些监控自动化为维护工具。

neutron 实现BFD

https://opendev.org/openstack/neutron-specs/commit/e35a6606f093cd87f72becabbdfcba0729187d18

具体代码在参考部分
References
==========

.. [1] https://tools.ietf.org/html/rfc5880
.. [2] https://tools.ietf.org/html/rfc5880#section-6.1
.. [3] https://tools.ietf.org/html/rfc5880#section-6.8.1
.. [4] https://tools.ietf.org/html/draft-wang-bfd-one-arm-use-case-00
.. [5] https://tools.ietf.org/html/rfc5880#section-6.7
.. [6] https://review.opendev.org/c/openstack/neutron-lib/+/778859
.. [7] https://review.opendev.org/c/openstack/neutron-specs/+/767337/7/specs/wallaby/bfd_support.rst#377

在序号7的参考文档中: 有关于lsp ovs支持bfd的直接示例

OVS is able to handle BFD, see [3]_, it's possible to set BFD on an
interface, and check the status of it.

The manual process is something like this:

.. code-block:: bash

        $ sudo ovs-vsctl set interface ens5 bfd:bfd_src_ip=192.168.122.237 bfd:bfd_dst_ip=192.168.122.57 bfd:enable=true
        $ #.... Do the same on the remote end ....
        $ sudo ovs-vsctl get interface ens5 bfd_status
          {diagnostic="Control Detection Time Expired", flap_count="7", forwarding="true", remote_diagnostic="Neighbor Signaled Session Down", remote_state=up, state=up}


# 包括最关键的如何正确使用的部分:


该文档不是简要完整分析,分析过程在kube-ovn bfd相关pr中附注

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,992评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,212评论 3 388
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 159,535评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,197评论 1 287
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,310评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,383评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,409评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,191评论 0 269
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,621评论 1 306
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,910评论 2 328
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,084评论 1 342
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,763评论 4 337
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,403评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,083评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,318评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,946评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,967评论 2 351

推荐阅读更多精彩内容