Open vSwitch* with DPDK Overview

1、Overview

  This article presents a high-level overview of Open vSwitch* with the Data Plane Development Kit (OvS-DPDK)—the high performance, open source virtual switch—and links to further technical articles that dive deeper into individual OvS-DPDK features. This article was written for users of OvS who want to know more about DPDK integration.
  Note: Users can download a zip file of the OVS master branch or the 2.6 branch, as well as installation steps for the master branch or the 2.6 branch.
  本文介绍了使用DPDK的Open vSwitch的高级概述(OvS-DPDK)--高性能、开源的虚拟交换机--以及进一步深入研究OvS-DPDK个别特性的技术文章的链接。本文是为希望了解更多关于集成DPDK的OvS用户编写的。
  注意:用户可以下载OvS主分支或2.6分支的zip文件,以及OvS主分支或2.6分支的安装步骤。


2、OvS-DPDK High-level Architecture

  Open vSwitch is a production quality, multilayer virtual switch licensed under the open source Apache* 2.0 license. It supports SDN control semantics via the OpenFlow* protocol and its OVSDB management interface. It is available from openvswitch.org, GitHub, and is also consumable through Linux distributions.
  Native Open vSwitch generally forwards packets via the kernel space data path (see Figure 1). In the kernel data path, the switching “fastpath” consists of a simple flow table indicating forwarding/action rules for packets that are received. Exception packets (first packet in a flow) do not match any existing entries in the kernel fastpath table and are sent to the user space daemon for processing (slowpath). After user space handles the first packet in the flow, the daemon will then update the flow table in kernel space so that subsequent packets in the flow can be processed in the fastpath and not sent to user space. Following this approach, native OvS can eliminate the costly context switch between kernel and user space for a large percentage of received packets. However, the achievable packet throughput is limited by the forwarding bandwidth of the Linux network stack, which is not suited for use cases requiring a high rate of packet processing; for example, Telco.
  Open vSwitch是一款基于开源Apache* 2.0许可的生产级多层虚拟交换机。Open vSwitch通过OpenFlow协议及其OVSDB管理接口支持SDN控制语义。Open vSwitch可以从openvswitch.org和GitHub获得,也可以通过Linux发行版使用。
  原生Open vSwitch通常通过内核空间数据路径(data path)转发数据包(参见图1)。在内核数据路径中,交换“fastpath”由一个简单的流表组成,该表指示接收到的数据包的转发/动作规则。异常包(流中的第一个包)与内核fastpath表中的任何现有条目都不匹配,该数据包会被发送到用户空间的守护进程进行处理(slowpath)。在用户空间处理流中的第一个包之后,守护进程将更新内核空间中的流表,以便流中的后续包可以在快速路径中处理,而不需要再被发送到用户空间。按照这种方法,原生OvS可以为接收的大部分包消除内核和用户空间之间代价高昂的上下文切换。但是,可实现的数据包吞吐量受Linux网络栈转发带宽的限制,不适合对数据包处理速率要求较高的用例;例如,电信。

Figure1:Integration of Data Plane Development Kit data plane with native Open vSwitch*.

  Figure 2 below shows the high-level architecture of OvS-DPDK. OvS switching ports are represented by network devices (or netdevs). Netdev-dpdk is a DPDK-accelerated network device that uses DPDK to accelerate switch I/O, through three separate interfaces: one physical interface (handled by the librte_eth library within DPDK), and two virtual interfaces (librte_vhost and librte_ring). These interface with the physical and virtual devices connected to the virtual switch.
  Other OvS architectural layers provide further functionality and interface with, for example, the SDN controller. Dpif-netdev provides user space forwarding and ofproto is the OvS library that implements an OpenFlow switch. It talks to OpenFlow controllers over the network and to switch hardware or software through an ofproto provider. The ovsdb server maintains the up-to-date switching table information for this OvS instance and communicates this to the SDN controller. The following section provides details of the switching/forwarding tables, with further information on the OvS architecture available through the openvswitch.org website.
  下面的图2显示了OvS-DPDK的高级架构。OvS交换端口由网络设备(或netdevs)表示。Netdev-dpdk是一种DPDK网络设备,意思是使用DPDK技术加速交换机I/O,通过三个独立的接口:一个物理接口(由DPDK中的librte_eth库处理)和两个虚拟接口(librte_vhost和librte_ring)。这些物理设备接口和虚拟设备的接口,都连接到虚拟交换机。
  其他OvS架构层提供了更进一步的功能和接口,比如SDN控制器。dpif-netdev为用户空间提供转发功能,ofproto是一个OvS库,相当于一个OpenFlow交换机。它通过网络与OpenFlow控制器进行对话,并通过ofproto提供商切换硬件或软件。ovsdb server维护此OvS实例的最新交换表信息,并将此信息传递给SDN控制器。下一节将描述交换/转发表的详细信息,关于OvS架构的更多信息,可以参考openvswitch.org网站内容。

Figure 2: Open vSwitch* with Data Plane Development Kit high-level architecture.

3、OvS-DPDK Switching Table Hierarchy

  A packet entering OvS-DPDK from a physical or virtual interface receives a unique identifier or hash, based on its header fields, which is then matched against an entry in one of three main switching tables: the exact match cache (EMC), the data path classifier (dpcls), or the ofproto classifier. A packet’s identifier will traverse each of these three tables in order, unless a match is found, in which case the appropriate actions indicated by the match rule in the table will be executed and the packet forwarded out of the switch upon completion of all actions. This scheme is illustrated in Figure 3.
  从物理或虚拟接口进入OvS-DPDK的包,根据其报头字段计算出一个唯一标识符或hash值,然后将其与“精确匹配缓存(EMC)”、“数据路径分类器(dpcls)”、"ofproto分类器"这三个主要交换表中的条目进行匹配。包的标识符将依次遍历这三个表,直到找到一条匹配,接下来,将执行表中的匹配规则指示的适当操作,并在完成所有操作后将包转发出交换机。该方案如图3所示。


Figure 3: Open vSwitch* with Data Plane Development Kit switching table hierarchy.

  The three tables have different characteristics and associated throughput performance/latency. The EMC offers fastest processing for a limited number of table entries. The packet’s identifier must exactly match the entry in this table for all fields—the 5-tuple of source IP and port, destination IP and port, and protocol—for highest speed processing or it will “miss” on the EMC and pass through to the dpcls. The dpcls contains many more table entries (arranged in multiple subtables) and enables wildcard matching of the packet identifier (for example, destination IP and port are specified but any source is allowed). This gives approximately half the throughput performance of the EMC and caters to a much larger number of table entries. Packet flows matched in the dpcls are installed in the EMC so that subsequent packets with the same identifier can be processed at the highest speed.
  A miss on the dpcls results in the packet identifier being sent to the ofproto classifier so that the OpenFlow controller can decide on the action. This path is the least performant, >10x slower than the EMC. Matches in the ofproto classifier result in new table entries being established in the faster switching tables so that subsequent packets in the same flow can be processed more quickly.
  这三个表具有不同的特性、吞吐量、性能/延迟。EMC处理速度最快,但表项数目有限。数据包的标识符必须与EMC表中所有字段(源IP、源端口、目的IP、目的端口、协议,5元组)的条目完全匹配,才可以获得最高的处理速度,否则该数据包会被认为EMC缓存“未命中”,并将其传递给dpcls。dpcls包含更多的表项(排列在多个子表中),并允许数据包标识符进行通配符匹配(例如,指定了目的IP和端口,但允许任何源)。dpcls吞吐量性能大约为EMC一半,但提供了更多的表项。在dpcls中匹配的信息流被更新到EMC中,以便具有相同标识符的后续信息流能够以最高速度处理。
  如果dpcls未命中,则会导致数据包被发送到ofproto分类器,以便OpenFlow控制器可以决定需要执行的动作。该路径是性能最差的,差不多比EMC慢10倍。在ofproto分类器中的匹配结果是在更快的交换表(dpcls)中建立新的表项,以便在同一流中的后续包可以更快地处理。

4、OvS-DPDK Features and Performance

At the time of this writing, the following high-level OvS-DPDK features are available on the OvS master code branch:

  • DPDK support for v16.07 (supported version increments with each new DPDK release)
  • vHost user support
  • vHost reconnect
  • vHost multiqueue
  • Native tunneling support: VxLAN, GRE, Geneve
  • VLAN support
  • MPLS support
  • Ingress/egress QoS policing
  • Jumbo frame support
  • Connection tracking
  • Statistics: DPDK vHost and extended DPDK stats
  • Debug: DPDK pdump support
  • Link bonding
  • Link status
  • VFIO support
  • ODL/OpenStack detection of DPDK ports
  • vHost user NUMA awareness

  A recent performance comparison between native OvS and OvS-DPDK is highlighted in Figure 4. This shows the throughput in packets-per-second for the Phy-OvS-Phy use case, indicating a ~10x performance enhancement for OvS-DPDK over native OvS, increasing to ~12x with Intel® Hyper-Threading Technology (Intel® HT Technology) enabled (labelled 1C2T, or one physical core with two logical threads, in the figure legend). Similarly, the Phy-OvS-VM-OvS-Phy use case demonstrates a ~9x performance enhancement for OvS-DPDK over native OvS.
  The hardware and software configuration for this data, along with further use case results, can be found in the Intel® Open Network Platform (Intel® ONP) performance report.
  图4突出显示了原生OvS和OvS- dpdk之间最近的性能比较。这显示了Phy-OvS-Phy用例的每秒包吞吐量,表明OvS-dpdk的性能比原生OvS提高了约10倍,在启用英特尔®超线程技术(英特尔®HT技术)(标记为1C2T,一个物理核心和两个逻辑线程,在图中图例)后提高到约12倍。类似地,Phy-OvS-VM-OvS-Phy用例演示了OvS- dpdk相对于原生OvS的约有9倍的性能增强。
  该数据的硬件和软件配置,以及进一步的用例结果,可以在英特尔®开放网络平台(英特尔®ONP)性能报告中找到。

Figure 4: Performance comparison - native Open vSwitch* (OvS) and OvS with Data Plane Development Kit.

5、OvS-DPDK Availability

  OvS-DPDK is available in the upstream openvswitch.org repository and is also available through Linux distributions as below. The latest milestone release is OvS 2.6 (September 2016), and releases are made with a six-month cadence.
  Code is available for download as follows: OvS master branch; OvS 2.6 release branch. Installation steps for the master branch are available as well as installation steps for the 2.6 release branch.

Packaged versions of OvS with DPDK are available from:

  OvS-DPDK可在openvswitch.org远程仓库中获得,也可通过如下Linux发行版获得。最新的里程碑版本是OvS 2.6(2016年9月),每六个月发布一次。
  代码下载如下:OvS主分支;OvS 2.6发布分支。主分支的安装步骤和2.6版本分支的安装步骤都是可用的。
带有DPDK的OvS打包版本可从以下网站获得:

6、 Additional Information

To learn more about OvS-DPDK, check out the following videos and articles on Intel® Developer Zone, 01.org, Intel® Network Builders and Intel® Network Builders University.

User guides:

Developer guides:

Articles and Videos:

OvS with DPDK milestone release webinars:

INB university:

White paper:

Have a question? Feel free to follow up with the query on the Open vSwitch discussion mailing thread.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 215,012评论 6 497
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,628评论 3 389
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 160,653评论 0 350
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,485评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,574评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,590评论 1 293
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,596评论 3 414
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,340评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,794评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,102评论 2 330
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,276评论 1 344
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,940评论 5 339
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,583评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,201评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,441评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,173评论 2 366
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,136评论 2 352

推荐阅读更多精彩内容