概要

本文对dd测试中的进行下简单介绍。

dd的man手册

收下看下dd测试中，flag参数都有哪些，查看dd的man手册如下：

'if=FILE'
     Read from FILE instead of standard input.
'of=FILE'
     Write to FILE instead of standard output.  Unless 'conv=notrunc' is
     given, 'dd' truncates FILE to zero bytes (or the size specified
     with 'seek=').
'bs=BYTES'
     Set both input and output block sizes to BYTES.  This makes 'dd'
     read and write BYTES per block, overriding any 'ibs' and 'obs'
     settings.  In addition, if no data-transforming 'conv' option is
     specified, input is copied to the output as soon as it's read, even
     if it is smaller than the block size.
'count=N'
     Copy N 'ibs'-byte blocks from the input file, instead of everything
     until the end of the file.  if 'iflag=count_bytes' is specified, N
     is interpreted as a byte count rather than a block count.  Note if
     the input may return short reads as could be the case when reading
     from a pipe for example, 'iflag=fullblock' will ensure that
     'count=' corresponds to complete input blocks rather than the
     traditional POSIX specified behavior of counting input read
     operations.
'conv=CONVERSION[,CONVERSION]...'
     Convert the file as specified by the CONVERSION argument(s).  (No
     spaces around any comma(s).)

     Conversions:

     'ascii'
          Convert EBCDIC to ASCII, using the conversion table specified
          by POSIX.  This provides a 1:1 translation for all 256 bytes.

     'ebcdic'
          Convert ASCII to EBCDIC.  This is the inverse of the 'ascii'
          conversion.

     'ibm'
          Convert ASCII to alternate EBCDIC, using the alternate
          conversion table specified by POSIX.  This is not a 1:1
          translation, but reflects common historical practice for '~',
          '[', and ']'.

          The 'ascii', 'ebcdic', and 'ibm' conversions are mutually
          exclusive.

     'block'
          For each line in the input, output 'cbs' bytes, replacing the
          input newline with a space and padding with spaces as
          necessary.

     'unblock'
          Remove any trailing spaces in each 'cbs'-sized input block,
          and append a newline.

          The 'block' and 'unblock' conversions are mutually exclusive.

     'lcase'
          Change uppercase letters to lowercase.

     'ucase'
          Change lowercase letters to uppercase.

          The 'lcase' and 'ucase' conversions are mutually exclusive.

     'sparse'
          Try to seek rather than write NUL output blocks.  On a file
          system that supports sparse files, this will create sparse
          output when extending the output file.  Be careful when using
          this option in conjunction with 'conv=notrunc' or
          'oflag=append'.  With 'conv=notrunc', existing data in the
          output file corresponding to NUL blocks from the input, will
          be untouched.  With 'oflag=append' the seeks performed will be
          ineffective.  Similarly, when the output is a device rather
          than a file, NUL input blocks are not copied, and therefore
          this option is most useful with virtual or pre zeroed devices.

     'swab'
          Swap every pair of input bytes.  GNU 'dd', unlike others,
          works when an odd number of bytes are read--the last byte is
          simply copied (since there is nothing to swap it with).
     'swab'
          Swap every pair of input bytes.  GNU 'dd', unlike others,
          works when an odd number of bytes are read--the last byte is
          simply copied (since there is nothing to swap it with).

     'sync'
          Pad every input block to size of 'ibs' with trailing zero
          bytes.  When used with 'block' or 'unblock', pad with spaces
          instead of zero bytes.

     The following "conversions" are really file flags and don't affect
     internal processing:

     'excl'
          Fail if the output file already exists; 'dd' must create the
          output file itself.

     'nocreat'
          Do not create the output file; the output file must already
          exist.

          The 'excl' and 'nocreat' conversions are mutually exclusive.

     'notrunc'
          Do not truncate the output file.

     'noerror'
          Continue after read errors.

     'fdatasync'
          Synchronize output data just before finishing.  This forces a
          physical write of output data.

     'fsync'
          Synchronize output data and metadata just before finishing.
          This forces a physical write of output data and metadata.
'iflag=FLAG[,FLAG]...'
     Access the input file using the flags specified by the FLAG
     argument(s).  (No spaces around any comma(s).)

'oflag=FLAG[,FLAG]...'
     Access the output file using the flags specified by the FLAG
     argument(s).  (No spaces around any comma(s).)

     Here are the flags.  Not every flag is supported on every operating
     system.

     'append'
          Write in append mode, so that even if some other process is
          writing to this file, every 'dd' write will append to the
          current contents of the file.  This flag makes sense only for
          output.  If you combine this flag with the 'of=FILE' operand,
          you should also specify 'conv=notrunc' unless you want the
          output file to be truncated before being appended to.

     'cio'
          Use concurrent I/O mode for data.  This mode performs direct
          I/O and drops the POSIX requirement to serialize all I/O to
          the same file.  A file cannot be opened in CIO mode and with a
          standard open at the same time.

     'direct'
          Use direct I/O for data, avoiding the buffer cache.  Note that
          the kernel may impose restrictions on read or write buffer
          sizes.  For example, with an ext4 destination file system and
          a linux-based kernel, using 'oflag=direct' will cause writes
          to fail with 'EINVAL' if the output buffer size is not a
          multiple of 512.

     'directory'

          Fail unless the file is a directory.  Most operating systems
          do not allow I/O to a directory, so this flag has limited
          utility.

     'dsync'
          Use synchronized I/O for data.  For the output file, this
          forces a physical write of output data on each write.  For the
          input file, this flag can matter when reading from a remote
          file that has been written to synchronously by some other
          process.  Metadata (e.g., last-access and last-modified time)
          is not necessarily synchronized.

     'sync'
          Use synchronized I/O for both data and metadata.
     'nocache'
          Discard the data cache for a file.  When count=0 all cache is
          discarded, otherwise the cache is dropped for the processed
          portion of the file.  Also when count=0 failure to discard the
          cache is diagnosed and reflected in the exit status.  Here as
          some usage examples:

               # Advise to drop cache for whole file
               dd if=ifile iflag=nocache count=0

               # Ensure drop cache for the whole file
               dd of=ofile oflag=nocache conv=notrunc,fdatasync count=0

               # Drop cache for part of file
               dd if=ifile iflag=nocache skip=10 count=10 of=/dev/null

               # Stream data using just the read-ahead cache
               dd if=ifile of=ofile iflag=nocache oflag=nocache

     'nonblock'
          Use non-blocking I/O.

     'noatime'
          Do not update the file's access time.  Some older file systems
          silently ignore this flag, so it is a good idea to test it on
          your files before relying on it.

     'noctty'
          Do not assign the file to be a controlling terminal for 'dd'.
          This has no effect when the file is not a terminal.  On many
          hosts (e.g., GNU/Linux hosts), this option has no effect at
          all.

     'nofollow'
          Do not follow symbolic links.

     'nolinks'
          Fail if the file has multiple hard links.

     'binary'
          Use binary I/O.  This option has an effect only on nonstandard
          platforms that distinguish binary from text I/O.

     'text'
          Use text I/O.  Like 'binary', this option has no effect on
          standard platforms.

     'fullblock'
          Accumulate full blocks from input.  The 'read' system call may
          return early if a full block is not available.  When that
          happens, continue calling 'read' to fill the remainder of the
          block.  This flag can be used only with 'iflag'.  This flag is
          useful with pipes for example as they may return short reads.
          In that case, this flag is needed to ensure that a 'count='
          argument is interpreted as a block count rather than a count
          of read operations.

     'count_bytes'
          Interpret the 'count=' operand as a byte count, rather than a
          block count, which allows specifying a length that is not a
          multiple of the I/O block size.  This flag can be used only
          with 'iflag'.

我们重点对direct、dsync、sync来进行下介绍，在介绍之前，需要首先了解下linux的I/O体系，如下：

Linux I/O体系

image.png

上面的图片有些复杂，可以简略为如下图片：

image.png

Linux磁盘I/O可以分为以下层次：

虚拟文件系统层

文件系统层

缓存层

通用块层

I/O调度层

驱动层

物理设备层

虚拟文件系统层

一般来说，应用程序不会直接跟物理设备直接打交道，基本上都是经过文件系统去操作设备。文件系统种类比较多，比如基于块设备的ext系列、xfs，网络文件系统nfs等等，各类文件系统的接口和实现各不相同，这就产生了一个问题，难道应用程序要为各种文件系统做特殊化处理吗？答案是不用的，因为有虚拟文件系统。虚拟文件系统层位于文件系统层之上，屏蔽了各种文件系统的差异，为应用层提供了一个统一的、虚拟的文件系统接口，也就是说应用程序使用一套统一的接口便可以操作所有的文件系统。

文件系统层

基于虚拟文件系统定义的统一接口，实现具体文件系统的功能，文件系统有三类：

1.基于块设备的文件系统，如ext2、3、4，xfs；

2.网络文件系统，如nfs、cifs；

3.特殊文件系统，如/proc、裸设备文件。

缓存层

相比于CPU和内存，磁盘I/O属于慢速I/O，为了提高磁盘I/O的速度，Linux添加了缓存层。默认情况下，I/O数据先放到缓存中便返回上层，由内核再把数据写到设备，或者是上层把缓存数据读走。对于写操作，由于数据是放到缓存便返回了，上层认为I/O结束了，实际上数据还没落盘，如果这时候电脑异常掉电了，数据将会丢失。如果应用层要确保数据写到物理设备了，可以调用flush接口，缓存中的数据将会刷到物理设备中。Linux也提供了绕过缓存层的设置，打开文件的时候指定direct标识，数据将绕过缓存层继续执行。

可通过free看到目前缓存的数据量，下图的buff/cache便是：

image.png

通用块层

由于设备种类繁多，接口也各不相同，为了屏蔽这些设备的差异，添加了通用块层。文件系统只需要跟统一的通用层打交道便可以跟设备通信，无需关心实际设备驱动的实现，简化了文件系统的实现。

I/O调度层

磁盘I/O请求是随机的，请求操作的磁盘位置也是随机的，为了减少磁盘I/O的磁盘，增大磁盘整体的吞吐量，Linux添加了I/O调度层。I/O调度层使用调度算法，更加合理的对I/O请求进行排序和合并，经典的是电梯算法。

把磁盘I/O请求比作为乘坐电梯，分别有请求到3楼、到2楼、到6楼、到4楼，如果没有调度算法的处理，将会出现电梯从1楼到3楼，从3楼到2楼，从2楼到6楼，再从6楼到4楼，造成电梯资源的浪费；如果有了调度算法，对调度进行了合理的排序，将出现电梯先到2楼、3楼、4楼、6楼，一次从1楼到6楼便可以完成所有的请求。

image.png

驱动层

各类物理设备的驱动层，用于内核与物理设备通讯。内核会提供驱动的通用接口，设备商根据接口实现驱动程序并注册到内核便可实现内核与设备的通讯。

物理设备层

各种物理磁盘设备，提供实际的存储功能，慢速设备有传统的机械硬盘HDD、快速的有固态硬盘SSD和NVME。物理磁盘也会带有缓存，用于提供I/O速度，磁盘中带有电容，可保证哪怕掉电也能把缓存数据刷写到磁盘中。

常见参数对此

conv标志

 'fdatasync'
      Synchronize output data just before finishing.  This forces a
      physical write of output data.

 'fsync'
      Synchronize output data and metadata just before finishing.
      This forces a physical write of output data and metadata.

oflag参数

 'direct'
      Use direct I/O for data, avoiding the buffer cache.  Note that
      the kernel may impose restrictions on read or write buffer
      sizes.  For example, with an ext4 destination file system and
      a linux-based kernel, using 'oflag=direct' will cause writes
      to fail with 'EINVAL' if the output buffer size is not a
      multiple of 512.
 'dsync'
      Use synchronized I/O for data.  For the output file, this
      forces a physical write of output data on each write.  For the
      input file, this flag can matter when reading from a remote
      file that has been written to synchronously by some other
      process.  Metadata (e.g., last-access and last-modified time)
      is not necessarily synchronized.

 'sync'
      Use synchronized I/O for both data and metadata.

没oflag
没有oflag时，dd按照默认的方式打开输出文件，默认是buffered I/O，数据写到缓存层便返回，所以速度最快。
oflag=direct
以该方式打开输出文件，数据写到磁盘缓存便返回，所以速度比上面的buffered I/O方式要慢。
oflag=sync
以该方式打开输出文件，数据全部落盘才返回，所以速度比上面的仅写到磁盘缓存要慢。
oflag=dsync
以该方式打开输出文件，跟sync相同，区别在于sync同步元数据，但是dsync不包括元数据。

实际案例

某客户两台同配置机器，运行数据库业务，对两台机器使用dd性能测试，客户原始反馈两台机器使用dd测试性能差距较大，如下：
主服务器，有业务运行

image.png

同时客户表示主服务器tpm文件系统写入较快

image.png

备服务器，无业务运行。

image.png

分析

1、tmp写入较快为bs=1M，同时为fsync（参数原则上要求物理写入，但测试对象为/tmp文件系统，此文件系统会有些特殊）。
举例如下，同样的参数在/tmp下跟在/home下执行就会有些差别：

image.png

2、两台机器差别较大的原因，怀疑主要是受业务的影响。
3、因此建议在无业务影响的条件下，测试其他非/tmp文件系统，并使用oflag=direct（排除系统缓存影响），结果如下：

image.png

dd命令详解

dd命令详解

概要

dd的man手册

Linux I/O体系

Linux磁盘I/O可以分为以下层次：

虚拟文件系统层

文件系统层

缓存层

通用块层

I/O调度层

驱动层

物理设备层

常见参数对此

conv标志

oflag参数

实际案例

分析

推荐阅读更多精彩内容