上节介绍了WAL segment file的总体内部结构和其中的部分结构,本节继续介绍其中的XLOG Record data结构。
一、XLOG Record data
WAL segment file默认大小为16MB,其内部结构如下图所示:
注:上一版本的内部结构图没有标明prev XLOG Record data,错误认为XLogLongPageHeaderData为56Btyes,特此更正!
其中XLOG Record data是存储实际数据的结构,由以下几部分组成:
1、0..N个XLogRecordBlockHeader,每个XLogRecordBlockHeader对应一个block data;
注意:如设置了BKPBLOCK_HAS_IMAGE标记,则在XLogRecordBlockHeader结构体后跟XLogRecordBlockImageHeader结构体;如设置了BKPIMAGE_HAS_HOLE和 BKPIMAGE_IS_COMPRESSED则在XLogRecordBlockImageHeader后跟XLogRecordBlockCompressHeader结构体;
2、XLogRecordDataHeader[Short|Long]:如数据<256Bytes,则使用Short格式,否则使用Long格式;
3、block data:full-write-block数据,如启用了压缩,则压缩存储,相关元数据存储在XLogRecordBlockHeader中的XLogRecordBlockCompressHeader中.
4、main data:(tuple) data/checkpoint等日志数据.
插入数据时的XLOG Record data内部结构如下图所示:
下面逐一介绍上述几个部分,通过使用hexdump工具查看相关数据。
1、XLogRecordBlockHeader
uint8 id
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 80 -n 1
00000050 00 |.|
00000051
块引用ID为0x00,即0号Block.
uint8 fork_flags
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 81 -n 1
00000051 20 | |
00000052
值为0x20,高4位用于标记,即BKPBLOCK_HAS_DATA
uint16 data_length
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 82 -n 2
00000052 1e 00 |..|
00000054
payload bytes = 0x001E,十进制数值为30.
接下来是RelFileNode
RelFileNode
tablespace/database/relation,均为Oid类型(unsigned int)
1.tablespace
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 84 -n 4
00000054 7f 06 00 00 |....|
00000058
值为0x0000067F,十进制值为1663
表空间为default
testdb=# select * from pg_tablespace where oid=1663;
spcname | spcowner | spcacl | spcoptions
------------+----------+--------+------------
pg_default | 10 | |
(1 row)
2.database
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 88 -n 4
00000058 12 40 00 00 |.@..|
0000005c
值为0x00004012,十进制值为16402,数据库为testdb
testdb=# select * from pg_database where oid=16402;
datname | datdba | encoding | datcollate | datctype | datistemplate | datallowconn | datconnlimit | datlastsysoid | datfroze
nxid | datminmxid | dattablespace | datacl
---------+--------+----------+------------+----------+---------------+--------------+--------------+---------------+---------
-----+------------+---------------+--------
testdb | 10 | 6 | C | C | f | t | -1 | 13284 |
561 | 1 | 1663 |
(1 row)
3.relation
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 92 -n 4
0000005c 56 42 00 00 |VB..|
00000060
值为0x00004256,十进制值为16982
testdb=# select oid,relfilenode,relname from pg_class where relfilenode = 16982;
oid | relfilenode | relname
-------+-------------+---------
16982 | 16982 | t_jfxx
(1 row)
相应的关系为t_jfxx
BlockNumber
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 96 -n 4
00000060 85 00 00 00 |....|
00000064
值为0x00000085,十进制值为133,这是对应的数据块号.
2、XLogRecordDataHeaderShort
接下来是XLogRecordDataHeaderShort/Long,由于数据小于256B,使用XLogRecordDataHeaderShort结构
unit8 id
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 100 -n 1
00000064 ff |.|
00000065
值为0xFF --> XLR_BLOCK_ID_DATA_SHORT 255
uint8 data_length
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 101 -n 1
00000065 03 |.|
00000066
值为0x03,3个字节,指的是main data的大小,3个字节是xl_heap_insert结构体的大小.
3、block data
XLogRecordDataHeaderShort之后是block data,由两部分组成:
1.xl_heap_header
2.Tuple data
xl_heap_header
1.uint16 t_infomask2
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 102 -n 2
00000066 03 00 |..|
00000068
t_infomask2值为0x03,二进制值为00000000 00000011
2.uint16 t_infomask
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 104 -n 2
00000068 02 08 |..|
0000006a
t_infomask值为0x0802,二进制值为00001000 00000010
3.uint8 t_hoff
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 106 -n 1
0000006a 18 |.|
0000006b
t_hoff值(偏移)为0x18,十进制值为24
Tuple data
XLOG Record的大小是0x4F,即79B,减去头部数据XLogRecord(24B) + XLogRecordBlockHeader(20B) + XLogRecordDataHeaderShort(2B) + xl_heap_header(5B) + main data(3B),剩余25B
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 107 -n 25
0000006b 00 0d 32 30 39 31 39 0f 32 30 31 33 30 37 00 00 |..20919.201307..|
0000007b 00 00 00 00 00 00 03 b3 40 |........@|
00000084
4、main data
这是xl_heap_insert结构体
uint16 OffsetNumber
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 132 -n 2
00000084 26 00 |&.|
00000086
插入的tuple的偏移为0x0026,十进制为38
uint8 flags
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000042 -s 134 -n 1
00000086 00 |.|
00000087
标志位为0x00
二、参考资料
WAL Internals Of PostgreSQL
PostgreSQL 源码解读(109)- WAL#5(相关数据结构)
PostgreSQL DBA(16) - WAL segment file内部结构
关于结构体占用空间大小总结