Elasticsearch Rally TSDB Data Storage Analyse

This article analyzes the storage details of Elasticsearch rally's tsdb dataset. Through this article, we can analyze the storage capacity and compression of each field of rally's tsdb dataset.

Data Description

Elasticsearch rally tsdb data: https://github.com/elastic/rally-tracks/tree/master/tsdb

image.png

Test Description

Test case

case1: configure time_series

case2: enable ali_codec docvalues compression

case3: enable ali_codec docvalues compression + do not store _source

ali_codec: ali_codec is a codec plugin, deveolp by alibaba, it use to compress lucene data. in the test case, docvalues use zstd to compression column data.

do not store _source: we modify some elasticsearch server code, that can no store _source, _id, and _seq_no in lucene.

Test results

Time series data includes tags data, metrics data, and @timestamp, _tsid fields.

Metadata includes fields such as _id, _source, and _seq_no.


image.png

Field details:

image.png

Test details

case1: Configure time_series

read segment:_3tn, count=122613113
store summery: size=9.8gb, fieldCount=2, fileds=_source,_id
docvalue summery: size=3.4gb, fieldCount=176
points summery: size=2.6gb, fieldCount=96
inverted summery:  size=912mb, postings size=604.7mb fieldCount=82
segment totalSize=17.4gb, tsdbSize=5.7gb, percent=32.98%, otherSize=11.6gb, percent=67.02%, summery:
              fields     size  percent  desc
     _id and _source   10.6gb   61.41%  [docvalues:      0b] [points:      0b] [terms:   901mb] [posting:      0b] [summery:   9.8gb]
    metrics and tags    4.8gb   28.13%  [docvalues:   2.6gb] [points:   1.6gb] [terms:  10.9mb] [posting:   604mb] [summery:      0b]
             _seq_no  998.2mb    5.60%  [docvalues: 409.2mb] [points:   589mb] [terms:      0b] [posting:      0b] [summery:      0b]
          @timestamp  831.2mb    4.66%  [docvalues: 409.2mb] [points:   422mb] [terms:      0b] [posting:      0b] [summery:      0b]
               _tsid   34.8mb    0.20%  [docvalues:  34.8mb] [points:      0b] [terms:      0b] [posting:      0b] [summery:      0b]
        _field_names    764kb    0.00%  [docvalues:      0b] [points:      0b] [terms:     15b] [posting:   764kb] [summery:      0b]
            _version       0b    0.00%  [docvalues:      0b] [points:      0b] [terms:      0b] [posting:      0b] [summery:      0b]
       _primary_term       0b    0.00%  [docvalues:      0b] [points:      0b] [terms:      0b] [posting:      0b] [summery:      0b]

case2: enable ali_codec docvalues compression

read segment:_3uh, count=122613113
store summery: size=7.5gb, fieldCount=2, fileds=_source,_id
docvalue summery: size=1.4gb, fieldCount=176
points summery: size=2.6gb, fieldCount=96
inverted summery:  size=911.4mb, postings size=604.7mb fieldCount=82
segment totalSize=13.1gb, tsdbSize=3.7gb, percent=28.31%, otherSize=9.4gb, percent=71.69%, summery:
              fields     size  percent  desc
     _id and _source    8.4gb   63.79%  [docvalues:      0b] [points:      0b] [terms: 900.4mb] [posting:      0b] [summery:   7.5gb]
    metrics and tags    3.2gb   24.30%  [docvalues: 936.4mb] [points:   1.6gb] [terms:  10.9mb] [posting:   604mb] [summery:      0b]
             _seq_no      1gb    7.89%  [docvalues: 475.2mb] [points:   589mb] [terms:      0b] [posting:      0b] [summery:      0b]
          @timestamp  538.9mb    3.99%  [docvalues: 116.9mb] [points:   422mb] [terms:      0b] [posting:      0b] [summery:      0b]
               _tsid    1.9mb    0.01%  [docvalues:   1.9mb] [points:      0b] [terms:      0b] [posting:      0b] [summery:      0b]
        _field_names    764kb    0.01%  [docvalues:      0b] [points:      0b] [terms:     15b] [posting:   764kb] [summery:      0b]
            _version       8b    0.00%  [docvalues:      8b] [points:      0b] [terms:      0b] [posting:      0b] [summery:      0b]
       _primary_term       8b    0.00%  [docvalues:      8b] [points:      0b] [terms:      0b] [posting:      0b] [summery:      0b]

case3: enable ali_codec docvalues compression + do not store _source

read segment:_3ot, count=122613113
store summery: size=971.6mb, fieldCount=1, fileds=_id
docvalue summery: size=1.4gb, fieldCount=176
points summery: size=2.6gb, fieldCount=96
inverted summery:  size=916.5mb, postings size=604.7mb fieldCount=82
segment totalSize=6.6gb, tsdbSize=3.7gb, percent=56.49%, otherSize=2.8gb, percent=43.51%, summery:
              fields     size  percent  desc
    metrics and tags    3.2gb   48.49%  [docvalues: 936.4mb] [points:   1.6gb] [terms:  10.9mb] [posting:   604mb] [summery:      0b]
     _id and _source    1.8gb   27.76%  [docvalues:      0b] [points:      0b] [terms: 905.6mb] [posting:      0b] [summery: 971.6mb]
             _seq_no      1gb   15.73%  [docvalues: 474.8mb] [points:   589mb] [terms:      0b] [posting:      0b] [summery:      0b]
          @timestamp  538.9mb    7.97%  [docvalues: 116.9mb] [points:   422mb] [terms:      0b] [posting:      0b] [summery:      0b]
               _tsid    1.9mb    0.03%  [docvalues:   1.9mb] [points:      0b] [terms:      0b] [posting:      0b] [summery:      0b]
        _field_names    764kb    0.01%  [docvalues:      0b] [points:      0b] [terms:     15b] [posting:   764kb] [summery:      0b]
            _version       8b    0.00%  [docvalues:      8b] [points:      0b] [terms:      0b] [posting:      0b] [summery:      0b]
       _primary_term       8b    0.00%  [docvalues:      8b] [points:      0b] [terms:      0b] [posting:      0b] [summery:      0b]
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容