Cassandra tombstone超过阈值导致查询失败问题
今天发现使用JanusGraph导入数据时报了如下异常,意思就是在cassandra读的时候出现了超时,按照一致性策略至少需要一个replica确认的返回包,但是现在一个都没有。
aused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency QUORUM (1 responses were required but only 0 replica responded)
at com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:88)
at com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:25)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:68)
at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.lambda$getKeys$73f59b6e$1(CQLKeyColumnValueStore.java:403)
at io.vavr.control.Try.of(Try.java:62)
at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.getKeys(CQLKeyColumnValueStore.java:400)
... 37 common frames omitted
继续查看cassandra日志发现,是由于cassandra中tombstone超过阈值导致的查询失败
ERROR [ReadStage-2] 2019-08-01 11:10:00,143 StorageProxy.java:1896 - Scanned over 100001 tombstones during query 'SELECT * FROM janusgraph.edgestore WHERE column1 >= 02 AND column1 <= 03 LIMIT 100' (last scanned row partion key was ((2800000000057880), 02)); query aborted
然后看了下这篇博客理解了下cassandra的删除机制: https://www.jianshu.com/p/8590676a9b41
总结来说,cassandra为了保证集群数据的一致性,对数据的删除主要是插入操作,也就是tombstone,集群中其他节点根据tombstone来确认数据已经被删除,防止僵尸数据的产生,而清除tombstone的参数应为gc_grace_seconds,默认为10天,这里由于我在测试导入数据频繁删除数据,导致tombstone没有被及时gc,因此导致失败。
CREATE TABLE janusgraph.system_properties (
key blob,
column1 blob,
value blob,
PRIMARY KEY (key, column1)
) WITH CLUSTERING ORDER BY (column1 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
cqlsh:janusgraph> SELECT table_name,gc_grace_seconds FROM system_schema.tables WHERE keyspace_name='janusgraph';
table_name | gc_grace_seconds
-------------------------+------------------
edgestore | 864000
edgestore_lock_ | 864000
graphindex | 864000
graphindex_lock_ | 864000
janusgraph_ids | 864000
system_properties | 864000
system_properties_lock_ | 864000
systemlog | 864000
txlog | 864000
此时我还在测试阶段,所以此时我修改edgestore的gc_grace_seconds来进行垃圾回收,重新测试成功。
alter table janusgraph.edgestore with gc_grace_seconds=0;