Data要求:scale 分布式数据,
Consistency 预定酒店电影票,必须要一致性,不能我们俩同时订了一个位置
Fast, Complex Querying: Graph Search, store all node and entity and relation edge
大数据,而且经常被查询使用,数据结构会变化,需要压缩,
high availability, consistency, distribution! key things
You can only choose two of three 三种特性选择两种最重要的,然后选择适合的数据库
RDMBS:主要是一致性,也有一定的availability, partition,分布式数据库
MongoDB:一致性,
Cassandra:拥有怎么样的level of consistency将会讨论
比如: flights: availability能够快速响应,partition因为全球有好多好多flight数据
但是booking flights:必须是consistency,一个人在定时候,别人不能同时定他
partition tolerance: data has to be distributed,
user key get the value: faster, hashmap, store any object with a key cannot do partition.
big table clones: key-value extension, read and write partition data, Big Table Google
Document Database: do not know keys. city, state, age. the huge number of data. 写入的时候也是需要keys, 例如:存入person table, key: SSN, city state age all others column. Json object. 可以用state age Index, 查一查谁是20-25岁的,在IN的,不需要Key来查询,可以直接用index来得到
Graph Database: nearby restaurant. Huge number of edges and relationships.
以上是四种数据库
Find flight, book filght:
RDBMS all normalized data - > web application
Actual data is still in Database.要做几百个SQL才能得到数据, Heavy join, aggregation
加一个Caching Layer,有很多工具,但是也有很多challenge, syncronize all data
改进:
每天很多很多查询操作,不能很容易就down
Cassandra: DB application,analytical platform