设计目标及假设
1、Hardware Failure
2、Streaming Data Access (适用于批处理场景,不适用于交互场景)
3、Large Data Sets
4、Simple Coherency Model (数据不可修改)
5、Moving Computation is Cheaper than Moving Data
整体架构
基本组件
1、NameNode: maintains the file system namespace. Any change to the file system namespace or its properties is recorded by the NameNode. An application can specify the number of replicas of a file that should be maintained by HDFS. The number of copies of a file is called the replication factor of that file.
2、DataNode:存储对应的数据,支持一次写,多次读,不能修改。
3、Client:对应的客户端
存储策略
相关shell命令
HDFS Quotas Guide
Name Quotas
hdfs dfsadmin -setQuota N <directory> <directory>
hdfs dfsadmin -clrQuota N <directory> <directory>
Space Quotas
hdfs dfsadmin -setSpaceQuota N <directory> <directory>
hdfs dfsadmin -clrSpaceQuota N <directory> <directory>
Storage Type Quotas
hdfs dfsadmin -setSpaceQuota -storageType <directory> <directory>
hdfs dfsadmin -clrSpaceQuota-storageType