Centos7/Redhat7 部署CDH伪分布式Hadoop集群

Cloudera提供了一个可扩展的,灵活的集成平台,可以轻松管理企业中快速增长的数据量和各种数据。 Cloudera产品和解决方案使您能够部署和管理Apache Hadoop和相关项目,操纵和分析数据,并保持数据的安全和受保护。

先决条件
Centos7.x主机一台

Target
部署CDH伪分布式Hadoop集群应用

部署好的版本

[root@localhost ~]# hadoop version
Hadoop 2.6.0-cdh5.13.1
Subversion http://github.com/cloudera/hadoop -r 0061e3eb8ab164e415630bca11d299a7c2ec74fd
Compiled by jenkins on 2017-11-09T16:34Z
Compiled with protoc 2.5.0
From source with checksum 16d5272b34af2d8a4b4b7ee8f7c4cbe
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.13.1.jar

偶然间查到了cdh官网的伪分布式安装教程
这里做下笔记和记录.


开始部署
(笔者以Centos7.x为例)

1.JAVA环境

#到oracle.com下载jdk1.8.161
$ wget http://download.oracle.com/otn-pub/java/jdk/8u161-b12/2f38c3b165be4555a1fa6e98c45e0808/jdk-8u161-linux-x64.rpm?AuthParam=1516458261_e7574995a6546eeecbe0e4e901bc61a8

#上面这个网址可能会由于session live失效
#到官网重新download 即可
$ rpm -ivh jdk-8u161-linux-x64.rpm

Set the Java_Home

$ vim ~/.bashrc
#Add the JAVA_HOME
export JAVA_HOME=/usr/java/jdk1.8.0_161
#保存退出
$ source ~/.bashrc

2.Download the CDH 5 Package

$ wget http://archive.cloudera.com/cdh5/one-click-install/redhat/6/x86_64/cloudera-cdh-5-0.x86_64.rpm

$ yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm
#For instructions on how to add a CDH 5 yum repository or build your own CDH 5 yum repository

3.Install CDH 5

#Add a repository key
$ rpm --import http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/RPM-GPG-KEY-cloudera
#Install Hadoop in pseudo-distributed mode: To install Hadoop with YARN:
$ yum install hadoop-conf-pseudo -y  

4.Starting Hadoop

查看安装好的文件默认存放位置

[root@localhost ~]# rpm -ql hadoop-conf-pseudo
/etc/hadoop/conf.pseudo
/etc/hadoop/conf.pseudo/README
/etc/hadoop/conf.pseudo/core-site.xml
/etc/hadoop/conf.pseudo/hadoop-env.sh
/etc/hadoop/conf.pseudo/hadoop-metrics.properties
/etc/hadoop/conf.pseudo/hdfs-site.xml
/etc/hadoop/conf.pseudo/log4j.properties
/etc/hadoop/conf.pseudo/mapred-site.xml
/etc/hadoop/conf.pseudo/yarn-site.xml

无需改动,开始部署

Step 1.格式化namenode hdfs namenode -format

[root@localhost ~]# hdfs namenode -format
18/01/21 00:13:39 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   user = root
..........................................
18/01/21 00:13:41 INFO common.Storage: Storage directory /var/lib/hadoop-hdfs/cache/root/dfs/name has been successfully formatted.
...........................................
18/01/21 00:13:41 INFO util.ExitUtil: Exiting with status 0
18/01/21 00:13:41 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.0.1
************************************************************/

Step 2: 启动HDFS集群
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done

[root@localhost ~]# for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done
starting datanode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-datanode-localhost.out
Started Hadoop datanode (hadoop-hdfs-datanode):            [  OK  ]
starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-localhost.out
Started Hadoop namenode:                                   [  OK  ]
starting secondarynamenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-secondarynamenode-localhost.out
Started Hadoop secondarynamenode:                          [  OK  ]
#为了确认服务是否以及启动,可以使用jps命令或者查看webUI:http://localhost:50070

Step 3: Create the directories needed for Hadoop processes.

建立Hadoop进程所需的相关目录
/usr/lib/hadoop/libexec/init-hdfs.sh

[root@localhost ~]#   /usr/lib/hadoop/libexec/init-hdfs.sh
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /tmp'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 1777 /tmp'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /var'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /var/log'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 1775 /var/log'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown yarn:mapred /var/log'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /tmp/hadoop-yarn'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown -R mapred:mapred /tmp/hadoop-yarn'
....................................
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/oozie/share/lib/sqoop'
+ ls '/usr/lib/hive/lib/*.jar'
+ ls /usr/lib/hadoop-mapreduce/hadoop-streaming-2.6.0-cdh5.13.1.jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -put /usr/lib/hadoop-mapreduce/hadoop-streaming*.jar /user/oozie/share/lib/mapreduce-streaming'
+ ls /usr/lib/hadoop-mapreduce/hadoop-distcp-2.6.0-cdh5.13.1.jar /usr/lib/hadoop-mapreduce/hadoop-distcp.jar
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -put /usr/lib/hadoop-mapreduce/hadoop-distcp*.jar /user/oozie/share/lib/distcp'
+ ls '/usr/lib/pig/lib/*.jar' '/usr/lib/pig/*.jar'
+ ls '/usr/lib/sqoop/lib/*.jar' '/usr/lib/sqoop/*.jar'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /user/oozie'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown -R oozie /user/oozie'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/spark/applicationHistory'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown spark /user/spark/applicationHistory'

Step 4: Verify the HDFS File Structure:

确认HDFS的目录结构hadoop fs -ls -R /

[root@localhost ~]#  sudo -u hdfs hadoop fs -ls -R /
drwxrwxrwx   - hdfs  supergroup          0 2018-01-20 16:42 /benchmarks
drwxr-xr-x   - hbase supergroup          0 2018-01-20 16:42 /hbase
drwxrwxrwt   - hdfs  supergroup          0 2018-01-20 16:41 /tmp
drwxrwxrwt   - mapred mapred              0 2018-01-20 16:42 /tmp/hadoop-yarn
drwxrwxrwt   - mapred mapred              0 2018-01-20 16:42 /tmp/hadoop-yarn/staging
drwxrwxrwt   - mapred mapred              0 2018-01-20 16:42 /tmp/hadoop-yarn/staging/history
drwxrwxrwt   - mapred mapred              0 2018-01-20 16:42 /tmp/hadoop-yarn/staging/history/done_intermediate
drwxr-xr-x   - hdfs   supergroup          0 2018-01-20 16:44 /user
drwxr-xr-x   - mapred  supergroup          0 2018-01-20 16:42 /user/history
drwxrwxrwx   - hive    supergroup          0 2018-01-20 16:42 /user/hive
drwxrwxrwx   - hue     supergroup          0 2018-01-20 16:43 /user/hue
drwxrwxrwx   - jenkins supergroup          0 2018-01-20 16:42 /user/jenkins
drwxrwxrwx   - oozie   supergroup          0 2018-01-20 16:43 /user/oozie
................

Step 5: Start YARN

启动Yarn管理器

  • service hadoop-yarn-resourcemanager start
  • service hadoop-yarn-nodemanager start
  • service hadoop-mapreduce-historyserver start
[root@localhost ~]# service hadoop-yarn-resourcemanager start
starting resourcemanager, logging to /var/log/hadoop-yarn/yarn-yarn-resourcemanager-localhost.out
Started Hadoop resourcemanager:                            [  OK  ]
[root@localhost ~]# service hadoop-yarn-nodemanager start
starting nodemanager, logging to /var/log/hadoop-yarn/yarn-yarn-nodemanager-localhost.out
Started Hadoop nodemanager:                                [  OK  ]
[root@localhost ~]# service hadoop-mapreduce-historyserver start
starting historyserver, logging to /var/log/hadoop-mapreduce/mapred-mapred-historyserver-localhost.out
STARTUP_MSG:   java = 1.8.0_161
Started Hadoop historyserver:                              [  OK  ]

通过jps查看相关服务是否启动.

[root@localhost ~]# jps
5232 ResourceManager
3425 SecondaryNameNode
5906 Jps
5827 JobHistoryServer
3286 NameNode
5574 NodeManager
3162 DataNode

Step 6: 创建用户目录

[root@localhost ~]# sudo -u hdfs hadoop fs -mkdir /taroballs/
[root@localhost ~]# hadoop fs -ls /
Found 6 items
drwxrwxrwx   - hdfs  supergroup          0 2018-01-20 16:42 /benchmarks
drwxr-xr-x   - hbase supergroup          0 2018-01-20 16:42 /hbase
drwxr-xr-x   - hdfs  supergroup          0 2018-01-20 16:48 /taroballs
drwxrwxrwt   - hdfs  supergroup          0 2018-01-20 16:41 /tmp
drwxr-xr-x   - hdfs  supergroup          0 2018-01-20 16:44 /user
drwxr-xr-x   - hdfs  supergroup          0 2018-01-20 16:44 /var

在Yarn上运行一个简单的例子

#首先在root用户下建立个Input文件夹
[root@localhost ~]# hadoop fs -mkdir input
[root@localhost ~]# hadoop fs -ls /user/root/
Found 1 items
drwxr-xr-x   - root supergroup          0 2018-01-20 17:51 /user/root/input
#然后put一些东西上去
[root@localhost ~]# hadoop fs -put /etc/hadoop/conf/*.xml input/
[root@localhost ~]# hadoop fs -ls input/
Found 4 items
-rw-r--r--   1 root supergroup       2133 2018-01-20 17:54 input/core-site.xml
-rw-r--r--   1 root supergroup       2324 2018-01-20 17:54 input/hdfs-site.xml
-rw-r--r--   1 root supergroup       1549 2018-01-20 17:54 input/mapred-site.xml
-rw-r--r--   1 root supergroup       2375 2018-01-20 17:54 input/yarn-site.xml

Set HADOOP_MAPRED_HOME

#Set HADOOP_MAPRED_HOME
[root@localhost ~]# vim ~/.bashrc
#Add the HADOOP_MAPRED_HOME
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
#保存退出
[root@localhost ~]# source ~/.bashrc

运行Hadoop MR实例

hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep input outputroot23 'dfs[a-z.]+'

#运行Hadoop simple
[root@localhost ~]# hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep input outputroot23 'dfs[a-z.]+' 
18/01/20 17:55:54 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/01/20 17:55:55 WARN mapreduce.JobResourceUploader: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
18/01/20 17:55:55 INFO input.FileInputFormat: Total input paths to process : 4
18/01/20 17:56:44 INFO mapreduce.Job: Job job_1516438047064_0004 running in uber mode : false
18/01/20 17:56:44 INFO mapreduce.Job:  map 0% reduce 0%
18/01/20 17:56:51 INFO mapreduce.Job:  map 100% reduce 0%
18/01/20 17:56:59 INFO mapreduce.Job:  map 100% reduce 100%
18/01/20 17:56:59 INFO mapreduce.Job: Job job_1516438047064_0004 completed successfully
18/01/20 17:56:59 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=330
        FILE: Number of bytes written=287357
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=599
        HDFS: Number of bytes written=244
        HDFS: Number of read operations=7
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=4358
        Total time spent by all reduces in occupied slots (ms)=4738
        Total time spent by all map tasks (ms)=4358
        Total time spent by all reduce tasks (ms)=4738
        Total vcore-milliseconds taken by all map tasks=4358
        Total vcore-milliseconds taken by all reduce tasks=4738
        Total megabyte-milliseconds taken by all map tasks=4462592
        Total megabyte-milliseconds taken by all reduce tasks=4851712
    Map-Reduce Framework
        Map input records=10
        Map output records=10
        Map output bytes=304
        Map output materialized bytes=330
        Input split bytes=129
        Combine input records=0
        Combine output records=0
        Reduce input groups=1
        Reduce shuffle bytes=330
        Reduce input records=10
        Reduce output records=10
        Spilled Records=20
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=161
        CPU time spent (ms)=1320
        Physical memory (bytes) snapshot=328933376
        Virtual memory (bytes) snapshot=5055086592
        Total committed heap usage (bytes)=170004480
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=470
    File Output Format Counters 
        Bytes Written=244

Result

[root@localhost ~]# hadoop fs -ls 
Found 2 items
drwxr-xr-x   - root supergroup          0 2018-01-20 17:54 input
drwxr-xr-x   - root supergroup          0 2018-01-20 17:56 outputroot23
[root@localhost ~]# hadoop fs -ls outputroot23
Found 2 items
-rw-r--r--   1 root supergroup          0 2018-01-20 17:56 outputroot23/_SUCCESS
-rw-r--r--   1 root supergroup        244 2018-01-20 17:56 outputroot23/part-r-00000
[root@localhost ~]# hadoop fs -cat outputroot23/part-r-00000
1   dfs.safemode.min.datanodes
1   dfs.safemode.extension
1   dfs.replication
1   dfs.namenode.name.dir
1   dfs.namenode.checkpoint.dir
1   dfs.domain.socket.path
1   dfs.datanode.hdfs
1   dfs.datanode.data.dir
1   dfs.client.read.shortcircuit
1   dfs.client.file
[root@localhost ~]# 

大功告成~CDH伪分布式Hadoop集群搭建成功~如有勘误,欢迎斧正~

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 212,029评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,395评论 3 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 157,570评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,535评论 1 284
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,650评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,850评论 1 290
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,006评论 3 408
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,747评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,207评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,536评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,683评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,342评论 4 330
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,964评论 3 315
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,772评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,004评论 1 266
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,401评论 2 360
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,566评论 2 349

推荐阅读更多精彩内容