Hadoop伪分布式安装部署

版本:hadoop-2.6.0-cdh5.16.2.tar.gz(相当于Apache下的2.9-2.10版本)

1. Hadoop hdfs安装

1.1 创建用户和文件夹

[root@hadoop001 ~]# useradd hadoop
[root@hadoop001 ~]# su - hadoop
[hadoop@hadoop001 ~]$ mkdir tmp sourcecode software shell log lib app data
[hadoop@hadoop001 ~]$ cd software/

// 这是提前rz上去的安装包
[hadoop@hadoop001 software]$ ll
total 1266604
-rw-r--r-- 1 root   root   434354462 Feb 24 14:01 hadoop-2.6.0-cdh5.16.2.tar.gz
-rw-r--r-- 1 hadoop hadoop 185646832 Feb 24 12:03 jdk-8u181-linux-x64.tar.gz

1.2 安装部署jdk

jdk-8u181-linux-x64.tar.gz

[root@hadoop001 ~]# mkdir  /usr/java
[root@hadoop001 ~]# tar -xzvf jdk-8u181-linux-x64.tar.gz -C /usr/java/
[root@hadoop001 ~]# cd     /usr/java
[root@hadoop001 java]# chown -R root:root jdk1.8.0_181

[root@hadoop001 java]# vi /etc/profile
#hadoop env
export JAVA_HOME=/usr/java/jdk1.8.0_181
export PATH=$JAVA_HOME/bin:$PATH

[root@hadoop001 ~]# source /etc/profile
[root@hadoop001 ~]# which java
/usr/java/jdk1.8.0_181/bin/java

1.3 hadoop解压和软连接

[hadoop@hadoop001 software]$ tar -xzvf hadoop-2.6.0-cdh5.16.2.tar.gz -C ../app/
[hadoop@hadoop001 app]$ 
[hadoop@hadoop001 app]$ ll
total 4
drwxr-xr-x 14 hadoop hadoop 4096 Jun  3  2019 hadoop-2.6.0-cdh5.16.2
[hadoop@hadoop001 app]$ ln -s hadoop-2.6.0-cdh5.16.2 hadoop
[hadoop@hadoop001 app]$ ll
total 4
lrwxrwxrwx  1 hadoop hadoop   22 May  6 22:05 hadoop -> hadoop-2.6.0-cdh5.16.2
drwxr-xr-x 14 hadoop hadoop 4096 Jun  3  2019 hadoop-2.6.0-cdh5.16.2
[hadoop@hadoop001 app]$ 

1.4 配置ssh hadoop001 无密码验证

[hadoop@hadoop001 ~]$ rm -rf .ssh
[hadoop@hadoop001 ~]$ 
[hadoop@hadoop001 ~]$ 
[hadoop@hadoop001 ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:fhAts9iahMuFy0r/djKCcAO7m8vPm5lf2ExdkWUqIdw hadoop@ruozedata001
The key's randomart image is:
+---[RSA 2048]----+
|      .... .oo   |
|       ..E..+    |
|        +..o     |
|.    o o.=o      |
| o  o +.S.       |
|o oo ==+ .       |
| +.o=.o+. .      |
|oooo= = ..       |
|++oB+=.+         |
+----[SHA256]-----+
[hadoop@hadoop001 ~]$ cd .ssh
[hadoop@hadoop001 .ssh]$ 
[hadoop@hadoop001 .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@hadoop001 .ssh]$ chmod 0600 ~/.ssh/authorized_keys
[hadoop@hadoop001 .ssh]$ 
[hadoop@hadoop001 .ssh]$ ssh hadoop001 date
The authenticity of host hadoop001 (192.168.0.3)' can't be established.
ECDSA key fingerprint is SHA256:OLqoaMxlGFbCq4sC9pYgF+FdbcXHbEbtSrnMiGGFbVw.
ECDSA key fingerprint is MD5:d3:5b:4a:ef:8e:00:41:a0:5e:80:ef:75:76:8a:a3:49.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ruozedata001,192.168.0.3' (ECDSA) to the list of known hosts.
Wed May  6 22:26:57 CST 2020
[hadoop@hadoop001 .ssh]$ 
[hadoop@hadoop001 .ssh]$ 
[hadoop@hadoop001 .ssh]$ ssh hadoop001 date
Wed May  6 22:27:07 CST 2020
[hadoop@hadoop001 .ssh]$ 

1.5 修改配置文件 ,且hdfs的三个进程都以hadoop001名称启动

nn启动以hadoop001名称启动
etc/hadoop/core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop001:9000</value>
    </property>

    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/tmp/</value>
    </property>

</configuration>

snn启动以hadoop001名称启动
etc/hadoop/hdfs-site.xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop001:9868</value>
    </property>

    <property>
        <name>dfs.namenode.secondary.https-address</name>
        <value>hadoop001:9869</value>
    </property>
</configuration>

dn启动以hadoop001名称启动
[hadoop@hadoop001 hadoop]$ pwd
/home/hadoop/app/hadoop/etc/hadoop
[hadoop@hadoop001 hadoop]$ vi slaves 
hadoop001

1.6 添加环境变量

进入hadoop用户的家目录

[hadoop@hadoop001 ~]$ vi .bashrc 
# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
    . /etc/bashrc
fi

# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=

# User specific aliases and functions

export HADOOP_HOME=/home/hadoop/app/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

[hadoop@hadoop001 ~]$ source .bashrc 
[hadoop@hadoop001 ~]$

1.6 格式化 ,只需第一次即可,格式化自己的编码储存格式

[hadoop@hadoop001 hadoop]$ pwd
/home/hadoop/app/hadoop
[hadoop@hadoop001 hadoop]$ bin/hdfs namenode -format

#### 1.7 启动
[hadoop@hadoop001 hadoop]$ sbin/start-dfs.sh
20/05/06 22:43:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop001]
hadoop001: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-hadoop-namenode-hadoop001.out
hadoop001: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-hadoop-datanode-hadoop001.out
Starting secondary namenodes [hadoop001]
hadoop001: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-hadoop-secondarynamenode-hadoop001.out
20/05/06 22:43:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 hadoop]$ jps
21712 DataNode   dn 存储数据的  小弟
21585 NameNode   nn 负责分配数据存储的 老大
21871 SecondaryNameNode snn 万年老二 默认是按1小时粒度去备份老大的数据
21999 Jps
[hadoop@hadoop001 hadoop]$ 

1.8 web查看进程

[http://192.168.131.128:50070/dfshealth.html#tab-overview]

1.9 创建文件夹

[hadoop@hadoop001 ~]$ hdfs dfs -mkdir /user
20/05/18 20:03:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 ~]$ hdfs dfs -ls /
20/05/18 20:03:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 3 items
drwx------   - hadoop supergroup          0 2020-05-10 15:52 /tmp
drwxr-xr-x   - hadoop supergroup          0 2020-05-18 20:03 /user
drwxr-xr-x   - hadoop supergroup          0 2020-05-10 15:52 /wordcount
[hadoop@hadoop001 ~]$ 

1.9 上传下载文件

[hadoop@hadoop001 ~]$ hdfs dfs -put  error.log /wordcount
20/05/18 20:07:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 ~]$ hdfs dfs -ls /wordcount
20/05/18 20:07:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 3 items
-rw-r--r--   1 hadoop supergroup       1763 2020-05-18 20:07 /wordcount/error.log
drwxr-xr-x   - hadoop supergroup          0 2020-05-10 15:16 /wordcount/input
drwxr-xr-x   - hadoop supergroup          0 2020-05-10 15:53 /wordcount/output
[hadoop@hadoop001 ~]$ 

下载
[hadoop@hadoop001 ~]$ hdfs dfs -get /wordcount/output/part-r-00000
20/05/18 20:11:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 ~]$ ll
total 12
drwxrwxr-x. 3 hadoop hadoop   50 May  7 11:38 app
drwxrwxr-x. 2 hadoop hadoop   32 May 10 15:13 data
-rw-rw-r--. 1 hadoop hadoop 3039 May 10 14:43 error1.log
-rw-rw-r--. 1 hadoop hadoop 1763 May 10 14:40 error.log
drwxrwxr-x. 2 hadoop hadoop    6 May  7 11:21 lib
drwxrwxr-x. 2 hadoop hadoop    6 May  7 11:21 log
-rw-r--r--. 1 hadoop hadoop   64 May 18 20:11 part-r-00000
drwxrwxr-x. 2 hadoop hadoop    6 May  7 11:21 shell
drwxrwxr-x. 2 hadoop hadoop   77 May  7 11:25 software
drwxrwxr-x. 2 hadoop hadoop    6 May  7 11:21 sourcecode
drwxrwxr-x. 4 hadoop hadoop  222 May 18 20:00 tmp
[hadoop@hadoop001 ~]$ 

2 YARN安装部署

2.1 修改配置文件

etc/hadoop/mapred-site.xml:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

etc/hadoop/yarn-site.xml:
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

//这里的端口号修改为其他的,防止被挖矿,8088的端口号会被扫描为yarn
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>ruozedata001:7776</value> 
    </property>
</configuration>

2.2 启动yarn进程

[hadoop@hadoop001 hadoop]$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/yarn-hadoop-resourcemanager-hadoop001.out
hadoop001: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/yarn-hadoop-nodemanager-hadoop001.out
[hadoop@hadoop001 hadoop]$ jps
9539 DataNode
12135 NodeManager
12360 Jps
9401 NameNode
12011 ResourceManager
9708 SecondaryNameNode

2.3 web端查看进程

(http://192.168.131.128:18088/cluster)

2.4 词频统计

1.10 计算

[hadoop@hadoop001 hadoop]$ hadoop jar share/hadoop/mapreduce2/hadoop-mapreduce-examples-2.6.0-cdh5.16.2.jar wordcount /wordcount/error.log /user/output
20/05/18 20:17:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/05/18 20:17:27 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
20/05/18 20:17:28 INFO input.FileInputFormat: Total input paths to process : 1
20/05/18 20:17:28 INFO mapreduce.JobSubmitter: number of splits:1
20/05/18 20:17:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1589803231154_0001
20/05/18 20:17:28 INFO impl.YarnClientImpl: Submitted application application_1589803231154_0001
20/05/18 20:17:28 INFO mapreduce.Job: The url to track the job: http://hadoop001:18088/proxy/application_1589803231154_0001/
20/05/18 20:17:28 INFO mapreduce.Job: Running job: job_1589803231154_0001
20/05/18 20:17:39 INFO mapreduce.Job: Job job_1589803231154_0001 running in uber mode : false
20/05/18 20:17:39 INFO mapreduce.Job:  map 0% reduce 0%
20/05/18 20:17:46 INFO mapreduce.Job:  map 100% reduce 0%
20/05/18 20:17:53 INFO mapreduce.Job:  map 100% reduce 100%
20/05/18 20:17:54 INFO mapreduce.Job: Job job_1589803231154_0001 completed successfully
20/05/18 20:17:54 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=1554
        FILE: Number of bytes written=289077
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=1869
        HDFS: Number of bytes written=1180
        HDFS: Number of read operations=6
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=4100
        Total time spent by all reduces in occupied slots (ms)=4615
        Total time spent by all map tasks (ms)=4100
        Total time spent by all reduce tasks (ms)=4615
        Total vcore-milliseconds taken by all map tasks=4100
        Total vcore-milliseconds taken by all reduce tasks=4615
        Total megabyte-milliseconds taken by all map tasks=4198400
        Total megabyte-milliseconds taken by all reduce tasks=4725760
    Map-Reduce Framework
        Map input records=11
        Map output records=139
        Map output bytes=2316
        Map output materialized bytes=1554
        Input split bytes=106
        Combine input records=139
        Combine output records=92
        Reduce input groups=92
        Reduce shuffle bytes=1554
        Reduce input records=92
        Reduce output records=92
        Spilled Records=184
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=129
        CPU time spent (ms)=1640
        Physical memory (bytes) snapshot=306192384
        Virtual memory (bytes) snapshot=5457453056
        Total committed heap usage (bytes)=165810176
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=1763
    File Output Format Counters 

OK啦!伪分布式布置完成,去玩玩吧

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 215,539评论 6 497
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,911评论 3 391
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 161,337评论 0 351
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,723评论 1 290
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,795评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,762评论 1 294
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,742评论 3 416
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,508评论 0 271
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,954评论 1 308
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,247评论 2 331
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,404评论 1 345
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,104评论 5 340
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,736评论 3 324
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,352评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,557评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,371评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,292评论 2 352