发自简书
[Pseudo-Distributed Operation]
* [Configuration]
* [Setup passphraseless ssh]
* [Execution]
* [YARN on a Single Node]
安装软件
Ubuntu 18.04.2
sudo apt-get install ssh
sudo apt-get install rsync
tar -xzvf hadoop-3.1.2.tar.gz
sudo vim /etc/profile
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH:$HIVE_HOME/bin:$HADOOP_HOME/bin:$HBASE_HOME/bin
export HADOOP_HOME=/home/njupt4145438/Downloads/hadoop-3.1.2
source /etc/profile
cd hadoop-3.1.2
mkdir logs
ls
不要用root
sudo vim hadoop-env.sh
set to the root of your Java installation
export JAVA_HOME=/usr/java/latest
改一下IP
sudo vim core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.179.128:9000</value>
</property>
</configuration>
sudo vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
ssh不用密钥
$ ssh localhost
$ ssh 192.168.179.128
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
格式化
$ bin/hdfs namenode -format
守护进程
$ sbin/start-dfs.sh
确保有权限
chmod 777
Make the HDFS directories required to execute MapReduce jobs:
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>
传一个test.txt,删掉本地的test.txt
$ bin/hdfs dfs -put test.txt
$ rm test.txt
$ bin/hdfs dfs -ls /user/njupt4145438
$ bin/hdfs dfs -get test.txt
//When you’re done, stop the daemons with
$ sbin/stop-dfs.sh
您可以在伪分布式模式下对yarn运行mapreduce作业,方法是设置一些参数,另外运行resourcemanager守护进程和nodemanager守护进程。
sudo vim mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
sudo vim etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
//守护进程Start ResourceManager daemon and NodeManager daemon
$ sbin/start-yarn.sh
//When you’re done, stop the daemons with
$ sbin/stop-yarn.sh
pom.xml
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<version>3.1.2</version>
</dependency>
</dependencies>
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.net.URI;
public class Main {
public static void main(String[] args) throws IOException {
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(URI.create("hdfs://192.168.179.128:9000/user/njupt4145438/test.txt"), conf);
FSDataInputStream is = fs.open(new Path("hdfs://192.168.179.128:9000/user/njupt4145438/test.txt"));
OutputStream os=new FileOutputStream(new File("D:/a.txt"));
byte[] buff= new byte[1024];
int length = 0;
while ((length=is.read(buff))!=-1){
System.out.println(new String(buff,0,length));
os.write(buff,0,length);
os.flush();
}
}
}