一、创建maven项目
需提前自行选定项目工作目录。
并在项目目录下执行命令:
mvn archetype:generate
如下图所示时,按回车

1.png
下面解读回车之后的操作

2.png
groupId: maven
artifactId:(项目名称)WordCount
version: 1.0
'package' maven: : org.example
Y: : 按下回车确认,即可创建成功
二、编辑项目文件
2.1在本地创建WordCount.java
在文件中添加如下代码:
package org.example;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount{
public class WordCountMap extends Mapper<LongWritable, Text, Text, IntWritable> {
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String[] strs = value.toString().split(" ");
for (String str : strs) {
context.write(new Text(str), new IntWritable(1));
}
}
}
public class WordCountReduce extends Reducer<Text,IntWritable,Text,IntWritable>{
protected void reduce(Text key,Iterable<IntWritable>values,Context context) throws IOException,InterruptedException{
int sum=0;
for(IntWritable val : values){
sum += val.get();
}
context.write(key,new IntWritable(sum));
}
}
public static void main (String[] args) throws Exception{
Configuration conf=new Configuration();
Job job=Job.getInstance(conf);
job.setJarByClass(WordCount.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(WordCountMap.class);
job.setReducerClass(WordCountReduce.class);
FileInputFormat.addInputPath(job,new Path(args[0]));
FileOutputFormat.setOutputPath(job,new Path(args[1]));
job.waitForCompletion(true);
}
}
编辑好之后通过xftp上传至节点
此处建议使用xftp上传文件,由于maven的目录过多,容易出错

3.png
2.2修改依赖文件
修改pom.xml

5.png
下图为该文件的初始内容

4.png
对其进行修改
将
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
修改为
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<java.version>1.8</java.version>
<hadoop.version>3.1.3</hadoop.version>
<log4j.version>1.2.14</log4j.version>
<junit.version>4.8.2</junit.version>
</properties>
将
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
删除
复制下列内容到该文件
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>${log4j.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>${junit.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>org.example.WordCount</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
</plugins>
</build>
注意!要在</project>之前进行以上操作!!!
三、打包项目
进入项目根目录,如图

6.png
执行命令
mvn package

7.png
出现如上图所示的界面,即为打包成功!!!
(如果有error报错,请根据报错内容检查并修改程序文件,以及依赖文件。
这时候项目文件夹内多了一个 target 文件夹,文件夹内就有我们的 jar 包啦!!

8.png
四、运行测试
这里的话就和之前一样。
启动 Hadoop
启动 Yarn
在 hadoop 目录下创建 test 文件夹、文件夹内创建 input output 二级目录
在 input 文件夹内放入 input1.txt input2.txt
hadoop jar WordCount-1.0.jar /test/input /test/output
注意该操作要在jar包所在目录下进行!
运行界面以及查看结果同之前一致