hadoop0.20.2+eclipse3.5爬过的坑

前提

hadoop版本

hadoop0.20.2

我也不知道为什么会用这么老的版本，引以为鉴
eclipse版本

eclipse3.5

hadoop0.20.2的插件只能运行在eclipse3.5

找个eclipse3.5版本的很不容易
java版本

jdk1.7.0_80

eclipse3.5只能用jdk1.7,不然会报错

安装Hadoop注意

建议必须新建一个用户，各个主机节点必须相同，不然会出现难以意料的错误
配置什么的可以参考我的这几篇blog
- hadoop配置
hadoop各个主机节点注意关闭防火墙

eclipse里的配置

插件配置
- 插件名hadoop-0.20.2-eclipse-plugin.jar 下载地址
- eclipse3.5 下载地址
- 将插件复制到eclipse的相应plugins目录
- 重启eclipse, windows->preferences->hadoop/map-reduce,选择你安装的hadoop目录
  
  image
配置MapReduce/Location

image
新建MapReduce工程, 如图

image

src里面新建WordCount.java, 复制如下代码

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
 
public class WordCount {
  public static class TokenizerMapper extends
          Mapper<Object, Text, Text, IntWritable> {
      private final static IntWritable one = new IntWritable(1);
      private Text word = new Text();
 
      public void map(Object key, Text value, Context context)
              throws IOException, InterruptedException {
          StringTokenizer itr = new StringTokenizer(value.toString());
          while (itr.hasMoreTokens()) {
              word.set(itr.nextToken());
              context.write(word, one);
          }
      }
  }
 
  public static class IntSumReducer extends
          Reducer<Text, IntWritable, Text, IntWritable> {
      private IntWritable result = new IntWritable();
 
      public void reduce(Text key, Iterable<IntWritable> values,
              Context context) throws IOException, InterruptedException {
          int sum = 0;
          for (IntWritable val : values) {
              sum += val.get();
          }
          result.set(sum);
          context.write(key, result);
      }
  }
 
  public static void main(String[] args) throws Exception {
      Configuration conf = new Configuration();
      String[] otherArgs = new GenericOptionsParser(conf, args)
              .getRemainingArgs();
      if (otherArgs.length != 2) {
          System.err.println("Usage: wordcount <in> <out>");
          System.exit(2);
      }
      Job job = new Job(conf, "word count");
      job.setJarByClass(WordCount.class);
      job.setMapperClass(TokenizerMapper.class);
      job.setCombinerClass(IntSumReducer.class);
      job.setReducerClass(IntSumReducer.class);
      job.setOutputKeyClass(Text.class);
      job.setOutputValueClass(IntWritable.class);
      FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
      FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
      System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

Run as -> Run Configurations-> 如图

image

最后编辑于：2018.09.13 21:22:20

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成，浏览时请结合常识与多方信息审慎甄别。
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。

hadoop0.20.2+eclipse3.5爬过的坑

hadoop0.20.2+eclipse3.5爬过的坑

前提

安装Hadoop注意

eclipse里的配置

相关阅读更多精彩内容

友情链接更多精彩内容