hadoop2.2编程:MRUnit——Test MaxTemperatureMapper

继承关系1

1. java.lang.Object
  |__ org.apache.hadoop.mapreduce.JobContext
           |__org.apache.hadoop.mapreduce.TaskAttemptContext
                   |__ org.apache.hadoop.mapreduce.TaskInputOutputContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
                          |__org.apache.hadoop.mapreduce.MapContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
                               |__ org.apache.hadoop.mapreduce.Mapper.Context
Description:
                        
public class Mapper.Context

extends MapContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT>

Constructor Summary:
Mapper.Context(Configuration conf, TaskAttemptID taskid, RecordReader<KEYIN,VALUEIN> reader, RecordWriter<KEYOUT,VALUEOUT> writer, OutputCommitter committer, StatusReporter reporter, InputSplit split)
Method Summary:
  Methods inherited from class org.apache.hadoop.mapreduce.MapContext
getCurrentKey, getCurrentValue, getInputSplit, nextKeyValue
  Methods inherited from class org.apache.hadoop.mapreduce.TaskInputOutputContext
getCounter, getCounter, getOutputCommitter, progress, setStatus, write
  Methods inherited from class org.apache.hadoop.mapreduce.TaskAttemptContext
getStatus, getTaskAttemptID
  Methods inherited from class org.apache.hadoop.mapreduce.JobContext
getCombinerClass, getConfiguration, getCredentials, getGroupingComparator, getInputFormatClass, getJar, getJobID, getJobName, getMapOutputKeyClass, getMapOutputValueClass, getMapperClass, getNumReduceTasks, getOutputFormatClass, getOutputKeyClass, getOutputValueClass, getPartitionerClass, getReducerClass, getSortComparator, getWorkingDirectory
  Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
2. java.lang.Object
  org.apache.hadoop.mapreduce.JobContext
      |_ org.apache.hadoop.mapreduce.TaskAttemptContext
          |_ org.apache.hadoop.mapreduce.TaskInputOutputContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
               |_ org.apache.hadoop.mapreduce.ReduceContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
                   |_ org.apache.hadoop.mapreduce.Reducer.Context
Description:
public class Reducer.Contextextends ReduceContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
Constructor Summary:
Reducer.Context(Configuration conf, TaskAttemptID taskid, RawKeyValueIterator input, Counter inputKeyCounter, Counter inputValueCounter, RecordWriter<KEYOUT,VALUEOUT> output, OutputCommitter committer, StatusReporter reporter, RawComparator<KEYIN> comparator, Class<KEYIN> keyClass, Class<VALUEIN> valueClass)
Method Summary:
  Methods inherited from class org.apache.hadoop.mapreduce.ReduceContext
getCurrentKey, getCurrentValue, getValues, nextKey, nextKeyValue
  Methods inherited from class org.apache.hadoop.mapreduce.TaskInputOutputContext
getCounter, getCounter, getOutputCommitter, progress, setStatus, write
  Methods inherited from class org.apache.hadoop.mapreduce.TaskAttemptContext
getStatus, getTaskAttemptID
  Methods inherited from class org.apache.hadoop.mapreduce.JobContext
getCombinerClass, getConfiguration, getCredentials, getGroupingComparator, getInputFormatClass, getJar, getJobID, getJobName, getMapOutputKeyClass, getMapOutputValueClass, getMapperClass, getNumReduceTasks, getOutputFormatClass, getOutputKeyClass, getOutputValueClass, getPartitionerClass, getReducerClass, getSortComparator, getWorkingDirectory
  Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

继承关系2

Code

1.MaxTemperatureMapper.java

 import java.io.IOException;

 import org.apache.hadoop.mapreduce.Mapper;
 import org.apache.hadoop.io.LongWritable;
 import org.apache.hadoop.io.IntWritable;
 import org.apache.hadoop.io.Text;

 public class MaxTemperatureMapper
   extends Mapper<LongWritable, Text, Text, IntWritable> {

     @Override
     public void map(LongWritable key, Text, value, Context context)
       throws IOException, InterruptedExceptioin {

       String line = value.toString();
       String year = line.subString(15,19);
       int airTemperature = Integer.parseInt(line.subString(87,92));
       context.write(new Text(year), new IntWritable(airTemperature));
     }
 }

2.MaxTemperatureMapperTest.java

 import java.io.IOException;
 import org.apache.hadoop.io.LongWritable;
 import org.apache.hadoop.io.IntWritable;
 import org.apache.hadoop.io.Text;
 import org.junit.Test;
 import org.apache.hadoop.mrunit.mapreduce.MapDriver;

 public class MaxTemperatureMapperTest {

   @Test
   public void processesValidRecord() throws IOException {
     Text value = new Text("0043011990999991950051518004+68750+023550FM-12+0382" +
                                   // Year ^^^^
         "99999V0203201N00261220001CN9999999N9-00111+99999999999");
                               // Temperature ^^^^^
     new MapDriver<LongWritable, Text, Text, IntWritable>()
     .withMapper(new MaxTemperatureMapper())
     .withInput(new LongWritable(1), value)
     .withOutput(new Text("1950"), new IntWritable(-11))
     .runTest();
   }
 }

注意一些deprecated的class和methods:

org.apache.hadoop.mrunit.MapDriver<K1,V1,K2,V2>被弃用应该可以理解,此类是为mapreduce的旧API(比如org.apache.hadoop.mapred)写的,比如其中一个方法
 MapDriver<K1,V1,K2,V2> withMapper(org.apache.hadoop.mapred.Mapper<K1,V1,K2,V2> m)

mapreduce的新API为org.apache.hadoop.mapreduce.*; 与之对应MRUnit的MapDriver(包括ReduceDriver)为:

org.apache.hadoop.mrunit.mapreduce.MapDriver<K1,V1,K2,V2>同样的,上述方法变为:
MapDriver<K1,V1,K2,V2> withCounters(org.apache.hadoop.mapreduce.Counters ctrs)  

MapDriverBase class中的T withInputValue(V1 val) 被弃用,改为T withInput(K1 key, V1 val) ,还有很多,不详列。

执行步骤:

注意: 需要下载MRUnit并编译,在/home/user/.bashrc下设置MRUnit_HOME变量, 之后修改$HADOOP_HOME/libexec/hadoop-config.sh,将$MRUnit_HOME/lib/*.jar添加进去, 之后source $HADOOP_HOME/libexec/hadoop-config.sh,再执行下面操作:

javac  -d class/  MaxTemperatureMapper.java  MaxTemperatureMapperTest.java
jar -cvf test.jar -C class ./
java -cp test.jar:$CLASSPATH org.junit.runner.JUnitCore  MaxTemperatureMapperTest  # or
yarn -cp test.jar:$CLASSPATH org.junit.runner.JUnitCore  MaxTemperatureMapperTest
上一篇:GIC400简介


下一篇:《HelloGitHub月刊》第11期