在 hbase mapreduce 中传递 Delete 或 Put 错误
在 hbase 上运行 mapreduce 时出现以下错误:
I am getting below Error while running mapreduce on hbase:
java.io.IOException: Pass a Delete or a Put
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:125)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:84)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at HBaseImporter$InnerMap.map(HBaseImporter.java:61)
at HBaseImporter$InnerMap.map(HBaseImporter.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
12/11/27 16:16:50 INFO mapred.JobClient: map 0% reduce 0%
12/11/27 16:16:50 INFO mapred.JobClient: Job complete: job_local_0001
12/11/27 16:16:50 INFO mapred.JobClient: Counters: 0
代码:
public class HBaseImporter extends Configured implements Tool {
public static class InnerMap extends
TableMapper<Text, IntWritable> {
IntWritable one = new IntWritable();
public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
String val = new String(value.getValue(Bytes.toBytes("cf"), Bytes.toBytes("line")));
String[] words = val.toString().split(" ");
try {
for(String word:words)
{
context.write(new Text(word), one);
}
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
public static class MyTableReducer extends TableReducer<Text, IntWritable, ImmutableBytesWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
int i = 0;
for (IntWritable val : values) {
i += val.get();
}
Put put = new Put(Bytes.toBytes(key.toString()));
put.add(Bytes.toBytes("cf"), Bytes.toBytes("count"), Bytes.toBytes(i));
context.write(null, put);
}
}
public int run(String args[]) throws Exception
{
//Configuration conf = getConf();
Configuration conf = HBaseConfiguration.create();
conf.addResource(new Path("/home/trg/hadoop-1.0.4/conf/core-site.xml"));
conf.addResource(new Path("/home/trg/hadoop-1.0.4/conf/hdfs-site.xml"));
Job job = new Job(conf,"SM LogAnalyzer MR");
job.setJarByClass(HBaseImporter.class);
//FileInputFormat.setInputPaths(job, new Path(args[1]));
//FileOutputFormat.setOutputPath(job, new Path("outyy"));
//job.setOutputFormatClass(TextOutputFormat.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
//job.setMapperClass(InnerMap.class);
Scan scan = new Scan();
scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
scan.setCacheBlocks(false);
TableMapReduceUtil.initTableMapperJob(
"wc_in", // input table
scan, // Scan instance to control CF and attribute selection
InnerMap.class, // mapper class
Text.class, // mapper output key
IntWritable.class, // mapper output value
job);
TableMapReduceUtil.initTableReducerJob(
"word_count", // output table
MyTableReducer.class, // reducer class
job);
job.setNumReduceTasks(1);
job.setNumReduceTasks(0);
return job.waitForCompletion(true)?0:1;
}
public static void main(String[] args) throws Exception {
//Configuration conf = new HBaseConfiguration();
//Job job = configureJob(conf, args);
//System.exit(job.waitForCompletion(true) ? 0 : 1);
String[] inArgs = new String[4];
inArgs[0] = "HBaseImporter";
inArgs[1] = "/user/trg/wc_in";
inArgs[2] = "AppLogMRImport";
inArgs[3] = "MessageDB";
int res = ToolRunner.run(new Configuration(), new HBaseImporter(), inArgs);
//int res = ToolRunner.run(new Configuration(), new HBaseImporter(), args);
}
}
我将映射输出值类设置为 IntWritable.class,但仍会在需要 Put 对象的映射器中调用 TableOutputFormat.write.
Am setting map output value class as IntWritable.class, but still TableOutputFormat.write getting called in mapper which expects Put object.
推荐答案
God Answer for my own question.我错误地将减速器任务设置为0".
Got Answer for my own question. I was setting mistakenly no of reducer tasks as '0'.
job.setNumReduceTasks(0);
所以 Mapper 期望 Put 对象直接写入 Hbase 表.注释上述行解决了问题.
So Mapper expects Put object to directly write into Hbase table.Commenting the above line solved the issue.
相关文章