如何在 Hadoop MapReduce 中将对象设置为 Map 输出的值?

在 Hadoop MapReduce 中,对于中间输出(由 map() 生成),我希望中间输出的值是以下对象.

In the Hadoop MapReduce, for the intermediate Output (generated by the map()), i want the Value for the Intermediate output to be the following object.


MyObject{
  date:Date
  balance:Double
}

我该怎么做.我应该创建自己的可写类吗?

How would i do this. Should i create my own Writable Class?

我是 MapReduce 的新手.

I am a newbie to MapReduce.

谢谢.

推荐答案

您可以编写您可以作为映射器值发出的自定义类型.但是无论你想作为值发出什么,都必须实现可写接口.你可以这样做:

You can write your custom type which you can emit as the mapper value. But whatever you want to emit as value, must implement the Writable Interface. You can do something like this :

public class MyObj implements WritableComparable<MyObj>{

    private String date;
    private Double balance;

    public String getDate() { return date;}
    public Double getBalance() { return balance;}

    @Override
    public void readFields(DataInput in) throws IOException {

        //Define how you want to read the fields
        }
    @Override
    public void writeFields(DataOutput out) throws IOException {

        //Define how you want to write the fields
    }
        .......
        .......
        .......

}

您也可以使用 Avro 序列化框架.

Alternatively you can make use of Avro serialization framework.

相关文章