How could I reduce the length of the code?

How do I know what is wrong with this Map Reduce Program?

  • - I wrote this question early this morning on SO, but did not receive any reply, hoping to get some insights here - I am new to hadoop and trying to write Map Reduce program myself while learning http://stackoverflow.com/questions/11761135/hadoop-job-fails-while-reducing-java-io-ioexception-type-mismatch-in-value-fro My code is public class TopKRecords extends Configured implements Tool { public static class MapClass extends Mapper<LongWritable, Text, Text, LongWritable> { public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { // your map code goes here String[] fields = value.toString().split(","); String year = fields[1]; String claims = fields[8]; if (claims.length() > 0 && (!claims.startsWith("\""))) { context.write(new Text(year), new LongWritable(Long.parseLong(claims))); } } } public static class Reduce extends Reducer<Text, LongWritable, Text, Text> { public void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException { // your reduce function goes here context.write(key, new Text("hello")); } } public int run(String args[]) throws Exception { Job job = new Job(); job.setJarByClass(TopKRecords.class); job.setMapperClass(MapClass.class); job.setReducerClass(Reduce.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJobName("TopKRecords"); // job.setOutputKeyClass(Text.class); // job.setOutputValueClass(Text.class); // job.setInputFormatClass(TextInputFormat.class); // job.setOutputFormatClass(TextOutputFormat.class); // job.setNumReduceTasks(0); boolean success = job.waitForCompletion(true); return success ? 0 : 1; } public static void main(String args[]) throws Exception { int ret = ToolRunner.run(new TopKRecords(), args); System.exit(ret); } } The error I see is 12/08/01 16:09:29 INFO mapred.JobClient: Task Id : attempt_201207311050_0042_m_000001_0, Status : FAILED java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, recieved org.apache.hadoop.io.Text at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1014) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at com.hadoop.programs.TopKRecords$MapClass.map(TopKRecords.java:35) at com.hadoop.programs.TopKRecords$MapClass.map(TopKRecords.java:26) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249)

  • Answer:

    Map function is incompatible with the input data you are processing. As you can see in Exception string http://java.io.IOException: Type mismatch in key from map: expected http://org.apache.hadoop.io.LongWritable, recieved http://org.apache.hadoop.io.Text To fix this issue change the Map function(class) defination to read Text. e.g public static class MapClass extends Mapper<Text, Text, Text, LongWritable> {         public void map(Text key, Text value, Context context)

Ravi Phulari at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.