How do you program using Java?

Can I use more mappers and reducers simultaneously in one MapReduce program by using the threading concept of Java? If yes, how? If no, why?

  • Actually, I want to write one program in hadoop environment. It requires to run two mapper and reducer simultaneously? Is it possible using threads of Java? If yes, how? If no, why?

  • Answer:

    I think you should consider studying/understanding core components of Hadoop - HDFS and Map-Reduce. In HDFS a file is broken down in chunks of block-size; the block size is configurable, default being 64 MB Number of mappers spawned is dependent on HDFSblock size, so if your file size is say 1 GB = 1024 MB so for a HDFS configured with 64 MB block size 16 Mappers would be spawned. The default implementation would run single Reducer on o/p of these 16 Mappers. To increase the number of Reducer a Partitioner can be implemented, but this is very use-case dependent. If you want to spawn threaded Mappers within a Mapper you can check Class MultithreadedMapper, however its not advisable as the basic aim of framework is to process data in parallel on cluster nodes. Check http://kickstarthadoop.blogspot.in/2012/02/enable-multiple-threads-in-mapper-aka.html for more details. Reducer processes data fed from Mapper output, even if you write threaded reducers how would you control ingestion of data to reducer? Hadoop doesn't provide any mechanism to override and customize it; And it seems unnecessary as well.

Mita Baxi at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.