Does anyone know of a good Perl unit test generator?

I have a reducer implemented in perl which takes massively longer to run on the cluster than it does locally. Why might this be?

  • Specifically, one reduce task has been running for over 24 hours, while the others completed in under 10 minutes.  This one task received data for a key with a relatively large amount of data, about 500K records, and performs a clustering task which should run in ~n^2 time.  When I pull the input data associated with this key and the adjacent keys to my local machine, and do a test run like "cat data.txt | perl http://map.pl | sort -k1 | perl http://reduce.pl", the computation is done in less than a minute.  I know you can't debug my code without details, but I'm just wondering if anyone has seen anything similar before.

  • Answer:

    It is hard to say based on the info you give, but there is a good chance there is a lot less free memory on the hadoop node and your task is swapping like mad. When things suddenly get 100,000 times slower this is usually the cause. You can diagnose this with iostat, or better yet install ganglia and get a view of your whole cluster in graphical form.

Jay Kreps at Quora Visit the source

Was this solution helpful to you?

Other answers

Sometimes this is a symptom of faulty infrastructure: cpu, disk, etc.  Do you have other reduce tasks that finished quickly on this same machine? In general, this problem is exactly the type that speculative execution (enabled by default, check mapred.{map,reduce}.tasks.speculative.execution is supposed to solve.  If there exists a task that is taking much longer than the rest, then Hadoop should automatically spawn another instance of the task, and occasionally this second instance will finish before the first.  Do you have speculative execution turned off? What do the logs for this particular task show?  In particular, there are R/W loglines which show read/write progress through the reducer.  Do these logs indicate: (a) a stuck reducer, or (b) slow-but-steady progress?

Norbert Burger

There may be multiple reasons for stragglers in a MapReduce cluster. The following paper in OSDI'10 covers some as seen in production clusters. http://research.microsoft.com/en-us/um/people/srikanth/data/mantri_osdi10.pdf Would be helpful to take a cue to see if any one fits yours.

Bikash Sharma

I realized much later that the problem was due to memory issues on the cluster. I didn't realize that the memory allocated to map and reduce tasks applies to the parent Java process that spawns the streaming script, and that my perl script had to make do with what was left over. I had scaled up the task memory allocation thinking that my perl script would share the budget, so perl was furiously writing to swap because it had only a few megabytes of memory to work with. No error was generated, so I didn't notice until larger runs started causing perl to return out of memory errors.

Alex Hasha

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.