How can I learn hadoop through projects?

If I were to learn Data Mining, Machine Learning, Information Retrieval Algorithms using the Hadoop Framework should i use Java or layers like Hadoop Streaming, Pydoop, Scoobi, Scalding, ... etc?

  • Writing Java based Map Reduce programs seems to be a cumbersome and also the frameworks like Scoobi, Scalding have been popular in the Hadoop community.

  • Answer:

    I don't think beginning with ready-to-use frameworks like Scoobi or Scalding will be a good idea. Hadoop is vast and requires you to think in a totally different paradigm - that is, Map Reduce. It will be something like using a WYSIWYG editor to generate web-pages instead of coding HTML-CSS. things become easy, but you can't tweak them on your command. You have to rely on something. As I'm also beginning to use Hadoop for almost similar purpose as yours, I believe first we should implement simple algorithms in hard-coded Map-Reduce. The results may not be too good. When you realize this, only then you can really appreciate the beauty of the frameworks you mentioned here. p.s. What do you think of Mahout? I'm taking it next...

Aditya Joshi at Quora Visit the source

Was this solution helpful to you?

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.