How to Map Two Tables To One Class in Fluent NHibernate?

Why Hive's Sort Merge Bucket Map Join does the join in just one mapper?

  • I have a Hadoop cluster and I use Hive for querying. I have  two large tables, which are bucketed and sorted on the join key. So, I  think using "sort merge bucket map join" would be helpful. I set the required flags and then run the query. I see that it starts a job which has 1 mapper and no reducers. Question: Why Hive does the join in one mapper? Did I  miss anything? Because tables are bucketed on the join key, it seems  that join can be done in multiple mappers in parallel. Why Hive doesn't  do that?

  • Answer:

    It's tough to say without taking a look at the explain plan and/or the mapreduce job's job.xml because the answer here may depend on a few things depending on your hive configuration, the format of your data set, the metadata associated with it among other things. 2 things that jump out of the top of my head are: 1. InputFormat: Set by hive.input.format property, it may be a format that combines the buckets when sending them to the mapper. 2. Size of your dataset and buckets: If the bucket sizes are small, they may be bundled together before being sent to mappers.

Mark Grover at Quora Visit the source

Was this solution helpful to you?

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.