Is there an efficient way to scan and filter by column qualifier in HBase?
-
Lets say I have the following Metrics table: row key: <metric_id> Families: stats_monthly => columns: <YYYYMM>_stat1, <YYYYMM>_stat2, ... stats_weekly => columns: <week#>_stat1, ... Is there an efficient way to query for all the stats columns of a certain metric for a given month? (something like all values for "metric1:stats_monthly:201107_*") Or maybe a better (or only?) solution is to split the Metrics table into Metrics_Monthly and Metrics_Daily and move the date part from the column qualifier to the row_key? (row key would be <YYYYMM>_<metric_id> and when querying for a specific month I could scan for all row keys that with <YYYYMM>) Since all writes call will be on the current month wont all the writes going to the same server with this method? Update: is using a ColumnPrefixFilter the right solution? Looking at the code it seems that it enumerates over all the possible KeyValues available which isnt as performant is jumping to the first match and scanning from there as I would expect...
-
Answer:
You may use a ColumnPrefixFilter to filter keys by their columns prefix or you may even implement your own column filter. Read more about filters here: http://ofps.oreilly.com/titles/9781449396107/clientapisadv.html And here some info about the ColumnPrefixFilter: https://issues.apache.org/jira/browse/HBASE-3684 ColumnPrefixFilter will scan all the columns making the amount of data transfered from Hbase to client smaller, but requiring to scan all the columns. In other words Hbase does not have any index on column names (in contradiction to Cassandra, which has columns stored in treemaps, guaranteeing sorted data and fast submap() functionality). So the solution will be to implement your own index on top of Hbase, resulting for additional maintenance and additional round trips to Hbase cluster. Luckily there is a good implementation already called IHbase , which optimizes the columns access: https://issues.apache.org/jira/browse/HBASE-2037 https://github.com/ykulbak/ihbase/wiki/Getting-Started
Matthew Tovbin at Quora Visit the source
Related Q & A:
- How to create an external table for hbase?Best solution by stackoverflow.com
- How to split column text to another column?Best solution by Stack Overflow
- How to find the column name of the first column with a null value?Best solution by Stack Overflow
- How to change column data's as a separate column wise format in a SQL Server?Best solution by stackoverflow.com
- What is the fastest and most efficient way to heal a canker sore?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.