How often should african violets bloom?

Stable bloom filter implementation for streaming data?

  • Bloom filter(http://en.wikipedia.org/wiki/Bloom_filter#Stable_Bloom_filters) is probabilistic data structure, which is used to find duplicate elements in set. Stable bloom filter is variant of bloom filter which can be used to find duplicates in data-stream. Since there is no way to store whole history of data stream, so stable bloom filter uses more recent elements to find duplicates. I have read about stable bloom filter in Fan Dang paper "Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters"(http://webdocs.cs.ualberta.ca/~drafiei/papers/DupDet06Sigmod.pdf). I was searching for implementation of stable bloom filter, but not able  to find it. Does any one know if there is open source code for stable bloom filter available or anyone has worked on stable bloom filter?

  • Answer:

    You may be interested in stream-lib (https://github.com/clearspring/stream-lib). "A Java library for summarizing data in streams for which it is infeasible to store all events. More specifically, there are classes for estimating: cardinality (i.e. counting things); set membership; and the top-k elements. One particularly useful feature is that cardinality estimators with compatible configurations may be safely merged." Matt Abrams of Clearspring explains it further here: http://www.addthis.com/blog/2012/03/26/probabilistic-counting/#.T4QO8Dd5mc0

Mo Patel at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.