What is the fastest way to compute the total number of occurrence of a bit pattern in C?
-
So I am currently reading a file byte by byte and comparing it against the pattern which I need to match. A pattern is an unsigned char and I want to count the occurrences of that pattern not including occurrences that span byte boundaries. What is the fastest way to go about doing this in C? For example, the pattern "011" occurs twice in a byte containing the ASCII character 'f' = 0x66, since the bit representation of that byte is 00110110, and it contains two occurrences of the input bit pattern 11.
-
Answer:
I would make a lookup table for all 256 values of a byte to equivalent counts for the bit pattern being searched for. For example, in your example, the lookup table would contain 2 for 'f' or 102. How you handle counting overlapping instances of the bit pattern is up to you (say, bit pattern is '000' and byte value is 0). However you decide to count within each byte shouldn't matter, so long as you're consistent. You can then simply read in each byte, lookup it's equivalent count and keep adding that to an accumulator. A little complexity arises when you need to consider occurrences of the bit pattern across byte boundaries. Again, using '011' as the pattern both 0x00 and 0xff would count as 0, however, when the two occur successively you have one occurrence between the two individual bytes. In most modern systems, it's possible it construct a 64KB array to hold a lookup table for all possible 16-bit word values to indicate whether any two consecutive bytes contain an occurrence of the bit pattern. You could use a bitmap of 65,536 bits (or 8KB) and just use 0 and 1 to indicate a match or not. However, if you do that you'll be doing bitwise manipulation a for every byte anyway that it'll probably cost the same to simply mask and shift the original pair to see if the bit pattern occurs. That is, shift 0x00ff (low endian) left by 4, giving 0x0ff_ and discard the lower bits to get 0x0f and have a second 256-byte lookup table to indicate an occurrence or not. In any case, you'll need to simply remember the last byte read and pairing that with the current byte, see if the desired bit pattern occurs between byte boundaries and count that along with occurrences within each byte. All the above assumes that the desired bit pattern is less than 8-bits long. If you need to search for bit patterns longer than that, then using larger and larger lookup tables might not be possible or might perform poorer than bit shifting and masking every set of n-bits.
Alistair Israel at Quora Visit the source
Other answers
All in O(1): If you are not concerned about memory but only speed, just use a hash table and have the pattern:number_of_bits mapping. And if you are concerned about memory but not about code size, then you can write switch-case to do the same mapping. -- You will anyway know the slower methods(in terms of algorithmic complexity) than the above ones I think. Like shifting, and counting the number of bits etc.
Raghavan Santhanam
Related Q & A:
- What is the fastest way to burn a DVD?Best solution by Yahoo! Answers
- What's the fastest way to ship from USA to Japan?Best solution by Yahoo! Answers
- What is the fastest way to learn braille?Best solution by Yahoo! Answers
- What is the fastest way to get from Paris to Cannes?Best solution by Yahoo! Answers
- What's the fastest way to get rid of scars?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.