What are file formats known to be unsafe?

What are the existing searchable compression file formats?

  • My application is like this: 1. A writer that is writing out data to a compressed file (and rotate to next file when the file is 1GB in size) 2. A reader that can start at any time to read the file. I'd like the reader to be able to seek to arbitrary byte position in the file, so I don't need to process up to 1GB of data. 3. Coarse-granularity seek is good enough. It's OK to process 1MB of data before actual reading, but not 1GB. Is there any existing file format for this?

  • Answer:

    Do you have small and/or predictable input set?  If so, you could use something like DEFLATE with a pre-defined, static dictionary.   You might be able to do this with zlib out of the box. You could also compress in output chunks of a static size, like 1MB, that you concat into your larger file. When you seek to position X, you'll need to floor() to the boundary. (You'll have to reset the dictionary if using DEFLATE every 1MB chunk)

Pat Roberts at Quora Visit the source

Was this solution helpful to you?

Other answers

http://en.wikipedia.org/wiki/Compressed_pattern_matching is the research field for this question, but I don't know any existing software/format for the problem.

Mitsuru Oka

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.