What's the use of Deterministic Regular Expressions?

What's the best way to use Regular Expressions on a text file?

  • Since Regular Expressions seem to accept only string, the most common way to search a file is either by reading it to a single string or by reading each line in turn. The problem is that with big files you either hold a huge string in memory or killing the GC with very large amount of strings that are created, used and left for garbage collection. Does anyone know of a better way to access this problem?

  • Answer:

    The 'best' way depends on your needs. And you won't 'kill' the GC with a large number of strings that quickly. Test/verify your assumptions. When your problem is line-based, read the lines one-by-one. Prefer File.ReadLines() over File.ReadAllLines(). If your problem spans the whole file (RegexOptions.Multiline, maybe including line-breaks in the patterns), you will have to read it into 1 string. Use smaller files if it becomes a problem. In all cases, use common sense and/or a profiler.

BigDZ at Stack Overflow Visit the source

Was this solution helpful to you?

Other answers

If you need to process a full file which doesn't fit the memory comfortably (as in a few KB of data), you probably should look into "proper" parsing. There are many good http://en.wikipedia.org/wiki/Comparison_of_parser_generators around, my favorite one is the http://www.devincook.com by Devin Cook, but others such as http://www.antlr.org/ are very as well. The regex engine is not a plain DFA, it does backtrack in some cases. I assume that this is the reason why the regex cannot be applied to a sequence of characters; since efficient random access is needed holding the data in memory seems to be the obvious solution.

Lucero

Either load line by line, or load portions of the file. If you want you Regex to span line breaks, use Multiline option

Akram Mellice

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.