What are most interesting (crazy) use cases of regexes (regular expressions)?
-
I recently read this article http://cory.li/battlecode-intro/ and was really impressed with the way regex were used to encode Dijkstra's algorithm. That got me wondering about people who have use regexes in interesting places.
-
Answer:
I don't know if this is crazy per se, but many people probably don't realize that Emacs's syntax highlighting and indentation are implemented almost exclusively with regexes. And yet they're surprisingly good! Syntax highlighting with regexes is, for the most part, not much of a surprise. Most languages have a relatively sane lexical syntax to simplify lexing, so they're almost perfectly regular already. And since regexes in practice are actually more powerful¹ than academic "regular expressions" (ie they can match more than just regular languages), it's easy to cover most of the parts that are not regular too. Indentation, on the other hand, is quite a difficult problem, with or without regexes. Steve Yegge, who wrote JS2-mode which does not use regexes, sums it up well: See, I thought that since I had gone to all the thousands of lines of effort to produce a strongly-typed AST (abstract syntax tree) for JavaScript, it should therefore be really easy to do indentation. The AST tells me exactly what the syntax is at any given point in the buffer, so how hard could it be? It turns out to be, oh, about fifty times harder than incremental parsing. Surprise! (From http://steve-yegge.blogspot.com/2008/03/js2-mode-new-javascript-mode-for-emacs.html) So the fact that Emacs does a really good job without a full ASTâjust with regexesâis pretty impressive! There's also some very interesting and, serendipitously, very useful emergent behavior: indentation in Emacs is local. It is only based on the line immediately above. Now, sometimes, this is actually a problem, but in my experience it works well enough something like 99.99% of the time (ie I have an indentation problem at most every ~10k lines). More importantly, though, it means Emacs indentation adapts to my code! In particular, there are often weird language constructs and patterns that I indent differently from the rest of the code. With Emacs, I just need to do the first line of a block like this manually, and everything else lines up automaticallyâit just follows the lead of that first line. On the other hand, a global system like Eclipse insists or reindenting everything according to its built-in styling guide; this means I have to fiddle with the tool to add an exception for whatever weird one-off pattern I want to indent properly! (Or just give up and settle for the default style even if it ends up harder to read.) I take advantage of this far more often than I run into limitations of Emacs's indentation, so I think it's a net benefit. It also means that Emacs gracefully handles files that are not completely well-formatted: it can indent properly as soon as you have a single valid line, ignoring everything that comes before (or after!) it. Both of these features are particularly useful when working with embedded DSLs in languages with flexible syntaxes like Scheme or Haskell. Often, while the DSL is actually made up of normal Scheme/Haskell code, its meaning and conventions are different; Emacs ends up adapting to these new conventions by itself just thanks to this locality property.
Tikhon Jelvis at Quora Visit the source
Other answers
I was quite impressed the first time I looked at a Java library for recognising gestures (http://www.silentlycrashing.net/ezgestures/ ) and discovered the guy using regexes to decode them. Basically he had an object turn mouse or wii movements into a series of up / down / left / right strokes, represented in a string "UDLR" etc. From that, you could then use regexes to match any combination of strokes and dispatch to whatever you wanted to do. Take home message : if you can translate a sequence of interesting data into a simple alphabetic encoding, you can then use regexes to match and extract patterns from it. They're not just for things that we traditionally think of as "text".
Phil Jones
A regex that determines whether a number is prime or composite perl -wle 'print "Prime" if (1 x shift) !~ /^1?$|^(11+?)\1+$/' [number] The explanation of this regex can be read here http://montreal.pm.org/tech/neil_kandalgaonkar.shtml
Mostafa Hany Gomaa
The core ideas of regex, our beloved string matching and extraction language, can be easily generalized for other kind of sequences - such as byte streams, matrices, and even code itself! See this talk by Erik Osheim to see this generalization in action, for byte streams and matrices - http://2014.flatmap.no/speakers/osheim.html. The generalization is called Kleene Algebra. A couple of screenshots from the talk to pique your interest - seqex, a library for Clojure, attempts to do this specifically for sequences. Since sequences are the data structure that primarily make up Clojure code, this can be put to good use when dealing with macros, and so this library provides a module that does just that! You can find out more in this talk from Clojure Conj -
Rahul Goma Phulore
Ever since that fateful day in college, when I learned about Pumping lemma, I've been waiting patiently to use my RegEx skills in an unusual and interesting way, that makes people go, "Wait! Did you just order pizza with our favourite toppings using regular expressions on our facebook timeline?". Never happens. Source: https://xkcd.com/208/
Mathi S Manian
Related Q & A:
- What are some interesting facts about the chaparral biome?Best solution by softschools.com
- What are some interesting careers in health care?Best solution by Yahoo! Answers
- What are some interesting things to do in Nairobi, Kenya?Best solution by Quora
- What are some interesting facts about Honduras?Best solution by Yahoo! Answers
- What are some interesting stories?Best solution by Quora
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.