What is the proper use of / \ ? and + in Regular Expressions?
-
In Google Analytics, the use of regular expression enables goal and funnel tracking where the URL varies. To do that, all the steps need to use regular expressions. Is this accurate for the homepage? \.com?\/$ OR \.com+\/$ ? or plus and I'm also curious if the leading \ is necessary A step in the funnel looks like this orders\/new\?group_id=[0-9a-z]+$ OR orders\/new+?group_id=[0-9a-z]+$ \ or ? after new
-
Answer:
Here's the documentation for the Google Analytics Regular expressions. It's what you'd expect, but I needed to check in order to be sure: http://www.google.com/support/analytics/bin/answer.py?answer=55582 It's almost always best when talking about regular expressions to list out example texts. I'm already a little bit confused by your first question. Does your home page include characters that come after the http://dot.com? An example of something that would come there is a tracking code like http://dot.com/?utm_campaign=foo. I'm just going to use my home page as an example. Visitors to my blog come in to these three cases: http://stubbleblog.com http://stubbleblog.com?blahblahblah http://stubbleblog.com/ http://stubbleblog.com/?blahblahblah So the regular expression to match all four is (caveat, I didn't test this) \.com\/?(\?.*)?$ The core characteristics and differences from your example are: The optional quanitfier (?) needs to come after the thing that's optional. In the case above, the trailing / after the domain is optional and the "?blahblah" query string is optional. I'm using the anchor ($) like you did which means we need a regular expression that matches all the way to the end of the string. That way you don't get hit by something weird like http://dot.com/foo.com. This is probably a really minor edge case. (\?.*) describes the query string. I think this replaces the reason you might have been tempted to bring out the + quantifier. In your second question. What you want is the first option with the \. That modifies the start of the query string (?) so that it isn't treated like a regular expression operator. Don't use the + in your second example. That plus modifies the preceding w and would match something like newwwwww.
Tony Stubblebine at Quora Visit the source
Other answers
Hi there, Regular expressions are a great way to extend your Google Analytics implementation in goal set-up and filter enablement. When setting up regular expressions to identify pages, I recommend the following: Open up your Google Analytics reporting interface in a separate tab. Navigate to the 'Top Content' report. This is labeled as "Content > Pages" in the new version of Google Analytics. Search for the page or pages you are identifying in your goal set-up using your regular expression. If your page(s) are identified with the filter, you can feel confident that your regular expression is correct. About the characters... + - This character matches one or more of the previous items. For example, "regu+lar" matches "reguuuuular", "reguular" and "reguuular". ? - This character matches one or zero of the previous items. For example, "regu?lar" matches "reguular" and "regular". . - This character is a wildcard. It will match any single character, including numbers, letters and symbols. For example, "..." matches "123", "1#2", "ab?", etc. \ - This character turns a regular expression into a regular character. You want to use this character before the character you want to turn into a regular character. For example: "\ ." turns the period into a regular character, and not a wildcard. For your homepage... Unless you have a filter applied to show the full subdomain, you can identify your homepage using "^/$". The "^" requires that the proceeding character be at the beginning of the field, while the "$" requires that the preceeding character be at the end of the field. If you do have a filter applied which shows your homepage in your top content report as "http://homepage.com/", your regular expression will be "^homepage\.com$". The forward slash, "\", is required because a period, ".", is a wildcard character. Other resources: http://www.regular-expressions.info/reference.html http://www.zytrax.com/tech/web/regex.htm
Morgan Vawter
PPC Hero just recently posted about regular expressions. It's a quick overview of how the characters work and might be helpful. http://www.ppchero.com/analytics-regular-expression-characters/
Bethany Bey
Related Q & A:
- What's the proper structure of an HTML5 page that briefly lists other articles?Best solution by Webmasters
- What is the proper way to do an abdominal crunch?Best solution by exercise.about.com
- What is the proper way to open a laptop lid?Best solution by Yahoo! Answers
- What Is The Proper Way To Minimize The Screen While Playing The Sims2 PC?Best solution by modthesims.info
- What's the proper way of passing an argument to NSTimer?Best solution by Stack Overflow
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.