Perl script to mark lines of tab-delimited file according to content
-
I have a series of tab-delimited text files. Each file consists of records on separate lines. Every record contains text for each of 50 columns (the columns, or fields, are separated by tabs; there are no other tabs in any file). I want to mark each record that contains at least two directional indicators in the 10th column. There are 16 directional indicators: n, nne, ne, ene, e, ese, se, sse, s, ssw, sw, wsw, w, wnw, nw, nnw. These could be present in any combination of upper or lower case letters. I want to count as valid indicators only those that are preceded and followed by a single space, comma, semicolon, or period (in any combination; 3 examples in brackets: < n.>,<.eSE >,<;w.>). I want to mark only those records having two of these valid indicators in the 10th column, ignoring any matches in other columns. For output, I want a file created that lists all records from the input file. Each record should have identical content and structure to the input file (tab-delimited), with the addition that all matching records have appended one tab and the word "matched". Therefore, the input and output files will be identical for all columns, except that the output file will have one additional column containing "matched" for all records containing two valid directional indicators in the 10th field. I have tried variations of the following: #!/usr/bin/perl -w while (<>) { if (/NEED_A_WORKING_REGEX_HERE/gi) { print "$`$&$'\tmarked\n"; } else { print "$`$&$'\n"; } } In first attempts with the above, I was using $& to allow manipulation of the prinout of a matching string segement. Now, I will be satisfied to get (e.g., by redirection: script.pl in.txt > out.txt) an output file as described in the preceding paragraph. I would like the answer to work with Perl v.5.8 and higher.
-
Answer:
Hello gerry1234-ga, Thank-you for your question. After studying your question I believe I have come up with a solution for you that works and I have included it below. Should this be not correct please ask for clarification and include an example file of the data input so that I can work directly on this. As you indicated in your question, the most important section of the script is the regular expression so I will try to explain my approach to it. This is the regular expression I came up with from your description: ([\s,;.](n|nne|ne|ene|e|ese|se|sse|s|ssw|sw|wsw|w|wnw|nw|nnw)[\s,;.].*){2,} Within the square brackets are the characters that can go before/after the directional indicator (single space, comma, semi-colon and period). Between the two sets of square brackets are the possible directional indicators; within a regular expression each example is separated by a | character and should be enclosed within a set of () brackets to generate the match. After the second set of square brackets we do a greedy match to ensure anything can be matched here. The final part of the regular expression is the {2,} section, this means to match anything in the () brackets immediately preceding this at least twice. #!/usr/bin/perl open (TXTFILE, "TEST.TXT"); while($origLine = <TXTFILE>) { # split $origLine into parts by the tab in the line @splitLine = split(/\t/, $origLine); # check whether the 10th section of $origLine matches the pattern if($splitLine[9] =~ m/([\s,;.](n|nne|ne|ene|e|ese|se|sse|s|ssw|sw|wsw|w|wnw|nw|nnw)[\s,;.].*){2,}/gi) { chomp($origLine); print $origLine."\tmatched\n"; } # if it does not match the pattern else { print $origLine; }; } close(TXTFILE);
gerry1234-ga at Google Answers Visit the source
Related Q & A:
- How to convert Oracle script to MySQL script?Best solution by Stack Overflow
- How to call shell script into perl?Best solution by Stack Overflow
- How to count the number of lines in a file using Bash?Best solution by Stack Overflow
- Can you force a link to open in a tab specific tab that is already open?Best solution by Stack Overflow
- What does "From each according to his ability, to each according to his need" mean?Best solution by ChaCha
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.