Where can I search for someone and get results free?

Shell Scripting: How can I get a list of Display URLs from an HTML file consisting of Google search results?

  • I have a huge text file consisting of HTML code for 100+ Google search results (About 200+ pages of MS Word). I need to filter this text to get just the list of Display URLs shown in the search results.   The display URLs are enclosed in <cite> </cite> tags. So I just need to extract texts enclosed in these tags and I will get the result that I need. Can someone help me do this? (I have some idea of shell scripting if I can use that)   If there is any faster way to do this directly from web, that will be even more helpful.

  • Answer:

    You will need to give some sample of the actual text file to get the exact answer, but lets assume that the file has each url on a new line and the only content on that line is <cite>url</cite>, then you can do something like: cat file.txt|grep '<cite>'|cut -f 2- -d '>'|cut -b 8-|rev read up about grep, cut , sed and awk in general for such string manipulations. Also, python, perl and php are my tools of choice post bash for more complicated string operations. PS: Am assuming you are on a *nix system. (including mac) This won't work in windows.

Anshu Prateek at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.