What exactly is URL scraping?

Web Scraping for dummies?

  • A project at work has come up, and I would save a lot of time and hassle if I could somehow get my hands on a free (cheap is acceptable, as long as I can try it first), easy to use web-scraping program. The URL from which I will be scraping is static, unencrypted, and otherwise extremely vanilla. Suggestions?

  • Answer:

    curl or wget. Or, what platform?

Kwantsar at Ask.Metafilter.Com Visit the source

Was this solution helpful to you?

Other answers

Pretty much every scripting language has a capacity to do scraping. I'd say pick something you're comfortable with and try out some examples.

mathowie

Install perl, and the download and install the simple web libraries. There's a sample script that comes along with them for straightforward scraping.

thanotopsis

Platform is XP, code-writing skills are minimal.

Kwantsar

If you can't find a script or something that you can modify easily, it'd be relatively trivial to write a perl script for it. I could whip something up fairly fast if you don't find something out of the box you like.

devilsbrigade

http://www.gnu.org/software/wget/wget.html will get webpages for you. "Scraping" is usually defined as a combination of both getting the webpage and parsing it for whatever data you want. There's no magic bullet for the parsing part, since, uh, webpages are different. You're going to have to write some code of some sort, somehow.

jellicle

http://www.httrack.com/ for Windows et al http://www.sitesucker.us/ for OS X (for future AxeMe reference)

pedantic

I use and enjoy Python and mechanize. You could provide the page and say what you want to scrape from it.

grouse

http://www.urltoys.com/ will do the j-o-b and is not too too difficult to use.

Capn

Perl's http://search.cpan.org/~petdance/WWW-Mechanize-1.06/ is a good toolkit for the job. You will need some programming skills to do anything with it though.

sad_otter

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.