What exactly is URL scraping?

Let’s learn what exactly is URL scraping. The most accurate or helpful solution is served by Google Answers.

There are ten answers to this question.

Best solution

Scraping ascii file with redirected URL using Perl

I am trying to scrape a page that is generated on the fly by a webserver. Consistent data is submitted (by me) to the CGI on the target machine, then a report is generated on the fly and a redirect is issued. I need the script to obtain and follow this...

Answer:

Hello grabby-ga There are few exact details in your question so I will give a generic perl script that...

Read more

grabby-ga at Google Answers Mark as irrelevant Undo

Other solutions

Is it possible to build a stealth search engine (web crawling not web scraping) to target just one website online, without them knowing, and what coding skills or anonymity would be required?

The website I am keeping tabs on has a new web page for each new product promotion. So I wonder if it is at all possible to build a search engine / web crawler to keep up to date with it. In other words, I want to collect the subdomain URLs on a given...

Answer:

What you are asking is called web scraping. You would use some kind of script that is scheduled to visit...

Read more

Dwayne Charrington at Quora Mark as irrelevant Undo

Screen-scraping laws in India

Screen-scraping is a process by which content can be pulled off a website/ecommerce engine on the internet using robot/crawler scripts and can then be used on one's portal for end consumers to search/look on. This technique is useful in aggregating required...

Answer:

bajubakait-ga, Thanks for a most interesting question. As far as I know, there is little or no law...

Read more

bajubakait-ga at Google Answers Mark as irrelevant Undo

Help me populate a spreadsheet by scraping an RSS feed.

I would like to scrape information from an RSS feed into an Excel-readable text file for a completely legal non-copyright violating use. In a better world, I'd have access to the database that generates the feed, but since this ain't a perfect world...

Answer:

Is Excel XML mapping no good?

Read more

croutonsupafreak at Ask.Metafilter.Com Mark as irrelevant Undo

Web Scraping for dummies?

A project at work has come up, and I would save a lot of time and hassle if I could somehow get my hands on a free (cheap is acceptable, as long as I can try it first), easy to use web-scraping program. The URL from which I will be scraping is static...

Answer:

curl or wget. Or, what platform?

Read more

Kwantsar at Ask.Metafilter.Com Mark as irrelevant Undo

Screen scraping etiquette

Looking to start a 20k+ request screen scraping project, what sort of guidelines (in addition to those I plan to implement) do I need to follow to avoid having the hounds sent out after me. A corporate site has a collection of about ~20k freely available...

Answer:

A couple of extra tips, from someone who's both scraped lots of data and defended a website from scrapers...

Read more

NormandyJack at Ask.Metafilter.Com Mark as irrelevant Undo

Why is Facebook scraping the links in my private messages?

ParanoiaFilter: Should I be worried that Facebook is scraping my links, in my own private messages? Assuming you are on Facebook, you probably already know that by putting a link in the private messages' textfield, the system will automatically try to...

Answer:

Yes, Facebook records and analyses every single action you take on the site. Yes, you should be worried...

Read more

querty at Ask.Metafilter.Com Mark as irrelevant Undo

Hi all, is there any Good Data Scraping tool / software to get Products data from online store?

Hi all, is there any Good Data Scraping tool / software to get Products data from online store ? I  mean  , exp : i need to get all products data (tittle,Desc,Image URL,  SKU etc) from Amazon (or any Online store) in specific category (exp:  LAPTOP)...

Answer:

I found some tutorials on youtube about scraping websites. I think they can also be used to scrape onlin...

Read more

Hanasaki Tsiyuki at Quora Mark as irrelevant Undo

If I have a URL of a news article, how can I grab its RSS Feed Entry?

My application needs to fetch the exact contents of the RSS Feed entry corresponding to that news article URL. Scraping/Parsing the page is not an option.

Answer:

Without knowing your specific application development environment, it's hard to be precise. The key...

Read more

Brad Balfour at Quora Mark as irrelevant Undo

What model is the best for URL Similarity/URL Pattern Recognition?

I am trying to come up with a model that will go over a number of URLs in a training set and identify new patterns in the test set. Let's say I have 2 sets of URLs that comes to my website First set includes the following: http://www.google.com/url?somert...

Answer:

I would frame it as a classification problem and use character n-grams.

Read more

Alvin Grissom II at Quora Mark as irrelevant Undo

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.