What are some libraries in Python that can help me make a web crawler?
-
The libraries can be in standard library or 3rd party. The crawler will just scrape some content of the pages.
-
Answer:
Module: urllib and beautiful soup for html parsing Framework: Scrapy
Sachit Adhikari at Quora Visit the source
Other answers
lxml (http://lxml.de/) is an excellent library. It's fast and powerful, but it takes some time (a few hours) to learn properly. Beautiful Soup (http://www.crummy.com/software/BeautifulSoup/bs4/doc/) is another amazing library. It's a joy to work with, and you can get up and running quickly. If speed isn't important, use Beautiful Soup. No question. If speed is a factor, go with lxml. With some problems, the speed difference can be a matter of hours, even days. I'd also suggest using the requests library (http://docs.python-requests.org/en/latest/) to request each webpage rather than using urllib(2). It makes working with HTTP requests a pleasure.
Michael Kolodny
I would add http://wwwsearch.sourceforge.net/mechanize/
Ca Isah
Related Q & A:
- How To Make A Web Site?Best solution by Yahoo! Answers
- How can start to make a website like whateverlife.com?Best solution by Yahoo! Answers
- Could anyone please help me make a good msn name :D?Best solution by Yahoo! Answers
- How do you make a web show?Best solution by Yahoo! Answers
- Where can you work as a Web Developer / Web Designer?Best solution by Quora
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.