How do I crawl various websites with one crawler?
-
I want to make a web crawler in java. please someone help me or give me some tips and help me to making source code of web crawler in java...please.......Thanks in advance
-
Answer:
You can try scrapy + selenium. I did something similar for GPlay, check the project on github https://github.com/Stravanni/Gplay-Scraper.
Giovanni Simonini at Quora Visit the source
Other answers
There cannot be a common method bec of different structures of websites. Although you can define different scrapy configs and crawl them. Along with it you can use phantomjs
Shobhit Jain
Try to find out http endpoints in their website or mobile app. Crawl via them. Much easier since most them returns plain json. Do write test cases for them though :)
Anonymous
Related Q & A:
- How can I transfer my music from one iPod to another?Best solution by Yahoo! Answers
- How can I block malicious websites?Best solution by Yahoo! Answers
- How can I manually block websites?Best solution by Yahoo! Answers
- How can I keep these websites from popping up on my computer?Best solution by Yahoo! Answers
- How do I forward an email to one of my contacts?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.