How do you make an application that pick URL from database crawl the web extract information about the web and store information into the database.?
-
-
Answer:
Use caution when making your own custom web crawler. There are already quite a few web crawlers out there, and adding more may cause trafficking problems. However, if you must do it, here is a possible procedure that you could use: public class Crawler { private Set urls; public Crawler() { urls = new HashSet(); addSomeBeginningURLS(urls); } public void crawl() { for(Iterator urlIterator = urls.iterator();urlIterator.hasNext();) { String text = getTextFromWebPage(urlIterator.next()); for(int linkIndex = text.indexOf("href"); linkIndex!=-1; linkIndex=text.indexOf("href",linkIndex+1)) { urls.add(getTheURLFromTheHREF(text,linkIndex)); } } } } Some of these methods are just pseudo-code; however, they should not be too hard to implement. Also, a random-access-file (also called a direct-access-file) would be necessary for any real web crawler, as the above model relies directly on RAM to run. However, this is just meant to outline the basic format of a web crawler. NOTE: While it seems very appealing to use recursion in this situation, it is not appropriate. The RAM will run out simply from tracking the recursive calls on the stack, and would cause the crawler to get caught in a loop of links very easily.
wiki.answers.com Visit the source
Related Q & A:
- How to store information?Best solution by Stack Overflow
- How to store multiple Images in database?Best solution by Stack Overflow
- How to store an image in database using MySQL?Best solution by stackoverflow.com
- How to extract information from text file in Python?Best solution by Stack Overflow
- How do I make an Application Stand out?Best solution by eHow old
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.