How to extract text from web page?

Is there an API which can extract article text from any blog page?

  • I'd like to read clean article text into my own rss reader. Is there an API which I can use to extract only the article text from any blog web page? Thanks

  • Answer:

    Yes, there are several ways. In any case, you need to extract full HTML source, then extract from it a subset you need aun use it as you wish. You can use a set of MS COM objects concerning HTML, you can use MS DOM or Tidy tree, etc. - but any of them is not trivial task.

bababyt at Yahoo! Answers Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.