How to convert XML files to PDF files?

How can I convert a wiki xml dump file to individual html/text files?

  • I am developing a web system, and I need a lot of individual html/text files. Are there any tools or commands which can used to convert a wiki xml dump file to individual html files in windows platform?

  • Answer:

    Write and apply an XSL\XSLT file to your XML. It will get automatically transformed into HTML when you open the XML in a browser. Here is a sample program that you can use http://www.w3schools.com/xsl/

Anon User at Quora Visit the source

Was this solution helpful to you?

Other answers

You can find list of  wikipedia parsers in http://www.mediawiki.org/wiki/Alternative_parsers. You can also use wikiprep(http://www.cs.technion.ac.il/~gabr/resources/code/wikiprep/, http://www.cs.technion.ac.il/~gabr/resources/code/wikiprep/). Wikiprep process wikipedia and write in one xml. You can make changes in wikiprep to write output in multiple xml files. Apart from that you can use mediawiki text table(http://www.mediawiki.org/wiki/Manual:Text_table) and page table(http://www.mediawiki.org/wiki/Manual:Page_table) and generate xml files from the tables.

Vineet Yadav

I'd agree with XSL/XSLT transformation, or just script it up. I'd probably start with XSL/XSLT, and if it proved too difficult I'd write a Ruby script, but your favourite scripting language of choice will do the trick. This is exactly why my wiki system uses Markdown files on Dropbox - no XML dump files, just your own plaintext data in your Dropbox folder :)

Mark Beattie

I found a solution which based on java platform for this question: Using the gwtwiki Java Wikipedia API (Bliki engine) can solve this problem: http://code.google.com/p/gwtwiki/wiki/MediaWikiDumpSupport#Example_how_to_convert_all_wiki_pages_from_a_dump_into_static_HT

Changqi Cai

I would say it is a rather dumb idea to convert the wikipedia database dump individual text/html files. You would be better off getting a sql dump load it on to database and access it programatically. One way of programatically accessing data from above said database would be http://www.ukp.tu-darmstadt.de/software/jwpl/ Hope this helps.

Anonymous

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.