How do I get a random sample of Wikipedia articles?
-
I would like a sample size of approximately 10.000 articles â just the URLs. Does there exist a tool that can extract a random sample of Wikipedia articles? Or would I have to use a web crawler or similar automated script to visit http://en.wikipedia.org/wiki/Wikipedia:Random 10.000 times and recorded the URL each time? Sub question: Is 'Special:Random' random enough for a sample size of 10.000 articles? This FAQ says that the feature is not really random, but at what sample size could one say that its 'good enough' or 'as good as it gets'? I would also like to get a random sample of all the 'Featured articles' and of all 'Good articles'. Should I use http://meta.wikimedia.org/wiki/User:Dapete/Toolserver#Random_pages instead of Special:Random (in conjunction with approached described above)?
-
Answer:
The MediaWiki API helps here; https://en.wikipedia.org/w/api.php?action=query&list=random&rnnamespace=0&rnlimit=20 will give you 20 random* main namespace article URLs. If you want different settings, the relevant https://en.wikipedia.org/w/api.php?action=help&modules=query%2Brandom documentation should assist. Making a request for 10,000 items at once is not allowed (the limit is 20 for anonymous requests, or 1000 for authorised bot and sysop accounts), but you can just chain them (remember not to make too many close requests or you may be blocked). * â "Random" is a complex term and technically these aren't random, but it almost certainly is sufficient un-skewed as to suit your needs.
James Forrester at Quora Visit the source
Other answers
gives a good general answer (I didn't know the API provided this). For Featured and Good articles, don't forget all such articles are listed on their own pages: http://enwp.org/WP:FA and http://enwp.org/WP:GA . Could you just copy those lists into a spreadsheet, generate a list of random numbers within the proper range, and then select the articles manually?
Pete Forsyth
Related Q & A:
- How do I get a value from a JObject?Best solution by Stack Overflow
- How can I get a good deal at a luxury resort?Best solution by lastminute.com
- How do I get a job in IT or Software Development if I have no experience and only a 3rd class degree?Best solution by answers.yahoo.com
- Where can I get a writing sample?Best solution by monster.com
- How can I get a text alert when I get an email?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.