help me find word lists to build better bots!
-
I need an easy way to find lists of words related to specific topics, preferably ones of particular parts of speech, for my bots. The bots that I make need big lists of words so that the content that they spit out has a lot of different variations. I like to hand-curate these lists, so I'm not interested in using something like the wordnik API, but I'd really like it if there were Big Lists Of Words related to particular topics. It seems like people would have done this already, but I can't find anything like it. I can find individual lists of 20 or so words on certain themes on language learning websites-- lists of violent verbs, or color nouns, stuff like that-- and some of the wordnik lists made by the community have been useful. It seems like something like this probably exists somewhere (some big English as a second language site, or keywords from word searches or someplace that grabs data from wiktionary or SOMETHING) but I have no idea where to start. Here's the kind of lists I've looked for in the past. I already have these covered and do not need suggestions for them; I am just trying to give examples.Positive adjectives ("great" "wonderful" etc)Violent (nouns/adjectives/verbs)Color names that are just colors and not words for something else (as in, yellow, red and purple, but not lilac, eggplant or amethyst)Bonus points if sorting tools are available. Any suggestions will be appreciated-- it's hard to tell exactly what will be useful in advance, since looking at these lists often ends up being the inspiration for a new bot idea.
-
Answer:
Are there any easy ways to just grab the words from one column of a Wikipedia table and get them in text form? The approach that has worked best for me is to select and copy the whole table and paste that into a spreadsheet, and then just copy out the actual row of stuff I want from there to my text editor. It's a little hacky but it works and is reasonably fast.
NoraReed at Ask.Metafilter.Com Visit the source
Other answers
Darius Kazemi (aka http://tinysubversions.com/) has a project called Corpora that's essentially a https://github.com/dariusk/corpora, that may be handy for some of these ideas.
cortex
There are a lot of word lists https://myvocabulary.com/word-list/ (if I've understood what you're looking for).
billiebee
For something like that I'd usually get into Webscraping with Python and http://www.crummy.com/software/BeautifulSoup/ or similar (I've used it for a couple minor projects) However, in this case it looks like you can get away with using https://import.io/, which you can point at a webpage and see what it can extract out of it (and then let you download into an Excel CSV file for further manipulation). It works quite well on the example List of Animal Names, for example.
CrystalDave
http://www.thesaurus.com/ has a search feature, and 42 words for "http://www.thesaurus.com/browse/wonderful?s=t." Wikipedia has https://en.wikipedia.org/wiki/Lists_of_English_words, including https://en.wikipedia.org/wiki/List_of_Newspeak_words.
Little Dawn
Also, this is pretty spotty and involves some manual collection, but there are for nouns in particular a whole lot of "types of x" lists and indexes on Wikipedia, e.g. (but certainly not limited to) stuff like https://en.wikipedia.org/wiki/Lists_of_English_words for specific subsets of vocab. That's one of the main resources I use for collecting stuff like country names, animal species, international common given names, etc.
cortex
http://www.random-generator.com/index.php?title=Main_Page has a ton of word lists (and example generators)
CrystalDave
Are there any easy ways to just grab the words from one column of a Wikipedia table and get them in text form? I'd love to, say, copy and paste over the https://en.wikipedia.org/wiki/List_of_animal_names linked from the Lists Of English Words that cortex and billiebee links to into notepad and then go through and delete all the ones I don't want. Right now when I grab stuff from lists like that I just open it in one window and type in all the ones I want in another one, but there's got to be an easier way than that.
NoraReed
There is a list of http://www.brendenisteaching.com/gen/wordlist/search.php?searchtype=palindrome&term=&maxln=99 and other words from this http://www.brendenisteaching.com/gen/wordlist/ that might be easier to work with - the website says it has 30 pages of filtered wordlists.
Little Dawn
How advanced do the words have to be? http://www.enchantedlearning.com/wordlist/ has lists of vocabulary words for the K-12 set organized by topic. But "topic" is a broad term on that page, so one topic might be "Ways to say Big," another might be "Positive words," another "Irregular verbs."
mittens
Related Q & A:
- Can you help me find a travel brochure on Mesopotamia for school?Best solution by Yahoo! Answers
- Can someone help me find a study abroad program?Best solution by Yahoo! Answers
- Help me find this anime show?Best solution by Yahoo! Answers
- Can you help me find the lyrics?Best solution by Yahoo! Answers
- Can someone help me find the phone number of the Consulate of Honduras in Houston, Texas?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.