Latent Dirichlet Allocation (LDA): What does it mean by "Generating a word from a multinomial distribution conditioned on the topic"?
-
In the topic model literature, I often see the statement "Generating a word from a multinomial distribution conditioned on the topic" or "Generating a topic from a multinomial distribution conditioned on the document". What does it mean by generating a single variable from a multinomial distribution? To my understanding, you can only generate vector of variables from a multinomial distribution. If you have only a single variable, the multinomial distribution degenerates into the Bernoulli distribution.
-
Answer:
A categorical distribution defines the probability of different set of discrete outcomes. For example if we have four possible words {w1,w2,w3,w4} then a multinomial distribution will tell you the prob. of occurrence of each word. p(w1)â¡Pr{ W=w1 }, where W is a random word. The multinomial distribution corresponds to repeated draws from a categorical distribution. To my understanding, you can only generate vector of variables from a multinomial distribution. You are right, but in this case they are thinking about the degenerate case where you draw a single trial from a categorical distr. If you have only a single variable, the multinomial distribution degenerates into the Bernoulli distribution. You may be confusing things. Let me clarify. If the R.V. has two outcomes and single trial => Bernoulli two outcomes + multiple trials => Binomial (counts of outcomes) multiple outcomes, single trial => Categorical multiple outcomes, multiple trials => Multinomial
Ivan Savov at Quora Visit the source
Other answers
I have a detailed blog post: http://saravananthirumuruganathan.wordpress.com/2012/01/10/detecting-mixtures-of-genres-in-movie-dialogues/ describing my experiments with LDA and also my understanding of it. In the case of http://en.wikipedia.org/wiki/Maximum_likelihood given n data points, we assume the underlying distribution that generated this data is a Gaussian and fit the data to the best http://en.wikipedia.org/wiki/Gaussian_function possible. LDA also makes a similar assumption that there is a hidden structure to the data. And that hidden structure is a multinomial whose parameter [math]\theta [/math] comes from a http://en.wikipedia.org/wiki/Dirichlet_distribution Prior. Let us say that I want to generate a random document; I donât care if its meaningful or not. I first fix the number of words I would want to generate in that document. I can on the other hand draw a random number from say a Poisson. Once I have the number of words (N) to be generated, I go ahead to generate those many words from the corpus. Each word is generated thus: draw a [math]\theta [/math] from a http://en.wikipedia.org/wiki/Dirichlet_distribution. (Dirichlet is a distribution over the simplex.) Consider [math]\alpha [/math] as the parameter that decides the shape of the Dirichlet similar to how mean and variance decide the shape of the Gaussian bell curve. In a 3-D space, for some choice of [math]\alpha [/math], consider the probability to be more near (1,0,0), (0,1,0) and (0,0,1). For some other choice of [math]\alpha [/math] all points in the 3-D simplex might get the same probability! This represents what kind of topic mixtures I can generally expect. (If my initial guess is that each document has only one topic, mostly I will choose an [math]\alpha [/math] such that I get more probability on the (1,0,0) points. This is just a prior which could be wrong! And in this way it is not strictly analogous to Maximum Likelihood). So I have an [math]\alpha [/math] now and I draw a sample from the Dirichlet. What I actually get is a vector that sums up to 1. I call this [math]\theta [/math]. Remember that Iâm trying to generate a random document and I havenât generated a single word yet! The [math]\theta [/math] I have is my guess on the topic vector! I have obtained this [math]\theta [/math] by sampling from a k-dimensional vector (here k=3 in the above example.) Now that [math]\theta [/math] represents a topic vector which can also be re-imagined as a probability distribution and because any draw is guaranteed to be from the simplex, I can use this drawn vector ([math]\theta [/math]) as the weights of a loaded âkâ faced die. And I throw this die! Lets say it shows up 5 (a number between 1 and k). I will now say that the word Iâm going to generate belongs to Topic-5. I have not yet generated a word! To generate a word that belongs to a topic, I need a |V| faced die. |V| is the size of the vocabulary of the corpus. How do I get such a huge die?! I will get that in a similar way as for the topic vector. I will sample again from a Dirichlet â but a different Dirichlet â one that is over a v-dimensional simplex. Any draw from this Dirichlet will give a v-faced die. Call this the Dirichlet [math]\beta [/math]. For each topic ([math]\theta [/math]) you need a different v-faced die ([math]\beta [/math]). Thus I end up drawing âkâ such âvâ-faced dice. So for topic-5, I throw the 5th v-faced die. Let us say it shows 42; I then go to the vocabulary and pick the 42nd word! I will do this whole process âNâ times (N was the number of words to be generated for the random document.) The crux of this discussion is this: for every document , the dice (i.e samples from the dirichlet([math]\alpha [/math]) and dirichelt([math]\beta [/math]) ) are generated only once. It is just that to generate each word, the dice are rolled multiple times. Once to get a topic and once to get a word given this topic!
Kripa Chettiar
Related Q & A:
- What does it mean when someone says you are a very deep person?Best solution by Yahoo! Answers
- What does it mean to 'undertake a work placement or internship (paid or unpaid) as part of my course of study'?Best solution by Yahoo! Answers
- What does it mean to have a Valentine?Best solution by wiki.answers.com
- What does it mean to have a versatile voice?Best solution by Yahoo! Answers
- What does it mean to be a marketing specialist?Best solution by ChaCha
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.