Bayesian Inferencing: How do iterative parameter updates work?
-
I have been struggling with this for a while. A typical optimisation problem can be viewed as optimising some cost function which is a combination of a data term and a penalty term which encourages certain solutions. And normally there is a weighting term between the two. In the bayesian setting, this can be interpreted with the usual prior and the likelihood function. In the current problem that I am trying to understand, I model the prior as a multivariate normal with zero mean and the precision matrix equal to λÎλÎ\lambda \Lambda where λλ\lambda can be thought of this regularisation weighting and ÎÎ\Lambda is some appropriate precision matrix structure that encodes the plausible solutions somehow. Now, a typical thing I have seen is that there is some sort of an iterative scheme, where we first start with an approximation to λλ\lambda and compute the distribution over the other parameter of interest using some approximate scheme like variational Bayes or Expectation Propagation and then use this approximation to update our estimate of λλ\lambda (assuming priors over $\lambda$ are of the conjugate form, usually done with a Gamma distribution which also keeps it positive). Now, my question is that if I start with a very low value for λλ\lambda as my approximation, then the prior term would hardly have any effect. Would this not push the estimated distribution towards solutions that are less plausible i.e. basically give high probabilities to unregularized solutions? I am having a lot of trouble understanding how this update scheme can actually find good values for λλ\lambda i.e. finding the value of λλ\lambda that is optimal with respect to the observed data. So, basically what I have trouble understanding it is what is stopping the inference to drive this value of λλ\lambda down to zero or close to zero to prefer the unregularized maximum likelihood estimate?
-
Answer:
if you specify a small value for lambda in the model, this corresponds to a non-informative prior and posterior means under such a prior are close to MLE. The iterative procedure won't change the value of lambda in this case. You can further specify another level of distribution for the lambda, then procedure will find a posterior distribution for lambda.
Wei Zou at Quora Visit the source
Related Q & A:
- How To Get Madden Updates Without Internet?Best solution by reddit.com
- How to pass parameter in wordpress url?Best solution by Stack Overflow
- How can we get a work permit or holiday work visa for India nationalist?Best solution by Yahoo! Answers
- How to find status updates on new Facebook?Best solution by eHow old
- How to delete yahoo updates?Best solution by eHow old
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.