how to estimate the phase parameter of a complex function?

Bayesian Inferencing: How do iterative parameter updates work?

  • I have been struggling with this for a while. A typical optimisation problem can be viewed as optimising some cost function which is a combination of a data term and a penalty term which encourages certain solutions. And normally there is a weighting term between the two. In the bayesian setting, this can be interpreted with the usual prior and the likelihood function. In the current problem that I am trying to understand, I model the prior as a multivariate normal with zero mean and the precision matrix equal to λΛλΛ\lambda \Lambda where λλ\lambda can be thought of this regularisation weighting and ΛΛ\Lambda is some appropriate precision matrix structure that encodes the plausible solutions somehow. Now, a typical thing I have seen is that there is some sort of an iterative scheme, where we first start with an approximation to λλ\lambda and compute the distribution over the other parameter of interest using some approximate scheme like variational Bayes or Expectation Propagation and then use this approximation to update our estimate of λλ\lambda (assuming priors over $\lambda$ are of the conjugate form, usually done with a Gamma distribution which also keeps it positive). Now, my question is that if I start with a very low value for λλ\lambda as my approximation, then the prior term would hardly have any effect. Would this not push the estimated distribution towards solutions that are less plausible i.e. basically give high probabilities to unregularized solutions? I am having a lot of trouble understanding how this update scheme can actually find good values for λλ\lambda i.e. finding the value of λλ\lambda that is optimal with respect to the observed data. So, basically what I have trouble understanding it is what is stopping the inference to drive this value of λλ\lambda down to zero or close to zero to prefer the unregularized maximum likelihood estimate?

  • Answer:

    if you specify a small value for lambda in the model, this corresponds to a non-informative prior and posterior means under such a prior are close to MLE. The iterative procedure won't change the value of lambda in this case. You can further specify another level of distribution for the lambda, then procedure will find a posterior distribution for lambda.

Wei Zou at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.