How to calculate error for squared data?

In probability and statistics, is there an analytic method to find the error associated with the calculated probability of (hypothesis:) one measured ratio being less than another?

  • The context is this: We are wanting to create a model to predict outcomes for subjects who get training.  It is known that an historical group of subjects got training and another did not, (T,t).  The training was supposed to assist the members in passing a future test.  But the training has been found to sometimes have an insubstantial effect or an outright negative effect of making the subject perform worse during the test.  We have data on who was trained and not trained, and who failed or did not fail, (p,P).    The approach we are looking at is this: we can calculate the two probabilities of the subjects who passed the test, knowing that they did and knowing that they did not have training: p(P|T) and p(P|t).  Assuming the probabilities are Bernoulli random variables we use that distribution to approximate a standard error, s(P|T) and s(P|t).  Using the means and errors we look to find the probability that p(P|T) is less than p(P|t), call this p[P|T<P|t].  If p[P|T<P|t]<<1 then we have confidence that the training has has a positive effect, if p[P|T<P|t]>>0 then there is confidence that the training has a negative effect. If p[P|T<P|t] is close to 0.5 then we have no confidence of any effect.  The training's effect was either insubstantial or we do not have enough data.  We can calculate p[P|T<P|t], but is there an analytic way to estimate or to formulate the error of p[P|T<P|t] respecting the distributions and methods already chosen?

  • Answer:

    It seems the problem is to  compare the difference between two proportions, or the probability of two events, deciding whether it is significant or not. Depending on the sample sizes, the candidate tests are two-sample z-test for proportion (n for P|T and P|t both >30), two-sample t-test for proportion (n<30). Also you need to decide whether to use the pooled or unpooled   version (basically they differ by whether to assume the two samples have equal or unequal variance. You can refer to  http://en.wikipedia.org/wiki/Statistical_hypothesis_testing for a detailed description. The table there can be used as a good reference.

Wenwen Tao (陶雯雯) at Quora Visit the source

Was this solution helpful to you?

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.