How to calculate error for squared data?

What can we say about the variance of the error term in a standard linear regression?

  • Hi, Let's say you have a data set were observations are scattered in a way it is very visible the observations adhere to two different regimes. That is, the first X observations can be fitted with one linear regression very well, and the other observations with another linear regression very well. The variance of the error term for each would be very small but if you tried to regress the entire sample than you would get one line that doesn't really fit the observations and the variance of the error term would be very large. Now, suppose you calculate a statistic where in the numerator you multiply the variances of the error terms in the separate regressions and in the denominator you place the variance of the error term in the single regression. (I've seen this been done). Then, as it becomes clearer there should be two separate lines, the variance of the error term in the single regression would become greater. That's all fine and dandy but.... I saw a paper where it was argued that this would lead the statistic value to become *larger*. The only way it makes sense is if the variances are a fraction. But, if I remember basic econometrics correctly, no reason they should be. Help ?

  • Answer:

    Piecewise linear regression. Look it up.

Antony Morton at Quora Visit the source

Was this solution helpful to you?

Other answers

At its most basic we can consider the case with two regression lines as a special case of the single regression which contains an extra additive or multiplicative term. It's like putting in a dummy variable in which changes the slope and/or the intercept term.  Now they're on the same playing field we're just asking the question if adding an additional variable like we just did would increase or decrease the regression variance. The answer is pretty straight forward. Adding a new variable will NEVER increase the regression variance. It's ALWAYS monotone decreasing. That's because a new variable cannot make the model explain less than what's already been explained without it being present. So the regression variance stays the same or reduces. I can understand the motivation for the statistic you mention but it seems pretty redundant to me. Such a statistic would be monotone decreasing, as you have reasoned, but what new information does it contain that you can't get in a simpler, more intuitive way? There is a small flaw in your logic though. The single regression variance in the denominator would be constant. It's the numerators which are changing. And strictly speaking it would be better to use the standard deviations in the numerator since, if the data did not support two regression lines the statistic would approach 1 from below. Do you have a link to the paper? I'd like to see why the author would put forward the notion you mention. Maybe it's logically consistent but worded in a confusing way.

Nigel Clay

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.