Discussion:
LinearModelFit regression estimated variance error
(too old to reply)
Parita
2009-08-05 09:40:59 UTC
Permalink
Hi

I want to run linear regression in Mathematica and am using
LinearModelFit for the same. Following is the code that I am using.
data1 contains the 10 columns. The first 9 columns contains the data
through which I want to run regression and the last column contains
the response value.

model = LinearModelFit [data1, {a, b, c, d, e, f, g, h, i}, {a, b, c,
d, e, f, g, h, i}];
Print[model["BestFit"]];

However, I am getting the following two errors -
FittedModel::varzero: The estimated variance is zero. Properties
requiring division by the variance or standard error will not be
computed.

FittedModel::varnum: The estimated variance -8.76512*10^-32 is not a
positive number. Properties requiring division by the variance or
standard error will not be computed.
FittedModel::badfit: -- Message text not found --

I am getting the coefficients for the regression model bu the standard
error, p-values and R-squared are indeterminate. Any ideas what this
error means and how can I go around it?

Thanks in advance for your help
Darren Glosemeyer
2009-08-06 10:32:36 UTC
Permalink
Post by Parita
Hi
I want to run linear regression in Mathematica and am using
LinearModelFit for the same. Following is the code that I am using.
data1 contains the 10 columns. The first 9 columns contains the data
through which I want to run regression and the last column contains
the response value.
model = LinearModelFit [data1, {a, b, c, d, e, f, g, h, i}, {a, b, c,
d, e, f, g, h, i}];
Print[model["BestFit"]];
However, I am getting the following two errors -
FittedModel::varzero: The estimated variance is zero. Properties
requiring division by the variance or standard error will not be
computed.
FittedModel::varnum: The estimated variance -8.76512*10^-32 is not a
positive number. Properties requiring division by the variance or
standard error will not be computed.
FittedModel::badfit: -- Message text not found --
I am getting the coefficients for the regression model bu the standard
error, p-values and R-squared are indeterminate. Any ideas what this
error means and how can I go around it?
Thanks in advance for your help
The messages indicate that the fitted model goes through all of the data
points. This will give a 0 variance estimate because the sum of squared
errors will be 0, and any quantity that involves division by the
variance or standard deviation will be indeterminate or infinite. In
this case, the estimate is actually a small amount of numerical noise.
This is possibly an indication that the model is over-fitting the data
or that there are too few data points.

If the model is over-fitting, it may be that fewer basis functions could
be used to get a good fit that is not an (effectively) exact fit. If the
number of data points is equal to or less than the number of basis
functions, more data (or fewer basis functions) would be needed.

Darren Glosemeyer
Wolfram Research
Bill Rowe
2009-08-06 10:33:43 UTC
Permalink
Post by Parita
I want to run linear regression in Mathematica and am using
LinearModelFit for the same. Following is the code that I am using.
data1 contains the 10 columns. The first 9 columns contains the data
through which I want to run regression and the last column contains
the response value.
model = LinearModelFit [data1, {a, b, c, d, e, f, g, h, i}, {a, b,
c, d, e, f, g, h, i}]; Print[model["BestFit"]];
<error messages snipped>
Post by Parita
I am getting the coefficients for the regression model bu the
standard error, p-values and R-squared are indeterminate. Any ideas
what this error means and how can I go around it?
You have not provided enough information for me to be certain
what is causing the error messages. I would guess the source of
the problem is collinearity between some subsets of the
independent variables and/or a lack of variation in the
dependent variables. I suggest looking for relationships between
the independent variables.

I find a useful way to quickly look for dependencies among what
are supposed to be independent variables is PairwiseScatterPlot
which is found in the package StatisticalPlots. Given a data
matrix with n columns, this function produces a n x n array of
plots showing each column plotted against every other column in
the data matrix. Any of these plots showing something that
doesn't look like a scatter diagram when the columns plotted are
your independent variables is a potential issue when doing
linear regression analysis. If one or more of this looks like a
line, that is telling you those particular columns are
essentially equal predictors for your dependent variable and one
of them should be omitted before using LinearModelFit.
pfalloon
2009-08-06 10:34:05 UTC
Permalink
Post by Parita
Hi
I want to run linear regression in Mathematica and am using
LinearModelFit for the same. Following is the code that I am using.
data1 contains the 10 columns. The first 9 columns contains the data
through which I want to run regression and the last column contains
the response value.
model = LinearModelFit [data1, {a, b, c, d, e, f, g, h, i}, {a, b, c,
d, e, f, g, h, i}];
Print[model["BestFit"]];
However, I am getting the following two errors -
FittedModel::varzero: The estimated variance is zero. Properties
requiring division by the variance or standard error will not be
computed.
FittedModel::varnum: The estimated variance -8.76512*10^-32 is not a
positive number. Properties requiring division by the variance or
standard error will not be computed.
FittedModel::badfit: -- Message text not found --
I am getting the coefficients for the regression model bu the standard
error, p-values and R-squared are indeterminate. Any ideas what this
error means and how can I go around it?
Thanks in advance for your help
It's hard to know without seeing the data. Is it possible to reproduce
the problem on a reduced dataset that you could include in the post?

The error message seems to suggest one of the estimated variances is
negative (possibly because it's zero and becomes negative due to
rounding errors?). Have you tried using the function LeastSquares to
see if the problem occurs there?

Cheers,
Peter.

Loading...