How a dependent's variable non-randomness affects taper equation fitting: research note
AbstractIn order to apply the least squares method in regression analysis, the values of the dependent variable Y should be random. In an example of regression analysis linear and nonlinear taper equations, which estimate the diameter of the tree dhi at any height of the tree hi, were compared. For each tree the diameter at the breast height of 1.3 m (D), the total tree height (H) and the diameters at heights of 0.3 m, 0.8 m and at 2 m intervals above breast height diameter (that is at 3.3, 5.3, 7.3 ... m) were measured. Two methods were used to fit equations to data: in the first method, all diameter measurements were used, therefore the values of the dependent variables were not random, because obvious autocorrelation exists between the diameters measured on the same tree. In the second method only the last (highest) diameter for each tree was taken, making the dependent variables random. Regression results, for the two methods, were compared using the confidence interval estimates for the regression coefficients, the multicollinearity tests and Fit Index (FI) values as criteria. The comparison of results showed that randomness of the dependent variable (second method) did not improve the estimates, in any of the regression equations.
Key Words: Regression analysis; Basic concepts; Random variables
Southern African Forestry Journal Issue 202 2004: 67-76