# Deficiency of Ridge Regression in Double Sampling for Regression using Two Auxiliary Variables

## DOI:

https://doi.org/10.46881/ajsn.v2i1.30## Keywords:

Auxiliary Variable, Double Sampling, Multicollinearity, Ordinary Least Squares (OLS) Estimation, Outliers and Ridge regression## Abstract

Double sampling for regression is an advanced sampling and estimation method in Statistics. Sahooâ€™s Chain regression in 1993 was a case study of double sampling for regression where two auxiliary variables ( and ) are used. Authors have established that ridge regression performs better than Ordinary Least Squares (OLS) estimation in the presence of collinearity problem in linear regression with more than one independent variable. However, since double sampling for regression with more than one auxiliary variable could as well be faced with collinearity, hence, there is need to verify the validity of ridge regression in estimating the regression coefficient for use in double sampling for regression. This research applied ridge regression estimation method to Sahoo double sampling chain regression estimator when there exists high collinearity between the two auxiliary variables and . The empirical results show that two auxiliary variables maintains higher efficiency over one auxiliary variable in double sampling for regression; it was further established that ridge regression not only performed poorly than OLS estimation but may also cause Heywood case (negative variance which is an abnormality) in double sampling for regression with two auxiliary variables in the atmosphere of multicollinearity. Further investigations show that removal of outliers in the data may solve collinearity problem, hence, OLS estimation can further be used instead of Ridge Regression in double sampling for regression with two auxiliary variables.

## References

Agunbiade, D. A. & Ogunyinka, Peter I. (2013). Effect of Correlation Level on the use of Auxiliary Variable in Double Sampling for Regression Estimation. Open J o u r n a l o f S t a t i s t i c s 3 , P p . 3 1 2 - 3 1 8 . http://dx.doi.org/10.4236/ojs.2013.35037.

Chand, L. (1975). Some Ratio-type Estimations Based on Two or more Auxiliary Variables. Unpublished Ph.D. dissertation, Iowa State University, Ames, Iowa.

Cochran, W. G (1977), Sampling Technique, John Willey and Sons Inc., (3rd Edition) New York. 428.

David Birkes & Yadolah Dodge (1993). Alternative Methods of Regression. John Willey and Sons, Inc.,228.

Dorugade, A. V. & Kashid, D.N. (2010). Alternative Methods for Choosing Ridge Parameter for Regression. Applied Mathematical Sciences. 4(9), 447-456.

El-Dereny, M. & Rashwan, N. J. (2011). Solving Multicollinearity Problem Using Ridge Regression Model. Int. J. Contemp. Math. Sciences, 6(12), 585-600.

Hoerl, A. E. (1962). Application of Ridge Analysis to Regression Problems. Chemical Engineering Progress, 58, 54-59.

Hoerl, A. E. & Kennard, R. W. (1968). On Regression Analysis and Biased Estimation. Technometrics, 10, 422-423.

Hoerl, A. E. & Kennard, R. W. (2000). Ridge Regression: Biased Estimation for Non-orthogonal Problems.

Technometrics, 42(1), 80-86.

Khalaf, G. & Shukur, G. (2005). Choosing Ridge Parameter for Regression Problem. Communications in Statistics-Theory and Methods, 34, 1177-1182.

Kiregyera, B. (1980). A Chain Ratio-Type Estimator in Finite Population Double Sampling using Two Auxiliary Variables. Metrika, 27, 217-223.

Kiregyera, B. (1984). Regression Type Estimator using Two Auxiliary Variables and Model of Double Sampling from Finite Populations. Metrika, 31, 215-226.

Lawless, J. F. & Wang, P. (1976). A Simulation Study of Ridge Regression and other Regression Estimators.

Communications in statistics-Theory and Methods. 14, 1589-1604.

Masuo, N. (1988). On the almost Unbised Ridge Regression Estimation, Communications in Statistics-Simulation, 17, Pp.729-743.

Montgomery, D. C. & Peck, E. A. (1992). Introduction to Linear Regression Analysis. John Willey and Sons, New York.

Newhouse, W. & Oman (1971). A Comparison of Ridge Estimators. Technometrics, 20, 301-311.

Neyman, J. (1938). Contribution of the Theory of Sampling Human Populations. Journal of the American Statistical Association. 33, 101-116.

Ogunyinka, Peter I. & Sodipo, A. A. (2013). Efficiency of Ratio and Regression Estimators using Double Sampling. Journal of Natural Sciences Research. ISSN 2224-3186 (Paper). 3(7), 201-207.

Okafor, F. C. (2002). Sample Survey Theory with Applications. Afro-Orbis Publication Ltd.

Pagel, M. D. & Lunneborg, C. E. (1985). Empirical Evaluation of Ridge Regression. Psychological Bulletin, 97, 342-355.

Sahoo, J, Sahoo, L. N. & Mohanty, S. (1993). A Regression Approach to Estimation in Two-phase Sampling using Two Auxiliary Variables. Current Science, 65(1), 73-75.

Sana, M. & Eyup, C. (2008). Efficient Choice of Braising Constant for Ridge Regression. Int. J. Contemp. Math. Sciences 3, 527-536.

Senapati, S. C. & Sahoo, L. N. (2006). An Alternative Class of Estimation in Double Sampling. Bull, Malays. Math. Sci. Soc. (2), 29(1), 89-94.

Stein, J. (1960). A Critical view of ridge regression. The Statistician, 22, 181-187.

Vago, E. & Kemeny, S. (2006). Logistics Ridge Regression for Clinical Data Analysis (A Case study). Applied Ecology and Environmental Research ISSN:1589 1623, 4(2), 171-179.

Vinod, H. D. & Ullah, A. (1981). Recent Advances in Regression Models. Marcel Dekker, New York.

Wethrill (1986). Evaluation of Ordinary Ridge Regression.

Bulletin of Mathematical Sciences, 18, 1-35.

William R. Dillon, Narendra Mulani & Ajith Kumar (1987). Offending Estimates in Covariance Structure Analysis: Comments on the Causes of and Solutions to Heywood Cases. Psychological Bulletin. 101(1), 126-135. www.download.com (Accessed between Tuesday, October 15, 2013 and Tuesday, October 22, 2013).