Research Reports from the Department of Operations
Document Type
Dissertation
Publication Date
5-21-1986
Abstract
In multiple regression analysis, one occasionally finds that a small subset of data has a disproportionate influence on the values of regression coefficients. The techniques for detecting such influential data points consist of a combination of numerical and graphical methods. Detecting influential data sets is a very difficult and time-consuming task, especially in large data sets. Simple and computationally efficient techniques for detecting influential data points are given in the thesis. The main contributions of the thesis are: 1) New formulas for detecting influential data which are masked by one or more data points. 2) Efficient computational procedures for BCOVRATIO and BRESRATIO and reasonable cutoff values for BCOVRATIO. 3) A new method for computing RESRATIO using dummy (indicator) variables. 4) A stopping rule for determining m, the size of a potentially influential data subset.
Keywords
Operations research, Regression analysis, Linear models (Statistics), Statistical methods
Publication Title
Dissertation/Technical Memorandums from the Department of Operations, School of Management, Case Western Reserve University
Issue
Technical memorandum no. 572 ; Submitted in partial fulfillment of the requirements for the Degree of Doctor of Philosophy.
Rights
This work is in the public domain and may be freely downloaded for personal or academic use
Recommended Citation
Bao, Chiao-Pin, "New Tools for Detecting Influential Data in Multiple Linear Regression" (1986). Research Reports from the Department of Operations. 336.
https://commons.case.edu/wsom-ops-reports/336