In this paper we develop multivariate outlier tests based on the high-breakdown Minimum Covariance Determinant estimator. The rules that we propose have good performance under the null hypothesis of no outliers in the data and also appreciable power properties for the purpose of individual outlier detection. This achievement is made possible by two orders of improvement over the currently available methodology. First, we suggest an approximation to the exact distribution of robust distances from which cut-off values can be obtained even in small samples. Our thresholds are accurate, simple to implement and result in more powerful outlier identification rules than those obtained by calibrating the asymptotic distribution of distances. The second power improvement comes from the addition of a new iteration step after one-step reweighting of the estimator. The proposed methodology is motivated by asymptotic distributional results. Its finite sample performance is evaluated through simulations and compared to that of available multivariate outlier tests.
Multivariate Outlier Detection With High-Breakdown Estimators / Cerioli, Andrea. - In: JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION. - ISSN 0162-1459. - 105:(2010), pp. 147-156. [10.1198/jasa.2009.tm09147]
Multivariate Outlier Detection With High-Breakdown Estimators
CERIOLI, Andrea
2010-01-01
Abstract
In this paper we develop multivariate outlier tests based on the high-breakdown Minimum Covariance Determinant estimator. The rules that we propose have good performance under the null hypothesis of no outliers in the data and also appreciable power properties for the purpose of individual outlier detection. This achievement is made possible by two orders of improvement over the currently available methodology. First, we suggest an approximation to the exact distribution of robust distances from which cut-off values can be obtained even in small samples. Our thresholds are accurate, simple to implement and result in more powerful outlier identification rules than those obtained by calibrating the asymptotic distribution of distances. The second power improvement comes from the addition of a new iteration step after one-step reweighting of the estimator. The proposed methodology is motivated by asymptotic distributional results. Its finite sample performance is evaluated through simulations and compared to that of available multivariate outlier tests.File | Dimensione | Formato | |
---|---|---|---|
jasa.2009.pdf
non disponibili
Tipologia:
Documento in Post-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
453.01 kB
Formato
Adobe PDF
|
453.01 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.