Trimming principles play an important role in robust statis- tics. However, their use for clustering typically requires some preliminary information about the contamination rate and the number of groups. We suggest a fresh approach to trimming that does not rely on this knowledge and that proves to be particularly suited for solving problems in robust cluster analysis. Our approach replaces the original K-population (robust) estimation problem with K distinct one-population steps, which take advantage of the good breakdown properties of trimmed estimators when the trimming level exceeds the usual bound of 0.5. In this set- ting, we prove that exact affine equivariance is lost on one hand but, on the other hand, an arbitrarily high breakdown point can be achieved by “anchoring” the robust estima- tor. We also support the use of adaptive trimming schemes, in order to infer the contamination rate from the data. A further bonus of our methodology is its ability to provide a reliable choice of the usually unknown number of groups.

Wild adaptive trimming for robust estimation and cluster analysis / Cerioli, Andrea; Farcomeni, Alessio; Riani, Marco. - In: SCANDINAVIAN JOURNAL OF STATISTICS. - ISSN 1467-9469. - 46:(2019), pp. 235-256. [10.1111/sjos.12349]

Wild adaptive trimming for robust estimation and cluster analysis

Cerioli, Andrea;Riani, Marco
2019

Abstract

Trimming principles play an important role in robust statis- tics. However, their use for clustering typically requires some preliminary information about the contamination rate and the number of groups. We suggest a fresh approach to trimming that does not rely on this knowledge and that proves to be particularly suited for solving problems in robust cluster analysis. Our approach replaces the original K-population (robust) estimation problem with K distinct one-population steps, which take advantage of the good breakdown properties of trimmed estimators when the trimming level exceeds the usual bound of 0.5. In this set- ting, we prove that exact affine equivariance is lost on one hand but, on the other hand, an arbitrarily high breakdown point can be achieved by “anchoring” the robust estima- tor. We also support the use of adaptive trimming schemes, in order to infer the contamination rate from the data. A further bonus of our methodology is its ability to provide a reliable choice of the usually unknown number of groups.
Wild adaptive trimming for robust estimation and cluster analysis / Cerioli, Andrea; Farcomeni, Alessio; Riani, Marco. - In: SCANDINAVIAN JOURNAL OF STATISTICS. - ISSN 1467-9469. - 46:(2019), pp. 235-256. [10.1111/sjos.12349]
File in questo prodotto:
File Dimensione Formato  
WildTrimming_SJS_V2.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Creative commons
Dimensione 840.39 kB
Formato Adobe PDF
840.39 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/2849287
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 15
social impact