Zoeken
Zoeken kan via de modus 'eenvoudig zoeken' (één veld) of uitgebreid via 'geavanceerd zoeken' (meerdere velden). Zo kan je bv. zoeken op een combinatie van een auteursnaam (auteur), een jaartal (jaar) en een documenttype.
Boekenmand
Nuttige resultaten kan je aanvinken en toevoegen aan een mandje. De inhoud hiervan kan je exporteren of afdrukken (naar bv. PDF).
RSS
Op de hoogte blijven van nieuw toegevoegde publicaties binnen uw interessegebied? Dit kan door een RSS-feed (?) te maken van jouw zoekopdracht.
nieuwe zoekopdracht
A meta analysis study of outlier detection methods in classification
Acuna, E.; Rodriguez, C. (2004). A meta analysis study of outlier detection methods in classification, in: Proceedings of the International IPSI 2004 Conference: Symposium on Challenges in Internet and Interdisciplinary Research, Venice, Italy, November 10-15, 2004. pp. 1-25
In: (2004). Proceedings of the International IPSI 2004 Conference: Symposium on Challenges in Internet and Interdisciplinary Research, Venice, Italy, November 10-15, 2004. [S.n.]: [s.l.].
|
| Abstract |
An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism (Hawkins, 1980). Outlier detection has many applications, such as data cleaning, Fraud detection and network intrusion. The existence of outliers can indicate individuals or groups that have behavior very different to the most of the individuals of the dataset. Frequently, outliers are removed to improve accuracy of the estimators. But sometimes the presence of an outlier has a certain meaning which explanation can be lost if the outlier is deleted. In this work we compare detection outlier techniques based on statistical measures, clustering methods and data mining methods. In particular we compare detection of outliers using robust estimators of the center and the covariance matrix used in the Mahalanobis distance, detection of outliers using partitioning around medoids (PAM), and two data mining techniques to detect outliers: The Bay’s algorithm for distance-based outliers (Bay, 2003) y the LOF a density-based local outlier algorithm (Breunig et al., 2000). A decision on doubtful outliers is taken by looking into two visualization techniques for high dimensional data: The parallel coordinate plot and the surveyplot. The comparison is carried out in 15 datasets. |
IMIS is ontwikkeld en wordt gehost door het VLIZ.