Title
A tourist walk approach for internal and external outlier detection
Abstract
Outlier detection is a fundamental task for knowledge discovery in data mining, especially in the Big Data era. It aims to detect data items that deviate from the general pattern of a given data set. In this paper, we present a new outlier detection technique using tourist walks starting from each data sample and varying the memory size. Specifically, a data sample gets a higher outlier score if it participates in few tourist walk attractors, while it gets a low score if it participates in a large number of attractors. Experimental results on artificial and real data sets show good performance of the proposed method. In comparison to classical outlier detection methods, the proposed one shows the following salient features: (1) It finds out outliers by identifying the structure of the input data set instead of considering only physical features, such as distance, similarity or density. (2) It can detect not only external outliers as classical methods do, but also internal outliers staying among various normal data groups. (3) By varying the memory size, the tourist walks can characterize both local and global structures of the data set. (4) A parallel implementation is quite convenient due to the nature of large amount of independent walking of the algorithm. (5) The proposed method is a deterministic technique. Therefore, only one run is sufficient, in contrast to stochastic techniques, which require many runs. Moreover, in this work, we find, for the first time, that tourist walks can generate complex attractors in various crossing shapes. Such complex attractors reveal data structures in more details. Consequently, it can improve the outlier detection performance.
Year
DOI
Venue
2020
10.1016/j.neucom.2018.10.113
Neurocomputing
Keywords
DocType
Volume
Outlier,Internal outlier,Tourist walk,Memory size,Critical memory size,Attractor,Crossing-attractor
Journal
393
ISSN
Citations 
PageRank 
0925-2312
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Rafael D. Rodrigues100.34
Liang Zhao223030.46
Qiusheng Zheng331.07
Junbao Zhang451.76