您的当前位置：首页正文

Automatic Unsupervised Outlier Model Selection

来源：意榕旅游网

DataSet

prepare a finit pool of models M, each model can be seen as a {detector, hyperparameters} pair.

This table provides a comprehensive description of all included models.

Method

a collection of historical outlier detection datasets $D_{train} = {D_1, . . . , D_n }$ , namely, a meta-train database with ground truth labels, i.e., $\textstyle \bigcup_{1}^{n}D_i = (X_i, y_i)$ , and
the historical performances of the pool of candidate models, M, on the meta-train datasets. We denote by P ∈ Rn×m the performance matrix, where Pij corresponds to the j-th model Mj ’s performance on the i-th meta-train dataset Di.

(Meta-)Training (Offline)

It is easy to recognize the connection between the UOMS and the collaborative filtering (CF) under cold start problems. Simply put, meta-train datasets are akin to existing users in CF that have prior evaluations on a set of models that are akin to the item-set in CF. The test task is akin to a newcoming user with no prior evaluations (and in our case, no possible future evaluation either), which however exhibits some pre-defined features

Meta-Features design for outlier detection

1. statistical features

Captures statistical properties of the underlying data distributions; e.g., min, max, variance, skewness, covariance, etc. of the features

2. landmarker features

In addition to statistical meta-features, we use four OD-specific landmarker algorithms for computing OD-specific landmarker meta-features, iForest [31], HBOS [17], LODA [34], and PCA [20] (reconstruction error as outlier score), to capture outlying characteristics of a dataset. To this end, we first provide a quick overview of each algorithm and then discuss how we are using them for building meta-features. The algorithms are executed with the default parameter. Refer to the attached code for details of meta-feature construction.

Con sider iForest as an example. It creates a set of what-is-called extremely randomized trees that define the model structure, from which we extract structural features such as average horizontal and vertical tree imbalance. As another example, LODA builds on random-projection histograms from which we extract features such as entropy. In addition, based on the list of outlier scores from these models, we compute features such as dispersion, max consecutive gap in the sorted order, etc.

Object loss

In contrast, our goal is to rank the models for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG from the information retrieval literature [21]. For a given ranking, DCG is given as

Math the function differentiable:

Experiment

. As such, METAOD relies on the assumption that
a newcoming test dataset shares similarity with some meta-train datasets.

DataSet

We create two testbeds with different train/test dataset similarity, to systematically study the effect of task similarity

Baselines

POC Testbed Results

we observe that METAOD is superior to all baseline methods w.r.t. the average
rank and mean average precision (MAP), and performs comparably to the Empirical Upper
Bound (EUB).

ST Testbed Results

For the ST testbed, METAOD still outperforms all baseline methods w.r.t. average rank and MAP in spite of the lack of similarity between meta-train dataset and meta-test dataset。

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文