prepare a finit pool of models M, each model can be seen as a {detector, hyperparameters} pair.
This table provides a comprehensive description of all included models.
It is easy to recognize the connection between the UOMS and the collaborative filtering (CF) under cold start problems. Simply put, meta-train datasets are akin to existing users in CF that have prior evaluations on a set of models that are akin to the item-set in CF. The test task is akin to a newcoming user with no prior evaluations (and in our case, no possible future evaluation either), which however exhibits some pre-defined features
Captures statistical properties of the underlying data distributions; e.g., min, max, variance, skewness, covariance, etc. of the features
In addition to statistical meta-features, we use four OD-specific landmarker algorithms for computing OD-specific landmarker meta-features, iForest [31], HBOS [17], LODA [34], and PCA [20] (reconstruction error as outlier score), to capture outlying characteristics of a dataset. To this end, we first provide a quick overview of each algorithm and then discuss how we are using them for building meta-features. The algorithms are executed with the default parameter. Refer to the attached code for details of meta-feature construction.
Con sider iForest as an example. It creates a set of what-is-called extremely randomized trees that define the model structure, from which we extract structural features such as average horizontal and vertical tree imbalance. As another example, LODA builds on random-projection histograms from which we extract features such as entropy. In addition, based on the list of outlier scores from these models, we compute features such as dispersion, max consecutive gap in the sorted order, etc.
In contrast, our goal is to rank the models for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG from the information retrieval literature [21]. For a given ranking, DCG is given as
Math the function differentiable:
. As such, METAOD relies on the assumption that
a newcoming test dataset shares similarity with some meta-train datasets.
We create two testbeds with different train/test dataset similarity, to systematically study the effect of task similarity
we observe that METAOD is superior to all baseline methods w.r.t. the average
rank and mean average precision (MAP), and performs comparably to the Empirical Upper
Bound (EUB).
For the ST testbed, METAOD still outperforms all baseline methods w.r.t. average rank and MAP in spite of the lack of similarity between meta-train dataset and meta-test dataset。
因篇幅问题不能全部显示,请点此查看更多更全内容
怀疑对方AI换脸可以让对方摁鼻子 真人摁下去鼻子会变形
女子野生动物园下车狼悄悄靠近 后车司机按喇叭提醒
睡前玩8分钟手机身体兴奋1小时 还可能让你“变丑”
惊蛰为啥吃梨?倒春寒来不来就看惊蛰
男子高速犯困开智能驾驶出事故 60万刚买的奔驰严重损毁