Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?
http://www.jmlr.org/papers/volume15/delgado14a/delgado14a.pdf
autoML( aotumated Machine Learning)
https://github.com/automl
https://github.com/rhiever/tpot
Data Science Machine
参考论文:
auto-sklearn:《Efficient and Robust Automated Machine Learning》,偏框架介绍。github:
auto-weka: 《Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms》,偏算法介绍SMAC,TPE,SMBO等。
BOA: 《The Bayesian Optimization Algorithm》
主要完成的功能有2个,也就是是一个CASH problem (Combined Algorithm Selection and Hyperparameter optimization):
Model/Algorithm Selection : It is well known that ensembles often outperform individual models ,模型/算法选择
Hyperparameter Optimization : 模型超参数优化
整体过程:
meta-learning warmstart: Meta-learning for finding good instantiations of machine learning frameworks.
超参数优化(Algorithms for Hyper-Parameter Optimization)用到的算法:
TPE: tree parzen estimator
SMAC: random forest based 《Sequential model-based optimization for general algorithm configuration》
SMBO: bayesian optimization
主要用调优器来选择参数,输入参数为不同hyperparameter,Loss function为准确率等,调优器会在随机选择一些值的基础上,利用贪心算法去寻优。
Meta-Learning
Basic Ideas: Domain experts derive knowledge from previous tasks: They learn about the performance of machine learning algorithms. The area of meta-learning mimics this strategy by reasoning about the
performance of learning algorithms across datasets. we apply meta-learning to select instantiations of our given machine learning framework that are likely to perform well on a new dataset. instantiations of our given machine learning framework that are likely to perform well on a new dataset.
More specifically, for a large number of datasets, we collect both performance data and a set of
meta-features, i.e., characteristics of the dataset that can be computed efficiently and that help to determine which algorithm to use on a new dataset.
Meta Feature (数据集的一些外在feature),计算不同数据集之间的相似度,相似的数据可以采取类似的hyper parameter.
Model Ensemble
转载请注明原文地址: https://ju.6miu.com/read-678076.html