机器学习|自动化的机器学习：5个常用AutoML 框架介绍( 二 ) 自动化|算法|Java

from hpsklearn import any_preprocessing
from hyperopt import tpe
# load dataset
iris = load_iris()
X_train X_test y_train y_test = train_test_split(iris.data.astype(np.float64)
iris.target.astype(np.float64) train_size=0.75 test_size=0.25 random_state=42)
model = HyperoptEstimator(regressor=any_regressor('reg') preprocessing=any_preprocessing('pre') loss_fn=mean_absolute_error algo=tpe.suggest max_evals=50 trial_timeout=30)
model.fit(X_train y_train)
# summarize performance
mae = model.score(X_test y_test)
print(\"MAE: %.3f\" % mae)
# summarize the best model
print(model.best_model())
4、AutoKerasAutoKeras 是一个基于 Keras 的 AutoML 系统，只需几行代码就可以实现神经架构搜索（NAS）的强大功能。它由德克萨斯 A&M 大学的 DATA 实验室开发，以 TensorFlow的tf.keras API 和Keras为基础进行实现。
AutoKeras 可以支持不同的任务，例如图像分类、结构化数据分类或回归等。
安装：
pip install autokeras
样例代码：
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist
import autokeras as ak
#Load dataset
(x_train y_train) (x_test y_test) = mnist.load_data()
print(x_train.shape) # (60000 28 28)
print(y_train.shape) # (60000)
print(y_train[:3
) # array([7 2 1
dtype=uint8)

# Initialize the image classifier.
clf = ak.ImageClassifier(overwrite=True max_trials=1)
# Feed the image classifier with training data.
clf.fit(x_train y_train epochs=10)

# Predict with the best model.
predicted_y = clf.predict(x_test)
print(predicted_y)
# Evaluate the best model with testing data.
print(clf.evaluate(x_test y_test))
5、H2O AutoML：H2O 的 AutoML 可用于在用户指定的时间限制内自动训练和调整许多模型。
H2O 提供了许多适用于 AutoML 对象（模型组）以及单个模型的可解释性方法。可以自动生成解释，并提供一个简单的界面来探索和解释 AutoML 模型。
安装：
pip insall h2o
H2O可以更详细的说是一个分布式的机器学习平台，所以就需要建立H2O的集群，这部分的代码是使用的java开发的，就需要安装jdk的支持。
在安装完成JAVA后，并且环境变量设置了java路径的情况下在cmd执行以下命令：
java -jar path_to/h2o.jar
就可以启动H2O的集群，就可以通过Web界面进行操作，如果想使用Python代码编写，可以使用以下示例
import h2o
h2o.init()
from h2o.automl import H2OAutoML
churn_df = h2o.import_file('https://raw.githubusercontent.com/srivatsan88/YouTubeLI/master/dataset/WA_Fn-UseC_-Telco-Customer-Churn.csv')
churn_df.types
churn_df.describe()
churn_trainchurn_testchurn_valid = churn_df.split_frame(ratios=[.7 .15
)
churn_train
y = \"Churn\"
x = churn_df.columns
x.remove(y)
x.remove(\"customerID\")
aml = H2OAutoML(max_models = 10 seed = 10 exclude_algos = [\"StackedEnsemble\" \"DeepLearning\"
verbosity=\"info\" nfolds=0)
!nvidia-smi
aml.train(x = x y = y training_frame = churn_train validation_frame=churn_valid)

lb = aml.leaderboard
lb.head()
churn_pred=aml.leader.predict(churn_test)
churn_pred.head()
aml.leader.model_performance(churn_test)
model_ids = list(aml.leaderboard['model_id'
.as_data_frame().iloc[:0
)
#se = h2o.get_model([mid for mid in model_ids if \"StackedEnsemble_AllModels\" in mid
[0
)
#metalearner = h2o.get_model(se.metalearner()['name'
)
model_ids
h2o.get_model([mid for mid in model_ids if \"XGBoost\" in mid
[0
)
out = h2o.get_model([mid for mid in model_ids if \"XGBoost\" in mid
[0
)
out.params
out.convert_H2OXGBoostParams_2_XGBoostParams()
out
out_gbm = h2o.get_model([mid for mid in model_ids if \"GBM\" in mid
[0
)