模型训练完成后,需要使用sklearn2pmml库将模型转化为pmml文件然后上线
例如将随机森林模型转化为PMML文件,代码如下:
#sklearn2pmml库函数的安装
!pip install sklearn2pmml
#导入相关库
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn2pmml import sklearn2pmml, PMMLPipeline
from sklearn2pmml.decoration import ContinuousDomain
from sklearn.feature_selection import SelectKBest
from sklearn_pandas import DataFrameMapper
from sklearn.pipeline import make_pipeline
# 导入缺失值处理库
from sklearn.impute import SimpleImputer
# 导入数值型变量处理库
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import Normalizer
# 导入文字型类别变量处理库
from sklearn.preprocessing import LabelBinarizer
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
# 导入特征处理库
from sklearn.decomposition import PCA
from sklearn.preprocessing import PolynomialFeatures
# 省去数据导入
# 省去训练数据集合测试数据集划分
# 建立数据处理流水线
RF_pipeline = PMMLPipeline([
("pca", PCA(n_components=5)),#进项降维处理
("RF_classifier", RandomForestClassifier(max_depth=5))])
# 模型训练
RF_pipeline.fit(X_train,y_train)
# 省去测试集验证效果
# 将训练的模型导出为pmml格式进行保存
# 执行如下代码会报错,后面有解决措施
from sklearn2pmml import sklearn2pmml
sklearn2pmml(pipeline, pmml_destination_path)
sklearn2pmml(RF_pipeline, "模型要保存的名称.pmml", with_repr = True)
报错:RuntimeError: Java is not installed, or the Java executable is not on system path
解决办法是安装java并进行配置:按照链接的地址进行安装和配置,然后再重启就可以啦
注意:一定要进行环境的配置o
https://blog.csdn.net/weixin_37601546/article/details/88623530
sklearn2pmml介绍:
sklearn2pmml(
pipeline,
pmml,
user_classpath=[],
with_repr=False,
debug=False,
)
将拟合的PMML管道对象转换为PMML文件
Parameters:
pipeline: PMMLPipeline
The input PMML pipeline object.
输入PMML管道对象
pmml: string
The output PMML file.
输出PMML文件
user_classpath: list of strings, optional
The paths to JAR files that provide custom Transformer, Selector and/or Estimator converter classes.
The JPMML-SkLearn classpath is constructed by appending user JAR files to package JAR files.
提供自定义Transformer、Selector和/或Estimator转换器类的JAR文件的路径。
jmml - sklearn类路径是通过将用户JAR文件追加到包JAR文件中来构造的。
with_repr: boolean, optional
If true, insert the string representation of pipeline into the PMML document.
如果为true,将管道的字符串表示插入PMML文档中
debug: boolean, optional
If true, print information about the conversion process.
如果为true,则打印有关转换过程的信息。