调用sklearn中现有的机器学习函数包,往往都会有很多参数设置,如knn中,你需要选择k的数值,需要选择距离计算方法,需要选择是不是要按照距离远近给不同点的不同的权重,等等之类,很多时候只能依靠经验确定,这时候可以设置一个循环,来测试出最好的参数设置。
以KNN为例,代码如下:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2,random_state=42,stratify=y)
knn.score(x_test,y_test)
best_p = -1
best_score= 0.0
best_k = -1
best_method=''
for k in range(1,11):
for method in ['uniform','distance']:
for p in range(1,6):
knn_clf= KNeighborsClassifier(n_neighbors=k,weights=method,p=p)
knn_clf.fit(x_train,y_train)
score=knn_clf.score(x_test,y_test)
if score>best_score:
best_k=k
best_score=score
best_p=p
best_method = method
print (best_p)
print (best_score)
print (best_k)
print (best_method)