机器学习一些代码记录

计算多分类时的每个类别的F1

  • 接口
sklearn.metrics.classification_report(y_true, y_pred, labels=None, target_names=None, sample_weight=None, digits=2, output_dict=False)

示例:

from sklearn.metrics import classification_report
y_true = [0,0, 1, 2, 2, 2, 0]
y_pred = [0, 1, 0, 2, 2, 1, 0]
target_names = ['dog', 'pig', 'cat']
result = classification_report(y_true, y_pred, target_names=target_names, output_dict=True)
print(result)
image.png

pytorch 使用K-折交叉验证

pytorch 使用K-折交叉验证

核心代码

  # Define the K-fold Cross Validator
  kfold = KFold(n_splits=k_folds, shuffle=True)

  # K-fold Cross Validation model evaluation
  for fold, (train_ids, test_ids) in enumerate(kfold.split(dataset))
    
    # Sample elements randomly from a given list of ids, no replacement.
    train_subsampler = torch.utils.data.SubsetRandomSampler(train_ids)
    test_subsampler = torch.utils.data.SubsetRandomSampler(test_ids)
    
    # Define data loaders for training and testing data in this fold
    trainloader = torch.utils.data.DataLoader(
                      dataset, 
                      batch_size=10, sampler=train_subsampler)
    testloader = torch.utils.data.DataLoader(
                      dataset,
                      batch_size=10, sampler=test_subsampler)

Pytorch的nn.CrossEntropyLoss()的weight使用

Pytorch的nn.CrossEntropyLoss()的weight使用

  • 大多使用:1/类别出现的次数, 有人建议使用:出现类别最多的数目/自身类别出现的次数
    核心代码
weights = [1/1016, 1/12852, 1/12888, 1/3380, 1/296] #[ 1 / number of instances for each class]
class_weights = torch.FloatTensor(weights).cuda()

criterion = torch.nn.CrossEntropyLoss(weight=class_weights)

BERT模型中车cased 是需要区分大小写的,也就是字符不要lower() . uncased 是不区分大小写的,也就是此表只有小写,字符需要lower()

马氏距离的计算

import numpy as np
from scipy.spatial.distance import mahalanobis

def mahalanobis_distance(p, distr):

    # p: a point
    # distr : a distribution

    # covariance matrix
    cov = np.cov(distr, rowvar=False)

    # average of the points in distr
    avg_distri = np.average(distr, axis=0)

    dis = mahalanobis(p, avg_distri, cov)

    return dis
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容