林轩田机器学习基石课程 - PLA算法 python实现

作业1:计算

Q1. Implement a version of PLA by visiting examples in the naive cycle using the order of examples in the data set.
Run the algorithm on the data set.
What is the number of updates before the algorithm halts?

       def pla_1(self, X, Y):
        """
        统计迭代次数
        :param X: 特征集
        :param Y: 标签集
        :return: 返回迭代次数
        """
        # w权重初始化, 默认设置成与X唯独一样的零法向量
        W = np.zeros(X.shape[1])

        # PLA iteration
        halt = 0  # number of iteration before halt
        for i in range(X.shape[0]):  # 遍历所有所有X
            score = np.dot(X[i, :], W)  # 计算X与w的乘积
            if score * Y[i] <= 0:  # 出现错误,即 y*X <= 0的时候
                W = W + np.dot(X[i, :].T, Y[i])  # 重置W权重
                halt = halt + 1  # 增加迭代的次数

        return halt

作业2:

Q2. Implement a version of PLA by visiting examples in fixed, pre-determined random cycles throughout the algorithm.
Run the algorithm on the data set. Please repeat your experiment for 2000 times, each with a different random seed.
What is the average number of updates before the algorithm halts?
Plot a histogram ( https://en.wikipedia.org/wiki/Histogram ) to show the number of updates versus frequency.

    def pla_2(self, X, Y):
        """
        在pla_2的基础上统计 平均迭代的次数
        :param X: 特征集
        :param Y: 标签集
        :return: 平均迭代次数和准确率
        """
        Iteration = 2000  # 设置最大的迭代次数
        Halts = []  # list store halt every iteration
        Accuracys = []  # list store accuracy every iteration

        for iter in range(Iteration):
            np.random.seed(iter)  # set random seed, different by iteration

            # 随机选取一个点
            permutation = np.random.permutation(X.shape[0])  # random select index
            X = X[permutation]  # random order X
            Y = Y[permutation]  # random order Y, as the same as X

            # 与 pla_1 的功能一样,遍历X, 更新w权重和统计迭代次数
            W = np.zeros(X.shape[1])  # weights initialization
            halt = 0  # number of iteration before halt
            for i in range(X.shape[0]):
                score = np.dot(X[i, :], W)  # score
                if score * Y[i] <= 0:  # classification error
                    W = W + np.dot(X[i, :].T, Y[i])
                    halt = halt + 1

            # 设置Y标签,如果大于0置1,小于0置-1
            Y_pred = np.dot(X, W)
            Y_pred[Y_pred > 0] = 1
            Y_pred[Y_pred < 0] = -1
            accuracy = np.mean(Y_pred == Y)

            # store Halts & Accuracys
            Halts.append(halt)
            Accuracys.append(accuracy)

        # mean
        halt_mean = np.mean(Halts)
        accuracy_mean = np.mean(Accuracys)

        return halt_mean, accuracy_mean

作业3:

Q3. Implement a version of PLA by visiting examples in fixed, pre-determined random cycles throughout the algorithm, while changing the update rule to be:
Wt+1→Wt+ηyn(t)xn(t) with η=0.5η=0.5 . Note that your PLA in the previous problem corresponds to η=1η=1 .
Please repeat your experiment for 2000 times, each with a different random seed. What is the average number of updates before the algorithm halts?
Plot a histogram to show the number of updates versus frequency. Compare your result to the previous problem and briefly discuss your findings.


    def pla_3(self, X, Y):

        Iteration = 2000  # number of iteration
        Halts = []  # list store halt every iteration
        Accuracys = []  # list store accuracy every iteration

        for iter in range(Iteration):
            np.random.seed(iter)  # set random seed, different by iteration
            permutation = np.random.permutation(X.shape[0])  # random select index
            X = X[permutation]  # random order X_data
            Y = Y[permutation]  # random order Y_data, as the same as X_data

            # look through the entire data set
            W = np.zeros(X.shape[1])  # weights initialization
            halt = 0  # number of iteration before halt
            for i in range(X.shape[0]):
                score = np.dot(X[i, :], W)  # score
                if score * Y[i] <= 0:  # classification error
                    W = W + 0.5 * np.dot(X[i, :].T, Y[i])
                    halt = halt + 1

            # accuracy
            Y_pred = np.dot(X, W)
            Y_pred[Y_pred > 0] = 1
            Y_pred[Y_pred < 0] = -1
            accuracy = np.mean(Y_pred == Y)

            # store Halts & Accuracys
            Halts.append(halt)
            Accuracys.append(accuracy)

        # mean
        halt_mean = np.mean(Halts)
        accuracy_mean = np.mean(Accuracys)
        return halt_mean, accuracy_mean
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容