excel数据示例
要求:
1、以Name为参考,去除重复项
2、以Name为参考,输出重复项
import pandas as pd
employee=pd.read_excel("D:\\python_pandas\\sample\\demo13\\Students_Duplicates.xlsx")
table = employee.drop_duplicates(subset="Name",ignore_index=True)
print(table.head(30))
print("*"*40)
condition=employee.duplicated(subset="Name")
print(employee[condition])
输出结果:
ID Name Test_1 Test_2 Test_3
0 1 Student_001 62 86 83
1 2 Student_002 77 97 78
2 3 Student_003 57 96 46
3 4 Student_004 57 87 80
4 5 Student_005 95 59 87
5 6 Student_006 56 97 61
6 7 Student_007 64 91 67
7 8 Student_008 96 70 48
8 9 Student_009 77 73 48
9 10 Student_010 90 94 67
10 11 Student_011 62 55 63
11 12 Student_012 83 76 81
12 13 Student_013 68 60 90
13 14 Student_014 82 68 98
14 15 Student_015 61 67 91
15 16 Student_016 59 63 46
16 17 Student_017 62 83 93
17 18 Student_018 90 75 80
18 19 Student_019 100 95 55
19 20 Student_020 61 87 100
****************************************
ID Name Test_1 Test_2 Test_3
20 21 Student_001 62 86 83
21 22 Student_002 77 97 78
22 23 Student_003 57 96 46
23 24 Student_004 57 87 80
24 25 Student_005 95 59 87