1、如何在列表、字典、集合中根据条件筛选数据?
列表
方法一:filter 函数
In [1]: from random import randint
In [2]: data = [randint(-10,10) for _ in xrange(10)]
In [3]: data
Out[3]: [5, 3, -5, -10, -3, 0, 2, 1, 8, 8]
In [4]: filter( lambda x:x>0,data)
Out[4]: [5, 3, 2, 1, 8, 8]
方法二:列表解析
In [5]: [x for x in data if x>0]
Out[5]: [5, 3, 2, 1, 8, 8]
In [6]: timeit filter(lambda x:x>0,data)# 两个方法运行时间对比:
The slowest run took 11.86 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.62 µs per loop
In [7]: timeit [x for x in data if x>0]
1000000 loops, best of 3: 747 ns per loop
tips:可以看出,列表解析的速度更快,用时为 filter 函数的一半,当然,这两种方法都快于传统的 for 循环方法。
字典
方法:字典解析
In [8]: d = {x:randint(60,100) for x in xrange(1,21)} # 随机生成20个字典
In [9]: {k:v for k,v in d.iteritems() if v>90}
Out[9]: {7: 92, 12: 97}
集合
方法:集合解析
In [13]: s = set([randint(-10,10) for _ in xrange(10)])
In [20]: s
Out[20]: {-8, -7, -6, -1, 1, 2, 3, 8, 10}
In [21]: {x for x in s if x%3==0}
Out[21]: {-6, 3}
2、如何为元组中的每个元素命名,提高程序可读性?
方法一:使用 namedtuple
In [28]: from collections import namedtuple
In [30]: Person = namedtuple('Person',['name','age','sex','email'])
In [31]: person = Person('Saders','18','male','123456@mail.com')
In [32]: person.name,person.age,person.sex,person.email
Out[32]: ('Saders', '18', 'male', '123456@mail.com')
3、如何统计序列中元素的出现频度?
方法一:for 循环
data = [randint(0,20) for _ in xrange(30)] # 随机生成30个数
c = dict.fromkeys(data,0) # 以0为初始值创建字典
In [37]: c
Out[37]: {0: 0, 1: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 10: 0, 11: 0, 12: 0, 15: 0, 16: 0, 17: 0, 19: 0, 20: 0}
In [38]: for x in data: # for 循环统计
...: c[x] += 1
In [39]: c
Out[39]:{0: 3, 1: 2, 3: 2, 4: 2, 5: 1, 6: 4, 7: 1, 10: 2, 11: 4, 12: 2, 15: 2, 16: 1, 17: 1, 19: 2, 20: 1}
方法二:使用 Counter
In [40]: from collections import Counter
In [41]: c = Counter(data)
In [42]: c
Out[42]: Counter({0: 3, 1: 2, 3: 2, 4: 2, 5: 1, 6: 4, 7: 1, 10: 2, 11: 4, 12: 2, 15: 2, 16: 1, 17: 1, 19: 2, 20: 1})
In [43]: c.most_common(3) # 出现频率最高的前3位
Out[43]: [(6, 4), (11, 4), (0, 3)]
4、如何根据字典值的大小,对字典中的项排序?
方法一:zip 函数与 sorted 函数结合
In [49]: data={x:randint(60,100) for x in 'abcdef'}
In [51]: sorted(zip(data.values(),data.keys())) # 由小到大排名
Out[51]: [(71, 'e'), (73, 'b'), (75, 'a'), (79, 'c'), (79, 'f'), (90, 'd')]
方法二:传递 sorted 函数的 key 参数
In [52]: sorted(data.items(), key=lambda x:x[1])
Out[52]: [('e', 71), ('b', 73), ('a', 75), ('c', 79), ('f', 79), ('d', 90)]
5、如何快速找到多个字典中的公共键?
方法一:for 循环
In [54]: from random import randint,sample
In [55]: r1 = {x:randint(1,4) for x in sample('qwertyu',randint(3,6))} # 生成数据
In [56]: r2 = {x:randint(1,4) for x in sample('qwertyu',randint(3,6))}
In [57]: r3 = {x:randint(1,4) for x in sample('qwertyu',randint(3,6))}
In [58]: r1
Out[58]: {'e': 3, 'q': 1, 'r': 2, 't': 4, 'u': 3, 'y': 2}
In [59]: r2
Out[59]: {'q': 3, 'r': 4, 't': 4, 'y': 4}
In [60]: r3
Out[60]: {'e': 4, 'q': 1, 'r': 1, 'u': 2, 'w': 2, 'y': 2}
In [61]: res = []
In [62]: for k in r1: # 判断三轮都出现的键
...: if k in r2 and k in r3:
...: res.append(k)
In [63]: res
Out[63]: ['q', 'r', 'y']
方法二:set 的交集操作
In [65]: r1.viewkeys() & r2.viewkeys() & r3.viewkeys()
Out[65]: {'q', 'r', 'y'}
方法三:map 函数与 reduce 函数
In [67]: reduce(lambda a,b:a&b,map(dict.viewkeys,[r1,r2,r3]))
Out[67]: {'q', 'r', 'y'}
6、如何让字典保持有序?
方法:使用 OrderedDict
In [68]: from collections import OrderedDict
In [69]: d = OrderedDict()
In [70]: d['jim'] = (1,20)
In [71]: d['bob'] = (2,30)
In [72]: d['john'] = (3,40)
In [73]: d
Out[73]: OrderedDict([('jim', (1, 20)), ('bob', (2, 30)), ('john', (3, 40))])
7、如何实现用户的历史记录功能?
方法:使用 deque
In [74]: from collections import deque
In [75]: q = deque([],5)
In [77]: q.append(1)
In [78]: q.append(2)
In [79]: q.append(3)
In [78]: q.append(4)
In [79]: q.append(5)
In [83]: q
Out[83]: deque([1, 2, 3, 4, 5])
In [82]: q.append(6)
In [83]: q
Out[83]: deque([2, 3, 4, 5, 6])
8、如何对迭代器做切片操作?
方法:使用 islice
In [91]: from itertools import islice
In [86]: l = range(15) # 生成 1-20 的数
In [87]: l
Out[87]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
In [88]: i = iter(l) # 生成迭代器
In [89]: i
Out[89]: <listiterator at 0x426b7b0>
In [91]: from itertools import islice
In [92]: for j in islice(i, 5, 10): # 将迭代器切片
...: print j
5
6
7
8
9
tips:值得注意的是,这个切片操作是会消耗 i 这个迭代器的,每次用完 islice 方法,都要重新申请。
In [93]: for k in i:
...: print k
10
11
12
13
14
9、如何拆分含有多种分隔符的字符串?
方法一:使用 str.split
In [102]: def mspilt(s,ds):
...: res = [s]
...: for d in ds:
...: t=[]
...: map(lambda x:t.extend(x.split(d)),res)
...: res = t
...: return res
In [103]: s = 'gfd;sdfew|asdfs,ewfsc\tsdf;asd? asd a'
In [107]: print mspilt(s, ',;|\t?')
['gfd', 'sdfew', 'asdfs', 'ewfsc', 'sdf', 'asd', ' asd a']
方法二:使用 re.split
In [108]: import re
In [110]: re.split(r'[,;?|\t]+',s)
Out[110]: ['gfd', 'sdfew', 'asdfs', 'ewfsc', 'sdf', 'asd', ' asd a']