我有一个元组列表,我只想返回第二列数据,只返回唯一值
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
期望的输出:
['Andrew@gmail.com','Jim@gmail.com','Sarah@gmail.com']
我的想法是遍历列表并将第二列中的项添加到新列表中,然后使用以下代码.在我沿着这条路走得太远之前,我知道有更好的方法可以做到这一点.
from collections import Counter
cnt = Counter(mytuple_new)
unique_mytuple_new = [k for k, v in cnt.iteritems() if v > 1]
解决方法:
你可以使用zip功能:
>>> set(zip(*mytuple)[1])
set(['Sarah@gmail.com', 'Jim@gmail.com', 'Andrew@gmail.com'])
或者作为一种性能较低的方法,您可以使用map和operator.itemgetter并使用set来获取唯一的元组:
>>> from operator import itemgetter
>>> tuple(set(map(lambda x:itemgetter(1)(x),mytuple)))
('Sarah@gmail.com', 'Jim@gmail.com', 'Andrew@gmail.com')
一些答案的基准:
我的答案 :
s = """\
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
set(zip(*mytuple)[1])
"""
print timeit.timeit(stmt=s, number=100000)
0.0740020275116
icodez答案:
s = """\
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
seen = set()
[x[1] for x in mytuple if x[1] not in seen and not seen.add(x[1])]
"""
print timeit.timeit(stmt=s, number=100000)
0.0938332080841
哈桑的回答是:
s = """\
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
set([k[1] for k in mytuple])
"""
print timeit.timeit(stmt=s, number=100000)
0.0699651241302
Adem的回答:
s = """
from itertools import izip
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
set(map(lambda x: x[1], mytuple))
"""
print timeit.timeit(stmt=s, number=100000)
0.237300872803 !!!