问题是,如何才能完全删除数组中多次出现的元素.在下面,您会发现对于大型数组,这种方法非常慢.
知道这样做的想法吗?提前致谢.
import numpy as np
count = 0
result = []
input = np.array([[1,1], [1,1], [2,3], [4,5], [1,1]]) # array with points [x, y]
# count appearance of elements with same x and y coordinate
# append to result if element appears just once
for i in input:
for j in input:
if (j[0] == i [0]) and (j[1] == i[1]):
count += 1
if count == 1:
result.append(i)
count = 0
print np.array(result)
更新:由于过分简化
再次明确一点:如何从数组/列表中删除与某个属性有关的多次出现的元素?此处:元素长度为6的列表,如果每个元素的第一项和第二项在列表中均出现多次,请从列表中删除所有相关元素.希望我不要混淆. Eumiro在这方面为我提供了很多帮助,但是我没有设法将输出列表弄平,因为它应该是:(
import numpy as np
import collections
input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]
# here, from input there should be removed input[0], input[1] and input[4] because
# first and second entry appears more than once in the list, got it? :)
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a[2:])
outputDict = [list(k)+list(v) for k,v in d.iteritems() if len(v) == 1 ]
result = []
def flatten(x):
if isinstance(x, collections.Iterable):
return [a for i in x for a in flatten(i)]
else:
return [x]
# I took flatten(x) from https://*.com/a/2158522/1132378
# And I need it, because output is a nested list :(
for i in outputDict:
result.append(flatten(i))
print np.array(result)
因此,这可行,但是对于大列表来说是不可行的.
首先我得到了
RuntimeError:超过最大递归深度,以cmp为单位
申请后
sys.setrecursionlimit(10000)
我有
分段故障
如何为大型列表实施Eumiros解决方案> 100000个元素?
解决方法:
np.array(list(set(map(tuple, input))))
退货
array([[4, 5],
[2, 3],
[1, 1]])
更新1:如果您也想删除[1,1](因为它多次出现),则可以执行以下操作:
from collections import Counter
np.array([k for k, v in Counter(map(tuple, input)).iteritems() if v == 1])
退货
array([[4, 5],
[2, 3]])
更新2:输入为[[1,1,2],[1,1,3],[2、3、4],[4、5、5],[1,1,7]]:
input=[[1,1,2], [1,1,3], [2,3,4], [4,5,5], [1,1,7]]
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a[2])
d现在是:
{(1, 1): [2, 3, 7],
(2, 3): [4],
(4, 5): [5]}
因此,我们要获取所有具有单个值的键值对,然后重新创建数组:
np.array([k+tuple(v) for k,v in d.iteritems() if len(v) == 1])
返回:
array([[4, 5, 5],
[2, 3, 4]])
更新3:对于更大的阵列,您可以将我以前的解决方案改编为:
import numpy as np
input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a)
np.array([v for v in d.itervalues() if len(v) == 1])
返回:
array([[[456, 6, 5, 343, 435, 5]],
[[ 1, 3, 4, 5, 6, 7]],
[[ 3, 4, 6, 7, 7, 6]],
[[ 3, 3, 3, 3, 3, 3]]])