python - Find unique rows in numpy.array -



python - Find unique rows in numpy.array -

i need find unique rows in numpy.array.

for example:

>>> # have array([[1, 1, 1, 0, 0, 0], [0, 1, 1, 1, 0, 0], [0, 1, 1, 1, 0, 0], [1, 1, 1, 0, 0, 0], [1, 1, 1, 1, 1, 0]]) >>> new_a # want array([[1, 1, 1, 0, 0, 0], [0, 1, 1, 1, 0, 0], [1, 1, 1, 1, 1, 0]])

i know can create set , loop on array, looking efficient pure numpy solution. believe there way set info type void , utilize numpy.unique, couldn't figure out how create work.

another alternative utilize of structured arrays using view of void type joins whole row single item:

a = np.array([[1, 1, 1, 0, 0, 0], [0, 1, 1, 1, 0, 0], [0, 1, 1, 1, 0, 0], [1, 1, 1, 0, 0, 0], [1, 1, 1, 1, 1, 0]]) b = np.ascontiguousarray(a).view(np.dtype((np.void, a.dtype.itemsize * a.shape[1]))) _, idx = np.unique(b, return_index=true) unique_a = a[idx] >>> unique_a array([[0, 1, 1, 1, 0, 0], [1, 1, 1, 0, 0, 0], [1, 1, 1, 1, 1, 0]])

edit added np.ascontiguousarray next @seberg's recommendation. slow method downwards if array not contiguous.

edit above can sped up, perhaps @ cost of clarity, doing:

unique_a = np.unique(b).view(a.dtype).reshape(-1, a.shape[1])

also, @ to the lowest degree on system, performance wise on par, or better, lexsort method:

a = np.random.randint(2, size=(10000, 6)) %timeit np.unique(a.view(np.dtype((np.void, a.dtype.itemsize*a.shape[1])))).view(a.dtype).reshape(-1, a.shape[1]) 100 loops, best of 3: 3.17 ms per loop %timeit ind = np.lexsort(a.t); a[np.concatenate(([true],np.any(a[ind[1:]]!=a[ind[:-1]],axis=1)))] 100 loops, best of 3: 5.93 ms per loop = np.random.randint(2, size=(10000, 100)) %timeit np.unique(a.view(np.dtype((np.void, a.dtype.itemsize*a.shape[1])))).view(a.dtype).reshape(-1, a.shape[1]) 10 loops, best of 3: 29.9 ms per loop %timeit ind = np.lexsort(a.t); a[np.concatenate(([true],np.any(a[ind[1:]]!=a[ind[:-1]],axis=1)))] 10 loops, best of 3: 116 ms per loop

python arrays numpy unique

Comments

Popular posts from this blog

formatting - SAS SQL Datepart function returning odd values -

c++ - Apple Mach-O Linker Error(Duplicate Symbols For Architecture armv7) -

php - Yii 2: Unable to find a class into the extension 'yii2-admin' -