Numpy sum elements in array based on its value

Python programming language
Post Reply
User avatar
Ziroy
Posts: 34
Joined: Tue Oct 24, 2017 12:32 pm

Numpy sum elements in array based on its value

Post by Ziroy » Tue Feb 13, 2018 5:23 pm

I have unsorted range of indexes:

Code: Select all

 i = np.array( [1,5,2,6,4,3,6,7,4,3,2] 
I likewise have a range of worths of the very same length:

Code: Select all

 v = np.array( [2,5,2,3,4,1,2,1,6,4,2] 
I have variety with zeros of wanted worths:

Code: Select all

 d = np.zeros( 10) 
Now I want to add to aspects in d values of v based on it's index in i. If I do it in plain python I would do it like this:

Code: Select all

 for index, worth in enumerate( v):.
idx = i [index] d [idx] = v [index] 
It is ugly and inefficient. How can I change it?

User avatar
BubbleJ
Posts: 49
Joined: Fri Sep 08, 2017 3:11 pm

Re: Numpy sum elements in array based on its value

Post by BubbleJ » Thu Mar 01, 2018 8:09 pm

Code: Select all

 np.bincount 
is apparently quite efficient for such accumulative weighted counting, so here's one with that -

Code: Select all

 counts = np.bincount( i, v).
d [: counts.size] = counts 
Runtime tests This area compares

Code: Select all

 np.add.at 
based approach listed in the

Code: Select all

 other post 
with the

Code: Select all

 np.bincount 
based one noted previously in this post.

Code: Select all

 In [61]: def bincount_based( d, i, v):.
...: counts = np.bincount( i, v).
...: d [: counts.size] = counts.
...:.
...: def add_at_based( d, i, v):.
...: np.add.at( d, i, v).
...:.

In [62]: # Inputs (random numbers).
...: N = 10000.
...: i = np.random.randint( 0,1000,( N)).
...: v = np.random.randint( 0,1000,( N)).
...:.
...: # Setup output varieties for two methods.
...: M = 12000.
...: d1 = np.zeros( M).
...: d2 = np.zeros( M).
...:.

In [63]: bincount_based( d1, i, v) # Run approaches.
...: add_at_based( d2, i, v).
...:.

In [64]: np.allclose( d1, d2) # Validate outputs.
Out [64]: True.

In [67]: # Setup output arrays for two approaches once again for timing.
...: M = 12000.
...: d1 = np.zeros( M).
...: d2 = np.zeros( M).
...:.

In [68]: %timeit add_at_based( d2, i, v).
1000 loops, finest of 3: 1.83 ms per loop.

In [69]: %timeit bincount_based( d1, i, v).
10000 loops, best of 3: 52.7 s per loop 

Post Reply