Python Counter Performance
Wed 16 April 2014
Python's Counters are a subclass of dictionaries used for tallying how many times elements occur in an iterable data structure like a list. I'm writing an application that makes heavy use of Counters - like creating 10,000,000 counters in one run - and I found that they were a performance chokepoint.
Since each of my counters is very similar to the previous one generated, I hoped that I could improve performance by continually mutating an existing counter rather than creating all of them from scratch. I hoped to do this by creating a Counter with a short list of changes and adding it to my existing Counter.
tl;dr
Adding Python Counters takes the same amount of time as creating each of the addend Counters. There's no performance gain from applying changes to an existing Counter by adding a second one to it vs. generating a new Counter from scratch.
Counter performance vs. dictionary performance
Python advertises Counters as 'fast'. For small n (here, n = 10) creating one is about 4 times slower than creating a regular dictionary:
>>> import timeit
>>> from collections import Counter
>>> # Timing dictionary creation
>>> timeit.timeit("x = { num : 'foo' for num in range(0, 10)}", number=1000)
0.003952980041503906
>>> # Timing Counter creation
>>> timeit.timeit("from collections import Counter; x = Counter({ num : 'foo' for num in range(0, 10)})", number=1000)
0.016061067581176758
For larger n, the performance disparity isn't nearly as extreme. Here, n = 1,000,000 and Counter creation is about 35% slower than dictionary creation:
# Timing dictionary creation
>>> timeit.timeit("x = { num : 'foo' for num in range(0, 1000000)}", number=1000)
123.48868298530579
# Timing Counter creation
>>> timeit.timeit("from collections import Counter; x = Counter({ num : 'foo' for num in range(0, 1000000)})", number=1000)
167.64849400520325
Creating counters
When creating a Counter, you can either create an empty Counter, or you can pass it an iterable whose items
you want to count. If you pass it an iterable, most of the activity to create the Counter takes place in
its update
method - Counter.__init__()
just creates an
empty Counter, then updates it with values from the iterable. For example, here are the interesting parts
of cProfile output on code to create a Counter of 1,500 elements 1,000 times.
ncalls | tottime | percall | cumtime | percall | filename:lineno(function) |
---|---|---|---|---|---|
1000 | 0.957 | 0.001 | 1.201 | 0.001 | collections.py:501(update) |
1575000 | 0.219 | 0.000 | 0.219 | 0.000 | {method 'get' of 'dict' objects} |
36000 | 0.020 | 0.000 | 0.020 | 0.000 | counter-perf-test.py:18( |
1000 | 0.003 | 0.000 | 0.004 | 0.000 | abc.py:128(instancecheck) |
1000 | 0.002 | 0.000 | 1.203 | 0.001 | collections.py:438(init) |
Counter.update()
takes by far the most time, and on top of that
it's what calls most of the functions below it. What does it look like?
def update(self, iterable=None, **kwds):
if iterable is not None:
if isinstance(iterable, Mapping):
if self:
self_get = self.get
for elem, count in iterable.iteritems():
self[elem] = self_get(elem, 0) + count
else:
super(Counter, self).update(iterable) # fast path when counter is empty
else:
self_get = self.get
for elem in iterable:
self[elem] = self_get(elem, 0) + 1
if kwds:
self.update(kwds)
First, update
checks to see if the input iterable is a Mapping
- a dictionary or
dictionary subclass (like a Counter). This is so it can use the update
method of
Counter's parent class, dict
, if the Counter is empty - a performance enhancement.
If the input iterable is not a Mapping
, or if the Counter already has stuff in it,
update
loops through the iterator. Each element in the iterator is a key in the
Counter. As update
loops, it increments the old value stored in self[elem]
by 1.
Finally, update
updates values passed by keyword argument, like those shown below:
>>> x = Counter(key1 = 'foo', key2 = 'var')
>>> print x
Counter({'key2': 'var', 'key1': 'foo'})
The performance hits in update
are in the time it takes to (1) loop through the
iterator, and (2) get
the Counter value of each element in the iterator. So
performance is tied almost entirely to the size of the iterator we're instantiating
the Counter with.
Adding Counters
Given that, adding a short Counter to a large Counter shouldn't degrade performance at all. Here I profile adding a 25-element Counter to the 1500-element one.
ncalls | tottime | percall | cumtime | percall | filename:lineno(function) |
---|---|---|---|---|---|
2000 | 1.399 | 0.001 | 1.729 | 0.001 | collections.py:501(update) |
1600000 | 0.275 | 0.000 | 0.275 | 0.000 | {method 'get' of 'dict' objects} |
1 | 0.234 | 0.234 | 2.063 | 2.063 | counter-perf-test.py:6(main) |
1035 | 0.046 | 0.000 | 0.060 | 0.000 | random.py:291(sample) |
36000 | 0.040 | 0.000 | 0.040 | 0.000 | counter-perf-test.py:18( |
36000 | 0.032 | 0.000 | 0.032 | 0.000 | counter-perf-test.py:22( |
2000 | 0.007 | 0.000 | 0.013 | 0.000 | abc.py:128(instancecheck) |
2000 | 0.007 | 0.000 | 1.736 | 0.001 | collections.py:438(init) |
25358 | 0.006 | 0.000 | 0.006 | 0.000 | {method 'add' of 'set' objects} |
Why does this lengthen the code's running time by 60%? I was surprised, because
the new Counter is not very big, and I'm not making any copies - I thought that adding
Counters would be quick. Let's look at Counter__add__()
:
def __add__(self, other):
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
newcount = count + other[elem]
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count > 0:
result[elem] = count
return result
Here's why it takes so long: it creates an entirely new counter (result
) by
iterating through the items in both of the input Counters (self.items()
and other.items()
).
Adding two Counters takes the same amount of time as it took to instantiate those
Counters initially.
Gah. Looks like I'll have to come up with another idea to improve my program's running time.