Merge and sum of two dictionaries
up vote
36
down vote
favorite
I have a dictionary below, and I want to add to another dictionary with not necessarily distinct elements and merge it's results. Is there any built-in function for this, or will I need to make my own?
{
'6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,
'7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,
'9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,
'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,
'2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11
}
The number of elements in the dictionary is also unknown.
Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.
python dictionary
add a comment |
up vote
36
down vote
favorite
I have a dictionary below, and I want to add to another dictionary with not necessarily distinct elements and merge it's results. Is there any built-in function for this, or will I need to make my own?
{
'6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,
'7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,
'9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,
'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,
'2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11
}
The number of elements in the dictionary is also unknown.
Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.
python dictionary
11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
add a comment |
up vote
36
down vote
favorite
up vote
36
down vote
favorite
I have a dictionary below, and I want to add to another dictionary with not necessarily distinct elements and merge it's results. Is there any built-in function for this, or will I need to make my own?
{
'6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,
'7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,
'9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,
'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,
'2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11
}
The number of elements in the dictionary is also unknown.
Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.
python dictionary
I have a dictionary below, and I want to add to another dictionary with not necessarily distinct elements and merge it's results. Is there any built-in function for this, or will I need to make my own?
{
'6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,
'7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,
'9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,
'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,
'2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11
}
The number of elements in the dictionary is also unknown.
Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.
python dictionary
python dictionary
edited Jun 5 at 11:57
Clemens Tolboom
766617
766617
asked May 5 '12 at 11:45
badc0re
1,40342039
1,40342039
11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
add a comment |
11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
11
11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
add a comment |
9 Answers
9
active
oldest
votes
up vote
105
down vote
accepted
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y
.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }
print(diff)
And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}
I am concerned about theonly_y
value above, as it changed to negative200
instead of retaining200
. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -b
and then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51
add a comment |
up vote
24
down vote
You can perform +
, -
, &
, and |
(intersection and union) on collections.Counter()
.
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update()
for addition and Counter.subtract()
for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
1
What if'both1': 0
inx
andy
and I want to have'both1': 0
inz
? With this solution there would be no'both1'
key inz
.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 at 6:59
add a comment |
up vote
17
down vote
You could use defaultdict
for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
add a comment |
up vote
9
down vote
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter()
behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
add a comment |
up vote
2
down vote
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
add a comment |
up vote
1
down vote
I suspect you're looking for dict
's update
method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict
. A list of tuples e.g.[(1,2),(3,4)]
would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
up vote
1
down vote
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update()
:
merged = dict(d1, **d2)
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
add a comment |
up vote
0
down vote
If you want to create a new dict
as |
use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
add a comment |
up vote
0
down vote
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__
, we have defined how to use the operator +
for our dict_merge
which inherits from the inbuilt python dict
. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. *
with __mul__
for multiplying, or /
with __div__
for dividing, or even %
with __mod__
for modulo, and replacing the +
in self[key] + other[key]
with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
add a comment |
9 Answers
9
active
oldest
votes
9 Answers
9
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
105
down vote
accepted
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y
.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }
print(diff)
And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}
I am concerned about theonly_y
value above, as it changed to negative200
instead of retaining200
. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -b
and then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51
add a comment |
up vote
105
down vote
accepted
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y
.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }
print(diff)
And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}
I am concerned about theonly_y
value above, as it changed to negative200
instead of retaining200
. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -b
and then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51
add a comment |
up vote
105
down vote
accepted
up vote
105
down vote
accepted
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
answered May 5 '12 at 12:38
georg
143k33193290
143k33193290
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y
.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }
print(diff)
And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}
I am concerned about theonly_y
value above, as it changed to negative200
instead of retaining200
. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -b
and then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51
add a comment |
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y
.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }
print(diff)
And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}
I am concerned about theonly_y
value above, as it changed to negative200
instead of retaining200
. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -b
and then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e
x-y
. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }
print(diff)
And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}
I am concerned about the only_y
value above, as it changed to negative 200
instead of retaining 200
. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?– Panchu
Sep 29 at 22:29
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e
x-y
. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }
print(diff)
And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}
I am concerned about the only_y
value above, as it changed to negative 200
instead of retaining 200
. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?– Panchu
Sep 29 at 22:29
@Panchu: how about
sub = lambda a, b: a if b is None else b if a is None else a -b
and then {k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51
@Panchu: how about
sub = lambda a, b: a if b is None else b if a is None else a -b
and then {k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51
add a comment |
up vote
24
down vote
You can perform +
, -
, &
, and |
(intersection and union) on collections.Counter()
.
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update()
for addition and Counter.subtract()
for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
1
What if'both1': 0
inx
andy
and I want to have'both1': 0
inz
? With this solution there would be no'both1'
key inz
.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 at 6:59
add a comment |
up vote
24
down vote
You can perform +
, -
, &
, and |
(intersection and union) on collections.Counter()
.
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update()
for addition and Counter.subtract()
for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
1
What if'both1': 0
inx
andy
and I want to have'both1': 0
inz
? With this solution there would be no'both1'
key inz
.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 at 6:59
add a comment |
up vote
24
down vote
up vote
24
down vote
You can perform +
, -
, &
, and |
(intersection and union) on collections.Counter()
.
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update()
for addition and Counter.subtract()
for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
You can perform +
, -
, &
, and |
(intersection and union) on collections.Counter()
.
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update()
for addition and Counter.subtract()
for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
edited Jan 5 '17 at 18:01
answered Jun 20 '15 at 4:08
Scott
2,90221735
2,90221735
1
What if'both1': 0
inx
andy
and I want to have'both1': 0
inz
? With this solution there would be no'both1'
key inz
.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 at 6:59
add a comment |
1
What if'both1': 0
inx
andy
and I want to have'both1': 0
inz
? With this solution there would be no'both1'
key inz
.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 at 6:59
1
1
What if
'both1': 0
in x
and y
and I want to have 'both1': 0
in z
? With this solution there would be no 'both1'
key in z
.– sergej
Jan 5 '17 at 9:16
What if
'both1': 0
in x
and y
and I want to have 'both1': 0
in z
? With this solution there would be no 'both1'
key in z
.– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 at 6:59
This is the most pythonic answer.
– BenP
Oct 16 at 6:59
add a comment |
up vote
17
down vote
You could use defaultdict
for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
add a comment |
up vote
17
down vote
You could use defaultdict
for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
add a comment |
up vote
17
down vote
up vote
17
down vote
You could use defaultdict
for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
You could use defaultdict
for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
answered May 5 '12 at 12:43
NPE
344k60734866
344k60734866
add a comment |
add a comment |
up vote
9
down vote
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter()
behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
add a comment |
up vote
9
down vote
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter()
behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
add a comment |
up vote
9
down vote
up vote
9
down vote
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter()
behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter()
behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
edited May 23 '17 at 12:18
Community♦
11
11
answered Feb 28 '16 at 23:47
SCB
3,62512033
3,62512033
add a comment |
add a comment |
up vote
2
down vote
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
add a comment |
up vote
2
down vote
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
add a comment |
up vote
2
down vote
up vote
2
down vote
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
answered Sep 9 '17 at 7:59
Havok
3,57912028
3,57912028
add a comment |
add a comment |
up vote
1
down vote
I suspect you're looking for dict
's update
method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict
. A list of tuples e.g.[(1,2),(3,4)]
would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
up vote
1
down vote
I suspect you're looking for dict
's update
method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict
. A list of tuples e.g.[(1,2),(3,4)]
would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
up vote
1
down vote
up vote
1
down vote
I suspect you're looking for dict
's update
method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
I suspect you're looking for dict
's update
method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
answered May 5 '12 at 11:50
zigg
12.4k42749
12.4k42749
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict
. A list of tuples e.g.[(1,2),(3,4)]
would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict
. A list of tuples e.g.[(1,2),(3,4)]
would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a
dict
. A list of tuples e.g. [(1,2),(3,4)]
would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.– zigg
May 5 '12 at 12:03
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a
dict
. A list of tuples e.g. [(1,2),(3,4)]
would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
up vote
1
down vote
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update()
:
merged = dict(d1, **d2)
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
add a comment |
up vote
1
down vote
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update()
:
merged = dict(d1, **d2)
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
add a comment |
up vote
1
down vote
up vote
1
down vote
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update()
:
merged = dict(d1, **d2)
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update()
:
merged = dict(d1, **d2)
answered Dec 2 '13 at 19:37
renskiy
82898
82898
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
add a comment |
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
I liked this tip:
merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
I liked this tip:
merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
add a comment |
up vote
0
down vote
If you want to create a new dict
as |
use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
add a comment |
up vote
0
down vote
If you want to create a new dict
as |
use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
add a comment |
up vote
0
down vote
up vote
0
down vote
If you want to create a new dict
as |
use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
If you want to create a new dict
as |
use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
answered Jan 22 '16 at 20:33
Bartosz Foder
91
91
add a comment |
add a comment |
up vote
0
down vote
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__
, we have defined how to use the operator +
for our dict_merge
which inherits from the inbuilt python dict
. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. *
with __mul__
for multiplying, or /
with __div__
for dividing, or even %
with __mod__
for modulo, and replacing the +
in self[key] + other[key]
with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
add a comment |
up vote
0
down vote
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__
, we have defined how to use the operator +
for our dict_merge
which inherits from the inbuilt python dict
. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. *
with __mul__
for multiplying, or /
with __div__
for dividing, or even %
with __mod__
for modulo, and replacing the +
in self[key] + other[key]
with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
add a comment |
up vote
0
down vote
up vote
0
down vote
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__
, we have defined how to use the operator +
for our dict_merge
which inherits from the inbuilt python dict
. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. *
with __mul__
for multiplying, or /
with __div__
for dividing, or even %
with __mod__
for modulo, and replacing the +
in self[key] + other[key]
with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__
, we have defined how to use the operator +
for our dict_merge
which inherits from the inbuilt python dict
. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. *
with __mul__
for multiplying, or /
with __div__
for dividing, or even %
with __mod__
for modulo, and replacing the +
in self[key] + other[key]
with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
answered Apr 25 '17 at 3:01
John Mutuma
12018
12018
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f10461531%2fmerge-and-sum-of-two-dictionaries%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05