Merge and sum of two dictionaries

up vote
36
down vote

favorite

I have a dictionary below, and I want to add to another dictionary with not necessarily distinct elements and merge it's results. Is there any built-in function for this, or will I need to make my own?

{

  '6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,

  '7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,

  '9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,

  'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,

  '2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11

}

The number of elements in the dictionary is also unknown.

Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.

edited Jun 5 at 11:57

Clemens Tolboom

766617

asked May 5 '12 at 11:45

badc0re

1,40342039

11

Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47

1

You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05

add a comment |

up vote
36
down vote

favorite

{

  '6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,

  '7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,

  '9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,

  'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,

  '2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11

}

The number of elements in the dictionary is also unknown.

Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.

edited Jun 5 at 11:57

Clemens Tolboom

766617

asked May 5 '12 at 11:45

badc0re

1,40342039

11

Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47

1

You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05

add a comment |

up vote
36
down vote

favorite

{

  '6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,

  '7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,

  '9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,

  'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,

  '2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11

}

The number of elements in the dictionary is also unknown.

Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.

edited Jun 5 at 11:57

Clemens Tolboom

766617

asked May 5 '12 at 11:45

badc0re

1,40342039

{

  '6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,

  '7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,

  '9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,

  'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,

  '2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11

}

The number of elements in the dictionary is also unknown.

Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.

python dictionary

edited Jun 5 at 11:57

Clemens Tolboom

766617

asked May 5 '12 at 11:45

badc0re

1,40342039

edited Jun 5 at 11:57

Clemens Tolboom

766617

asked May 5 '12 at 11:45

badc0re

1,40342039

edited Jun 5 at 11:57

Clemens Tolboom

766617

edited Jun 5 at 11:57

Clemens Tolboom

766617

edited Jun 5 at 11:57

Clemens Tolboom

766617

asked May 5 '12 at 11:45

badc0re

1,40342039

asked May 5 '12 at 11:45

badc0re

1,40342039

asked May 5 '12 at 11:45

badc0re

1,40342039

11

Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47

1

You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05

add a comment |

11

Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47

1

You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05

Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47

You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05

add a comment |

9 Answers
9

active

oldest

votes

up vote
105
down vote

accepted

You didn't say how exactly you want to merge, so take your pick:

x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }

Results:

{'both2': 22, 'only_x': 100, 'both1': 11}

{'both2': 22, 'both1': 11}

{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}

answered May 5 '12 at 12:38

georg

143k33193290

how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57

I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e x-y. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) } print(diff) And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9} I am concerned about the only_y value above, as it changed to negative 200 instead of retaining 200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29

@Panchu: how about sub = lambda a, b: a if b is None else b if a is None else a -b and then {k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51

add a comment |

up vote
24
down vote

You can perform +, -, &, and | (intersection and union) on collections.Counter().

So we can do the following (Note: only positive count values will remain in the dictionary):

from collections import Counter



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



z = dict(Counter(x)+Counter(y))



print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}

To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:

x = {'both1':0, 'both2':2, 'only_x': 100 }

y = {'both1':0, 'both2': -20, 'only_y':200 }

xx = Counter(x)

yy = Counter(y)

xx.update(yy)

dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}

edited Jan 5 '17 at 18:01

answered Jun 20 '15 at 4:08

Scott

2,90221735

1

What if 'both1': 0 in x and y and I want to have 'both1': 0 in z? With this solution there would be no 'both1' key in z.
– sergej
Jan 5 '17 at 9:16

@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48

This is the most pythonic answer.
– BenP
Oct 16 at 6:59

add a comment |

up vote
17
down vote

You could use defaultdict for this:

from collections import defaultdict



def dsum(*dicts):

    ret = defaultdict(int)

    for d in dicts:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print(dsum(x, y))

This produces

{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}

answered May 5 '12 at 12:43

NPE

344k60734866

add a comment |

up vote
9
down vote

Additional notes based on the answers of georg, NPE and Scott.

I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.

Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:

Firstly I used the following data:

import random



x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}

y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}

z = {'xyz': 300, 'only_z': 300}



small_tests = [x, y, z]



# 200,000 random 8 letter keys

keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]



a, b, c = {}, {}, {}



# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh

for key in keys:

    if random.getrandbits(1):

        a[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        b[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        c[key] = random.randint(0, 1000)



large_tests = [a, b, c]



print("a:", len(a), "b:", len(b), "c:", len(c))

#: a: 100069 b: 100385 c: 99989

Now each of the methods:

from collections import defaultdict, Counter



def georg_method(tests):

    return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}



def georg_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}



def npe_method(tests):

    ret = defaultdict(int)

    for d in tests:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



# Note: There is a bug with scott's method. See below for details.

def scott_method(tests):

    return dict(sum((Counter(t) for t in tests), Counter()))



def scott_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))



methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,

           "npe_method": npe_method,

           "scott_method": scott_method, "scott_method_nosum": scott_method_nosum}

I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter() behaves when adding.

Finally, the results:

Results: Small Tests

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10000 method(small_tests)

#: Method: npe_method

#: 10000 loops, best of 3: 5.16 µs per loop

#: Method: georg_method_nosum

#: 10000 loops, best of 3: 8.11 µs per loop

#: Method: georg_method

#: 10000 loops, best of 3: 11.8 µs per loop

#: Method: scott_method_nosum

#: 10000 loops, best of 3: 42.4 µs per loop

#: Method: scott_method

#: 10000 loops, best of 3: 65.3 µs per loop

Results: Large Tests

Naturally, couldn't run anywhere near as many loops

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10 method(large_tests)

#: Method: npe_method

#: 10 loops, best of 3: 227 ms per loop

#: Method: georg_method_nosum

#: 10 loops, best of 3: 327 ms per loop

#: Method: georg_method

#: 10 loops, best of 3: 455 ms per loop

#: Method: scott_method_nosum

#: 10 loops, best of 3: 510 ms per loop

#: Method: scott_method

#: 10 loops, best of 3: 600 ms per loop

Conclusion

╔═══════════════════════════╦═══════╦═════════════════════════════╗

║                           ║       ║   Best of 3 Time Per Loop   ║

║         Algorithm         ║  By   ╠══════════════╦══════════════╣

║                           ║       ║  small_tests ║  large_tests ║

╠═══════════════════════════╬═══════╬══════════════╬══════════════╣

║ defaultdict sum           ║ NPE   ║      5.16 µs ║   227,000 µs ║

║ set unions without sum()  ║ georg ║      8.11 µs ║   327,000 µs ║

║ set unions with sum()     ║       ║      11.8 µs ║   455,000 µs ║

║ Counter() without sum()   ║ Scott ║      42.4 µs ║   510,000 µs ║

║ Counter() with sum()      ║       ║      65.3 µs ║   600,000 µs ║

╚═══════════════════════════╩═══════╩══════════════╩══════════════╝

Important. YMMV.

edited May 23 '17 at 12:18

Community♦

answered Feb 28 '16 at 23:47

SCB

3,62512033

add a comment |

up vote
2
down vote

Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:

from functools import reduce



collection = [

    {'a': 1, 'b': 1},

    {'a': 2, 'b': 2},

    {'a': 3, 'b': 3},

    {'a': 4, 'b': 4, 'c': 1},

    {'a': 5, 'b': 5, 'c': 1},

    {'a': 6, 'b': 6, 'c': 1},

    {'a': 7, 'b': 7},

    {'a': 8, 'b': 8},

    {'a': 9, 'b': 9},

]





def reducer(accumulator, element):

    for key, value in element.items():

        accumulator[key] = accumulator.get(key, 0) + value

    return accumulator





total = reduce(reducer, collection, {})





assert total['a'] == sum(d.get('a', 0) for d in collection)

assert total['b'] == sum(d.get('b', 0) for d in collection)

assert total['c'] == sum(d.get('c', 0) for d in collection)



print(total)

Execution:

{'a': 45, 'b': 45, 'c': 3}

Advantages:

Simple, clear, Pythonic.

Schema-less, as long all keys are "sumable".

O(n) temporal complexity and O(1) memory complexity.

answered Sep 9 '17 at 7:59

Havok

3,57912028

add a comment |

up vote
1
down vote

I suspect you're looking for dict's update method:

>>> d1 = {1:2,3:4}

>>> d2 = {5:6,7:8}

>>> d1.update(d2)

>>> d1

{1: 2, 3: 4, 5: 6, 7: 8}

answered May 5 '12 at 11:50

zigg

12.4k42749

I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55

1

Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57

@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02

I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a dict. A list of tuples e.g. [(1,2),(3,4)] would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03

I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13

add a comment |

up vote
1
down vote

d1 = {'apples': 2, 'banana': 1}

d2 = {'apples': 3, 'banana': 2}

merged = reduce(

    lambda d, i: (

        d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d

    ),

    d2.iteritems(),

    d1.copy(),

)

There is also pretty simple replacement of dict.update():

merged = dict(d1, **d2)

answered Dec 2 '13 at 19:37

renskiy

82898

I liked this tip: merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34

add a comment |

up vote
0
down vote

If you want to create a new dict as | use:

>>> dict({'a': 1,'c': 2}, **{'c': 1})

{'a': 1, 'c': 1}

answered Jan 22 '16 at 20:33

Bartosz Foder

add a comment |

up vote
0
down vote

class dict_merge(dict):

def __add__(self, other):

    result = dict_merge({})

    for key in self.keys():

        if key in other.keys():

            result[key] = self[key] + other[key]

        else:

            result[key] = self[key]

    for key in other.keys():

        if key in self.keys():

            pass

        else:

            result[key] = other[key]

    return result





a = dict_merge({"a":2, "b":3, "d":4})

b = dict_merge({"a":1, "b":2})

c = dict_merge({"a":5, "b":6, "c":5})

d = dict_merge({"a":8, "b":6, "e":5})



print((a + b + c +d))





>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}

That is operator overloading. Using __add__, we have defined how to use the operator + for our dict_merge which inherits from the inbuilt python dict. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. * with __mul__ for multiplying, or / with __div__ for dividing, or even % with __mod__ for modulo, and replacing the + in self[key] + other[key] with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.

answered Apr 25 '17 at 3:01

John Mutuma

12018

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f10461531%2fmerge-and-sum-of-two-dictionaries%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

9 Answers
9

active

oldest

votes

9 Answers
9

active

oldest

votes

up vote
105
down vote

accepted

You didn't say how exactly you want to merge, so take your pick:

x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }

Results:

{'both2': 22, 'only_x': 100, 'both1': 11}

{'both2': 22, 'both1': 11}

{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}

answered May 5 '12 at 12:38

georg

143k33193290

how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57

I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e x-y. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) } print(diff) And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9} I am concerned about the only_y value above, as it changed to negative 200 instead of retaining 200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29

@Panchu: how about sub = lambda a, b: a if b is None else b if a is None else a -b and then {k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51

add a comment |

up vote
105
down vote

accepted

You didn't say how exactly you want to merge, so take your pick:

x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }

Results:

{'both2': 22, 'only_x': 100, 'both1': 11}

{'both2': 22, 'both1': 11}

{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}

answered May 5 '12 at 12:38

georg

143k33193290

how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57

I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e x-y. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) } print(diff) And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9} I am concerned about the only_y value above, as it changed to negative 200 instead of retaining 200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29

@Panchu: how about sub = lambda a, b: a if b is None else b if a is None else a -b and then {k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51

add a comment |

up vote
105
down vote

accepted

You didn't say how exactly you want to merge, so take your pick:

x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }

Results:

{'both2': 22, 'only_x': 100, 'both1': 11}

{'both2': 22, 'both1': 11}

{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}

answered May 5 '12 at 12:38

georg

143k33193290

You didn't say how exactly you want to merge, so take your pick:

x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }

print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }

Results:

{'both2': 22, 'only_x': 100, 'both1': 11}

{'both2': 22, 'both1': 11}

{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}

answered May 5 '12 at 12:38

georg

143k33193290

answered May 5 '12 at 12:38

georg

143k33193290

answered May 5 '12 at 12:38

georg

143k33193290

answered May 5 '12 at 12:38

georg

143k33193290

how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57

I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e x-y. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) } print(diff) And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9} I am concerned about the only_y value above, as it changed to negative 200 instead of retaining 200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29

@Panchu: how about sub = lambda a, b: a if b is None else b if a is None else a -b and then {k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51

add a comment |

how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57

I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e x-y. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) } print(diff) And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9} I am concerned about the only_y value above, as it changed to negative 200 instead of retaining 200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29

@Panchu: how about sub = lambda a, b: a if b is None else b if a is None else a -b and then {k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51

how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 at 18:57

I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e x-y. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) } print(diff) And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9} I am concerned about the only_y value above, as it changed to negative 200 instead of retaining 200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 at 22:29

@Panchu: how about sub = lambda a, b: a if b is None else b if a is None else a -b and then {k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 at 0:51

add a comment |

up vote
24
down vote

You can perform +, -, &, and | (intersection and union) on collections.Counter().

So we can do the following (Note: only positive count values will remain in the dictionary):

from collections import Counter



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



z = dict(Counter(x)+Counter(y))



print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}

To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:

x = {'both1':0, 'both2':2, 'only_x': 100 }

y = {'both1':0, 'both2': -20, 'only_y':200 }

xx = Counter(x)

yy = Counter(y)

xx.update(yy)

dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}

edited Jan 5 '17 at 18:01

answered Jun 20 '15 at 4:08

Scott

2,90221735

1

What if 'both1': 0 in x and y and I want to have 'both1': 0 in z? With this solution there would be no 'both1' key in z.
– sergej
Jan 5 '17 at 9:16

@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48

This is the most pythonic answer.
– BenP
Oct 16 at 6:59

add a comment |

up vote
24
down vote

You can perform +, -, &, and | (intersection and union) on collections.Counter().

So we can do the following (Note: only positive count values will remain in the dictionary):

from collections import Counter



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



z = dict(Counter(x)+Counter(y))



print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}

To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:

x = {'both1':0, 'both2':2, 'only_x': 100 }

y = {'both1':0, 'both2': -20, 'only_y':200 }

xx = Counter(x)

yy = Counter(y)

xx.update(yy)

dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}

edited Jan 5 '17 at 18:01

answered Jun 20 '15 at 4:08

Scott

2,90221735

1

What if 'both1': 0 in x and y and I want to have 'both1': 0 in z? With this solution there would be no 'both1' key in z.
– sergej
Jan 5 '17 at 9:16

@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48

This is the most pythonic answer.
– BenP
Oct 16 at 6:59

add a comment |

up vote
24
down vote

You can perform +, -, &, and | (intersection and union) on collections.Counter().

So we can do the following (Note: only positive count values will remain in the dictionary):

from collections import Counter



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



z = dict(Counter(x)+Counter(y))



print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}

To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:

x = {'both1':0, 'both2':2, 'only_x': 100 }

y = {'both1':0, 'both2': -20, 'only_y':200 }

xx = Counter(x)

yy = Counter(y)

xx.update(yy)

dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}

edited Jan 5 '17 at 18:01

answered Jun 20 '15 at 4:08

Scott

2,90221735

You can perform +, -, &, and | (intersection and union) on collections.Counter().

So we can do the following (Note: only positive count values will remain in the dictionary):

from collections import Counter



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



z = dict(Counter(x)+Counter(y))



print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}

To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:

x = {'both1':0, 'both2':2, 'only_x': 100 }

y = {'both1':0, 'both2': -20, 'only_y':200 }

xx = Counter(x)

yy = Counter(y)

xx.update(yy)

dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}

edited Jan 5 '17 at 18:01

answered Jun 20 '15 at 4:08

Scott

2,90221735

edited Jan 5 '17 at 18:01

answered Jun 20 '15 at 4:08

Scott

2,90221735

answered Jun 20 '15 at 4:08

Scott

2,90221735

answered Jun 20 '15 at 4:08

Scott

2,90221735

1

What if 'both1': 0 in x and y and I want to have 'both1': 0 in z? With this solution there would be no 'both1' key in z.
– sergej
Jan 5 '17 at 9:16

@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48

This is the most pythonic answer.
– BenP
Oct 16 at 6:59

add a comment |

1

What if 'both1': 0 in x and y and I want to have 'both1': 0 in z? With this solution there would be no 'both1' key in z.
– sergej
Jan 5 '17 at 9:16

@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48

This is the most pythonic answer.
– BenP
Oct 16 at 6:59

What if 'both1': 0 in x and y and I want to have 'both1': 0 in z? With this solution there would be no 'both1' key in z.
– sergej
Jan 5 '17 at 9:16

@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48

This is the most pythonic answer.
– BenP
Oct 16 at 6:59

add a comment |

up vote
17
down vote

You could use defaultdict for this:

from collections import defaultdict



def dsum(*dicts):

    ret = defaultdict(int)

    for d in dicts:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print(dsum(x, y))

This produces

{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}

answered May 5 '12 at 12:43

NPE

344k60734866

add a comment |

up vote
17
down vote

You could use defaultdict for this:

from collections import defaultdict



def dsum(*dicts):

    ret = defaultdict(int)

    for d in dicts:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print(dsum(x, y))

This produces

{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}

answered May 5 '12 at 12:43

NPE

344k60734866

add a comment |

up vote
17
down vote

You could use defaultdict for this:

from collections import defaultdict



def dsum(*dicts):

    ret = defaultdict(int)

    for d in dicts:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print(dsum(x, y))

This produces

{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}

answered May 5 '12 at 12:43

NPE

344k60734866

You could use defaultdict for this:

from collections import defaultdict



def dsum(*dicts):

    ret = defaultdict(int)

    for d in dicts:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



x = {'both1':1, 'both2':2, 'only_x': 100 }

y = {'both1':10, 'both2': 20, 'only_y':200 }



print(dsum(x, y))

This produces

{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}

answered May 5 '12 at 12:43

NPE

344k60734866

answered May 5 '12 at 12:43

NPE

344k60734866

answered May 5 '12 at 12:43

NPE

344k60734866

answered May 5 '12 at 12:43

NPE

344k60734866

add a comment |

up vote
9
down vote

Additional notes based on the answers of georg, NPE and Scott.

Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:

Firstly I used the following data:

import random



x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}

y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}

z = {'xyz': 300, 'only_z': 300}



small_tests = [x, y, z]



# 200,000 random 8 letter keys

keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]



a, b, c = {}, {}, {}



# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh

for key in keys:

    if random.getrandbits(1):

        a[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        b[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        c[key] = random.randint(0, 1000)



large_tests = [a, b, c]



print("a:", len(a), "b:", len(b), "c:", len(c))

#: a: 100069 b: 100385 c: 99989

Now each of the methods:

from collections import defaultdict, Counter



def georg_method(tests):

    return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}



def georg_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}



def npe_method(tests):

    ret = defaultdict(int)

    for d in tests:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



# Note: There is a bug with scott's method. See below for details.

def scott_method(tests):

    return dict(sum((Counter(t) for t in tests), Counter()))



def scott_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))



methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,

           "npe_method": npe_method,

           "scott_method": scott_method, "scott_method_nosum": scott_method_nosum}

Finally, the results:

Results: Small Tests

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10000 method(small_tests)

#: Method: npe_method

#: 10000 loops, best of 3: 5.16 µs per loop

#: Method: georg_method_nosum

#: 10000 loops, best of 3: 8.11 µs per loop

#: Method: georg_method

#: 10000 loops, best of 3: 11.8 µs per loop

#: Method: scott_method_nosum

#: 10000 loops, best of 3: 42.4 µs per loop

#: Method: scott_method

#: 10000 loops, best of 3: 65.3 µs per loop

Results: Large Tests

Naturally, couldn't run anywhere near as many loops

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10 method(large_tests)

#: Method: npe_method

#: 10 loops, best of 3: 227 ms per loop

#: Method: georg_method_nosum

#: 10 loops, best of 3: 327 ms per loop

#: Method: georg_method

#: 10 loops, best of 3: 455 ms per loop

#: Method: scott_method_nosum

#: 10 loops, best of 3: 510 ms per loop

#: Method: scott_method

#: 10 loops, best of 3: 600 ms per loop

Conclusion

╔═══════════════════════════╦═══════╦═════════════════════════════╗

║                           ║       ║   Best of 3 Time Per Loop   ║

║         Algorithm         ║  By   ╠══════════════╦══════════════╣

║                           ║       ║  small_tests ║  large_tests ║

╠═══════════════════════════╬═══════╬══════════════╬══════════════╣

║ defaultdict sum           ║ NPE   ║      5.16 µs ║   227,000 µs ║

║ set unions without sum()  ║ georg ║      8.11 µs ║   327,000 µs ║

║ set unions with sum()     ║       ║      11.8 µs ║   455,000 µs ║

║ Counter() without sum()   ║ Scott ║      42.4 µs ║   510,000 µs ║

║ Counter() with sum()      ║       ║      65.3 µs ║   600,000 µs ║

╚═══════════════════════════╩═══════╩══════════════╩══════════════╝

Important. YMMV.

edited May 23 '17 at 12:18

Community♦

answered Feb 28 '16 at 23:47

SCB

3,62512033

add a comment |

up vote
9
down vote

Additional notes based on the answers of georg, NPE and Scott.

Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:

Firstly I used the following data:

import random



x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}

y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}

z = {'xyz': 300, 'only_z': 300}



small_tests = [x, y, z]



# 200,000 random 8 letter keys

keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]



a, b, c = {}, {}, {}



# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh

for key in keys:

    if random.getrandbits(1):

        a[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        b[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        c[key] = random.randint(0, 1000)



large_tests = [a, b, c]



print("a:", len(a), "b:", len(b), "c:", len(c))

#: a: 100069 b: 100385 c: 99989

Now each of the methods:

from collections import defaultdict, Counter



def georg_method(tests):

    return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}



def georg_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}



def npe_method(tests):

    ret = defaultdict(int)

    for d in tests:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



# Note: There is a bug with scott's method. See below for details.

def scott_method(tests):

    return dict(sum((Counter(t) for t in tests), Counter()))



def scott_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))



methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,

           "npe_method": npe_method,

           "scott_method": scott_method, "scott_method_nosum": scott_method_nosum}

Finally, the results:

Results: Small Tests

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10000 method(small_tests)

#: Method: npe_method

#: 10000 loops, best of 3: 5.16 µs per loop

#: Method: georg_method_nosum

#: 10000 loops, best of 3: 8.11 µs per loop

#: Method: georg_method

#: 10000 loops, best of 3: 11.8 µs per loop

#: Method: scott_method_nosum

#: 10000 loops, best of 3: 42.4 µs per loop

#: Method: scott_method

#: 10000 loops, best of 3: 65.3 µs per loop

Results: Large Tests

Naturally, couldn't run anywhere near as many loops

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10 method(large_tests)

#: Method: npe_method

#: 10 loops, best of 3: 227 ms per loop

#: Method: georg_method_nosum

#: 10 loops, best of 3: 327 ms per loop

#: Method: georg_method

#: 10 loops, best of 3: 455 ms per loop

#: Method: scott_method_nosum

#: 10 loops, best of 3: 510 ms per loop

#: Method: scott_method

#: 10 loops, best of 3: 600 ms per loop

Conclusion

╔═══════════════════════════╦═══════╦═════════════════════════════╗

║                           ║       ║   Best of 3 Time Per Loop   ║

║         Algorithm         ║  By   ╠══════════════╦══════════════╣

║                           ║       ║  small_tests ║  large_tests ║

╠═══════════════════════════╬═══════╬══════════════╬══════════════╣

║ defaultdict sum           ║ NPE   ║      5.16 µs ║   227,000 µs ║

║ set unions without sum()  ║ georg ║      8.11 µs ║   327,000 µs ║

║ set unions with sum()     ║       ║      11.8 µs ║   455,000 µs ║

║ Counter() without sum()   ║ Scott ║      42.4 µs ║   510,000 µs ║

║ Counter() with sum()      ║       ║      65.3 µs ║   600,000 µs ║

╚═══════════════════════════╩═══════╩══════════════╩══════════════╝

Important. YMMV.

edited May 23 '17 at 12:18

Community♦

answered Feb 28 '16 at 23:47

SCB

3,62512033

add a comment |

up vote
9
down vote

Additional notes based on the answers of georg, NPE and Scott.

Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:

Firstly I used the following data:

import random



x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}

y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}

z = {'xyz': 300, 'only_z': 300}



small_tests = [x, y, z]



# 200,000 random 8 letter keys

keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]



a, b, c = {}, {}, {}



# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh

for key in keys:

    if random.getrandbits(1):

        a[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        b[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        c[key] = random.randint(0, 1000)



large_tests = [a, b, c]



print("a:", len(a), "b:", len(b), "c:", len(c))

#: a: 100069 b: 100385 c: 99989

Now each of the methods:

from collections import defaultdict, Counter



def georg_method(tests):

    return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}



def georg_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}



def npe_method(tests):

    ret = defaultdict(int)

    for d in tests:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



# Note: There is a bug with scott's method. See below for details.

def scott_method(tests):

    return dict(sum((Counter(t) for t in tests), Counter()))



def scott_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))



methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,

           "npe_method": npe_method,

           "scott_method": scott_method, "scott_method_nosum": scott_method_nosum}

Finally, the results:

Results: Small Tests

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10000 method(small_tests)

#: Method: npe_method

#: 10000 loops, best of 3: 5.16 µs per loop

#: Method: georg_method_nosum

#: 10000 loops, best of 3: 8.11 µs per loop

#: Method: georg_method

#: 10000 loops, best of 3: 11.8 µs per loop

#: Method: scott_method_nosum

#: 10000 loops, best of 3: 42.4 µs per loop

#: Method: scott_method

#: 10000 loops, best of 3: 65.3 µs per loop

Results: Large Tests

Naturally, couldn't run anywhere near as many loops

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10 method(large_tests)

#: Method: npe_method

#: 10 loops, best of 3: 227 ms per loop

#: Method: georg_method_nosum

#: 10 loops, best of 3: 327 ms per loop

#: Method: georg_method

#: 10 loops, best of 3: 455 ms per loop

#: Method: scott_method_nosum

#: 10 loops, best of 3: 510 ms per loop

#: Method: scott_method

#: 10 loops, best of 3: 600 ms per loop

Conclusion

╔═══════════════════════════╦═══════╦═════════════════════════════╗

║                           ║       ║   Best of 3 Time Per Loop   ║

║         Algorithm         ║  By   ╠══════════════╦══════════════╣

║                           ║       ║  small_tests ║  large_tests ║

╠═══════════════════════════╬═══════╬══════════════╬══════════════╣

║ defaultdict sum           ║ NPE   ║      5.16 µs ║   227,000 µs ║

║ set unions without sum()  ║ georg ║      8.11 µs ║   327,000 µs ║

║ set unions with sum()     ║       ║      11.8 µs ║   455,000 µs ║

║ Counter() without sum()   ║ Scott ║      42.4 µs ║   510,000 µs ║

║ Counter() with sum()      ║       ║      65.3 µs ║   600,000 µs ║

╚═══════════════════════════╩═══════╩══════════════╩══════════════╝

Important. YMMV.

edited May 23 '17 at 12:18

Community♦

answered Feb 28 '16 at 23:47

SCB

3,62512033

Additional notes based on the answers of georg, NPE and Scott.

Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:

Firstly I used the following data:

import random



x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}

y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}

z = {'xyz': 300, 'only_z': 300}



small_tests = [x, y, z]



# 200,000 random 8 letter keys

keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]



a, b, c = {}, {}, {}



# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh

for key in keys:

    if random.getrandbits(1):

        a[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        b[key] = random.randint(0, 1000)

    if random.getrandbits(1):

        c[key] = random.randint(0, 1000)



large_tests = [a, b, c]



print("a:", len(a), "b:", len(b), "c:", len(c))

#: a: 100069 b: 100385 c: 99989

Now each of the methods:

from collections import defaultdict, Counter



def georg_method(tests):

    return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}



def georg_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}



def npe_method(tests):

    ret = defaultdict(int)

    for d in tests:

        for k, v in d.items():

            ret[k] += v

    return dict(ret)



# Note: There is a bug with scott's method. See below for details.

def scott_method(tests):

    return dict(sum((Counter(t) for t in tests), Counter()))



def scott_method_nosum(tests):

    # If you know you will have exactly 3 dicts

    return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))



methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,

           "npe_method": npe_method,

           "scott_method": scott_method, "scott_method_nosum": scott_method_nosum}

Finally, the results:

Results: Small Tests

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10000 method(small_tests)

#: Method: npe_method

#: 10000 loops, best of 3: 5.16 µs per loop

#: Method: georg_method_nosum

#: 10000 loops, best of 3: 8.11 µs per loop

#: Method: georg_method

#: 10000 loops, best of 3: 11.8 µs per loop

#: Method: scott_method_nosum

#: 10000 loops, best of 3: 42.4 µs per loop

#: Method: scott_method

#: 10000 loops, best of 3: 65.3 µs per loop

Results: Large Tests

Naturally, couldn't run anywhere near as many loops

for name, method in methods.items():

    print("Method:", name)

    %timeit -n10 method(large_tests)

#: Method: npe_method

#: 10 loops, best of 3: 227 ms per loop

#: Method: georg_method_nosum

#: 10 loops, best of 3: 327 ms per loop

#: Method: georg_method

#: 10 loops, best of 3: 455 ms per loop

#: Method: scott_method_nosum

#: 10 loops, best of 3: 510 ms per loop

#: Method: scott_method

#: 10 loops, best of 3: 600 ms per loop

Conclusion

╔═══════════════════════════╦═══════╦═════════════════════════════╗

║                           ║       ║   Best of 3 Time Per Loop   ║

║         Algorithm         ║  By   ╠══════════════╦══════════════╣

║                           ║       ║  small_tests ║  large_tests ║

╠═══════════════════════════╬═══════╬══════════════╬══════════════╣

║ defaultdict sum           ║ NPE   ║      5.16 µs ║   227,000 µs ║

║ set unions without sum()  ║ georg ║      8.11 µs ║   327,000 µs ║

║ set unions with sum()     ║       ║      11.8 µs ║   455,000 µs ║

║ Counter() without sum()   ║ Scott ║      42.4 µs ║   510,000 µs ║

║ Counter() with sum()      ║       ║      65.3 µs ║   600,000 µs ║

╚═══════════════════════════╩═══════╩══════════════╩══════════════╝

Important. YMMV.

edited May 23 '17 at 12:18

Community♦

answered Feb 28 '16 at 23:47

SCB

3,62512033

edited May 23 '17 at 12:18

Community♦

edited May 23 '17 at 12:18

Community♦

edited May 23 '17 at 12:18

Community♦

answered Feb 28 '16 at 23:47

SCB

3,62512033

answered Feb 28 '16 at 23:47

SCB

3,62512033

answered Feb 28 '16 at 23:47

SCB

3,62512033

add a comment |

up vote
2
down vote

Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:

from functools import reduce



collection = [

    {'a': 1, 'b': 1},

    {'a': 2, 'b': 2},

    {'a': 3, 'b': 3},

    {'a': 4, 'b': 4, 'c': 1},

    {'a': 5, 'b': 5, 'c': 1},

    {'a': 6, 'b': 6, 'c': 1},

    {'a': 7, 'b': 7},

    {'a': 8, 'b': 8},

    {'a': 9, 'b': 9},

]





def reducer(accumulator, element):

    for key, value in element.items():

        accumulator[key] = accumulator.get(key, 0) + value

    return accumulator





total = reduce(reducer, collection, {})





assert total['a'] == sum(d.get('a', 0) for d in collection)

assert total['b'] == sum(d.get('b', 0) for d in collection)

assert total['c'] == sum(d.get('c', 0) for d in collection)



print(total)

Execution:

{'a': 45, 'b': 45, 'c': 3}

Advantages:

Simple, clear, Pythonic.

Schema-less, as long all keys are "sumable".

O(n) temporal complexity and O(1) memory complexity.

answered Sep 9 '17 at 7:59

Havok

3,57912028

add a comment |

up vote
2
down vote

Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:

from functools import reduce



collection = [

    {'a': 1, 'b': 1},

    {'a': 2, 'b': 2},

    {'a': 3, 'b': 3},

    {'a': 4, 'b': 4, 'c': 1},

    {'a': 5, 'b': 5, 'c': 1},

    {'a': 6, 'b': 6, 'c': 1},

    {'a': 7, 'b': 7},

    {'a': 8, 'b': 8},

    {'a': 9, 'b': 9},

]





def reducer(accumulator, element):

    for key, value in element.items():

        accumulator[key] = accumulator.get(key, 0) + value

    return accumulator





total = reduce(reducer, collection, {})





assert total['a'] == sum(d.get('a', 0) for d in collection)

assert total['b'] == sum(d.get('b', 0) for d in collection)

assert total['c'] == sum(d.get('c', 0) for d in collection)



print(total)

Execution:

{'a': 45, 'b': 45, 'c': 3}

Advantages:

Simple, clear, Pythonic.

Schema-less, as long all keys are "sumable".

O(n) temporal complexity and O(1) memory complexity.

answered Sep 9 '17 at 7:59

Havok

3,57912028

add a comment |

up vote
2
down vote

Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:

from functools import reduce



collection = [

    {'a': 1, 'b': 1},

    {'a': 2, 'b': 2},

    {'a': 3, 'b': 3},

    {'a': 4, 'b': 4, 'c': 1},

    {'a': 5, 'b': 5, 'c': 1},

    {'a': 6, 'b': 6, 'c': 1},

    {'a': 7, 'b': 7},

    {'a': 8, 'b': 8},

    {'a': 9, 'b': 9},

]





def reducer(accumulator, element):

    for key, value in element.items():

        accumulator[key] = accumulator.get(key, 0) + value

    return accumulator





total = reduce(reducer, collection, {})





assert total['a'] == sum(d.get('a', 0) for d in collection)

assert total['b'] == sum(d.get('b', 0) for d in collection)

assert total['c'] == sum(d.get('c', 0) for d in collection)



print(total)

Execution:

{'a': 45, 'b': 45, 'c': 3}

Advantages:

Simple, clear, Pythonic.

Schema-less, as long all keys are "sumable".

O(n) temporal complexity and O(1) memory complexity.

answered Sep 9 '17 at 7:59

Havok

3,57912028

Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:

from functools import reduce



collection = [

    {'a': 1, 'b': 1},

    {'a': 2, 'b': 2},

    {'a': 3, 'b': 3},

    {'a': 4, 'b': 4, 'c': 1},

    {'a': 5, 'b': 5, 'c': 1},

    {'a': 6, 'b': 6, 'c': 1},

    {'a': 7, 'b': 7},

    {'a': 8, 'b': 8},

    {'a': 9, 'b': 9},

]





def reducer(accumulator, element):

    for key, value in element.items():

        accumulator[key] = accumulator.get(key, 0) + value

    return accumulator





total = reduce(reducer, collection, {})





assert total['a'] == sum(d.get('a', 0) for d in collection)

assert total['b'] == sum(d.get('b', 0) for d in collection)

assert total['c'] == sum(d.get('c', 0) for d in collection)



print(total)

Execution:

{'a': 45, 'b': 45, 'c': 3}

Advantages:

Simple, clear, Pythonic.

Schema-less, as long all keys are "sumable".

O(n) temporal complexity and O(1) memory complexity.

answered Sep 9 '17 at 7:59

Havok

3,57912028

answered Sep 9 '17 at 7:59

Havok

3,57912028

answered Sep 9 '17 at 7:59

Havok

3,57912028

answered Sep 9 '17 at 7:59

Havok

3,57912028

add a comment |

up vote
1
down vote

I suspect you're looking for dict's update method:

>>> d1 = {1:2,3:4}

>>> d2 = {5:6,7:8}

>>> d1.update(d2)

>>> d1

{1: 2, 3: 4, 5: 6, 7: 8}

answered May 5 '12 at 11:50

zigg

12.4k42749

I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55

1

Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57

@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02

I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a dict. A list of tuples e.g. [(1,2),(3,4)] would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03

I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13

add a comment |

up vote
1
down vote

I suspect you're looking for dict's update method:

>>> d1 = {1:2,3:4}

>>> d2 = {5:6,7:8}

>>> d1.update(d2)

>>> d1

{1: 2, 3: 4, 5: 6, 7: 8}

answered May 5 '12 at 11:50

zigg

12.4k42749

I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55

1

Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57

@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02

I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a dict. A list of tuples e.g. [(1,2),(3,4)] would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03

I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13

add a comment |

up vote
1
down vote

I suspect you're looking for dict's update method:

>>> d1 = {1:2,3:4}

>>> d2 = {5:6,7:8}

>>> d1.update(d2)

>>> d1

{1: 2, 3: 4, 5: 6, 7: 8}

answered May 5 '12 at 11:50

zigg

12.4k42749

I suspect you're looking for dict's update method:

>>> d1 = {1:2,3:4}

>>> d2 = {5:6,7:8}

>>> d1.update(d2)

>>> d1

{1: 2, 3: 4, 5: 6, 7: 8}

answered May 5 '12 at 11:50

zigg

12.4k42749

answered May 5 '12 at 11:50

zigg

12.4k42749

answered May 5 '12 at 11:50

zigg

12.4k42749

answered May 5 '12 at 11:50

zigg

12.4k42749

I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55

1

Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57

@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02

I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a dict. A list of tuples e.g. [(1,2),(3,4)] would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03

I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13

add a comment |

I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55

1

Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57

@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02

I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a dict. A list of tuples e.g. [(1,2),(3,4)] would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03

I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13

I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55

Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57

@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02

I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a dict. A list of tuples e.g. [(1,2),(3,4)] would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03

I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13

add a comment |

up vote
1
down vote

d1 = {'apples': 2, 'banana': 1}

d2 = {'apples': 3, 'banana': 2}

merged = reduce(

    lambda d, i: (

        d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d

    ),

    d2.iteritems(),

    d1.copy(),

)

There is also pretty simple replacement of dict.update():

merged = dict(d1, **d2)

answered Dec 2 '13 at 19:37

renskiy

82898

I liked this tip: merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34

add a comment |

up vote
1
down vote

d1 = {'apples': 2, 'banana': 1}

d2 = {'apples': 3, 'banana': 2}

merged = reduce(

    lambda d, i: (

        d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d

    ),

    d2.iteritems(),

    d1.copy(),

)

There is also pretty simple replacement of dict.update():

merged = dict(d1, **d2)

answered Dec 2 '13 at 19:37

renskiy

82898

I liked this tip: merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34

add a comment |

up vote
1
down vote

d1 = {'apples': 2, 'banana': 1}

d2 = {'apples': 3, 'banana': 2}

merged = reduce(

    lambda d, i: (

        d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d

    ),

    d2.iteritems(),

    d1.copy(),

)

There is also pretty simple replacement of dict.update():

merged = dict(d1, **d2)

answered Dec 2 '13 at 19:37

renskiy

82898

d1 = {'apples': 2, 'banana': 1}

d2 = {'apples': 3, 'banana': 2}

merged = reduce(

    lambda d, i: (

        d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d

    ),

    d2.iteritems(),

    d1.copy(),

)

There is also pretty simple replacement of dict.update():

merged = dict(d1, **d2)

answered Dec 2 '13 at 19:37

renskiy

82898

answered Dec 2 '13 at 19:37

renskiy

82898

answered Dec 2 '13 at 19:37

renskiy

82898

answered Dec 2 '13 at 19:37

renskiy

82898

I liked this tip: merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34

add a comment |

I liked this tip: merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34

I liked this tip: merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34

add a comment |

up vote
0
down vote

If you want to create a new dict as | use:

>>> dict({'a': 1,'c': 2}, **{'c': 1})

{'a': 1, 'c': 1}

answered Jan 22 '16 at 20:33

Bartosz Foder

add a comment |

up vote
0
down vote

If you want to create a new dict as | use:

>>> dict({'a': 1,'c': 2}, **{'c': 1})

{'a': 1, 'c': 1}

answered Jan 22 '16 at 20:33

Bartosz Foder

add a comment |

up vote
0
down vote

If you want to create a new dict as | use:

>>> dict({'a': 1,'c': 2}, **{'c': 1})

{'a': 1, 'c': 1}

answered Jan 22 '16 at 20:33

Bartosz Foder

If you want to create a new dict as | use:

>>> dict({'a': 1,'c': 2}, **{'c': 1})

{'a': 1, 'c': 1}

answered Jan 22 '16 at 20:33

Bartosz Foder

answered Jan 22 '16 at 20:33

Bartosz Foder

answered Jan 22 '16 at 20:33

Bartosz Foder

answered Jan 22 '16 at 20:33

Bartosz Foder

add a comment |

up vote
0
down vote

class dict_merge(dict):

def __add__(self, other):

    result = dict_merge({})

    for key in self.keys():

        if key in other.keys():

            result[key] = self[key] + other[key]

        else:

            result[key] = self[key]

    for key in other.keys():

        if key in self.keys():

            pass

        else:

            result[key] = other[key]

    return result





a = dict_merge({"a":2, "b":3, "d":4})

b = dict_merge({"a":1, "b":2})

c = dict_merge({"a":5, "b":6, "c":5})

d = dict_merge({"a":8, "b":6, "e":5})



print((a + b + c +d))





>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}

answered Apr 25 '17 at 3:01

John Mutuma

12018

add a comment |

up vote
0
down vote

class dict_merge(dict):

def __add__(self, other):

    result = dict_merge({})

    for key in self.keys():

        if key in other.keys():

            result[key] = self[key] + other[key]

        else:

            result[key] = self[key]

    for key in other.keys():

        if key in self.keys():

            pass

        else:

            result[key] = other[key]

    return result





a = dict_merge({"a":2, "b":3, "d":4})

b = dict_merge({"a":1, "b":2})

c = dict_merge({"a":5, "b":6, "c":5})

d = dict_merge({"a":8, "b":6, "e":5})



print((a + b + c +d))





>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}

answered Apr 25 '17 at 3:01

John Mutuma

12018

add a comment |

up vote
0
down vote

class dict_merge(dict):

def __add__(self, other):

    result = dict_merge({})

    for key in self.keys():

        if key in other.keys():

            result[key] = self[key] + other[key]

        else:

            result[key] = self[key]

    for key in other.keys():

        if key in self.keys():

            pass

        else:

            result[key] = other[key]

    return result





a = dict_merge({"a":2, "b":3, "d":4})

b = dict_merge({"a":1, "b":2})

c = dict_merge({"a":5, "b":6, "c":5})

d = dict_merge({"a":8, "b":6, "e":5})



print((a + b + c +d))





>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}

answered Apr 25 '17 at 3:01

John Mutuma

12018

class dict_merge(dict):

def __add__(self, other):

    result = dict_merge({})

    for key in self.keys():

        if key in other.keys():

            result[key] = self[key] + other[key]

        else:

            result[key] = self[key]

    for key in other.keys():

        if key in self.keys():

            pass

        else:

            result[key] = other[key]

    return result





a = dict_merge({"a":2, "b":3, "d":4})

b = dict_merge({"a":1, "b":2})

c = dict_merge({"a":5, "b":6, "c":5})

d = dict_merge({"a":8, "b":6, "e":5})



print((a + b + c +d))





>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}

answered Apr 25 '17 at 3:01

John Mutuma

12018

answered Apr 25 '17 at 3:01

John Mutuma

12018

answered Apr 25 '17 at 3:01

John Mutuma

12018

answered Apr 25 '17 at 3:01

John Mutuma

12018

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Btukfyl