What is the time complexity to access/find HashMap bucket(Not value in bucket)?
Suppose we have two different hashMaps, say map1 and map2.
- map1 has 1000 entries with 1000 buckets.
- map2 has 999999 entries with 999999 buckets.
And suppose, we have an object "obj1" with hashCode "1234" and we put this object as a key in both map1 and map2(with value "xyz").
Will it require more time to find "obj1" value in map2?
Will the time complexity be still O(1) for accessing obj1 from both map1 and map2?
java data-structures hashmap
add a comment |
Suppose we have two different hashMaps, say map1 and map2.
- map1 has 1000 entries with 1000 buckets.
- map2 has 999999 entries with 999999 buckets.
And suppose, we have an object "obj1" with hashCode "1234" and we put this object as a key in both map1 and map2(with value "xyz").
Will it require more time to find "obj1" value in map2?
Will the time complexity be still O(1) for accessing obj1 from both map1 and map2?
java data-structures hashmap
Why do you think it would be different? Finding the hash bucket given a hashcode is a modulo operation and an array access. Neither of those cost more if you use bigger numbers. There are obviously factors that influence it much more, like how many hashcode collisions there are (do all objects in the map have the same hashcode?)
– Krease
Nov 28 '18 at 17:10
@Krease - In HashMap, it creates array to hold hashcode, we call them as Arrayof HashCodes/Buckets. So If I want to find hashcode from that Array, I still needs to traverse through it right?
– Dr. Blue
Nov 28 '18 at 17:17
2
You don’t “traverse” through the array of buckets, you do a modulo operation on the hash with the number of buckets - his gives you they bucket the object lives in. If you have collisions (multiple objects in the bucket), then you traverse through that. Your question doesn’t specify anything about collisions though
– Krease
Nov 28 '18 at 17:27
Yes. I got your point.You mean to say "hashcode % sizeOfmap" right?
– Dr. Blue
Nov 29 '18 at 0:39
Yes. The%
character is the modulo operation in Java.
– Krease
Nov 29 '18 at 0:47
add a comment |
Suppose we have two different hashMaps, say map1 and map2.
- map1 has 1000 entries with 1000 buckets.
- map2 has 999999 entries with 999999 buckets.
And suppose, we have an object "obj1" with hashCode "1234" and we put this object as a key in both map1 and map2(with value "xyz").
Will it require more time to find "obj1" value in map2?
Will the time complexity be still O(1) for accessing obj1 from both map1 and map2?
java data-structures hashmap
Suppose we have two different hashMaps, say map1 and map2.
- map1 has 1000 entries with 1000 buckets.
- map2 has 999999 entries with 999999 buckets.
And suppose, we have an object "obj1" with hashCode "1234" and we put this object as a key in both map1 and map2(with value "xyz").
Will it require more time to find "obj1" value in map2?
Will the time complexity be still O(1) for accessing obj1 from both map1 and map2?
java data-structures hashmap
java data-structures hashmap
asked Nov 28 '18 at 17:02
Dr. BlueDr. Blue
304
304
Why do you think it would be different? Finding the hash bucket given a hashcode is a modulo operation and an array access. Neither of those cost more if you use bigger numbers. There are obviously factors that influence it much more, like how many hashcode collisions there are (do all objects in the map have the same hashcode?)
– Krease
Nov 28 '18 at 17:10
@Krease - In HashMap, it creates array to hold hashcode, we call them as Arrayof HashCodes/Buckets. So If I want to find hashcode from that Array, I still needs to traverse through it right?
– Dr. Blue
Nov 28 '18 at 17:17
2
You don’t “traverse” through the array of buckets, you do a modulo operation on the hash with the number of buckets - his gives you they bucket the object lives in. If you have collisions (multiple objects in the bucket), then you traverse through that. Your question doesn’t specify anything about collisions though
– Krease
Nov 28 '18 at 17:27
Yes. I got your point.You mean to say "hashcode % sizeOfmap" right?
– Dr. Blue
Nov 29 '18 at 0:39
Yes. The%
character is the modulo operation in Java.
– Krease
Nov 29 '18 at 0:47
add a comment |
Why do you think it would be different? Finding the hash bucket given a hashcode is a modulo operation and an array access. Neither of those cost more if you use bigger numbers. There are obviously factors that influence it much more, like how many hashcode collisions there are (do all objects in the map have the same hashcode?)
– Krease
Nov 28 '18 at 17:10
@Krease - In HashMap, it creates array to hold hashcode, we call them as Arrayof HashCodes/Buckets. So If I want to find hashcode from that Array, I still needs to traverse through it right?
– Dr. Blue
Nov 28 '18 at 17:17
2
You don’t “traverse” through the array of buckets, you do a modulo operation on the hash with the number of buckets - his gives you they bucket the object lives in. If you have collisions (multiple objects in the bucket), then you traverse through that. Your question doesn’t specify anything about collisions though
– Krease
Nov 28 '18 at 17:27
Yes. I got your point.You mean to say "hashcode % sizeOfmap" right?
– Dr. Blue
Nov 29 '18 at 0:39
Yes. The%
character is the modulo operation in Java.
– Krease
Nov 29 '18 at 0:47
Why do you think it would be different? Finding the hash bucket given a hashcode is a modulo operation and an array access. Neither of those cost more if you use bigger numbers. There are obviously factors that influence it much more, like how many hashcode collisions there are (do all objects in the map have the same hashcode?)
– Krease
Nov 28 '18 at 17:10
Why do you think it would be different? Finding the hash bucket given a hashcode is a modulo operation and an array access. Neither of those cost more if you use bigger numbers. There are obviously factors that influence it much more, like how many hashcode collisions there are (do all objects in the map have the same hashcode?)
– Krease
Nov 28 '18 at 17:10
@Krease - In HashMap, it creates array to hold hashcode, we call them as Arrayof HashCodes/Buckets. So If I want to find hashcode from that Array, I still needs to traverse through it right?
– Dr. Blue
Nov 28 '18 at 17:17
@Krease - In HashMap, it creates array to hold hashcode, we call them as Arrayof HashCodes/Buckets. So If I want to find hashcode from that Array, I still needs to traverse through it right?
– Dr. Blue
Nov 28 '18 at 17:17
2
2
You don’t “traverse” through the array of buckets, you do a modulo operation on the hash with the number of buckets - his gives you they bucket the object lives in. If you have collisions (multiple objects in the bucket), then you traverse through that. Your question doesn’t specify anything about collisions though
– Krease
Nov 28 '18 at 17:27
You don’t “traverse” through the array of buckets, you do a modulo operation on the hash with the number of buckets - his gives you they bucket the object lives in. If you have collisions (multiple objects in the bucket), then you traverse through that. Your question doesn’t specify anything about collisions though
– Krease
Nov 28 '18 at 17:27
Yes. I got your point.You mean to say "hashcode % sizeOfmap" right?
– Dr. Blue
Nov 29 '18 at 0:39
Yes. I got your point.You mean to say "hashcode % sizeOfmap" right?
– Dr. Blue
Nov 29 '18 at 0:39
Yes. The
%
character is the modulo operation in Java.– Krease
Nov 29 '18 at 0:47
Yes. The
%
character is the modulo operation in Java.– Krease
Nov 29 '18 at 0:47
add a comment |
2 Answers
2
active
oldest
votes
Finding the bucket is O(1) in a HashMap
, always, independent of the capacity (number of buckets).
Let's say your obj1
has a hash code of 1234567
. The core of a HashMap
is not to search for the correct bucket (as a TreeMap would do), but to compute its position, and immediately access the bucket with that number. That's where the hash code comes into the game.
The computation is obj.hashCode() % capacity
, and the resulting number gives the index into the bucketsArray
.
For the small hash map, it's
1234567 % 1000
=567
, meaning that the relevant bucket isbucketsArray[567]
.For the big one, it's
1234567 % 999999
=234568
, resulting inbucketsArray[234568]
.
The time necessary for computing the division rest is constant, independent of the values. The time for accessing an array with a given index is constant as well, so it's O(1).
We have only talked about finding the bucket. If the bucket contains multiple entries, a linear search completes the hash map access, and that's O(K) with K being the (average? maximum?) number of entries in a bucket.
add a comment |
I thin it would be best to answer with a code and diagram. We all know what hashing (one way) function is it is. Basically it takes arbitrary input and returns number (in java it is int but that is not always the case). And int in java has 32 bits. That means that it can be between -2,147,483,648 and 2,147,483,647. Every object on every java heep that exist can calculate it's hash (using method from java.util.Object class) and it has to be in that interval.
Now let's assume that we have 3 objects.
21234 = obj1.hashCode();
623424 = obj2.hashCode();
23124432 = obj3.hasCode();
and we want to add them to a hashMap that has 200 buckets. (this is not a working java code i typed it here)
public class MyHashMap {
private final Buckets buckets = new Buckets[200];
public boolean add(Object object){
int resultModulo = object.hashCode() % 200;
buckets[buckets].add(object);
}
}
Now for the final peace. For our object the resultModulo
will be 34(21234), 24(623424), 32(23124432). And the calculate number will not exceed 200.
Array is allocated as a continuous piece of memory. Just array of pointers(64-bits) not the actual objects. So the bucktes looks something like that
0xB80000xB80020xB80670xC1101 ....
1 2 3 4 .... 200
and so when your code invokes bucket[34],bucket[24],bucket[32] what the hardware does is this:
mov eax, bucktes[ecx*19]
; eax now contains the pointer to the
; 19 element in the array
; this is a one clock instruction
So that is why it doesn't matter how many bucket you have.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53524583%2fwhat-is-the-time-complexity-to-access-find-hashmap-bucketnot-value-in-bucket%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Finding the bucket is O(1) in a HashMap
, always, independent of the capacity (number of buckets).
Let's say your obj1
has a hash code of 1234567
. The core of a HashMap
is not to search for the correct bucket (as a TreeMap would do), but to compute its position, and immediately access the bucket with that number. That's where the hash code comes into the game.
The computation is obj.hashCode() % capacity
, and the resulting number gives the index into the bucketsArray
.
For the small hash map, it's
1234567 % 1000
=567
, meaning that the relevant bucket isbucketsArray[567]
.For the big one, it's
1234567 % 999999
=234568
, resulting inbucketsArray[234568]
.
The time necessary for computing the division rest is constant, independent of the values. The time for accessing an array with a given index is constant as well, so it's O(1).
We have only talked about finding the bucket. If the bucket contains multiple entries, a linear search completes the hash map access, and that's O(K) with K being the (average? maximum?) number of entries in a bucket.
add a comment |
Finding the bucket is O(1) in a HashMap
, always, independent of the capacity (number of buckets).
Let's say your obj1
has a hash code of 1234567
. The core of a HashMap
is not to search for the correct bucket (as a TreeMap would do), but to compute its position, and immediately access the bucket with that number. That's where the hash code comes into the game.
The computation is obj.hashCode() % capacity
, and the resulting number gives the index into the bucketsArray
.
For the small hash map, it's
1234567 % 1000
=567
, meaning that the relevant bucket isbucketsArray[567]
.For the big one, it's
1234567 % 999999
=234568
, resulting inbucketsArray[234568]
.
The time necessary for computing the division rest is constant, independent of the values. The time for accessing an array with a given index is constant as well, so it's O(1).
We have only talked about finding the bucket. If the bucket contains multiple entries, a linear search completes the hash map access, and that's O(K) with K being the (average? maximum?) number of entries in a bucket.
add a comment |
Finding the bucket is O(1) in a HashMap
, always, independent of the capacity (number of buckets).
Let's say your obj1
has a hash code of 1234567
. The core of a HashMap
is not to search for the correct bucket (as a TreeMap would do), but to compute its position, and immediately access the bucket with that number. That's where the hash code comes into the game.
The computation is obj.hashCode() % capacity
, and the resulting number gives the index into the bucketsArray
.
For the small hash map, it's
1234567 % 1000
=567
, meaning that the relevant bucket isbucketsArray[567]
.For the big one, it's
1234567 % 999999
=234568
, resulting inbucketsArray[234568]
.
The time necessary for computing the division rest is constant, independent of the values. The time for accessing an array with a given index is constant as well, so it's O(1).
We have only talked about finding the bucket. If the bucket contains multiple entries, a linear search completes the hash map access, and that's O(K) with K being the (average? maximum?) number of entries in a bucket.
Finding the bucket is O(1) in a HashMap
, always, independent of the capacity (number of buckets).
Let's say your obj1
has a hash code of 1234567
. The core of a HashMap
is not to search for the correct bucket (as a TreeMap would do), but to compute its position, and immediately access the bucket with that number. That's where the hash code comes into the game.
The computation is obj.hashCode() % capacity
, and the resulting number gives the index into the bucketsArray
.
For the small hash map, it's
1234567 % 1000
=567
, meaning that the relevant bucket isbucketsArray[567]
.For the big one, it's
1234567 % 999999
=234568
, resulting inbucketsArray[234568]
.
The time necessary for computing the division rest is constant, independent of the values. The time for accessing an array with a given index is constant as well, so it's O(1).
We have only talked about finding the bucket. If the bucket contains multiple entries, a linear search completes the hash map access, and that's O(K) with K being the (average? maximum?) number of entries in a bucket.
answered Nov 28 '18 at 18:44
Ralf KleberhoffRalf Kleberhoff
3,860156
3,860156
add a comment |
add a comment |
I thin it would be best to answer with a code and diagram. We all know what hashing (one way) function is it is. Basically it takes arbitrary input and returns number (in java it is int but that is not always the case). And int in java has 32 bits. That means that it can be between -2,147,483,648 and 2,147,483,647. Every object on every java heep that exist can calculate it's hash (using method from java.util.Object class) and it has to be in that interval.
Now let's assume that we have 3 objects.
21234 = obj1.hashCode();
623424 = obj2.hashCode();
23124432 = obj3.hasCode();
and we want to add them to a hashMap that has 200 buckets. (this is not a working java code i typed it here)
public class MyHashMap {
private final Buckets buckets = new Buckets[200];
public boolean add(Object object){
int resultModulo = object.hashCode() % 200;
buckets[buckets].add(object);
}
}
Now for the final peace. For our object the resultModulo
will be 34(21234), 24(623424), 32(23124432). And the calculate number will not exceed 200.
Array is allocated as a continuous piece of memory. Just array of pointers(64-bits) not the actual objects. So the bucktes looks something like that
0xB80000xB80020xB80670xC1101 ....
1 2 3 4 .... 200
and so when your code invokes bucket[34],bucket[24],bucket[32] what the hardware does is this:
mov eax, bucktes[ecx*19]
; eax now contains the pointer to the
; 19 element in the array
; this is a one clock instruction
So that is why it doesn't matter how many bucket you have.
add a comment |
I thin it would be best to answer with a code and diagram. We all know what hashing (one way) function is it is. Basically it takes arbitrary input and returns number (in java it is int but that is not always the case). And int in java has 32 bits. That means that it can be between -2,147,483,648 and 2,147,483,647. Every object on every java heep that exist can calculate it's hash (using method from java.util.Object class) and it has to be in that interval.
Now let's assume that we have 3 objects.
21234 = obj1.hashCode();
623424 = obj2.hashCode();
23124432 = obj3.hasCode();
and we want to add them to a hashMap that has 200 buckets. (this is not a working java code i typed it here)
public class MyHashMap {
private final Buckets buckets = new Buckets[200];
public boolean add(Object object){
int resultModulo = object.hashCode() % 200;
buckets[buckets].add(object);
}
}
Now for the final peace. For our object the resultModulo
will be 34(21234), 24(623424), 32(23124432). And the calculate number will not exceed 200.
Array is allocated as a continuous piece of memory. Just array of pointers(64-bits) not the actual objects. So the bucktes looks something like that
0xB80000xB80020xB80670xC1101 ....
1 2 3 4 .... 200
and so when your code invokes bucket[34],bucket[24],bucket[32] what the hardware does is this:
mov eax, bucktes[ecx*19]
; eax now contains the pointer to the
; 19 element in the array
; this is a one clock instruction
So that is why it doesn't matter how many bucket you have.
add a comment |
I thin it would be best to answer with a code and diagram. We all know what hashing (one way) function is it is. Basically it takes arbitrary input and returns number (in java it is int but that is not always the case). And int in java has 32 bits. That means that it can be between -2,147,483,648 and 2,147,483,647. Every object on every java heep that exist can calculate it's hash (using method from java.util.Object class) and it has to be in that interval.
Now let's assume that we have 3 objects.
21234 = obj1.hashCode();
623424 = obj2.hashCode();
23124432 = obj3.hasCode();
and we want to add them to a hashMap that has 200 buckets. (this is not a working java code i typed it here)
public class MyHashMap {
private final Buckets buckets = new Buckets[200];
public boolean add(Object object){
int resultModulo = object.hashCode() % 200;
buckets[buckets].add(object);
}
}
Now for the final peace. For our object the resultModulo
will be 34(21234), 24(623424), 32(23124432). And the calculate number will not exceed 200.
Array is allocated as a continuous piece of memory. Just array of pointers(64-bits) not the actual objects. So the bucktes looks something like that
0xB80000xB80020xB80670xC1101 ....
1 2 3 4 .... 200
and so when your code invokes bucket[34],bucket[24],bucket[32] what the hardware does is this:
mov eax, bucktes[ecx*19]
; eax now contains the pointer to the
; 19 element in the array
; this is a one clock instruction
So that is why it doesn't matter how many bucket you have.
I thin it would be best to answer with a code and diagram. We all know what hashing (one way) function is it is. Basically it takes arbitrary input and returns number (in java it is int but that is not always the case). And int in java has 32 bits. That means that it can be between -2,147,483,648 and 2,147,483,647. Every object on every java heep that exist can calculate it's hash (using method from java.util.Object class) and it has to be in that interval.
Now let's assume that we have 3 objects.
21234 = obj1.hashCode();
623424 = obj2.hashCode();
23124432 = obj3.hasCode();
and we want to add them to a hashMap that has 200 buckets. (this is not a working java code i typed it here)
public class MyHashMap {
private final Buckets buckets = new Buckets[200];
public boolean add(Object object){
int resultModulo = object.hashCode() % 200;
buckets[buckets].add(object);
}
}
Now for the final peace. For our object the resultModulo
will be 34(21234), 24(623424), 32(23124432). And the calculate number will not exceed 200.
Array is allocated as a continuous piece of memory. Just array of pointers(64-bits) not the actual objects. So the bucktes looks something like that
0xB80000xB80020xB80670xC1101 ....
1 2 3 4 .... 200
and so when your code invokes bucket[34],bucket[24],bucket[32] what the hardware does is this:
mov eax, bucktes[ecx*19]
; eax now contains the pointer to the
; 19 element in the array
; this is a one clock instruction
So that is why it doesn't matter how many bucket you have.
edited Nov 28 '18 at 21:41
answered Nov 28 '18 at 18:52
piotr szybickipiotr szybicki
6811511
6811511
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53524583%2fwhat-is-the-time-complexity-to-access-find-hashmap-bucketnot-value-in-bucket%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Why do you think it would be different? Finding the hash bucket given a hashcode is a modulo operation and an array access. Neither of those cost more if you use bigger numbers. There are obviously factors that influence it much more, like how many hashcode collisions there are (do all objects in the map have the same hashcode?)
– Krease
Nov 28 '18 at 17:10
@Krease - In HashMap, it creates array to hold hashcode, we call them as Arrayof HashCodes/Buckets. So If I want to find hashcode from that Array, I still needs to traverse through it right?
– Dr. Blue
Nov 28 '18 at 17:17
2
You don’t “traverse” through the array of buckets, you do a modulo operation on the hash with the number of buckets - his gives you they bucket the object lives in. If you have collisions (multiple objects in the bucket), then you traverse through that. Your question doesn’t specify anything about collisions though
– Krease
Nov 28 '18 at 17:27
Yes. I got your point.You mean to say "hashcode % sizeOfmap" right?
– Dr. Blue
Nov 29 '18 at 0:39
Yes. The
%
character is the modulo operation in Java.– Krease
Nov 29 '18 at 0:47