确保每个 Hashmap 桶/槽一个值

时间:2022-10-17
本文介绍了确保每个 Hashmap 桶/槽一个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

有没有办法严格确保每个Hashmap桶的条目数不篡改Java中的object.hashcode()函数?

Is there a way to strictly ensure the number of entries per Hashmap bucket without tampering the the object.hashcode() function in Java?

负载因子是一个平均值:(条目数)/(桶数).本质上,假设我有一个容量为 1000 的 Hashmap.为了这个示例,假设我使用 1 的负载因子.我将要存储在 HashMap 中的 100 个对象具有错误的哈希码函数,它总是返回每个对象的值相同.当我存储完 100 个对象后,它们都将映射到同一个 HashMap 存储桶,我最终会获得 LinkedList 的性能.负载因子将保持沉默,因为 100 个条目/1000 个桶 = 0.1 <1. 现在如果我放置 1 M 个相同的对象会发生什么.因为永远不会触发 LF,所以永远不会调整 HashMap 的大小(无论如何都不会使用).

The Load Factor is an average: (# of entries) / (# of buckets). In essence, let's say I have a Hashmap of capacity 1000. For the sake of this example, say I use a Load Factor of 1. The 100 objects I'm going to be storing in the HashMap have bad hashcode function which always return the same value for every object. When I'm done storing 100 objects, they will all map of the same HashMap bucket and I eventually end up with LinkedList performance. The Load Factor will sit silent because 100 entries / 1000 buckets = 0.1 < 1. Now what happens if I put 1 M of the same objects. The HashMap will never be resized (no use anyways) as the LF will never be triggered.

我知道这在现实世界中并不常见,但我想提高我的理解.HashMap 有没有办法防止这种情况发生,或者至少从结构本身得到一些警告?

I know this is an uncommon scenario in real world but would like to improve my understanding. Is there a way in HashMap to prevent this or at least get some warning from the structure itself?

推荐答案

HashMap 总是会根据 key 的 hash code 计算出使用哪个桶.如果每个键具有相同的哈希码,它们都将映射到同一个桶.如果不提供更好的 hashCode() 实现,您将无法阻止您描述的行为.

A HashMap will always calculate which bucket to use based on the key's hash code. If each key has the same hash code, they will all map to the same bucket. You cannot prevent the behavior you described without providing a better hashCode() implementation.

您可以查看使用开放寻址的 Map 实现(例如 Trove 的 THashMap).他们总是每个桶只有一个条目.但是性能不会提高,它们只是以不同的方式处理冲突,而且它们也无法解决您的根本问题:哈希码错误.

You could look at Map implementations that use open addressing (e.g. Trove's THashMap). They will always have just one entry per bucket. But the performance will not improve, they just deal with collisions in a different way, and they also won't solve your root problem : a bad hash code.

这篇关于确保每个 Hashmap 桶/槽一个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!