我有 32 个机器线程和一个 ConcurrentHashMap<Key,Value>map
,其中包含很多键.Key
定义了一个公共方法 visit()
.我想visit()
使用我可用的处理能力以及可能的某种线程池,只对 map 的每个元素进行一次.
I have 32 machine threads and one ConcurrentHashMap<Key,Value> map
, which contains a lot of keys. Key
has defined a public method visit()
. I want to visit()
every element of map exactly once using the processing power I have available and possibly some sort of thread pooling.
我可以尝试的事情:
map.keys()
方法.生成的 Enumeration
可以使用 nextElement()
进行迭代,但是由于对 key.visit()
的调用非常简短,因此我无法使线程保持忙碌.枚举本质上是单线程的.HashSet<Key>
代替,调用方法 toArray()
并将数组上的工作分成所有 32 个线程.我严重怀疑这个解决方案,因为方法 toArray()
很可能是单线程瓶颈.ConcurrentHashMap
继承,掌握其内部 Segment<K,V>
的实例,尝试将它们分成 32 个组并工作分别在每组上.不过,这听起来像是一种硬核方法.Enumeration<Key>
的类似魔法.map.keys()
. The resulting Enumeration<Key>
could be iterated over using nextElement()
, but since a call to key.visit()
is very brief I won't manage to keep threads busy. The Enumeration is inherently single-threaded.HashSet<Key>
instead, invoke a method toArray()
and split the work on the array into all 32 threads. I seriously doubt in this solution, since the method toArray()
will likely be a single-thread bottle-neck.ConcurrentHashMap
, get my hands on the instances of its inner Segment<K,V>
, try to group them into 32 groups and work on each group separately. This sounds like a hardcore approach though.Enumeration<Key>
.理想情况下:
ConcurrentHashMap<Key, Value>
将定义一个方法 keysEnumerator(intapproxPosition)
,这可能会使我丢失大约前 1/32 个元素的枚举器,即map.keysEnumerator(map.size()/32)
ConcurrentHashMap<Key, Value>
would define a method keysEnumerator(int approximatePosition)
, which could drop me an enumerator missing approximately first 1/32 elements, i.e. map.keysEnumerator(map.size()/32)
我错过了什么明显的东西吗?有没有人遇到过类似的问题?
Am I missing anything obvious? Has anybody run into similar problem before?
编辑
我已经进行了分析,看看这个问题是否真的会影响实践中的性能.由于目前我无法访问集群,因此我使用笔记本电脑并尝试将结果外推到更大的数据集.在我的机器上,我可以创建一个 200 万个键 ConcurrentHashMap,并且在每个键上调用 visit()
方法来迭代它大约需要 1 秒.该程序应该扩展到 8500 万键(及以上).集群的处理器稍微快一些,但它仍然需要大约 40 秒来迭代整个地图.现在谈谈程序的逻辑流程.呈现的逻辑是顺序的,即在上一步中的所有线程都完成之前,不允许任何线程进行下一步:
I've had a go at profiling to see whether this problem is actually going to affect the performance in practice. As I don't have access to the cluster at the moment I used my laptop and tried to extrapolate the results to a bigger dataset. On my machine I can create a 2 million keys ConcurrentHashMap and it takes about 1 second to iterate over it invoking the visit()
method on every key. The program is supposed to scale to 85 million keys (and over). The cluster's processor is slightly faster, but it still should take about 40 seconds to iterate over entire map. Now a few words about the logic flow of the program. The logic presented is sequential, i.e. it is not allowed for any thread to proceed to the next step until all the threads in the previous step are finished:
这个逻辑流程意味着一个 40 秒的迭代将被重复几百次,比如 100 次.这让我们在访问节点上花费了 一个多小时.使用一组 32 个并行迭代器,它可以缩短到几分钟,这是一个显着的性能改进.
That logic flow means that a 40 second iteration is going to be repeated a few hundred times, say 100. Which gives us a bit over an hour spent just in visiting the nodes. With a set of 32 parallel iterators it could go down to just a few minutes, which is a significant performance improvement.
现在谈谈 ConcurrentHashMap
是如何工作的(或者我认为它是如何工作的).每个 ConcurrentHashMap
都由段组成(默认为 16 个).对哈希映射的每次写入都会在相关段上同步.假设我们正在尝试将两个新键 k1 和 k2 写入哈希映射,并且它们将被解析为属于同一段,例如 s1.如果尝试同时写入它们,则其中一个将首先获取锁,然后再添加另一个.两个元素被解析为属于同一段的机会是多少?如果我们有一个好的散列函数和 16 个段,那么它就是 1/16.
Now a few words on how ConcurrentHashMap
works (Or how I believe it works). Every ConcurrentHashMap
consists of segments (by default 16). Every write to a hash map is synchronised on a relevant segment. So say we're trying to write two new keys k1 and k2 to the hash map and that they would be resolved to belong to the same segment, say s1. If they are attempted to be written simultaneously, one of them is going to acquire the lock first and be added earlier then the other. What is the chance of two elements to be resolved to belong to the same segment? In case we have got a good hash function and 16 segements it is 1/16.
我相信 ConcurrentHashMap
应该有一个方法 concurrentKeys()
,它将返回一个枚举数组,每个段一个.我有一些想法如何通过继承将它添加到 ConcurrentHashMap
,如果我成功了,我会告诉你的.就目前而言,解决方案似乎是创建一个 ConcurrentHashMaps 数组并预先散列每个键以解析为此类数组的一个成员.准备好后,我也会分享该代码.
I belive that ConcurrentHashMap
should have a method concurrentKeys()
, which would return an array of Enumerations, one per each segment. I have got a few ideas how to add it to ConcurrentHashMap
through inheritance and I'll let you know if I succeed. As for now the solution seems to be to create an array of ConcurrentHashMaps and pre-hashing every key to resolve to one member of such array. I'll share that code as well, once it's ready.
编辑
这是不同语言的相同问题:
This is the same problem in a different language:
并行迭代器
我最终会采用的解决方案是一个 ConcurrentHashMaps
数组,而不是一个 ConcurrentHashMap
.这是临时的,但似乎与我的用例有关.我不在乎第二步的速度很慢,因为它不会影响我的代码的性能.解决办法是:
The solution I will eventually go for is an array of ConcurrentHashMaps
instead of one ConcurrentHashMap
. This is ad hoc, but seems to be relevant for my usecase. I don't care about the second step being slow as it doesn't affect my code's performance. The solution is:
对象创建:
数组填充(单线程,不是问题):
Array Population (single threaded, not an issue):
数组迭代(很好的多线程,性能提升):
Array Iteration (nicely multithreaded, performance gain):
要查看概念验证代码(因为它有一些来自项目的依赖项,我无法在此处发布)前往我在 github 上的项目
To see the proof of concept code (as it's got some dependencies from the project I can't post it here) head towards my project on github
编辑
实际上,为我的系统实施上述概念证明已被证明是耗时、容易出错且令人非常失望的.此外,我发现我会错过标准库 ConcurrentHashMap 的许多功能.我最近一直在探索的解决方案是使用 Scala,它看起来不那么特别而且更有希望,它产生的字节码可以与 Java 完全互操作.概念证明依赖于 本文 中描述的令人惊叹的库和 AFAIK考虑到标准库和相应第三方库的当前状态,如果不编写数千行代码,目前在 vanilla Java 中实现相应的解决方案是不可能的.
Actually, implementing the above proof of concept for my system has proven to be time-consuming, bug-prone and grossly disappointing. Additionally I've discovered I would have missed many features of the standard library ConcurrentHashMap. The solution I have been exploring recently, which looks much less ad-hoc and much more promising is to use Scala, which produces bytecode that is fully interoperable with Java. The proof of concept relies on stunning library described in this paper and AFAIK it is currently IMPOSSIBLE to achieve a corresponding solution in vanilla Java without writing thousands lines of code, given the current state of the standard library and corresponding third-party libraries.
import scala.collection.parallel.mutable.ParHashMap
class Node(value: Int, id: Int){
var v = value
var i = id
override def toString(): String = v toString
}
object testParHashMap{
def visit(entry: Tuple2[Int, Node]){
entry._2.v += 1
}
def main(args: Array[String]){
val hm = new ParHashMap[Int, Node]()
for (i <- 1 to 10){
var node = new Node(0, i)
hm.put(node.i, node)
}
println("========== BEFORE ==========")
hm.foreach{println}
hm.foreach{visit}
println("========== AFTER ==========")
hm.foreach{println}
}
}
这篇关于访问 ConcurrentHashMap<Element, Boolean> 的每个元素的可扩展方式恰好一次的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!