我正在研究 libc++ 中 std::vector
的实现,我注意到它在内部保留了三个指针(一个指向开头,一个指向结尾,一个指向结尾分配的内存)而不是我本能地做的事情,即一个指向开始的指针和两个 size
和 capacity
成员.
I'm looking at the implementation of std::vector
in libc++ and I noticed that it internally keeps three pointers (one to the begin, one the end, and one to the end of the allocated memory) instead of what I'd instinctively do, i.e., one pointer to the begin and two size
and capacity
members.
这里是libc++的<vector>
中的代码(忽略压缩对,我知道是什么意思).
Here is the code from libc++'s <vector>
(ignore the compressed pair, I know what it means).
pointer __begin_;
pointer __end_;
__compressed_pair<pointer, allocator_type> __end_cap_;
我注意到其他标准库也做同样的事情(例如 Visual C++).我没有看到为什么这个解决方案应该比另一个更快的任何特殊原因,但我可能是错的.
I noticed that also other standard libraries do the same (e.g. Visual C++). I don't see any particular reason why this solution should be faster than the other one, but I might be wrong.
那么三指针"解决方案比指针+大小"解决方案更受欢迎有什么特殊原因吗?
So is there a particular reason the "three pointers" solution is preferred to the "pointer + sizes" one?
这是因为其基本原理是 性能 应该针对迭代器进行优化,而不是针对索引进行优化.
(换句话说,应该针对 begin()
/end()
优化性能,而不是针对 size()
/operator[]
.)
为什么?因为迭代器是通用指针,因此 C++ 鼓励使用它们,作为回报,当两者等效时,确保它们的性能与原始指针相匹配.
It's because the rationale is that performance should be optimized for iterators, not indices.
(In other words, performance should be optimized for begin()
/end()
, not size()
/operator[]
.)
Why? Because iterators are generalized pointers, and thus C++ encourages their use, and in return ensures that their performance matches those of raw pointers when the two are equivalent.
要了解为什么会出现性能问题,请注意典型的 for
循环如下:
To see why it's a performance issue, notice that the typical for
loop is as follows:
for (It i = items.begin(); i != items.end(); ++i)
...
除了在最琐碎的情况下,如果我们跟踪大小而不是指针,会发生什么比较 i != items.end()
会变成 i != items.begin() + items.size()
,执行的指令比您预期的要多.(在许多情况下,优化器通常很难分解出代码.)这会在紧密循环中显着减慢速度,因此避免了这种设计.
Except in the most trivial cases, if we kept track of sizes instead of pointers, what would happen is that the comparison i != items.end()
would turn into i != items.begin() + items.size()
, taking more instructions than you'd expect. (The optimizer generally has a hard time factoring out the code in many cases.) This slows things down dramatically in a tight loop, and hence this design is avoided.
(在尝试编写自己的 std::vector
替代品时,我已经验证这是一个性能问题.)
(I've verified this is a performance problem when trying to write my own replacement for std::vector
.)
正如 Yakk 在评论中指出的那样,当元素大小不是幂时,使用索引而不是指针也会导致生成乘法指令共 2 个,这是非常昂贵且在紧密循环中引人注目的.我在写这个答案时没有想到这一点,但这是一个让我感到困惑的现象(例如,请参阅此处)... 底线是,在一个紧密的循环中,一切都很重要.
As Yakk pointed out in the comments, using indices instead of pointers can also result in the generation of a multiplication instruction when the element sizes aren't powers of 2, which is pretty expensive and noticeable in a tight loop. I didn't think of this when writing this answer, but it's a phenomenon that's bitten me before (e.g. see here)... bottom line is, in a tight loop everything matters.
这篇关于为什么 libc++ std::vector 在内部保留三个指针而不是一个指针和两个大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!