<bdo id='zAN74'></bdo><ul id='zAN74'></ul>
  • <tfoot id='zAN74'></tfoot>
    <legend id='zAN74'><style id='zAN74'><dir id='zAN74'><q id='zAN74'></q></dir></style></legend>

  • <small id='zAN74'></small><noframes id='zAN74'>

  • <i id='zAN74'><tr id='zAN74'><dt id='zAN74'><q id='zAN74'><span id='zAN74'><b id='zAN74'><form id='zAN74'><ins id='zAN74'></ins><ul id='zAN74'></ul><sub id='zAN74'></sub></form><legend id='zAN74'></legend><bdo id='zAN74'><pre id='zAN74'><center id='zAN74'></center></pre></bdo></b><th id='zAN74'></th></span></q></dt></tr></i><div id='zAN74'><tfoot id='zAN74'></tfoot><dl id='zAN74'><fieldset id='zAN74'></fieldset></dl></div>

        为什么这个 C++ 包装类没有被内联?

        时间:2023-10-17
          <legend id='hckqw'><style id='hckqw'><dir id='hckqw'><q id='hckqw'></q></dir></style></legend>
          • <bdo id='hckqw'></bdo><ul id='hckqw'></ul>

            • <i id='hckqw'><tr id='hckqw'><dt id='hckqw'><q id='hckqw'><span id='hckqw'><b id='hckqw'><form id='hckqw'><ins id='hckqw'></ins><ul id='hckqw'></ul><sub id='hckqw'></sub></form><legend id='hckqw'></legend><bdo id='hckqw'><pre id='hckqw'><center id='hckqw'></center></pre></bdo></b><th id='hckqw'></th></span></q></dt></tr></i><div id='hckqw'><tfoot id='hckqw'></tfoot><dl id='hckqw'><fieldset id='hckqw'></fieldset></dl></div>

                <tfoot id='hckqw'></tfoot>

                <small id='hckqw'></small><noframes id='hckqw'>

                    <tbody id='hckqw'></tbody>

                  本文介绍了为什么这个 C++ 包装类没有被内联?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  EDIT - 我的构建系统出了点问题.我仍在弄清楚到底是什么,但是 gcc 产生了奇怪的结果(即使它是一个 .cpp 文件),但是一旦我使用了 g++ 然后它按预期工作.

                  EDIT - something's up with my build system. I'm still figuring out exactly what, but gcc was producing weird results (even though it's a .cpp file), but once I used g++ then it worked as expected.

                  对于我遇到的问题,这是一个非常简化的测试用例,其中使用数字包装类(我认为会被内联)使我的程序慢了 10 倍.

                  This is a very reduced test-case for something I've been having trouble with, where using a numerical wrapper class (which I thought would be inlined away) made my program 10x slower.

                  这与优化级别无关(尝试使用 -O0-O3).

                  This is independent of optimisation level (tried with -O0 and -O3).

                  我是否在包装类中遗漏了一些细节?

                  Am I missing some detail in my wrapper class?

                  我有以下程序,我在其中定义了一个包含 double 并提供 + 运算符的类:

                  I have the following program, in which I define a class which wraps a double and provides the + operator:

                  #include <cstdio>
                  #include <cstdlib>
                  
                  #define INLINE __attribute__((always_inline)) inline
                  
                  struct alignas(8) WrappedDouble {
                      double value;
                  
                      INLINE friend const WrappedDouble operator+(const WrappedDouble& left, const WrappedDouble& right) {
                          return {left.value + right.value};
                      };
                  };
                  
                  #define doubleType WrappedDouble // either "double" or "WrappedDouble"
                  
                  int main() {
                      int N = 100000000;
                      doubleType* arr = (doubleType*)malloc(sizeof(doubleType)*N);
                      for (int i = 1; i < N; i++) {
                          arr[i] = arr[i - 1] + arr[i];
                      }
                  
                      free(arr);
                      printf("done
                  ");
                  
                      return 0;
                  }
                  

                  我认为这会编译为相同的东西 - 它进行相同的计算,并且所有内容都是内联的.

                  I thought that this would compile to the same thing - it's doing the same calculations, and everything is inlined.

                  然而,事实并非如此 - 无论优化级别如何,它都会产生更大更慢的结果.

                  However, it's not - it produces a larger and slower result, regardless of optimisation level.

                  (这个特殊的结果并没有显着慢,但我的实际用例包括更多的算术.)

                  (This particular result is not significantly slower, but my actual use-case includes more arithmetic.)

                  EDIT - 我知道这不是在构建我的数组元素.我认为这可能会产生更少的 ASM,所以我可以更好地理解它,但如果它有问题,我可以更改它.

                  EDIT - I am aware that this isn't constructing my array elements. I thought this might produce less ASM so I could understand it better, but I can change it if it's a problem.

                  EDIT - 我也知道我应该使用 new[]/delete[].不幸的是 gcc 拒绝编译它,即使它在一个 .cpp 文件中.这是我的构建系统被搞砸的症状,这可能是我的实际问题.

                  EDIT - I am also aware that I should be using new[]/delete[]. Unfortunately gcc refused to compile that, even though it was in a .cpp file. This was a symptom of my build system being screwed up, which is probably my actual problem.

                  EDIT - 如果我使用 g++ 而不是 gcc,它会产生相同的输出.

                  EDIT - If I use g++ instead of gcc, it produces identical output.

                  EDIT - 我发布了错误版本的 ASM(-O0 而不是 -O3),所以本节没有帮助.

                  EDIT - I posted the wrong version of the ASM (-O0 instead of -O3), so this section isn't helpful.

                  我在 64 位系统上的 Mac 上使用 XCode 的 gcc.结果是一样的,除了 for 循环的主体.

                  I'm using XCode's gcc on my Mac, on a 64-bit system. The result is the same, aside from the body of the for-loop.

                  如果 doubleTypedouble,它为循环体产生的结果如下:

                  Here's what it produces for the body of the loop if doubleType is double:

                  movq    -16(%rbp), %rax
                  movl    -20(%rbp), %ecx
                  subl    $1, %ecx
                  movslq  %ecx, %rdx
                  movsd   (%rax,%rdx,8), %xmm0    ## xmm0 = mem[0],zero
                  movq    -16(%rbp), %rax
                  movslq  -20(%rbp), %rdx
                  addsd   (%rax,%rdx,8), %xmm0
                  movq    -16(%rbp), %rax
                  movslq  -20(%rbp), %rdx
                  movsd   %xmm0, (%rax,%rdx,8)
                  

                  WrappedDouble 版本要长得多:

                  movq    -40(%rbp), %rax
                  movl    -44(%rbp), %ecx
                  subl    $1, %ecx
                  movslq  %ecx, %rdx
                  shlq    $3, %rdx
                  addq    %rdx, %rax
                  movq    -40(%rbp), %rdx
                  movslq  -44(%rbp), %rsi
                  shlq    $3, %rsi
                  addq    %rsi, %rdx
                  movq    %rax, -16(%rbp)
                  movq    %rdx, -24(%rbp)
                  movq    -16(%rbp), %rax
                  movsd   (%rax), %xmm0           ## xmm0 = mem[0],zero
                  movq    -24(%rbp), %rax
                  addsd   (%rax), %xmm0
                  movsd   %xmm0, -8(%rbp)
                  movsd   -8(%rbp), %xmm0         ## xmm0 = mem[0],zero
                  movsd   %xmm0, -56(%rbp)
                  movq    -40(%rbp), %rax
                  movslq  -44(%rbp), %rdx
                  movq    -56(%rbp), %rsi
                  movq    %rsi, (%rax,%rdx,8)
                  

                  推荐答案

                  当您使用 启用优化时,两个版本都会使用 g++clang++ 生成相同的汇编代码>-O3.

                  Both versions result in identical assembly code with g++ and clang++ when you turn on optimizations with -O3.

                  这篇关于为什么这个 C++ 包装类没有被内联?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  上一篇:编译C++程序导致“致命错误LNK1104" 下一篇:强制 GCC 将 .cpp 文件编译为 C

                  相关文章

                      <small id='5aG1s'></small><noframes id='5aG1s'>

                      <legend id='5aG1s'><style id='5aG1s'><dir id='5aG1s'><q id='5aG1s'></q></dir></style></legend>
                    1. <tfoot id='5aG1s'></tfoot>
                        <bdo id='5aG1s'></bdo><ul id='5aG1s'></ul>
                    2. <i id='5aG1s'><tr id='5aG1s'><dt id='5aG1s'><q id='5aG1s'><span id='5aG1s'><b id='5aG1s'><form id='5aG1s'><ins id='5aG1s'></ins><ul id='5aG1s'></ul><sub id='5aG1s'></sub></form><legend id='5aG1s'></legend><bdo id='5aG1s'><pre id='5aG1s'><center id='5aG1s'></center></pre></bdo></b><th id='5aG1s'></th></span></q></dt></tr></i><div id='5aG1s'><tfoot id='5aG1s'></tfoot><dl id='5aG1s'><fieldset id='5aG1s'></fieldset></dl></div>