<i id='MG2kE'><tr id='MG2kE'><dt id='MG2kE'><q id='MG2kE'><span id='MG2kE'><b id='MG2kE'><form id='MG2kE'><ins id='MG2kE'></ins><ul id='MG2kE'></ul><sub id='MG2kE'></sub></form><legend id='MG2kE'></legend><bdo id='MG2kE'><pre id='MG2kE'><center id='MG2kE'></center></pre></bdo></b><th id='MG2kE'></th></span></q></dt></tr></i><div id='MG2kE'><tfoot id='MG2kE'></tfoot><dl id='MG2kE'><fieldset id='MG2kE'></fieldset></dl></div>
      <tfoot id='MG2kE'></tfoot>
    1. <legend id='MG2kE'><style id='MG2kE'><dir id='MG2kE'><q id='MG2kE'></q></dir></style></legend>

        <small id='MG2kE'></small><noframes id='MG2kE'>

          <bdo id='MG2kE'></bdo><ul id='MG2kE'></ul>

        我的 x86 目标文件中这些看似无用的 callq 指令是做什么用的?

        时间:2023-12-02
        <i id='NHz1e'><tr id='NHz1e'><dt id='NHz1e'><q id='NHz1e'><span id='NHz1e'><b id='NHz1e'><form id='NHz1e'><ins id='NHz1e'></ins><ul id='NHz1e'></ul><sub id='NHz1e'></sub></form><legend id='NHz1e'></legend><bdo id='NHz1e'><pre id='NHz1e'><center id='NHz1e'></center></pre></bdo></b><th id='NHz1e'></th></span></q></dt></tr></i><div id='NHz1e'><tfoot id='NHz1e'></tfoot><dl id='NHz1e'><fieldset id='NHz1e'></fieldset></dl></div>

            • <bdo id='NHz1e'></bdo><ul id='NHz1e'></ul>

                <tfoot id='NHz1e'></tfoot>
                <legend id='NHz1e'><style id='NHz1e'><dir id='NHz1e'><q id='NHz1e'></q></dir></style></legend>

                <small id='NHz1e'></small><noframes id='NHz1e'>

                    <tbody id='NHz1e'></tbody>
                1. 本文介绍了我的 x86 目标文件中这些看似无用的 callq 指令是做什么用的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我有一些模板密集的 C++ 代码,我想确保编译器尽可能地优化,因为它在编译时拥有大量信息.为了评估它的性能,我决定看一看它生成的目标文件的反汇编.下面是我从 objdump -dC 得到的片段:

                  I have some template-heavy C++ code that I want to ensure the compiler optimizes as much as possible due to the large amount of information it has at compile time. To evaluate its performance, I decided to take a look at the disassembly of the object file that it generates. Below is a snippet of what I got from objdump -dC:

                  0000000000000000 <bar<foo, 0u>::get(bool)>:
                     0:   41 57                   push   %r15
                     2:   49 89 f7                mov    %rsi,%r15
                     5:   41 56                   push   %r14
                     7:   41 55                   push   %r13
                     9:   41 54                   push   %r12
                     b:   55                      push   %rbp
                     c:   53                      push   %rbx
                     d:   48 81 ec 68 02 00 00    sub    $0x268,%rsp
                    14:   48 89 7c 24 10          mov    %rdi,0x10(%rsp)
                    19:   48 89 f7                mov    %rsi,%rdi
                    1c:   89 54 24 1c             mov    %edx,0x1c(%rsp)
                    20:   e8 00 00 00 00          callq  25 <bar<foo, 0u>::get(bool)+0x25>
                    25:   84 c0                   test   %al,%al
                    27:   0f 85 eb 00 00 00       jne    118 <bar<foo, 0u>::get(bool)+0x118>
                    2d:   48 c7 44 24 08 00 00    movq   $0x0,0x8(%rsp)
                    34:   00 00 
                    36:   4c 89 ff                mov    %r15,%rdi
                    39:   4d 8d b7 30 01 00 00    lea    0x130(%r15),%r14
                    40:   e8 00 00 00 00          callq  45 <bar<foo, 0u>::get(bool)+0x45>
                    45:   84 c0                   test   %al,%al
                    47:   88 44 24 1b             mov    %al,0x1b(%rsp)
                    4b:   0f 85 ef 00 00 00       jne    140 <bar<foo, 0u>::get(bool)+0x140>
                    51:   80 7c 24 1c 00          cmpb   $0x0,0x1c(%rsp)
                    56:   0f 85 24 03 00 00       jne    380 <bar<foo, 0u>::get(bool)+0x380>
                    5c:   48 8b 44 24 10          mov    0x10(%rsp),%rax
                    61:   c6 00 00                movb   $0x0,(%rax)
                    64:   80 7c 24 1b 00          cmpb   $0x0,0x1b(%rsp)
                    69:   75 25                   jne    90 <bar<foo, 0u>::get(bool)+0x90>
                    6b:   48 8b 74 24 10          mov    0x10(%rsp),%rsi
                    70:   4c 89 ff                mov    %r15,%rdi
                    73:   e8 00 00 00 00          callq  78 <bar<foo, 0u>::get(bool)+0x78>
                    78:   48 8b 44 24 10          mov    0x10(%rsp),%rax
                    7d:   48 81 c4 68 02 00 00    add    $0x268,%rsp
                    84:   5b                      pop    %rbx
                    85:   5d                      pop    %rbp
                    86:   41 5c                   pop    %r12
                    88:   41 5d                   pop    %r13
                    8a:   41 5e                   pop    %r14
                    8c:   41 5f                   pop    %r15
                    8e:   c3                      retq   
                    8f:   90                      nop
                    90:   4c 89 f7                mov    %r14,%rdi
                    93:   e8 00 00 00 00          callq  98 <bar<foo, 0u>::get(bool)+0x98>
                    98:   83 f8 04                cmp    $0x4,%eax
                    9b:   74 f3                   je     90 <bar<foo, 0u>::get(bool)+0x90>
                    9d:   85 c0                   test   %eax,%eax
                    9f:   0f 85 e4 08 00 00       jne    989 <bar<foo, 0u>::get(bool)+0x989>
                    a5:   49 83 87 b0 01 00 00    addq   $0x1,0x1b0(%r15)
                    ac:   01 
                    ad:   49 8d 9f 58 01 00 00    lea    0x158(%r15),%rbx
                    b4:   48 89 df                mov    %rbx,%rdi
                    b7:   e8 00 00 00 00          callq  bc <bar<foo, 0u>::get(bool)+0xbc>
                    bc:   49 8d bf 80 01 00 00    lea    0x180(%r15),%rdi
                    c3:   e8 00 00 00 00          callq  c8 <bar<foo, 0u>::get(bool)+0xc8>
                    c8:   48 89 df                mov    %rbx,%rdi
                    cb:   e8 00 00 00 00          callq  d0 <bar<foo, 0u>::get(bool)+0xd0>
                    d0:   4c 89 f7                mov    %r14,%rdi
                    d3:   e8 00 00 00 00          callq  d8 <bar<foo, 0u>::get(bool)+0xd8>
                    d8:   83 f8 04                cmp    $0x4,%eax
                  

                  这个特定函数的反汇编还在继续,但我注意到的一件事是相对大量的 call 指令,如下所示:

                  The disassembly of this particular function continues on, but one thing I noticed is the relatively large number of call instructions like this one:

                  20:   e8 00 00 00 00          callq  25 <bar<foo, 0u>::get(bool)+0x25>
                  

                  这些指令,总是带有操作码 e8 00 00 00 00,在生成的代码中频繁出现,据我所知,只不过是无操作;他们似乎都只是通过下一个指令.这就引出了一个问题,那么,是否有充分的理由生成所有这些指令?

                  These instructions, always with the opcode e8 00 00 00 00, occur frequently throughout the generated code, and from what I can tell, are nothing more than no-ops; they all seem to just fall through to the next instruction. This begs the question, then, is there a good reason why all these instructions are generated?

                  我担心生成代码的指令缓存占用空间,因此在整个函数中多次浪费 5 个字节似乎适得其反.nop 似乎有点重量级,除非编译器试图保留某种内存对齐或其他东西.如果是这种情况,我不会感到惊讶.

                  I'm concerned about the instruction cache footprint of the generated code, so wasting 5 bytes many times throughout a function seems counterproductive. It seems a bit heavyweight for a nop, unless the compiler is trying to preserve some kind of memory alignment or something. I wouldn't be surprised if this were the case.

                  我使用 -O3 -fomit-frame-pointer 使用 g++ 4.8.5 编译了我的代码.就其价值而言,我看到使用 clang 3.7 生成类似的代码.

                  I compiled my code using g++ 4.8.5 using -O3 -fomit-frame-pointer. For what it's worth, I saw similar code generation using clang 3.7.

                  推荐答案

                  e8 00 00 00 00 中的 00 00 00 00(相对)目标地址旨在由链接器填写.这并不意味着呼叫失败.这只是意味着您正在反汇编尚未链接的目标文件.

                  The 00 00 00 00 (relative) target address in e8 00 00 00 00 is intended to be filled in by the linker. It doesn't mean that the call falls through. It just means you are disassembling an object file that has not been linked yet.

                  此外,对下一条指令的调用,如果那是链接阶段之后的最终结果,则不会是空操作,因为它改变了堆栈(某种暗示,这不是您的案件).

                  Also, a call to the next instruction, if that was the end result after the link phase, would not be a no-op, because it changes the stack (a certain hint that this is not what is going on in your case).

                  这篇关于我的 x86 目标文件中这些看似无用的 callq 指令是做什么用的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  上一篇:SSE2 内在函数 - 比较无符号整数 下一篇:GCC 5.4.0 的昂贵跳跃

                  相关文章

                    <small id='6sRXH'></small><noframes id='6sRXH'>

                    <i id='6sRXH'><tr id='6sRXH'><dt id='6sRXH'><q id='6sRXH'><span id='6sRXH'><b id='6sRXH'><form id='6sRXH'><ins id='6sRXH'></ins><ul id='6sRXH'></ul><sub id='6sRXH'></sub></form><legend id='6sRXH'></legend><bdo id='6sRXH'><pre id='6sRXH'><center id='6sRXH'></center></pre></bdo></b><th id='6sRXH'></th></span></q></dt></tr></i><div id='6sRXH'><tfoot id='6sRXH'></tfoot><dl id='6sRXH'><fieldset id='6sRXH'></fieldset></dl></div>
                  1. <legend id='6sRXH'><style id='6sRXH'><dir id='6sRXH'><q id='6sRXH'></q></dir></style></legend>
                    • <bdo id='6sRXH'></bdo><ul id='6sRXH'></ul>

                      <tfoot id='6sRXH'></tfoot>