<legend id='MIMtW'><style id='MIMtW'><dir id='MIMtW'><q id='MIMtW'></q></dir></style></legend>

    1. <i id='MIMtW'><tr id='MIMtW'><dt id='MIMtW'><q id='MIMtW'><span id='MIMtW'><b id='MIMtW'><form id='MIMtW'><ins id='MIMtW'></ins><ul id='MIMtW'></ul><sub id='MIMtW'></sub></form><legend id='MIMtW'></legend><bdo id='MIMtW'><pre id='MIMtW'><center id='MIMtW'></center></pre></bdo></b><th id='MIMtW'></th></span></q></dt></tr></i><div id='MIMtW'><tfoot id='MIMtW'></tfoot><dl id='MIMtW'><fieldset id='MIMtW'></fieldset></dl></div>
      <tfoot id='MIMtW'></tfoot>
        <bdo id='MIMtW'></bdo><ul id='MIMtW'></ul>
      1. <small id='MIMtW'></small><noframes id='MIMtW'>

      2. RGBA 到 ABGR:适用于 iOS/Xcode 的内联臂霓虹灯 asm

        时间:2023-06-11
        1. <small id='X9n5E'></small><noframes id='X9n5E'>

            <tbody id='X9n5E'></tbody>
            <tfoot id='X9n5E'></tfoot>
            <i id='X9n5E'><tr id='X9n5E'><dt id='X9n5E'><q id='X9n5E'><span id='X9n5E'><b id='X9n5E'><form id='X9n5E'><ins id='X9n5E'></ins><ul id='X9n5E'></ul><sub id='X9n5E'></sub></form><legend id='X9n5E'></legend><bdo id='X9n5E'><pre id='X9n5E'><center id='X9n5E'></center></pre></bdo></b><th id='X9n5E'></th></span></q></dt></tr></i><div id='X9n5E'><tfoot id='X9n5E'></tfoot><dl id='X9n5E'><fieldset id='X9n5E'></fieldset></dl></div>

                <bdo id='X9n5E'></bdo><ul id='X9n5E'></ul>
              • <legend id='X9n5E'><style id='X9n5E'><dir id='X9n5E'><q id='X9n5E'></q></dir></style></legend>

                1. 本文介绍了RGBA 到 ABGR:适用于 iOS/Xcode 的内联臂霓虹灯 asm的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  这段代码(非常相似的代码,没有尝试过完全这段代码)使用Android NDK编译,但不适用于Xcode/armv7+arm64/iOS

                  This code(very similar code, haven't tried exactly this code) compiles using Android NDK, but not with Xcode/armv7+arm64/iOS

                  评论错误:

                  uint32_t *src;
                  uint32_t *dst;
                  
                  #ifdef __ARM_NEON
                  __asm__ volatile(
                      "vld1.32 {d0, d1}, [%[src]] 
                  " // error: Vector register expected
                      "vrev32.8 q0, q0            
                  " // error: Unrecognized instruction mnemonic
                      "vst1.32 {d0, d1}, [%[dst]] 
                  " // error: Vector register expected
                      :
                      : [src]"r"(src), [dst]"r"(dst)
                      : "d0", "d1"
                      );
                  #endif
                  

                  这段代码有什么问题?

                  What's wrong with this code?

                  我使用内在函数重写了代码:

                  I rewrote the code using intrinsics:

                  uint8x16_t x = vreinterpretq_u8_u32(vld1q_u32(src));
                  uint8x16_t y = vrev32q_u8(x);
                  vst1q_u32(dst, vreinterpretq_u32_u8(y));
                  

                  拆解后,我得到以下,这是我已经尝试过的一个变体:

                  After disassembling, I get the following, which is a variation I have already tried:

                  vld1.32 {d16, d17}, [r0]!
                  vrev32.8    q8, q8
                  vst1.32 {d16, d17}, [r1]!
                  

                  所以我的代码现在看起来像这样,但给出了完全相同的错误:

                  So my code looks like this now, but gives the exact same errors:

                  __asm__ volatile("vld1.32 {d0, d1}, [%0]! 
                  "
                                   "vrev32.8 q0, q0         
                  "
                                   "vst1.32 {d0, d1}, [%1]! 
                  "
                                   :
                                   : "r"(src), "r"(dst)
                                   : "d0", "d1"
                                   );
                  

                  通过反汇编阅读,我实际上找到了该函数的第二个版本.事实证明,arm64 使用的指令集略有不同.例如,arm64 程序集使用 rev32.16b v0, v0 代替.整个函数列表(我无法正面或反面)如下:

                  Reading through the disassembly, I actually found a second version of the function. It turns out that arm64 uses a slightly different instruction set. For example, the arm64 assembly uses rev32.16b v0, v0 instead. The whole function listing(which I can't make heads or tails of) is below:

                  _My_Function:
                  cmp     w2, #0
                  add w9, w2, #3
                  csel    w8, w9, w2, lt
                  cmp     w9, #7
                  b.lo    0x3f4
                  asr w9, w8, #2
                  ldr     x8, [x0]
                  mov  w9, w9
                  lsl x9, x9, #2
                  ldr q0, [x8], #16
                  rev32.16b   v0, v0
                  str q0, [x1], #16
                  sub x9, x9, #16
                  cbnz    x9, 0x3e0
                  ret
                  

                  推荐答案

                  正如对原始问题的编辑中所述,我需要为 arm64 和 armv7 提供不同的程序集实现.

                  As stated in the edits to the original question, it turned out that I needed a different assembly implementation for arm64 and armv7.

                  #ifdef __ARM_NEON
                    #if __LP64__
                  asm volatile("ldr q0, [%0], #16  
                  "
                               "rev32.16b v0, v0   
                  "
                               "str q0, [%1], #16  
                  "
                               : "=r"(src), "=r"(dst)
                               : "r"(src), "r"(dst)
                               : "d0", "d1"
                               );
                    #else
                  asm volatile("vld1.32 {d0, d1}, [%0]! 
                  "
                               "vrev32.8 q0, q0         
                  "
                               "vst1.32 {d0, d1}, [%1]! 
                  "
                               : "=r"(src), "=r"(dst)
                               : "r"(src), "r"(dst)
                               : "d0", "d1"
                               );
                    #endif
                  #else
                  

                  我在原始帖子中发布的内在函数代码生成了令人惊讶的好汇编,并且还为我生成了 arm64 版本,因此将来使用内在函数可能是一个更好的主意.

                  The intrinsics code that I posted in the original post generated surprisingly good assembly though, and also generated the arm64 version for me, so it may be a better idea to use intrinsics instead in the future.

                  这篇关于RGBA 到 ABGR:适用于 iOS/Xcode 的内联臂霓虹灯 asm的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  上一篇:适用于 iPhone 的 MySQL C API 库 下一篇:在 iOS ARM 设备 (iPhone 4) 上支持低于标准的 IEEE 754 浮点数

                  相关文章

                2. <legend id='meExe'><style id='meExe'><dir id='meExe'><q id='meExe'></q></dir></style></legend>
                3. <tfoot id='meExe'></tfoot>

                    <bdo id='meExe'></bdo><ul id='meExe'></ul>

                  1. <i id='meExe'><tr id='meExe'><dt id='meExe'><q id='meExe'><span id='meExe'><b id='meExe'><form id='meExe'><ins id='meExe'></ins><ul id='meExe'></ul><sub id='meExe'></sub></form><legend id='meExe'></legend><bdo id='meExe'><pre id='meExe'><center id='meExe'></center></pre></bdo></b><th id='meExe'></th></span></q></dt></tr></i><div id='meExe'><tfoot id='meExe'></tfoot><dl id='meExe'><fieldset id='meExe'></fieldset></dl></div>

                    <small id='meExe'></small><noframes id='meExe'>