• <tfoot id='SQg3E'></tfoot>

        <small id='SQg3E'></small><noframes id='SQg3E'>

        • <bdo id='SQg3E'></bdo><ul id='SQg3E'></ul>
        <legend id='SQg3E'><style id='SQg3E'><dir id='SQg3E'><q id='SQg3E'></q></dir></style></legend>
        <i id='SQg3E'><tr id='SQg3E'><dt id='SQg3E'><q id='SQg3E'><span id='SQg3E'><b id='SQg3E'><form id='SQg3E'><ins id='SQg3E'></ins><ul id='SQg3E'></ul><sub id='SQg3E'></sub></form><legend id='SQg3E'></legend><bdo id='SQg3E'><pre id='SQg3E'><center id='SQg3E'></center></pre></bdo></b><th id='SQg3E'></th></span></q></dt></tr></i><div id='SQg3E'><tfoot id='SQg3E'></tfoot><dl id='SQg3E'><fieldset id='SQg3E'></fieldset></dl></div>

      1. SSE2 内在函数 - 比较无符号整数

        时间:2023-12-02
          <bdo id='GLNsr'></bdo><ul id='GLNsr'></ul>
              <tbody id='GLNsr'></tbody>
            <tfoot id='GLNsr'></tfoot>

              <i id='GLNsr'><tr id='GLNsr'><dt id='GLNsr'><q id='GLNsr'><span id='GLNsr'><b id='GLNsr'><form id='GLNsr'><ins id='GLNsr'></ins><ul id='GLNsr'></ul><sub id='GLNsr'></sub></form><legend id='GLNsr'></legend><bdo id='GLNsr'><pre id='GLNsr'><center id='GLNsr'></center></pre></bdo></b><th id='GLNsr'></th></span></q></dt></tr></i><div id='GLNsr'><tfoot id='GLNsr'></tfoot><dl id='GLNsr'><fieldset id='GLNsr'></fieldset></dl></div>
              <legend id='GLNsr'><style id='GLNsr'><dir id='GLNsr'><q id='GLNsr'></q></dir></style></legend>

                • <small id='GLNsr'></small><noframes id='GLNsr'>

                  本文介绍了SSE2 内在函数 - 比较无符号整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我有兴趣在添加无符号 8 位整数时识别溢出值,并将结果限制为 0xFF:

                  I'm interested in identifying overflowing values when adding unsigned 8-bit integers, and clamping the result to 0xFF:

                  __m128i m1 = _mm_loadu_si128(/* 16 8-bit unsigned integers */);
                  __m128i m2 = _mm_loadu_si128(/* 16 8-bit unsigned integers */);
                  
                  __m128i m3 = _mm_adds_epu8(m1, m2);
                  

                  我有兴趣对小于"进行比较在这些无符号整数上,类似于 _mm_cmplt_epi8 用于有符号:

                  I would be interested in performing comparison for "less than" on these unsigned integers, similar to _mm_cmplt_epi8 for signed:

                  __m128i mask = _mm_cmplt_epi8 (m3, m1);
                  m1 = _mm_or_si128(m3, mask);
                  

                  如果一个epu8"等效可用,mask 将具有 0xFF 其中 m3[i] (溢出!),0x00 否则,我们将能够使用或"来钳位 m1,所以 m1 将在有效的地方保存加法结果,在溢出的地方保存 0xFF.

                  If an "epu8" equivalent was available, mask would have 0xFF where m3[i] < m1[i] (overflow!), 0x00 otherwise, and we would be able to clamp m1 using the "or", so m1 will hold the addition result where valid, and 0xFF where it overflowed.

                  问题是,_mm_cmplt_epi8 执行有符号比较,例如如果 m1[i] = 0x70m2[i] = 0x10,然后 m3[i] = 0x80mask[i] = 0xFF,这显然不是我需要的.

                  Problem is, _mm_cmplt_epi8 performs a signed comparison, so for instance if m1[i] = 0x70 and m2[i] = 0x10, then m3[i] = 0x80 and mask[i] = 0xFF, which is obviously not what I require.

                  使用 VS2012.

                  我很感激另一种执行此操作的方法.谢谢!

                  I would appreciate another approach for performing this. Thanks!

                  推荐答案

                  实现无符号 8 位向量比较的一种方法是利用 _mm_max_epu8,它返回最大的无符号 8 位 int 元素.您可以比较两个元素的(无符号)最大值与源元素之一是否相等,然后返回适当的结果.这意味着 >=<= 有 2 条指令,>< 有 3 条指令>.

                  One way of implementing compares for unsigned 8 bit vectors is to exploit _mm_max_epu8, which returns the maximum of unsigned 8 bit int elements. You can compare for equality the (unsigned) maximum value of two elements with one of the source elements and then return the appropriate result. This translates to 2 instructions for >= or <=, and 3 instructions for > or <.

                  示例代码:

                  #include <stdio.h>
                  #include <emmintrin.h>    // SSE2
                  
                  #define _mm_cmpge_epu8(a, b) 
                          _mm_cmpeq_epi8(_mm_max_epu8(a, b), a)
                  
                  #define _mm_cmple_epu8(a, b) _mm_cmpge_epu8(b, a)
                  
                  #define _mm_cmpgt_epu8(a, b) 
                          _mm_xor_si128(_mm_cmple_epu8(a, b), _mm_set1_epi8(-1))
                  
                  #define _mm_cmplt_epu8(a, b) _mm_cmpgt_epu8(b, a)
                  
                  int main(void)
                  {
                      __m128i va = _mm_setr_epi8(0,   0,   1,   1,   1, 127, 127, 127, 128, 128, 128, 254, 254, 254, 255, 255);
                      __m128i vb = _mm_setr_epi8(0, 255,   0,   1, 255,   0, 127, 255,   0, 128, 255,   0, 254, 255,   0, 255);
                  
                      __m128i v_ge = _mm_cmpge_epu8(va, vb);
                      __m128i v_le = _mm_cmple_epu8(va, vb);
                      __m128i v_gt = _mm_cmpgt_epu8(va, vb);
                      __m128i v_lt = _mm_cmplt_epu8(va, vb);
                  
                      printf("va   = %4vhhu
                  ", va);
                      printf("vb   = %4vhhu
                  ", vb);
                      printf("v_ge = %4vhhu
                  ", v_ge);
                      printf("v_le = %4vhhu
                  ", v_le);
                      printf("v_gt = %4vhhu
                  ", v_gt);
                      printf("v_lt = %4vhhu
                  ", v_lt);
                  
                      return 0;
                  }
                  

                  编译运行:

                  $ gcc -Wall _mm_cmplt_epu8.c && ./a.out 
                  va   =    0    0    1    1    1  127  127  127  128  128  128  254  254  254  255  255
                  vb   =    0  255    0    1  255    0  127  255    0  128  255    0  254  255    0  255
                  v_ge =  255    0  255  255    0  255  255    0  255  255    0  255  255    0  255  255
                  v_le =  255  255    0  255  255    0  255  255    0  255  255    0  255  255    0  255
                  v_gt =    0    0  255    0    0  255    0    0  255    0    0  255    0    0  255    0
                  v_lt =    0  255    0    0  255    0    0  255    0    0  255    0    0  255    0    0
                  

                  这篇关于SSE2 内在函数 - 比较无符号整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                • <small id='Jau01'></small><noframes id='Jau01'>

                      <bdo id='Jau01'></bdo><ul id='Jau01'></ul>
                      <i id='Jau01'><tr id='Jau01'><dt id='Jau01'><q id='Jau01'><span id='Jau01'><b id='Jau01'><form id='Jau01'><ins id='Jau01'></ins><ul id='Jau01'></ul><sub id='Jau01'></sub></form><legend id='Jau01'></legend><bdo id='Jau01'><pre id='Jau01'><center id='Jau01'></center></pre></bdo></b><th id='Jau01'></th></span></q></dt></tr></i><div id='Jau01'><tfoot id='Jau01'></tfoot><dl id='Jau01'><fieldset id='Jau01'></fieldset></dl></div>

                    • <legend id='Jau01'><style id='Jau01'><dir id='Jau01'><q id='Jau01'></q></dir></style></legend>
                        <tbody id='Jau01'></tbody>

                        <tfoot id='Jau01'></tfoot>