• <small id='gBAC9'></small><noframes id='gBAC9'>

    <tfoot id='gBAC9'></tfoot>
        <legend id='gBAC9'><style id='gBAC9'><dir id='gBAC9'><q id='gBAC9'></q></dir></style></legend>

        <i id='gBAC9'><tr id='gBAC9'><dt id='gBAC9'><q id='gBAC9'><span id='gBAC9'><b id='gBAC9'><form id='gBAC9'><ins id='gBAC9'></ins><ul id='gBAC9'></ul><sub id='gBAC9'></sub></form><legend id='gBAC9'></legend><bdo id='gBAC9'><pre id='gBAC9'><center id='gBAC9'></center></pre></bdo></b><th id='gBAC9'></th></span></q></dt></tr></i><div id='gBAC9'><tfoot id='gBAC9'></tfoot><dl id='gBAC9'><fieldset id='gBAC9'></fieldset></dl></div>
          <bdo id='gBAC9'></bdo><ul id='gBAC9'></ul>

      1. pandas groupby 和过滤器

        时间:2023-08-29
      2. <tfoot id='zOltX'></tfoot>
        <legend id='zOltX'><style id='zOltX'><dir id='zOltX'><q id='zOltX'></q></dir></style></legend>

          • <small id='zOltX'></small><noframes id='zOltX'>

              <tbody id='zOltX'></tbody>
              <bdo id='zOltX'></bdo><ul id='zOltX'></ul>
              <i id='zOltX'><tr id='zOltX'><dt id='zOltX'><q id='zOltX'><span id='zOltX'><b id='zOltX'><form id='zOltX'><ins id='zOltX'></ins><ul id='zOltX'></ul><sub id='zOltX'></sub></form><legend id='zOltX'></legend><bdo id='zOltX'><pre id='zOltX'><center id='zOltX'></center></pre></bdo></b><th id='zOltX'></th></span></q></dt></tr></i><div id='zOltX'><tfoot id='zOltX'></tfoot><dl id='zOltX'><fieldset id='zOltX'></fieldset></dl></div>

                • 本文介绍了 pandas groupby 和过滤器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我有数据框:

                  df = pd.DataFrame({'ID':[1,1,2,2,3,3], 
                                     'YEAR' : [2011,2012,2012,2013,2013,2014], 
                                     'V': [0,1,1,0,1,0],
                                     'C':[00,11,22,33,44,55]})
                  

                  我想按 ID 分组,并在每个组中选择 V = 0 的行.

                  I would like to group by ID, and select the row with V = 0 within each group.

                  这似乎不起作用:

                  print(df.groupby(['ID']).filter(lambda x: x['V'] == 0)) 
                  

                  出现错误:

                  TypeError: filter 函数返回了一个 Series,但预期的是一个标量 bool

                  TypeError: filter function returned a Series, but expected a scalar bool

                  如何使用过滤器来实现目标?谢谢.

                  How can I use filter to achieve the goal? Thank you.

                  编辑:V 上的条件可能因每个组而异,例如,对于 ID 1,它可能是 V==0,对于 ID 2,它可能是 V==1,并且可以通过另一个 DF 获得此信息:

                  EDIT: The condition on V may vary for each group, e.g., it could be V==0 for ID 1, V==1 for ID 2, and this info can be available through another DF:

                  df = pd.DataFrame({'ID':[1,2,3], 
                                     'V': [0,1,0])
                  

                  那么如何在每个组内进行行过滤呢?

                  So how to do row filtering within each group?

                  推荐答案

                  我觉得groupby没必要,用boolean indexing 仅在需要 V0<的所有行时/代码>:

                  I think groupby is not necessary, use boolean indexing only if need all rows where V is 0:

                  print (df[df.V == 0])
                      C  ID  V  YEAR
                  0   0   1  0  2011
                  3  33   2  0  2013
                  5  55   3  0  2014
                  

                  但如果需要返回列 V 的至少一个值等于 0 的所有组,请添加 any,因为 filter 需要 TrueFalse 用于过滤组中的所有行:

                  But if need return all groups where is at least one value of column V equal 0 add any, because filter need True or False for filtering all rows in group:

                  print(df.groupby(['ID']).filter(lambda x: (x['V'] == 0).any())) 
                      C  ID  V  YEAR
                  0   0   1  0  2011
                  1  11   1  1  2012
                  2  22   2  1  2012
                  3  33   2  0  2013
                  4  44   3  1  2013
                  5  55   3  0  2014
                  

                  更好的测试是更改 groupby 的列 - 2012 的行被过滤掉,因为没有 V==0:

                  Better for testing is change column for groupby - row with 2012 is filter out because no V==0:

                  print(df.groupby(['YEAR']).filter(lambda x: (x['V'] == 0).any())) 
                      C  ID  V  YEAR
                  0   0   1  0  2011
                  3  33   2  0  2013
                  4  44   3  1  2013
                  5  55   3  0  2014
                  

                  如果性能很重要,请使用 GroupBy.transform布尔索引:

                  If performance is important use GroupBy.transform with boolean indexing:

                  print(df[(df['V'] == 0).groupby(df['YEAR']).transform('any')]) 
                     ID  YEAR  V   C
                  0   1  2011  0   0
                  3   2  2013  0  33
                  4   3  2013  1  44
                  5   3  2014  0  55
                  

                  详情:

                  print((df['V'] == 0).groupby(df['YEAR']).transform('any')) 
                  0     True
                  1    False
                  2    False
                  3     True
                  4     True
                  5     True
                  Name: V, dtype: bool
                  

                  这篇关于 pandas groupby 和过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  • <tfoot id='bpcCR'></tfoot>

                      <tbody id='bpcCR'></tbody>
                    <legend id='bpcCR'><style id='bpcCR'><dir id='bpcCR'><q id='bpcCR'></q></dir></style></legend>
                    1. <i id='bpcCR'><tr id='bpcCR'><dt id='bpcCR'><q id='bpcCR'><span id='bpcCR'><b id='bpcCR'><form id='bpcCR'><ins id='bpcCR'></ins><ul id='bpcCR'></ul><sub id='bpcCR'></sub></form><legend id='bpcCR'></legend><bdo id='bpcCR'><pre id='bpcCR'><center id='bpcCR'></center></pre></bdo></b><th id='bpcCR'></th></span></q></dt></tr></i><div id='bpcCR'><tfoot id='bpcCR'></tfoot><dl id='bpcCR'><fieldset id='bpcCR'></fieldset></dl></div>

                      <small id='bpcCR'></small><noframes id='bpcCR'>

                        • <bdo id='bpcCR'></bdo><ul id='bpcCR'></ul>