<i id='fDhHC'><tr id='fDhHC'><dt id='fDhHC'><q id='fDhHC'><span id='fDhHC'><b id='fDhHC'><form id='fDhHC'><ins id='fDhHC'></ins><ul id='fDhHC'></ul><sub id='fDhHC'></sub></form><legend id='fDhHC'></legend><bdo id='fDhHC'><pre id='fDhHC'><center id='fDhHC'></center></pre></bdo></b><th id='fDhHC'></th></span></q></dt></tr></i><div id='fDhHC'><tfoot id='fDhHC'></tfoot><dl id='fDhHC'><fieldset id='fDhHC'></fieldset></dl></div>
      1. <small id='fDhHC'></small><noframes id='fDhHC'>

          <bdo id='fDhHC'></bdo><ul id='fDhHC'></ul>

      2. <tfoot id='fDhHC'></tfoot>
        <legend id='fDhHC'><style id='fDhHC'><dir id='fDhHC'><q id='fDhHC'></q></dir></style></legend>

        逐行比较两个不同长度的数据帧,并为每行添加具有相等值的列

        时间:2023-11-07
      3. <legend id='srORN'><style id='srORN'><dir id='srORN'><q id='srORN'></q></dir></style></legend>
          <bdo id='srORN'></bdo><ul id='srORN'></ul>
          1. <i id='srORN'><tr id='srORN'><dt id='srORN'><q id='srORN'><span id='srORN'><b id='srORN'><form id='srORN'><ins id='srORN'></ins><ul id='srORN'></ul><sub id='srORN'></sub></form><legend id='srORN'></legend><bdo id='srORN'><pre id='srORN'><center id='srORN'></center></pre></bdo></b><th id='srORN'></th></span></q></dt></tr></i><div id='srORN'><tfoot id='srORN'></tfoot><dl id='srORN'><fieldset id='srORN'></fieldset></dl></div>
            <tfoot id='srORN'></tfoot>

            • <small id='srORN'></small><noframes id='srORN'>

                    <tbody id='srORN'></tbody>
                  本文介绍了逐行比较两个不同长度的数据帧,并为每行添加具有相等值的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我在 python pandas 中有两个不同长度的数据框,如下所示:

                  I have two dataframes of different length in python pandas like this:

                  df1:                                 df2:
                  
                        Column1  Column2 Column3            ColumnA ColumnB 
                      0    1       a       r              0    1       a
                      1    2       b       u              1    1       d
                      2    3       c       k              2    1       e
                      3    4       d       j              3    2       r
                      4    5       e       f              4    2       w
                                                          5    3       y 
                                                          6    3       h
                  

                  我现在要做的是比较 df1 的 Column1 和 df2 的 ColumnA.对于每个命中",其中 df2 中的 ColumnA 中的一行与 df1 中 Column1 中的一行具有相同的值,我想将一列附加到 df1,其中 df2 的 ColumnB 对命中"的行具有,所以我的结果如下所示:

                  What I am trying to do now is comparing Column1 of df1 and ColumnA of df2. For each "hit", where a row in ColumnA in df2 has the same value as a row in Column1 in df1, I want to append a column to df1 with the vaule ColumnB of df2 has for the row where the "hit" was found, so that my result looks like this:

                  df1:
                  
                     Column1  Column2  Column3  Column4 Column5  Column6
                  0     1        a        r        a       d        e
                  1     2        b        u        r       w
                  2     3        c        k        y       h
                  3     4        d        j
                  4     5        e        f
                  

                  到目前为止我尝试过的是:

                  What I have tried so far was:

                  for row in df1, df2:
                     if df1[Column1] == df2[ColumnA]:
                        print 'yey!'
                  

                  这给了我一个错误,说我无法比较两个不同长度的数据帧.所以我尝试了:

                  which gave me an error saying I could not compare two dataframes of different length. So I tried:

                  for row in df1, df2:
                      if def2[def2['ColumnA'].isin(def1['column1'])]:
                          print 'lalala' 
                      else:
                          print 'Nope'
                  

                  就我获得输出而言,哪个有效",但我认为它不会遍历行并比较它们,因为它只打印 'lalala' 两次.于是我又研究了一番,找到了一种遍历数据框每一行的方法,即:

                  Which "works" in terms that I get an output, but I do not think it iterates over the rows and compares them, since it only prints 'lalala' two times. So I researched some more and found a way to iterate over each row of the dataframe, which is:

                  for index, row in df1.iterrows():
                      print row['Column1]
                  

                  但我不知道如何使用它来比较两个数据框的列并获得我想要的输出.

                  But I do not know how to use this to compare the columns of the two dataframes and get the output I desire.

                  非常感谢任何有关如何执行此操作的帮助.

                  Any help on how to do this would be really appreciated.

                  推荐答案

                  我推荐你使用DataFrame API,它允许在加入合并groupby 等.您可以在下面找到我的解决方案:

                  I recommend you to use DataFrame API which allows to operate with DF in terms of join, merge, groupby, etc. You can find my solution below:

                  import pandas as pd
                  
                  df1 = pd.DataFrame({'Column1': [1,2,3,4,5], 
                      'Column2': ['a','b','c','d','e'], 
                      'Column3': ['r','u','k','j','f']})
                  
                  df2 = pd.DataFrame({'Column1': [1,1,1,2,2,3,3], 'ColumnB': ['a','d','e','r','w','y','h']})
                  
                  dfs = pd.DataFrame({})
                  for name, group in df2.groupby('Column1'):
                      buffer_df = pd.DataFrame({'Column1': group['Column1'][:1]})
                      i = 0
                      for index, value in group['ColumnB'].iteritems():
                          i += 1
                          string = 'Column_' + str(i)
                          buffer_df[string] = value
                  
                      dfs = dfs.append(buffer_df)
                  
                  result = pd.merge(df1, dfs, how='left', on='Column1')
                  print(result)
                  

                  结果是:

                     Column1 Column2 Column3 Column_0 Column_1 Column_2
                  0        1       a       r        a        d        e
                  1        2       b       u        r        w      NaN
                  2        3       c       k        y        h      NaN
                  3        4       d       j      NaN      NaN      NaN
                  4        5       e       f      NaN      NaN      NaN
                  

                  附:更多详情:

                  1) 对于 df2,我通过Column1"生成 groups.单个 group 是一个数据框.示例如下:

                  1) for df2 I produce groups by 'Column1'. The single group is a data frame. Example below:

                     Column1 ColumnB
                  0        1       a
                  1        1       d
                  2        1       e
                  

                  2) 对于每个 group 我生成数据帧 buffer_df:

                  2) for each group I produce data frame buffer_df:

                     Column1 Column_0 Column_1 Column_2
                  0        1        a        d        e
                  

                  3) 之后我创建 DF dfs:

                  3) after that I create DF dfs:

                     Column1 Column_0 Column_1 Column_2
                  0        1        a        d        e
                  3        2        r        w      NaN
                  5        3        y        h      NaN
                  

                  4) 最后我为 df1dfs 执行左连接以获得所需的结果.

                  4) in the end I execute left join for df1 and dfs obtaining needed result.

                  2)* buffer_df 是迭代产生的:

                  2)* buffer_df is produced iteratively:

                  step0 (buffer_df = pd.DataFrame({'Column1': group['Column1'][:1]})):
                              Column1
                           5       3
                  
                  step1 (buffer_df['Column_0'] = group['ColumnB'][5]):      
                              Column1 Column_0
                           5       3       y
                  
                  step2 (buffer_df['Column_1'] = group['ColumnB'][5]):      
                              Column1 Column_0 Column_1
                           5       3       y       h
                  

                  这篇关于逐行比较两个不同长度的数据帧,并为每行添加具有相等值的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  上一篇:比较两个大的 1 和 1 列表的最快方法是什么?0 并返回差异计数/百分比? 下一篇:Python 3.6.5 “是"和“=="对于超出缓存间隔的整数

                  相关文章

                1. <small id='ahSbL'></small><noframes id='ahSbL'>

                  <i id='ahSbL'><tr id='ahSbL'><dt id='ahSbL'><q id='ahSbL'><span id='ahSbL'><b id='ahSbL'><form id='ahSbL'><ins id='ahSbL'></ins><ul id='ahSbL'></ul><sub id='ahSbL'></sub></form><legend id='ahSbL'></legend><bdo id='ahSbL'><pre id='ahSbL'><center id='ahSbL'></center></pre></bdo></b><th id='ahSbL'></th></span></q></dt></tr></i><div id='ahSbL'><tfoot id='ahSbL'></tfoot><dl id='ahSbL'><fieldset id='ahSbL'></fieldset></dl></div>
                    • <bdo id='ahSbL'></bdo><ul id='ahSbL'></ul>
                  1. <tfoot id='ahSbL'></tfoot>

                      <legend id='ahSbL'><style id='ahSbL'><dir id='ahSbL'><q id='ahSbL'></q></dir></style></legend>