需要转置一个 pandas 数据框

时间：2023-10-19

本文介绍了需要转置一个 pandas 数据框的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着跟版网的小编来一起学习吧！

问题描述

我有一个如下所示的系列:

I have a Series that look like this:

      col1          id
 0      a           10
 1      b           20
 2      c           30
 3      b           10
 4      d           10
 5      a           30
 6      e           40

我想要的输出是这样的:

My desired output is this:

    a   b   c   d   e
10  1   1   0   1   0
20  0   1   0   0   0
30  1   0   1   0   0
40  0   0   0   0   1

我得到了这个代码:

import pandas as pd

df['dummies'] = 1
df_ind.pivot(index='id', columns='col1', values='dummies')

我得到一个错误:

    137 
    138         if mask.sum() < len(self.index):
--> 139             raise ValueError('Index contains duplicate entries, '
    140                              'cannot reshape')
    141 

ValueError: Index contains duplicate entries, cannot reshape

存在重复的 id，因为 col1 中的多个值可以归因于一个 id.

There are duplicate id's because multiple values in col1 can be attributed to a single id.

我怎样才能达到预期的输出?

How can I achieve the desired output?

谢谢！

推荐答案

你可以使用 pd.crosstab

In [329]: pd.crosstab(df.id, df.col1)
Out[329]:
col1  a  b  c  d  e
id
10    1  1  0  1  0
20    0  1  0  0  0
30    1  0  1  0  0
40    0  0  0  0  1

或者，使用pd.pivot_table

In [336]: df.pivot_table(index='id', columns='col1', aggfunc=len, fill_value=0)
Out[336]:
col1  a  b  c  d  e
id
10    1  1  0  1  0
20    0  1  0  0  0
30    1  0  1  0  0
40    0  0  0  0  1

或者，使用groupby和unstack

In [339]: df.groupby(['id', 'col1']).size().unstack(fill_value=0)
Out[339]:
col1  a  b  c  d  e
id
10    1  1  0  1  0
20    0  1  0  0  0
30    1  0  1  0  0
40    0  0  0  0  1

这篇关于需要转置一个 pandas 数据框的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持跟版网！

上一篇：大 pandas 旋转数据框，重复行 下一篇：如何在 Pandas 中使用总计(边距)创建数据透视?

需要转置一个 pandas 数据框

问题描述

推荐答案

相关文章