我有一个如下所示的数据框:
# Create an example dataframe about a fictional army
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks'],
'company': ['1st', '1st', '2nd', '2nd'],
'deaths': ['kkk', 52, '25', 616],
'battles': [5, '42', 2, 2],
'size': ['l', 'll', 'l', 'm']}
df = pd.DataFrame(raw_data, columns = ['regiment', 'company', 'deaths', 'battles', 'size'])
我的目标是将数据帧内的每个字符串转换为大写,以便它看起来像这样:
注意:所有数据类型都是对象,不得更改;输出必须包含所有对象.我想避免逐个转换每一列……我想在整个数据框中做一般的.
到目前为止我尝试过的是这样做但没有成功
df.str.upper()
解决方法:
astype()将每个系列转换为dtype对象(字符串),然后在转换后的系列上调用str()方法以字面上获取字符串并在其上调用函数upper().请注意,在此之后,所有列的dtype都将更改为object.
In [17]: df
Out[17]:
regiment company deaths battles size
0 Nighthawks 1st kkk 5 l
1 Nighthawks 1st 52 42 ll
2 Nighthawks 2nd 25 2 l
3 Nighthawks 2nd 616 2 m
In [18]: df.apply(lambda x: x.astype(str).str.upper())
Out[18]:
regiment company deaths battles size
0 NIGHTHAWKS 1ST KKK 5 L
1 NIGHTHAWKS 1ST 52 42 LL
2 NIGHTHAWKS 2ND 25 2 L
3 NIGHTHAWKS 2ND 616 2 M
您可以稍后使用to_numeric()将“战斗”列再次转换为数字:
In [42]: df2 = df.apply(lambda x: x.astype(str).str.upper())
In [43]: df2['battles'] = pd.to_numeric(df2['battles'])
In [44]: df2
Out[44]:
regiment company deaths battles size
0 NIGHTHAWKS 1ST KKK 5 L
1 NIGHTHAWKS 1ST 52 42 LL
2 NIGHTHAWKS 2ND 25 2 L
3 NIGHTHAWKS 2ND 616 2 M
In [45]: df2.dtypes
Out[45]:
regiment object
company object
deaths object
battles int64
size object
dtype: object