整理Pandas读取行列数据方法

2024-01-02 08:58:16

1、读取方法有按行，按列；按位置（坐标），按字符（索引）；函数有 df.iloc(), df.loc(), df.iat(), df.at(), df.ix()。

2、转换为DF，赋值columns，index，修改添加数据，取行列索引

data = {'省份': ['北京', '上海', '广州', '深圳'],
        '年份': ['2017', '2018', '2019', '2020'],
        '总人数': ['2200', '1900', '2170', '1890'],
        '高考人数': ['6.3', '5.9', '6.0', '5.2']}
df = pd.DataFrame(data, columns=['省份', '年份', '总人数', '高考人数', '高数'],
                  index=['one', 'two', 'three', 'four'])
df['高数'] = ['90', '95', '92', '98']
print("行索引：{}".format(list(df.index)))
print("列索引：{}".format(list(df.columns)))
print(df.index[1:3])
print(df.columns[1])
print(df.columns[1:3])
print(df)

行索引：['one', 'two', 'three', 'four']
列索引：['省份', '年份', '总人数', '高考人数', '高数']
Index(['two', 'three'], dtype='object')
年份
Index(['年份', '总人数'], dtype='object')
       省份    年份   总人数 高考人数  高数
one    北京  2017  2200  6.3  90
two    上海  2018  1900  5.9  95
three  广州  2019  2170  6.0  92
four   深圳  2020  1890  5.2  98

3、

print(df['省份'])  #按列名取列
print(df.省份)  #按列名取列
print(df[['省份', '总人数']])  #按列名取列
print(df[df.columns[1:4]])  #按列索引取列
print(df.iloc[:, 1])  #按位置取列
print(df.iloc[:, [1, 3]])  #按位置取列

one      北京
two      上海
three    广州
four     深圳
Name: 省份, dtype: object
one      北京
two      上海
three    广州
four     深圳
Name: 省份, dtype: object
       省份   总人数
one    北京  2200
two    上海  1900
three  广州  2170
four   深圳  1890
         年份   总人数 高考人数
one    2017  2200  6.3
two    2018  1900  5.9
three  2019  2170  6.0
four   2020  1890  5.2
one      2017
two      2018
three    2019
four     2020
Name: 年份, dtype: object
         年份 高考人数
one    2017  6.3
two    2018  5.9
three  2019  6.0
four   2020  5.2

4、

print(df.iloc[1])
print(df.iloc[1, 3])
print(df.iloc[[1], [3]])
print(df.loc[df.index[1:3]])  #按行索引取行，但没必要
print(df.iloc[1:3])
print(df.iloc[[1, 3]])
print(df.iloc[[1,2,3], [2,4]])

省份        上海
年份      2018
总人数     1900
高考人数     5.9
高数        95
Name: two, dtype: object
5.9
    高考人数
two  5.9
       省份    年份   总人数 高考人数  高数
two    上海  2018  1900  5.9  95
three  广州  2019  2170  6.0  92
       省份    年份   总人数 高考人数  高数
two    上海  2018  1900  5.9  95
three  广州  2019  2170  6.0  92
      省份    年份   总人数 高考人数  高数
two   上海  2018  1900  5.9  95
four  深圳  2020  1890  5.2  98
        总人数  高数
two    1900  95
three  2170  92
four   1890  98

5、

print(df.loc['two'])
print(df.loc['two', '省份'])
print(df.loc['two':'three'])
print(df.loc[['one', 'three']])
print(df.loc[['one', 'three'], ['省份', '年份']])

省份        上海
年份      2018
总人数     1900
高考人数     5.9
高数        95
Name: two, dtype: object
上海
       省份    年份   总人数 高考人数  高数
two    上海  2018  1900  5.9  95
three  广州  2019  2170  6.0  92
       省份    年份   总人数 高考人数  高数
one    北京  2017  2200  6.3  90
three  广州  2019  2170  6.0  92
       省份    年份
one    北京  2017
three  广州  2019

6、

print(df.ix[1:3])
print(df.ix[:, [1, 3]])
print(df.iat[1,3])
print(df.at['two', '省份'])

       省份    年份   总人数 高考人数  高数
two    上海  2018  1900  5.9  95
three  广州  2019  2170  6.0  92
         年份 高考人数
one    2017  6.3
two    2018  5.9
three  2019  6.0
four   2020  5.2
5.9
上海

码农公寓

相关文章