Python 数据分析:Pandas 缺省值的判断

Python 数据分析:Pandas 缺省值的判断

背景

我们从数据库中取出数据存入 Pandas None 转换成 NaN 或 NaT。但是,我们将 Pandas 数据写入数据库时又需要转换成 None,不然就会报错。因此,我们就需要处理 Pandas 的缺省值。

样本数据

   id         name  password  sn  sex  age  amount  content  remark  login_date login_at    created_at
0 1 123456789.0 NaN NaN NaN 20 NaN NaN NaN NaN NaT 2019-08-10 10:00:00
1 2 NaN NaN NaN NaN 20 NaN NaN NaN NaN NaT 2019-08-10 10:00:00

判断缺省值

如果 column 是缺省值,则统一处理为 None。

def judge_null(column):
if pd.isnull(column):
return None
return column

处理缺省值

按列处理缺省值。

df['id'] = df.apply(lambda row: judge_null(row['id']), axis=1)
df['name'] = df.apply(lambda row: judge_null(row['name']), axis=1)
df['password'] = df.apply(lambda row: judge_null(row['password']), axis=1)
df['sn'] = df.apply(lambda row: judge_null(row['sn']), axis=1)
df['sex'] = df.apply(lambda row: judge_null(row['sex']), axis=1)
df['age'] = df.apply(lambda row: judge_null(row['age']), axis=1)
df['amount'] = df.apply(lambda row: judge_null(row['amount']), axis=1)
df['content'] = df.apply(lambda row: judge_null(row['content']), axis=1)
df['remark'] = df.apply(lambda row: judge_null(row['remark']), axis=1)
df['login_date'] = df.apply(lambda row: judge_null(row['login_date']), axis=1)
df['login_at'] = df.apply(lambda row: judge_null(row['login_at']), axis=1)
df['created_at'] = df.apply(lambda row: judge_null(row['created_at']), axis=1)

处理完成之后的数据

   id         name  password  sn    sex    age   amount    content  remark  login_date  login_at  created_at
0 1 123456789.0 None None None 20 None None None None None 2019-08-10 10:00:00
1 2 None None None None 20 None None None None None 2019-08-10 10:00:00

补充

设置显示所有的行、列及值得长度。

# 显示所有列
pd.set_option('display.max_columns', None)
# 显示所有行
pd.set_option('display.max_rows', None)
# 设置value的显示长度为100,默认为50
pd.set_option('max_colwidth', 100)

对应的数据库建表语句

create table test
(
id int(10) not null primary key,
name varchar(32) null,
password char(10) null,
sn bigint null,
sex tinyint(1) null,
age int(5) null,
amount decimal(10, 2) null,
content text null,
remark json null,
login_date date null,
login_at datetime null,
created_at timestamp null
);
上一篇:java io读书笔记(2)什么是stream


下一篇:Object类的toString()方法总结