# 一、Pandas对缺失值的处理
![image-20211217203512276](https://gitee.com/alcoholfree/python-project/raw/master/img/202112172035373.png)
## 1.1 忽略空行
skiprows
## 1.2 删掉全是空值的列或者行
.dropna(axis="columns" / "index",how ='any'/'all',inplace=True/False)
## 1.3 对缺失值进行填充
loc[:, '列名'] = ['列名'].fillna(method="ffill")
# 二、数据结构
![image-20211217205338641](https://gitee.com/alcoholfree/python-project/raw/master/img/202112172053672.png)
## 2.1 创建series
### 2.1.1 列表创建series
变量名 = pd.Series([xx,xx,xx])
### 2.1.2 创建具有标签索引的Series
变量名 = pd.Series([x,x,x,x],index=['x','xx','xxx','xxxx'])
输出标签
print(变量名['index'])
### 2.1.3 使用字典创建Series
变量名1 = {'x':12,'xx':123,'xxx':1234}
变量名2 = pd.Series(变量名1)
## 2.2 DataFrame
创建dataFrame最常用的方式就是读取文件
### 2.2.1 根据字典序列创建dataFrame
变量名1 = {'属性1':['x','xx','xxx'],'属性2':['x','xx','xxx']}
变量名2 = pd.DataFrame(变量名1)
# 三、索引index
## 3.1 使用index查询数据
.set_index("索引名",inplace = True,drop=False) //drop = False 让索引列依旧保留在原本的列中
# 四、Merge合并
![image-20211217214139143](https://gitee.com/alcoholfree/python-project/raw/master/img/202112172141233.png)
## 4.1 inner join
![image-20211218094319181](https://gitee.com/alcoholfree/python-project/raw/master/img/202112180943277.png)
## 4.2 outer join
![image-20211218094508888](https://gitee.com/alcoholfree/python-project/raw/master/img/202112180945924.png)
## 4.3 right join
![image-20211218094434887](https://gitee.com/alcoholfree/python-project/raw/master/img/202112180944921.png)
## 4.4 left join
![image-20211218094414483](https://gitee.com/alcoholfree/python-project/raw/master/img/202112180944310.png)
## 4.5 出现非Key的字段重名
![image-20211218094709543](https://gitee.com/alcoholfree/python-project/raw/master/img/202112180947575.png)
key不变
使用suffixes=('字段名1','字段名2')来自己指定重复字段的名称
引号里的内容会被添加后原来的column后面
# 五、Concat合并
![image-20211221135634828](https://gitee.com/alcoholfree/python-project/raw/master/img/202112211356888.png)
axis = 0,为竖向合并
axis = 1,为横向合并
ignore_index = True,行从0开始算,而不是每个数据集单独显示
join ='inner'/'outer'
inner会显示两者共有的属性,outer是都会显示,默认为outer
// df1.append(df2) 会直接将df2插入到df1的后面