对符合 条件 的数据进行标记
数据如下:
import pandas as pd
import numpy as np
data=[['Amy',13],['Susan',19],['Tom',14],['Ella',18]]
df = pd.DataFrame(data,columns=['Name','Age'])
df
<style scoped="">
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
|
Name |
Age |
0 |
Amy |
13 |
1 |
Susan |
19 |
2 |
Tom |
14 |
3 |
Ella |
18 |
如果 age>18, 则在adult(成年人)列,显示Yes,否则显示No
df['adult']=np.where(df['Age']>=18,'Yes','No')
df
<style scoped="">
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
|
Name |
Age |
adult |
0 |
Amy |
13 |
No |
1 |
Susan |
19 |
Yes |
2 |
Tom |
14 |
No |
3 |
Ella |
18 |
Yes |
对符合多个条件的数据进行分组标记
数据如下:
data=[['Beijing',4000],['ShangHai',5000],['Beijing',3500],['Nanjing',2000],['Beijing',3800]]
df1=pd.DataFrame(data,columns=['city','price'])
df1
<style scoped="">
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
|
city |
price |
0 |
Beijing |
4000 |
1 |
ShangHai |
5000 |
2 |
Beijing |
3500 |
3 |
Nanjing |
2000 |
4 |
Beijing |
3800 |
如果city是Beijin,且price大于等于3800,则标记为1
df1.loc[(df1['city']=='Beijing') & (df1['price']>=3800),'sing']=1
df1
<style scoped="">
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
|
city |
price |
sing |
0 |
Beijing |
4000 |
1.0 |
1 |
ShangHai |
5000 |
NaN |
2 |
Beijing |
3500 |
NaN |
3 |
Nanjing |
2000 |
NaN |
4 |
Beijing |
3800 |
1.0 |