# 我们将index_col的值设置为第一列的名称(“日期”,在Excel中打开时在文件的单元格A1中找到) , 将行的标签当作日期来读
fifa_data = pd.read_csv(fifa_filepath, index_col="Date", parse_dates=True)
# 使用Seaborn画数据
plt.figure(figsize=(14,6)) plt.title("Daily Global Streams of Popular Songs in 2017-2018") sns.lineplot(data=spotify_data)
# 打印所有列名字
list(spotify_data.columns)
# 使标签倾斜
plt.xticks(rotation=-45)
# 画某一列
# Line chart showing daily global streams of 'Shape of You' sns.lineplot(data=spotify_data['Shape of You'], label="Shape of You")
# 画一个柱型图
plt.figure(figsize=(10,6)) # Bar chart showing average arrival delay for Spirit Airlines flights by month sns.barplot(x=flight_data.index, y=flight_data['NK'])
# 热力图
sns.heatmap -这告诉笔记本我们要创建一个heatmap。
data=data_airplot -这告诉笔记本使用航班数据中的所有条目来创建热图。
annot=true -这可以确保每个单元格的值显示在图表上。(去掉这个会删除每个单元格中的数字!)
# 返回列表中最大值的索引
np.argmax(alist())
# 画一个散点图
sns.scatterplot(x = candy_data['sugarpercent'] ,y=candy_data['winpercent'])
区分数据的标记
hue=candy_data['chocolate']
# 有一个相关曲线的图
sns.regplot(x=candy_data['sugarpercent'], y=candy_data['winpercent'])
# 有两个相关曲线的图
sns.lmplot(x="bmi", y="charges", hue="smoker", data=insurance_data)
# 画一个分类散点图 (像小花的图,横坐标最好是两种情况,Yes,或No)
sns.swarmplot(x=candy_data['chocolate'],y=candy_data['winpercent'])
# 画一个直方图
sns.distplot(a=iris_data['Petal Length (cm)'], kde=False)
kde = Flase 是否在图中画出核密度估计图
# 核密度估计图 (可以理解为平滑的直方图)
# KDE plot
sns.kdeplot(data=iris_data['Petal Length (cm)'], shade=True)
# 2维的核密度估计图
# 2D KDE plot
sns.jointplot(x=iris_data['Petal Length (cm)'], y=iris_data['Sepal Width (cm)'], kind="kde")
# 图的分类
Since it's not always easy to decide how to best tell the story behind your data, we've broken the chart types into three broad categories to help with this.
-
Trends - A trend is defined as a pattern of change.
-
sns.lineplot
- Line charts are best to show trends over a period of time, and multiple lines can be used to show trends in more than one group.
-
-
Relationship - There are many different chart types that you can use to understand relationships between variables in your data.
-
sns.barplot
- Bar charts are useful for comparing quantities corresponding to different groups. -
sns.heatmap
- Heatmaps can be used to find color-coded patterns in tables of numbers. -
sns.scatterplot
- Scatter plots show the relationship between two continuous variables; if color-coded, we can also show the relationship with a third categorical variable. -
sns.regplot
- Including a regression line in the scatter plot makes it easier to see any linear relationship between two variables. -
sns.lmplot
- This command is useful for drawing multiple regression lines, if the scatter plot contains multiple, color-coded groups. -
sns.swarmplot
- Categorical scatter plots show the relationship between a continuous variable and a categorical variable.
-
-
Distribution - We visualize distributions to show the possible values that we can expect to see in a variable, along with how likely they are.
-
sns.distplot
- Histograms show the distribution of a single numerical variable. -
sns.kdeplot
- KDE plots (or 2D KDE plots) show an estimated, smooth distribution of a single numerical variable (or two numerical variables). -
sns.jointplot
- This command is useful for simultaneously displaying a 2D KDE plot with the corresponding KDE plots for each individual variable.
-
# seaborn 主题
sns.set_style("dark")
Seaborn有五个不同的主题:(1)“DarkGrid”、(2)“WhiteGrid”、(3)“Dark”、(4)“White”和(5)“Ticks”,您只需要使用与上面代码单元中的类似的命令(填充所选主题)来更改它。