3.3 The Five-Number Summary; Boxplots
the deciles divide a data set into tenths (10 equal parts), the quintiles divide a data set into fififths (5 equal parts), and the quartiles divide a data set into quarters (4 equal parts).
an extreme observation need not be an outlier; it may instead be an indication of skewness.:try to determine its cause,因为离群点若是因为测量误差导致,则可以删去,但是在没有明显原因的情况下需要严查这个离群点,有可能是别的意想不到的原因
如何判断离群点?
Observations that lie below the lower limit or above the upper limit are potential outliers.. To determine whether a potential outlier is truly an outlier, you should perform further data analyses by constructing a histogram, stem-and-leaf diagram, and other appropriate graphics that we present later.
Boxplots:The adjacent values of a data set are the most extreme observations that still lie within the lower and upper limits
In a boxplot, the two lines emanating from the box are called whiskers
Symbols other than an asterisk are often used to plot potential outliers
fourth quarter has the greatest variation of all.
Boxplots are especially suited for comparing two or more data sets
各种分布及它们对应的箱图:
For small data sets, boxplots can be unreliable in identifying distribution shape(应该说对于分布图都不可靠,即都不能成线); using a stem-and-leaf diagram or a dotplot is generally better