基于Spark的机器学习实践 (四) - 数据可视化

# 0 相关源码

1 数据可视化的作用及常用方法

1.1 为什么要数据可视化

1.1.1 何为数据可视化?

◆ 将数据以图形图像的形式展现出来

◆ 人类可以对三维及以下的数据产生直观的感受

1.1.2 数据可视化的好处

◆ 便于人们发现与理解数据蕴含的信息

◆ 便于人们进行讨论

1.2 数据可视化的常用方法

◆ 对于web应用,一般使用echarts,hightcharts,d3.js等

◆ 对于数据分析利器python , 使用matplotlib等可视化库

◆ 对于非码农的数据分析员, 一般使用excel等

2 初识Echarts

◆ echarts是由百度开源的JS数据可视化库,底层依赖ZRender渲染

◆ 虽然该项目并不能称为最优秀的可视化库,但是在国内市场占有率很高,故本教程选择echarts.

◆ echarts 提供的图表很丰富 ,我们只需使用其中几个即可

2.1 学习使用echarts绘图

◆ 我们将通过官网的文档,共同学习echarts使用的基本方法

◆ 使用流程:

  • 定义网页结构
  • 声明DOM
  • 填充并解析数据
  • 渲染数据

◆ 我们主要学习的图表有折线图、条形图、散点图等

3 通过Echarts实现图表化数据展示

3.1 实现一个echarts图表的例子

简单线形图

  • 替换为年份数据
    基于Spark的机器学习实践 (四) - 数据可视化
  • 替换为降雨量数据
    基于Spark的机器学习实践 (四) - 数据可视化

柱状图动画延迟

基于Spark的机器学习实践 (四) - 数据可视化

var xAxisData = [2009,2007,2006,2005,2004,2003,2002,2001,2000,1999,1998,1997,1996,1995,1994,1993,1992,1991,1990,1989,1988,1987,1986,1985,1984,1983,1982,1981,1980,1979,1978,1977,1976,1975,1974,1973,1972,1971,1970,1969,1968,1967,1966,1965,1964,1963,1962,1961,1960,1959,1958,1957,1956,1955,1954,1953,1952,1951,1950,1949];
var data = [0.4806,0.4839,0.318,0.4107,0.4835,0.4445,0.3704,0.3389,0.3711,0.2669,0.7317,0.4309,0.7009,0.5725,0.8132,0.5067,0.5415,0.7479,0.6973,0.4422,0.6733,0.6839,0.6653,0.721,0.4888,0.4899,0.5444,0.3932,0.3807,0.7184,0.6648,0.779,0.684,0.3928,0.4747,0.6982,0.3742,0.5112,0.597,0.9132,0.3867,0.5934,0.5279,0.2618,0.8177,0.7756,0.3669,0.5998,0.5271,1.406,0.6919,0.4868,1.1157,0.9332,0.9614,0.6577,0.5573,0.4816,0.9109,0.921];

option = {
    title: {
        text: '柱状图动画延迟'
    },
    legend: {
        data: ['beijing'],
        align: 'left'
    },
    toolbox: {
        // y: 'bottom',
        feature: {
            magicType: {
                type: ['stack', 'tiled']
            },
            dataView: {},
            saveAsImage: {
                pixelRatio: 2
            }
        }
    },
    tooltip: {},
    xAxis: {
        data: xAxisData,
        silent: false,
        splitLine: {
            show: false
        }
    },
    yAxis: {
    },
    series: [{
        name: 'beijing',
        type: 'bar',
        data: data,
        animationDelay: function (idx) {
            return idx * 10;
        }
    }
],
    animationEasing: 'elasticOut',
    animationDelayUpdate: function (idx) {
        return idx * 5;
    }
};

基于Spark的机器学习实践 (四) - 数据可视化

var xAxisData = [2009,2007,2006,2005,2004,2003,2002,2001,2000,1999,1998,1997,1996,1995,1994,1993,1992,1991,1990,1989,1988,1987,1986,1985,1984,1983,1982,1981,1980,1979,1978,1977,1976,1975,1974,1973,1972,1971,1970,1969,1968,1967,1966,1965,1964,1963,1962,1961,1960,1959,1958,1957,1956,1955,1954,1953,1952,1951,1950,1949];
var data = [0.4806,0.4839,0.318,0.4107,0.4835,0.4445,0.3704,0.3389,0.3711,0.2669,0.7317,0.4309,0.7009,0.5725,0.8132,0.5067,0.5415,0.7479,0.6973,0.4422,0.6733,0.6839,0.6653,0.721,0.4888,0.4899,0.5444,0.3932,0.3807,0.7184,0.6648,0.779,0.684,0.3928,0.4747,0.6982,0.3742,0.5112,0.597,0.9132,0.3867,0.5934,0.5279,0.2618,0.8177,0.7756,0.3669,0.5998,0.5271,1.406,0.6919,0.4868,1.1157,0.9332,0.9614,0.6577,0.5573,0.4816,0.9109,0.921];

option = {
    title: {
        text: '柱状图动画延迟'
    },
    legend: {
        data: ['beijing','shanghai'],
        align: 'left'
    },
    toolbox: {
        // y: 'bottom',
        feature: {
            magicType: {
                type: ['stack', 'tiled']
            },
            dataView: {},
            saveAsImage: {
                pixelRatio: 2
            }
        }
    },
    tooltip: {},
    xAxis: {
        data: xAxisData,
        silent: false,
        splitLine: {
            show: false
        }
    },
    yAxis: {
    },
    series: [
    {
        name: 'beijing', 
        type: 'bar',
        data: data,
        animationDelay: function (idx) {
            return idx * 10;
        }
    },
    {
        name: 'shanghai', 
        type: 'bar',
        data: data,
        animationDelay: function (idx) {
            return idx * 10;
        }
    }
],
    animationEasing: 'elasticOut',
    animationDelayUpdate: function (idx) {
        return idx * 5;
    }
};

基于Spark的机器学习实践 (四) - 数据可视化

Spark机器学习实践系列

上一篇:修改Outlook脱机文件(.ost)的保存位置


下一篇:C++多进程并发框架FFLIB