09Hive_ETL_数据分析

  1. 准备表

    1.创建原始表:video_ori

    点击查看代码
    create table video_ori(
        videoId string, 
        uploader string, 
        age int, 
        category array<string>, 
        length int, 
        views int, 
        rate float, 
        ratings int, 
        comments int,
        relatedId array<string>)
    row format delimited fields terminated by "\t"
    collection items terminated by "&"
    stored as textfile;

    2.创建原始表:user_ori

    点击查看代码
    reate table user_ori(
        uploader string,
        videos int,
        friends int)
    row format delimited 
    fields terminated by "\t" 
    stored as textfile;
    

    3.创建orc存储格式带snappy压缩的video_orc

    点击查看代码
    create table video_orc(
        videoId string, 
        uploader string, 
        age int, 
        category array<string>, 
        length int, 
        views int, 
        rate float, 
        ratings int, 
        comments int,
        relatedId array<string>)
    stored as orc
    tblproperties("orc.compress"="SNAPPY");

    4.创建orc存储格式带snappy压缩的user_orc

    点击查看代码
    create table video_user_orc(
        uploader string,
        videos int,
        friends int)
    row format delimited 
    fields terminated by "\t" 
    stored as orc
    tblproperties("orc.compress"="SNAPPY");
    

     

09Hive_ETL_数据分析

上一篇:docker常用命令


下一篇:系统ghost后只有C盘了别的盘的文件怎样恢复