Hive QL模块执行计划优化分析

Transform

查询优化的基类,优化过程通过子类重写transform(ParseContext pctx)函数实现。

/**
 * Optimizer interface. All the rule-based optimizations implement this
 * interface. All the transformations are invoked sequentially. They take the
 * current parse context (which contains the operator tree among other things),
 * perform all the optimizations, and then return the updated parse context.
 */
public abstract class Transform {
  /**
   * All transformation steps implement this interface.
   * 
   * @param pctx
   *          input parse context
   * @return ParseContext
   * @throws SemanticException
   */
  public abstract ParseContext transform(ParseContext pctx) throws SemanticException;
  
  public void beginPerfLogging() {
    PerfLogger perfLogger = SessionState.getPerfLogger();
    perfLogger.perfLogBegin(this.getClass().getName(), PerfLogger.OPTIMIZER);
  }

  public void endPerfLogging() {
    PerfLogger perfLogger = SessionState.getPerfLogger();
    perfLogger.perfLogEnd(this.getClass().getName(), PerfLogger.OPTIMIZER);
  }
  public void endPerfLogging(String additionalInfo) {
    PerfLogger perfLogger = SessionState.getPerfLogger();
	perfLogger.perfLogEnd(this.getClass().getName(), PerfLogger.OPTIMIZER, additionalInfo);
  }  
}

SimpleFetchOptimizer

通过对执行计划的判断,根据数据量以及是否需要group by等操作判断是否可以直接返回hdfs中的数据,避免执行mapreduce过程。

/**
 * Tries to convert simple fetch query to single fetch task, which fetches rows directly
 * from location of table/partition.
 */
上一篇:mysql导出数据库数据及表结构


下一篇:「开源组件」青龙定时面板使用场景举例