【变换和量化是整个视频编码技术系列中理论性和研究性较强的一部分,本文暂时不去研究变换的原理、推导过程等,只是调试一下在参考代码中对预测残差进行变换的实现过程。变换的原理在未来会作为一个研究方向进行探讨。】
1、HM中Intra模式的主要实现逻辑
以Intra的亮度模式为例。主要实现代码实现于TEncSearch::estIntraPredQT方法中。TEncSearch::estIntraPredQT实现时,首先获取当前CU的分割子块的个数,并且对每个子块分别进行预测、变换量化操作(代码中称之为PU Loop)。在每一次的PU Loop中,编码器首先遍历35种预测模式,对每一种模式进行参考像素配置、预测和率失真代价判断,并选择出某几个最优的候选模式。对这几个选取出的最优模式,递归地进行变换、量化、熵编码操作(代码中称之为Mode loop)。以下伪代码可以作为参考:
Void TEncSearch::estIntraPredQT( TComDataCU* pcCU, TComYuv* pcOrgYuv, TComYuv* pcPredYuv, TComYuv* pcResiYuv, TComYuv* pcRecoYuv, UInt& ruiDistC, Bool bLumaOnly ) { UInt uiNumPU = pcCU->getNumPartInter();//当前CU的分割模式下,子块的个数 for( UInt uiPU = 0; uiPU < uiNumPU; uiPU++, uiPartOffset += uiQNumParts ) { // 获取参考像素,对参考像素进行滤波 pcCU->getPattern()->initAdiPattern(); for( Int modeIdx = 0; modeIdx < 35; modeIdx++ ) { predIntraLumaAng();//获取各个预测模式的结果 //计算预测模式的代价 UInt uiSad = m_pcRdCost->calcHAD(); UInt iModeBits = xModeBitsIntra(); xUpdateCandList();//更新候选模式的cost值 }//Mode loop //递归编码Intra CU,包括变换、量化等 xRecurIntraCodingQT(); }// PU loop }
2、xRecurIntraCodingQT以及变换量化的实现
本段主要通过代码注释讨论变换和量化的方法:
Void TEncSearch::xRecurIntraCodingQT( TComDataCU* pcCU, UInt uiTrDepth, UInt uiAbsPartIdx, Bool bLumaOnly, TComYuv* pcOrgYuv, TComYuv* pcPredYuv, TComYuv* pcResiYuv, UInt& ruiDistY, UInt& ruiDistC, Double& dRDCost ) { UInt uiFullDepth = pcCU->getDepth( 0 ) + uiTrDepth; UInt uiLog2TrSize = g_aucConvertToBit[ pcCU->getSlice()->getSPS()->getMaxCUWidth() >> uiFullDepth ] + 2; Bool bCheckFull = ( uiLog2TrSize <= pcCU->getSlice()->getSPS()->getQuadtreeTULog2MaxSize() ); /*判断是否将当前的PU按照整个模式进行变换和量化。当前示例下,PU大小为64×64,因此uiLog2TrSize==6,而sps中指定的最大TU为32,因此bCheckFull为false。*/ Bool bCheckSplit = ( uiLog2TrSize > pcCU->getQuadtreeTULog2MinSizeInCU(uiAbsPartIdx) ); /*比较当前U大小同CU中允许的最小TU的大小,只有大于最小TU大小时才允许进行分割,对于当前PU,MinTUSize为16,因此允许进行分割*/ Double dSingleCost = MAX_DOUBLE; UInt uiSingleDistY = 0; UInt uiSingleDistC = 0; UInt uiSingleCbfY = 0; UInt uiSingleCbfU = 0; UInt uiSingleCbfV = 0; Bool checkTransformSkip = pcCU->getSlice()->getPPS()->getUseTransformSkip(); UInt widthTransformSkip = pcCU->getWidth ( 0 ) >> uiTrDepth; UInt heightTransformSkip = pcCU->getHeight( 0 ) >> uiTrDepth; Int bestModeId = 0; Int bestModeIdUV[2] = {0, 0}; checkTransformSkip &= (widthTransformSkip == 4 && heightTransformSkip == 4); checkTransformSkip &= (!pcCU->getCUTransquantBypass(0)); checkTransformSkip &= (!((pcCU->getQP( 0 ) == 0) && (pcCU->getSlice()->getSPS()->getUseLossless()))); if ( m_pcEncCfg->getUseTransformSkipFast() ) { checkTransformSkip &= (pcCU->getPartitionSize(uiAbsPartIdx)==SIZE_NxN); } if( bCheckFull ) {//按照一整个TU进行变换、量化 if(checkTransformSkip == true) {//skip模式 //...... } else { pcCU ->setTransformSkipSubParts ( 0, TEXT_LUMA, uiAbsPartIdx, uiFullDepth ); //----- store original entropy coding status ----- if( m_bUseSBACRD && bCheckSplit ) { m_pcRDGoOnSbacCoder->store( m_pppcRDSbacCoder[ uiFullDepth ][ CI_QT_TRAFO_ROOT ] ); } //----- code luma block with given intra prediction mode and store Cbf----- dSingleCost = 0.0; xIntraCodingLumaBlk( pcCU, uiTrDepth, uiAbsPartIdx, pcOrgYuv, pcPredYuv, pcResiYuv, uiSingleDistY ); //对亮度TU进行变换和量化编码 if( bCheckSplit ) { uiSingleCbfY = pcCU->getCbf( uiAbsPartIdx, TEXT_LUMA, uiTrDepth ); } //----- code chroma blocks with given intra prediction mode and store Cbf----- if( !bLumaOnly ) { pcCU ->setTransformSkipSubParts ( 0, TEXT_CHROMA_U, uiAbsPartIdx, uiFullDepth ); //如果包含色度信息,编码色度TU pcCU ->setTransformSkipSubParts ( 0, TEXT_CHROMA_V, uiAbsPartIdx, uiFullDepth ); xIntraCodingChromaBlk ( pcCU, uiTrDepth, uiAbsPartIdx, pcOrgYuv, pcPredYuv, pcResiYuv, uiSingleDistC, 0 ); xIntraCodingChromaBlk ( pcCU, uiTrDepth, uiAbsPartIdx, pcOrgYuv, pcPredYuv, pcResiYuv, uiSingleDistC, 1 ); if( bCheckSplit ) { uiSingleCbfU = pcCU->getCbf( uiAbsPartIdx, TEXT_CHROMA_U, uiTrDepth ); uiSingleCbfV = pcCU->getCbf( uiAbsPartIdx, TEXT_CHROMA_V, uiTrDepth ); } } //----- determine rate and r-d cost ----- UInt uiSingleBits = xGetIntraBitsQT( pcCU, uiTrDepth, uiAbsPartIdx, true, !bLumaOnly, false ); dSingleCost = m_pcRdCost->calcRdCost( uiSingleBits, uiSingleDistY + uiSingleDistC ); } } if( bCheckSplit ) {//分割成4个TU,进行递归编码 //----- store full entropy coding status, load original entropy coding status ----- if( m_bUseSBACRD ) { if( bCheckFull ) { m_pcRDGoOnSbacCoder->store( m_pppcRDSbacCoder[ uiFullDepth ][ CI_QT_TRAFO_TEST ] ); m_pcRDGoOnSbacCoder->load ( m_pppcRDSbacCoder[ uiFullDepth ][ CI_QT_TRAFO_ROOT ] ); } else { m_pcRDGoOnSbacCoder->store( m_pppcRDSbacCoder[ uiFullDepth ][ CI_QT_TRAFO_ROOT ] ); } } //----- code splitted block ----- Double dSplitCost = 0.0; UInt uiSplitDistY = 0; UInt uiSplitDistC = 0; UInt uiQPartsDiv = pcCU->getPic()->getNumPartInCU() >> ( ( uiFullDepth + 1 ) << 1 ); UInt uiAbsPartIdxSub = uiAbsPartIdx; UInt uiSplitCbfY = 0; UInt uiSplitCbfU = 0; UInt uiSplitCbfV = 0; for( UInt uiPart = 0; uiPart < 4; uiPart++, uiAbsPartIdxSub += uiQPartsDiv ) { xRecurIntraCodingQT( pcCU, uiTrDepth + 1, uiAbsPartIdxSub, bLumaOnly, pcOrgYuv, pcPredYuv, pcResiYuv, uiSplitDistY, uiSplitDistC, dSplitCost );//本函数递归调用 uiSplitCbfY |= pcCU->getCbf( uiAbsPartIdxSub, TEXT_LUMA, uiTrDepth + 1 ); if(!bLumaOnly) { uiSplitCbfU |= pcCU->getCbf( uiAbsPartIdxSub, TEXT_CHROMA_U, uiTrDepth + 1 ); uiSplitCbfV |= pcCU->getCbf( uiAbsPartIdxSub, TEXT_CHROMA_V, uiTrDepth + 1 ); } } //...... }
在该函数调用的xIntraCodingLumaBlk中,按照给定的预测模式获取CU相应的预测数据,并与实际的像素值求取差值数据作为残差。随后,调用transformNxN函数进行变换编码。
Void TEncSearch::xIntraCodingLumaBlk( TComDataCU* pcCU, UInt uiTrDepth, UInt uiAbsPartIdx, TComYuv* pcOrgYuv, TComYuv* pcPredYuv, TComYuv* pcResiYuv, UInt& ruiDist, Int default0Save1Load2/*默认为0*/ ) { //...... //===== init availability pattern ===== Bool bAboveAvail = false; Bool bLeftAvail = false; if(default0Save1Load2 != 2) { pcCU->getPattern()->initPattern ( pcCU, uiTrDepth, uiAbsPartIdx ); pcCU->getPattern()->initAdiPattern( pcCU, uiAbsPartIdx, uiTrDepth, m_piYuvExt, m_iYuvExtStride, m_iYuvExtHeight, bAboveAvail, bLeftAvail ); //===== get prediction signal ===== predIntraLumaAng( pcCU->getPattern(), uiLumaPredMode, piPred, uiStride, uiWidth, uiHeight, bAboveAvail, bLeftAvail );//针对给定模式进行帧内预测,获取预测像素块 // save prediction //...... } else { // load prediction Pel* pPred = piPred; Pel* pPredBuf = m_pSharedPredTransformSkip[0]; Int k = 0; for( UInt uiY = 0; uiY < uiHeight; uiY++ ) { for( UInt uiX = 0; uiX < uiWidth; uiX++ ) { pPred[ uiX ] = pPredBuf[ k ++ ]; } pPred += uiStride; } } //===== get residual signal ===== { // get residual Pel* pOrg = piOrg; Pel* pPred = piPred; Pel* pResi = piResi; for( UInt uiY = 0; uiY < uiHeight; uiY++ ) { for( UInt uiX = 0; uiX < uiWidth; uiX++ ) { pResi[ uiX ] = pOrg[ uiX ] - pPred[ uiX ]; } pOrg += uiStride; pResi += uiStride; pPred += uiStride; } } //===== transform and quantization ===== //--- init rate estimation arrays for RDOQ --- if( useTransformSkip? m_pcEncCfg->getUseRDOQTS():m_pcEncCfg->getUseRDOQ()) { m_pcEntropyCoder->estimateBit( m_pcTrQuant->m_pcEstBitsSbac, uiWidth, uiWidth, TEXT_LUMA ); } //--- transform and quantization --- UInt uiAbsSum = 0; pcCU ->setTrIdxSubParts ( uiTrDepth, uiAbsPartIdx, uiFullDepth ); m_pcTrQuant->setQPforQuant ( pcCU->getQP( 0 ), TEXT_LUMA, pcCU->getSlice()->getSPS()->getQpBDOffsetY(), 0 ); #if RDOQ_CHROMA_LAMBDA m_pcTrQuant->selectLambda (TEXT_LUMA); #endif m_pcTrQuant->transformNxN ( pcCU, piResi, uiStride, pcCoeff, #if ADAPTIVE_QP_SELECTION pcArlCoeff, #endif uiWidth, uiHeight, uiAbsSum, TEXT_LUMA, uiAbsPartIdx,useTransformSkip );/*对预测残差进行变换和量化编码*/ //--- set coded block flag --- pcCU->setCbfSubParts ( ( uiAbsSum ? 1 : 0 ) << uiTrDepth, TEXT_LUMA, uiAbsPartIdx, uiFullDepth ); //--- inverse transform --- if( uiAbsSum ) { Int scalingListType = 0 + g_eTTable[(Int)TEXT_LUMA]; assert(scalingListType < 6); m_pcTrQuant->invtransformNxN( pcCU->getCUTransquantBypass(uiAbsPartIdx), TEXT_LUMA,pcCU->getLumaIntraDir( uiAbsPartIdx ), piResi, uiStride, pcCoeff, uiWidth, uiHeight, scalingListType, useTransformSkip ); } else { Pel* pResi = piResi; memset( pcCoeff, 0, sizeof( TCoeff ) * uiWidth * uiHeight ); for( UInt uiY = 0; uiY < uiHeight; uiY++ ) { memset( pResi, 0, sizeof( Pel ) * uiWidth ); pResi += uiStride; } } //===== reconstruction ===== //...... }其中,transformNxN函数实现了变换和量化功能,其核心功能在于调用了三个函数:xTransformSkip、xT和xQuant,分别实现skip模式变换、常规残差数据的变换和量化编码。
Void TComTrQuant::transformNxN( TComDataCU* pcCU, Pel* pcResidual, UInt uiStride, TCoeff* rpcCoeff, UInt uiWidth, UInt uiHeight, UInt& uiAbsSum, TextType eTType, UInt uiAbsPartIdx, Bool useTransformSkip ) { if (pcCU->getCUTransquantBypass(uiAbsPartIdx)) { uiAbsSum=0; for (UInt k = 0; k<uiHeight; k++) { for (UInt j = 0; j<uiWidth; j++) { rpcCoeff[k*uiWidth+j]= pcResidual[k*uiStride+j]; uiAbsSum += abs(pcResidual[k*uiStride+j]); } } return; } UInt uiMode; //luma intra pred if(eTType == TEXT_LUMA && pcCU->getPredictionMode(uiAbsPartIdx) == MODE_INTRA ) { uiMode = pcCU->getLumaIntraDir( uiAbsPartIdx ); } else { uiMode = REG_DCT; } uiAbsSum = 0; assert( (pcCU->getSlice()->getSPS()->getMaxTrSize() >= uiWidth) ); Int bitDepth = eTType == TEXT_LUMA ? g_bitDepthY : g_bitDepthC; if(useTransformSkip) { xTransformSkip(bitDepth, pcResidual, uiStride, m_plTempCoeff, uiWidth, uiHeight );//skip模式变换 } else { xT(bitDepth, uiMode, pcResidual, uiStride, m_plTempCoeff, uiWidth, uiHeight );//常规残差变换 } xQuant( pcCU, m_plTempCoeff, rpcCoeff, uiWidth, uiHeight, uiAbsSum, eTType, uiAbsPartIdx );//变换编码 }xT函数实现比较简单,主要为调用xTrMxN函数实现变换功能,该函数的实现为:
void xTrMxN(Int bitDepth, Short *block,Short *coeff, Int iWidth, Int iHeight, UInt uiMode) { Int shift_1st = g_aucConvertToBit[iWidth] + 1 + bitDepth-8; // log2(iWidth) - 1 + g_bitDepth - 8 Int shift_2nd = g_aucConvertToBit[iHeight] + 8; // log2(iHeight) + 6 Short tmp[ 64 * 64 ]; if( iWidth == 4 && iHeight == 4) { if (uiMode != REG_DCT)//对于4×4Intra块,采用dst整数变换 { fastForwardDst(block,tmp,shift_1st); // Forward DST BY FAST ALGORITHM, block input, tmp output fastForwardDst(tmp,coeff,shift_2nd); // Forward DST BY FAST ALGORITHM, tmp input, coeff output } else { partialButterfly4(block, tmp, shift_1st, iHeight); partialButterfly4(tmp, coeff, shift_2nd, iWidth); } } else if( iWidth == 8 && iHeight == 8)//8×8整数变换 { partialButterfly8( block, tmp, shift_1st, iHeight ); partialButterfly8( tmp, coeff, shift_2nd, iWidth ); } else if( iWidth == 16 && iHeight == 16)//16×16整数变换<pre name="code" class="cpp" style="font-size: 18px;"><span style="font-family: 'Microsoft YaHei';"> {</span>partialButterfly16( block, tmp, shift_1st, iHeight ); partialButterfly16( tmp, coeff, shift_2nd, iWidth ); } else if( iWidth == 32 && iHeight == 32)//32×32整数变换 { partialButterfly32( block, tmp, shift_1st, iHeight ); partialButterfly32( tmp, coeff, shift_2nd, iWidth ); }} 该函数调用不同预设的蝶形变换函数实现对某个大小残差信号矩阵的变换,以4×4为例:
void partialButterfly4(Short *src,Short *dst,Int shift, Int line) { Int j; Int E[2],O[2]; Int add = 1<<(shift-1); for (j=0; j<line; j++) { /* E and O */ E[0] = src[0] + src[3]; O[0] = src[0] - src[3]; E[1] = src[1] + src[2]; O[1] = src[1] - src[2]; dst[0] = (g_aiT4[0][0]*E[0] + g_aiT4[0][1]*E[1] + add)>>shift; dst[2*line] = (g_aiT4[2][0]*E[0] + g_aiT4[2][1]*E[1] + add)>>shift; dst[line] = (g_aiT4[1][0]*O[0] + g_aiT4[1][1]*O[1] + add)>>shift; dst[3*line] = (g_aiT4[3][0]*O[0] + g_aiT4[3][1]*O[1] + add)>>shift; src += 4; dst ++; } }