一、采样流程
在上一节里的流程图有写到,图像绘制的实际渲染发生在某个blitter的blitRect函数中,我们先看一个具体的blitRect实现。
void SkARGB32_Shader_Blitter::blitRect(int x, int y, int width, int height) { SkASSERT(x >= 0 && y >= 0 && x + width <= fDevice.width() && y + height <= fDevice.height()); uint32_t* device = fDevice.getAddr32(x, y); size_t deviceRB = fDevice.rowBytes(); SkShader::Context* shaderContext = fShaderContext; SkPMColor* span = fBuffer; if (fConstInY) { if (fShadeDirectlyIntoDevice) { // shade the first row directly into the device shaderContext->shadeSpan(x, y, device, width); span = device; while (--height > 0) { device = (uint32_t*)((char*)device + deviceRB); memcpy(device, span, width << 2); } } else { shaderContext->shadeSpan(x, y, span, width); SkXfermode* xfer = fXfermode; if (xfer) { do { xfer->xfer32(device, span, width, NULL); y += 1; device = (uint32_t*)((char*)device + deviceRB); } while (--height > 0); } else { SkBlitRow::Proc32 proc = fProc32; do { proc(device, span, width, 255); y += 1; device = (uint32_t*)((char*)device + deviceRB); } while (--height > 0); } } return; } if (fShadeDirectlyIntoDevice) { void* ctx; SkShader::Context::ShadeProc shadeProc = shaderContext->asAShadeProc(&ctx); if (shadeProc) { do { shadeProc(ctx, x, y, device, width); y += 1; device = (uint32_t*)((char*)device + deviceRB); } while (--height > 0); } else { do { shaderContext->shadeSpan(x, y, device, width); y += 1; device = (uint32_t*)((char*)device + deviceRB); } while (--height > 0); } } else { SkXfermode* xfer = fXfermode; if (xfer) { do { shaderContext->shadeSpan(x, y, span, width); xfer->xfer32(device, span, width, NULL); y += 1; device = (uint32_t*)((char*)device + deviceRB); } while (--height > 0); } else { SkBlitRow::Proc32 proc = fProc32; do { shaderContext->shadeSpan(x, y, span, width); proc(device, span, width, 255); y += 1; device = (uint32_t*)((char*)device + deviceRB); } while (--height > 0); } } }
其中shadeSpan用来将shader中x,y坐标处的值取n个到dst的buffer中。
对于图像绘制时,它是 SkBitmapProcShader,这里是其实现:
void SkBitmapProcShader::BitmapProcShaderContext::shadeSpan(int x, int y, SkPMColor dstC[], int count) { const SkBitmapProcState& state = *fState; if (state.getShaderProc32()) { state.getShaderProc32()(state, x, y, dstC, count); return; } uint32_t buffer[BUF_MAX + TEST_BUFFER_EXTRA]; SkBitmapProcState::MatrixProc mproc = state.getMatrixProc(); SkBitmapProcState::SampleProc32 sproc = state.getSampleProc32(); int max = state.maxCountForBufferSize(sizeof(buffer[0]) * BUF_MAX); SkASSERT(state.fBitmap->getPixels()); SkASSERT(state.fBitmap->pixelRef() == NULL || state.fBitmap->pixelRef()->isLocked()); for (;;) { int n = count; if (n > max) { n = max; } SkASSERT(n > 0 && n < BUF_MAX*2); #ifdef TEST_BUFFER_OVERRITE for (int i = 0; i < TEST_BUFFER_EXTRA; i++) { buffer[BUF_MAX + i] = TEST_PATTERN; } #endif mproc(state, buffer, n, x, y); #ifdef TEST_BUFFER_OVERRITE for (int j = 0; j < TEST_BUFFER_EXTRA; j++) { SkASSERT(buffer[BUF_MAX + j] == TEST_PATTERN); } #endif sproc(state, buffer, n, dstC); if ((count -= n) == 0) { break; } SkASSERT(count > 0); x += n; dstC += n; } }流程如下:
1、存在 shaderProc,直接用
2、计算一次能处理的像素数count
3、mproc计算count个坐标,sproc根据坐标值去取色
注意到之前三个函数指针:
state.getShaderProc32
mproc = state.getMatrixProc
sproc = state.getShaderProc32
这三个函数指针在一开始创建blitter时设定:
SkBlitter::Choose -> SkShader::createContext -> SkBitmapProcShader::onCreateContext -> SkBitmapProcState::chooseProcs
1、(优化步骤)在大于SkPaint::kLow_FilterLevel的质量要求下,试图做预缩放。
2、选择matrix函数:chooseMatrixProc。
3、选择sample函数:
(1)高质量:setBitmapFilterProcs
(2)kLow_FilterLevel或kNone_FilterLevel:采取flags计算的方法,根据x,y变化矩阵情况和采样要求选择函数
4、(优化步骤)在满足条件时,选取shader函数,此函数替代matrix和sample函数
5、(优化步骤)platformProcs(),进一步选择优化版本的sample函数
对于RGB565格式的目标,使用的是SkShader的 shadeSpan16 方法。shadeSpan16的代码逻辑类似,不再说明。
bool SkBitmapProcState::chooseProcs(const SkMatrix& inv, const SkPaint& paint) { SkASSERT(fOrigBitmap.width() && fOrigBitmap.height()); fBitmap = NULL; fInvMatrix = inv; fFilterLevel = paint.getFilterLevel(); SkASSERT(NULL == fScaledCacheID); // possiblyScaleImage will look to see if it can rescale the image as a // preprocess; either by scaling up to the target size, or by selecting // a nearby mipmap level. If it does, it will adjust the working // matrix as well as the working bitmap. It may also adjust the filter // quality to avoid re-filtering an already perfectly scaled image. if (!this->possiblyScaleImage()) { if (!this->lockBaseBitmap()) { return false; } } // The above logic should have always assigned fBitmap, but in case it // didn't, we check for that now... // TODO(dominikg): Ask humper@ if we can just use an SkASSERT(fBitmap)? if (NULL == fBitmap) { return false; } // If we are "still" kMedium_FilterLevel, then the request was not fulfilled by possiblyScale, // so we downgrade to kLow (so the rest of the sniffing code can assume that) if (SkPaint::kMedium_FilterLevel == fFilterLevel) { fFilterLevel = SkPaint::kLow_FilterLevel; } bool trivialMatrix = (fInvMatrix.getType() & ~SkMatrix::kTranslate_Mask) == 0; bool clampClamp = SkShader::kClamp_TileMode == fTileModeX && SkShader::kClamp_TileMode == fTileModeY; if (!(clampClamp || trivialMatrix)) { fInvMatrix.postIDiv(fOrigBitmap.width(), fOrigBitmap.height()); } // Now that all possible changes to the matrix have taken place, check // to see if we're really close to a no-scale matrix. If so, explicitly // set it to be so. Subsequent code may inspect this matrix to choose // a faster path in this case. // This code will only execute if the matrix has some scale component; // if it's already pure translate then we won't do this inversion. if (matrix_only_scale_translate(fInvMatrix)) { SkMatrix forward; if (fInvMatrix.invert(&forward)) { if (clampClamp ? just_trans_clamp(forward, *fBitmap) : just_trans_general(forward)) { SkScalar tx = -SkScalarRoundToScalar(forward.getTranslateX()); SkScalar ty = -SkScalarRoundToScalar(forward.getTranslateY()); fInvMatrix.setTranslate(tx, ty); } } } fInvProc = fInvMatrix.getMapXYProc(); fInvType = fInvMatrix.getType(); fInvSx = SkScalarToFixed(fInvMatrix.getScaleX()); fInvSxFractionalInt = SkScalarToFractionalInt(fInvMatrix.getScaleX()); fInvKy = SkScalarToFixed(fInvMatrix.getSkewY()); fInvKyFractionalInt = SkScalarToFractionalInt(fInvMatrix.getSkewY()); fAlphaScale = SkAlpha255To256(paint.getAlpha()); fShaderProc32 = NULL; fShaderProc16 = NULL; fSampleProc32 = NULL; fSampleProc16 = NULL; // recompute the triviality of the matrix here because we may have // changed it! trivialMatrix = (fInvMatrix.getType() & ~SkMatrix::kTranslate_Mask) == 0; if (SkPaint::kHigh_FilterLevel == fFilterLevel) { // If this is still set, that means we wanted HQ sampling // but couldn't do it as a preprocess. Let's try to install // the scanline version of the HQ sampler. If that process fails, // downgrade to bilerp. // NOTE: Might need to be careful here in the future when we want // to have the platform proc have a shot at this; it's possible that // the chooseBitmapFilterProc will fail to install a shader but a // platform-specific one might succeed, so it might be premature here // to fall back to bilerp. This needs thought. if (!this->setBitmapFilterProcs()) { fFilterLevel = SkPaint::kLow_FilterLevel; } } if (SkPaint::kLow_FilterLevel == fFilterLevel) { // Only try bilerp if the matrix is "interesting" and // the image has a suitable size. if (fInvType <= SkMatrix::kTranslate_Mask || !valid_for_filtering(fBitmap->width() | fBitmap->height())) { fFilterLevel = SkPaint::kNone_FilterLevel; } } // At this point, we know exactly what kind of sampling the per-scanline // shader will perform. fMatrixProc = this->chooseMatrixProc(trivialMatrix); // TODO(dominikg): SkASSERT(fMatrixProc) instead? chooseMatrixProc never returns NULL. if (NULL == fMatrixProc) { return false; } /////////////////////////////////////////////////////////////////////// // No need to do this if we're doing HQ sampling; if filter quality is // still set to HQ by the time we get here, then we must have installed // the shader procs above and can skip all this. if (fFilterLevel < SkPaint::kHigh_FilterLevel) { int index = 0; if (fAlphaScale < 256) { // note: this distinction is not used for D16 index |= 1; } if (fInvType <= (SkMatrix::kTranslate_Mask | SkMatrix::kScale_Mask)) { index |= 2; } if (fFilterLevel > SkPaint::kNone_FilterLevel) { index |= 4; } // bits 3,4,5 encoding the source bitmap format switch (fBitmap->colorType()) { case kN32_SkColorType: index |= 0; break; case kRGB_565_SkColorType: index |= 8; break; case kIndex_8_SkColorType: index |= 16; break; case kARGB_4444_SkColorType: index |= 24; break; case kAlpha_8_SkColorType: index |= 32; fPaintPMColor = SkPreMultiplyColor(paint.getColor()); break; default: // TODO(dominikg): Should we ever get here? SkASSERT(false) instead? return false; } #if !SK_ARM_NEON_IS_ALWAYS static const SampleProc32 gSkBitmapProcStateSample32[] = { S32_opaque_D32_nofilter_DXDY, S32_alpha_D32_nofilter_DXDY, S32_opaque_D32_nofilter_DX, S32_alpha_D32_nofilter_DX, S32_opaque_D32_filter_DXDY, S32_alpha_D32_filter_DXDY, S32_opaque_D32_filter_DX, S32_alpha_D32_filter_DX, S16_opaque_D32_nofilter_DXDY, S16_alpha_D32_nofilter_DXDY, S16_opaque_D32_nofilter_DX, S16_alpha_D32_nofilter_DX, S16_opaque_D32_filter_DXDY, S16_alpha_D32_filter_DXDY, S16_opaque_D32_filter_DX, S16_alpha_D32_filter_DX, SI8_opaque_D32_nofilter_DXDY, SI8_alpha_D32_nofilter_DXDY, SI8_opaque_D32_nofilter_DX, SI8_alpha_D32_nofilter_DX, SI8_opaque_D32_filter_DXDY, SI8_alpha_D32_filter_DXDY, SI8_opaque_D32_filter_DX, SI8_alpha_D32_filter_DX, S4444_opaque_D32_nofilter_DXDY, S4444_alpha_D32_nofilter_DXDY, S4444_opaque_D32_nofilter_DX, S4444_alpha_D32_nofilter_DX, S4444_opaque_D32_filter_DXDY, S4444_alpha_D32_filter_DXDY, S4444_opaque_D32_filter_DX, S4444_alpha_D32_filter_DX, // A8 treats alpha/opaque the same (equally efficient) SA8_alpha_D32_nofilter_DXDY, SA8_alpha_D32_nofilter_DXDY, SA8_alpha_D32_nofilter_DX, SA8_alpha_D32_nofilter_DX, SA8_alpha_D32_filter_DXDY, SA8_alpha_D32_filter_DXDY, SA8_alpha_D32_filter_DX, SA8_alpha_D32_filter_DX }; static const SampleProc16 gSkBitmapProcStateSample16[] = { S32_D16_nofilter_DXDY, S32_D16_nofilter_DX, S32_D16_filter_DXDY, S32_D16_filter_DX, S16_D16_nofilter_DXDY, S16_D16_nofilter_DX, S16_D16_filter_DXDY, S16_D16_filter_DX, SI8_D16_nofilter_DXDY, SI8_D16_nofilter_DX, SI8_D16_filter_DXDY, SI8_D16_filter_DX, // Don't support 4444 -> 565 NULL, NULL, NULL, NULL, // Don't support A8 -> 565 NULL, NULL, NULL, NULL }; #endif fSampleProc32 = SK_ARM_NEON_WRAP(gSkBitmapProcStateSample32)[index]; index >>= 1; // shift away any opaque/alpha distinction fSampleProc16 = SK_ARM_NEON_WRAP(gSkBitmapProcStateSample16)[index]; // our special-case shaderprocs if (SK_ARM_NEON_WRAP(S16_D16_filter_DX) == fSampleProc16) { if (clampClamp) { fShaderProc16 = SK_ARM_NEON_WRAP(Clamp_S16_D16_filter_DX_shaderproc); } else if (SkShader::kRepeat_TileMode == fTileModeX && SkShader::kRepeat_TileMode == fTileModeY) { fShaderProc16 = SK_ARM_NEON_WRAP(Repeat_S16_D16_filter_DX_shaderproc); } } else if (SK_ARM_NEON_WRAP(SI8_opaque_D32_filter_DX) == fSampleProc32 && clampClamp) { fShaderProc32 = SK_ARM_NEON_WRAP(Clamp_SI8_opaque_D32_filter_DX_shaderproc); } if (NULL == fShaderProc32) { fShaderProc32 = this->chooseShaderProc32(); } } // see if our platform has any accelerated overrides this->platformProcs(); return true; }
二、MatrixProc和SampleProc
MatrixProc的使命是生成坐标集。SampleProc则根据坐标集取像素,采样合成
我们先倒过来看 sampleProc 看这个坐标集是怎么使用的:
nofilter_dx系列:
nofilter_dxdy系列:
void MAKENAME(_nofilter_DXDY)(const SkBitmapProcState& s, const uint32_t* SK_RESTRICT xy, int count, DSTTYPE* SK_RESTRICT colors) { for (int i = (count >> 1); i > 0; --i) { XY = *xy++; SkASSERT((XY >> 16) < (unsigned)s.fBitmap->height() && (XY & 0xFFFF) < (unsigned)s.fBitmap->width()); src = ((const SRCTYPE*)(srcAddr + (XY >> 16) * rb))[XY & 0xFFFF]; *colors++ = RETURNDST(src); XY = *xy++; SkASSERT((XY >> 16) < (unsigned)s.fBitmap->height() && (XY & 0xFFFF) < (unsigned)s.fBitmap->width()); src = ((const SRCTYPE*)(srcAddr + (XY >> 16) * rb))[XY & 0xFFFF]; *colors++ = RETURNDST(src); } if (count & 1) { XY = *xy++; SkASSERT((XY >> 16) < (unsigned)s.fBitmap->height() && (XY & 0xFFFF) < (unsigned)s.fBitmap->width()); src = ((const SRCTYPE*)(srcAddr + (XY >> 16) * rb))[XY & 0xFFFF]; *colors++ = RETURNDST(src); } }
这两个系列是直接取了x,y坐标处的图像像素
filter_dx系列:
filter_dxdy系列:
void MAKENAME(_filter_DX)(const SkBitmapProcState& s, const uint32_t* SK_RESTRICT xy, int count, DSTTYPE* SK_RESTRICT colors) { SkASSERT(count > 0 && colors != NULL); SkASSERT(s.fFilterLevel != SkPaint::kNone_FilterLevel); SkDEBUGCODE(CHECKSTATE(s);) #ifdef PREAMBLE PREAMBLE(s); #endif const char* SK_RESTRICT srcAddr = (const char*)s.fBitmap->getPixels(); size_t rb = s.fBitmap->rowBytes(); unsigned subY; const SRCTYPE* SK_RESTRICT row0; const SRCTYPE* SK_RESTRICT row1; // setup row ptrs and update proc_table { uint32_t XY = *xy++; unsigned y0 = XY >> 14; row0 = (const SRCTYPE*)(srcAddr + (y0 >> 4) * rb); row1 = (const SRCTYPE*)(srcAddr + (XY & 0x3FFF) * rb); subY = y0 & 0xF; } do { uint32_t XX = *xy++; // x0:14 | 4 | x1:14 unsigned x0 = XX >> 14; unsigned x1 = XX & 0x3FFF; unsigned subX = x0 & 0xF; x0 >>= 4; FILTER_PROC(subX, subY, SRC_TO_FILTER(row0[x0]), SRC_TO_FILTER(row0[x1]), SRC_TO_FILTER(row1[x0]), SRC_TO_FILTER(row1[x1]), colors); colors += 1; } while (--count != 0); #ifdef POSTAMBLE POSTAMBLE(s); #endif } void MAKENAME(_filter_DXDY)(const SkBitmapProcState& s, const uint32_t* SK_RESTRICT xy, int count, DSTTYPE* SK_RESTRICT colors) { SkASSERT(count > 0 && colors != NULL); SkASSERT(s.fFilterLevel != SkPaint::kNone_FilterLevel); SkDEBUGCODE(CHECKSTATE(s);) #ifdef PREAMBLE PREAMBLE(s); #endif const char* SK_RESTRICT srcAddr = (const char*)s.fBitmap->getPixels(); size_t rb = s.fBitmap->rowBytes(); do { uint32_t data = *xy++; unsigned y0 = data >> 14; unsigned y1 = data & 0x3FFF; unsigned subY = y0 & 0xF; y0 >>= 4; data = *xy++; unsigned x0 = data >> 14; unsigned x1 = data & 0x3FFF; unsigned subX = x0 & 0xF; x0 >>= 4; const SRCTYPE* SK_RESTRICT row0 = (const SRCTYPE*)(srcAddr + y0 * rb); const SRCTYPE* SK_RESTRICT row1 = (const SRCTYPE*)(srcAddr + y1 * rb); FILTER_PROC(subX, subY, SRC_TO_FILTER(row0[x0]), SRC_TO_FILTER(row0[x1]), SRC_TO_FILTER(row1[x0]), SRC_TO_FILTER(row1[x1]), colors); colors += 1; } while (--count != 0); #ifdef POSTAMBLE POSTAMBLE(s); #endif }将四个相邻像素取出来之后,作Filter处理
看晕了么,其实总结一下是这样:
nofilter_dx,第一个32位数表示y,其余的32位数包含两个x坐标。
nofilter_dxdy,用16位表示x,16位表示y。这种情况就是取的最近值,直接到x,y坐标处取值就可以了。
filter_dxdy系列,每个32位数分别表示X和Y坐标(14:4:14),交错排列,中间的差值部分是相差的小数扩大16倍而得的近似整数。
filter_dx系列,第一个数为Y坐标用14:4:14的方式存储,后面的数为X坐标,也用14:4:14的方式存储,前后为对应坐标,中间为放大16倍的距离,这个情况是一行之内y坐标相同(只做缩放或小数平移的情况),一样是作双线性插值。
下面我们来看matrixproc的实现,
先跟进 chooseMatrixProc的代码:
SkBitmapProcState::MatrixProc SkBitmapProcState::chooseMatrixProc(bool trivial_matrix) { // test_int_tileprocs(); // check for our special case when there is no scale/affine/perspective if (trivial_matrix) { SkASSERT(SkPaint::kNone_FilterLevel == fFilterLevel); fIntTileProcY = choose_int_tile_proc(fTileModeY); switch (fTileModeX) { case SkShader::kClamp_TileMode: return clampx_nofilter_trans; case SkShader::kRepeat_TileMode: return repeatx_nofilter_trans; case SkShader::kMirror_TileMode: return mirrorx_nofilter_trans; } } int index = 0; if (fFilterLevel != SkPaint::kNone_FilterLevel) { index = 1; } if (fInvType & SkMatrix::kPerspective_Mask) { index += 4; } else if (fInvType & SkMatrix::kAffine_Mask) { index += 2; } if (SkShader::kClamp_TileMode == fTileModeX && SkShader::kClamp_TileMode == fTileModeY) { // clamp gets special version of filterOne fFilterOneX = SK_Fixed1; fFilterOneY = SK_Fixed1; return SK_ARM_NEON_WRAP(ClampX_ClampY_Procs)[index]; } // all remaining procs use this form for filterOne fFilterOneX = SK_Fixed1 / fBitmap->width(); fFilterOneY = SK_Fixed1 / fBitmap->height(); if (SkShader::kRepeat_TileMode == fTileModeX && SkShader::kRepeat_TileMode == fTileModeY) { return SK_ARM_NEON_WRAP(RepeatX_RepeatY_Procs)[index]; } fTileProcX = choose_tile_proc(fTileModeX); fTileProcY = choose_tile_proc(fTileModeY); fTileLowBitsProcX = choose_tile_lowbits_proc(fTileModeX); fTileLowBitsProcY = choose_tile_lowbits_proc(fTileModeY); return GeneralXY_Procs[index]; }
有些函数是找符号找不到的,我们注意到SkBitmapProcState.cpp 中包含了多次 SkBitmapProcState_matrix.h 头文件:
#if !SK_ARM_NEON_IS_ALWAYS #define MAKENAME(suffix) ClampX_ClampY ## suffix #define TILEX_PROCF(fx, max) SkClampMax((fx) >> 16, max) #define TILEY_PROCF(fy, max) SkClampMax((fy) >> 16, max) #define TILEX_LOW_BITS(fx, max) (((fx) >> 12) & 0xF) #define TILEY_LOW_BITS(fy, max) (((fy) >> 12) & 0xF) #define CHECK_FOR_DECAL #include "SkBitmapProcState_matrix.h"
头文件代码如下:
/* * Copyright 2011 Google Inc. * * Use of this source code is governed by a BSD-style license that can be * found in the LICENSE file. */ #include "SkMath.h" #include "SkMathPriv.h" #define SCALE_FILTER_NAME MAKENAME(_filter_scale) #define AFFINE_FILTER_NAME MAKENAME(_filter_affine) #define PERSP_FILTER_NAME MAKENAME(_filter_persp) #define PACK_FILTER_X_NAME MAKENAME(_pack_filter_x) #define PACK_FILTER_Y_NAME MAKENAME(_pack_filter_y) #ifndef PREAMBLE #define PREAMBLE(state) #define PREAMBLE_PARAM_X #define PREAMBLE_PARAM_Y #define PREAMBLE_ARG_X #define PREAMBLE_ARG_Y #endif // declare functions externally to suppress warnings. void SCALE_FILTER_NAME(const SkBitmapProcState& s, uint32_t xy[], int count, int x, int y); void AFFINE_FILTER_NAME(const SkBitmapProcState& s, uint32_t xy[], int count, int x, int y); void PERSP_FILTER_NAME(const SkBitmapProcState& s, uint32_t* SK_RESTRICT xy, int count, int x, int y); static inline uint32_t PACK_FILTER_Y_NAME(SkFixed f, unsigned max, SkFixed one PREAMBLE_PARAM_Y) { unsigned i = TILEY_PROCF(f, max); i = (i << 4) | TILEY_LOW_BITS(f, max); return (i << 14) | (TILEY_PROCF((f + one), max)); } static inline uint32_t PACK_FILTER_X_NAME(SkFixed f, unsigned max, SkFixed one PREAMBLE_PARAM_X) { unsigned i = TILEX_PROCF(f, max); i = (i << 4) | TILEX_LOW_BITS(f, max); return (i << 14) | (TILEX_PROCF((f + one), max)); } void SCALE_FILTER_NAME(const SkBitmapProcState& s, uint32_t xy[], int count, int x, int y) { SkASSERT((s.fInvType & ~(SkMatrix::kTranslate_Mask | SkMatrix::kScale_Mask)) == 0); SkASSERT(s.fInvKy == 0); PREAMBLE(s); const unsigned maxX = s.fBitmap->width() - 1; const SkFixed one = s.fFilterOneX; const SkFractionalInt dx = s.fInvSxFractionalInt; SkFractionalInt fx; { SkPoint pt; s.fInvProc(s.fInvMatrix, SkIntToScalar(x) + SK_ScalarHalf, SkIntToScalar(y) + SK_ScalarHalf, &pt); const SkFixed fy = SkScalarToFixed(pt.fY) - (s.fFilterOneY >> 1); const unsigned maxY = s.fBitmap->height() - 1; // compute our two Y values up front *xy++ = PACK_FILTER_Y_NAME(fy, maxY, s.fFilterOneY PREAMBLE_ARG_Y); // now initialize fx fx = SkScalarToFractionalInt(pt.fX) - (SkFixedToFractionalInt(one) >> 1); } #ifdef CHECK_FOR_DECAL if (can_truncate_to_fixed_for_decal(fx, dx, count, maxX)) { decal_filter_scale(xy, SkFractionalIntToFixed(fx), SkFractionalIntToFixed(dx), count); } else #endif { do { SkFixed fixedFx = SkFractionalIntToFixed(fx); *xy++ = PACK_FILTER_X_NAME(fixedFx, maxX, one PREAMBLE_ARG_X); fx += dx; } while (--count != 0); } } void AFFINE_FILTER_NAME(const SkBitmapProcState& s, uint32_t xy[], int count, int x, int y) { SkASSERT(s.fInvType & SkMatrix::kAffine_Mask); SkASSERT((s.fInvType & ~(SkMatrix::kTranslate_Mask | SkMatrix::kScale_Mask | SkMatrix::kAffine_Mask)) == 0); PREAMBLE(s); SkPoint srcPt; s.fInvProc(s.fInvMatrix, SkIntToScalar(x) + SK_ScalarHalf, SkIntToScalar(y) + SK_ScalarHalf, &srcPt); SkFixed oneX = s.fFilterOneX; SkFixed oneY = s.fFilterOneY; SkFixed fx = SkScalarToFixed(srcPt.fX) - (oneX >> 1); SkFixed fy = SkScalarToFixed(srcPt.fY) - (oneY >> 1); SkFixed dx = s.fInvSx; SkFixed dy = s.fInvKy; unsigned maxX = s.fBitmap->width() - 1; unsigned maxY = s.fBitmap->height() - 1; do { *xy++ = PACK_FILTER_Y_NAME(fy, maxY, oneY PREAMBLE_ARG_Y); fy += dy; *xy++ = PACK_FILTER_X_NAME(fx, maxX, oneX PREAMBLE_ARG_X); fx += dx; } while (--count != 0); } void PERSP_FILTER_NAME(const SkBitmapProcState& s, uint32_t* SK_RESTRICT xy, int count, int x, int y) { SkASSERT(s.fInvType & SkMatrix::kPerspective_Mask); PREAMBLE(s); unsigned maxX = s.fBitmap->width() - 1; unsigned maxY = s.fBitmap->height() - 1; SkFixed oneX = s.fFilterOneX; SkFixed oneY = s.fFilterOneY; SkPerspIter iter(s.fInvMatrix, SkIntToScalar(x) + SK_ScalarHalf, SkIntToScalar(y) + SK_ScalarHalf, count); while ((count = iter.next()) != 0) { const SkFixed* SK_RESTRICT srcXY = iter.getXY(); do { *xy++ = PACK_FILTER_Y_NAME(srcXY[1] - (oneY >> 1), maxY, oneY PREAMBLE_ARG_Y); *xy++ = PACK_FILTER_X_NAME(srcXY[0] - (oneX >> 1), maxX, oneX PREAMBLE_ARG_X); srcXY += 2; } while (--count != 0); } } #undef MAKENAME #undef TILEX_PROCF #undef TILEY_PROCF #ifdef CHECK_FOR_DECAL #undef CHECK_FOR_DECAL #endif #undef SCALE_FILTER_NAME #undef AFFINE_FILTER_NAME #undef PERSP_FILTER_NAME #undef PREAMBLE #undef PREAMBLE_PARAM_X #undef PREAMBLE_PARAM_Y #undef PREAMBLE_ARG_X #undef PREAMBLE_ARG_Y #undef TILEX_LOW_BITS #undef TILEY_LOW_BITS
然后我们就清楚了,这些函数名是用宏组合出来的。(神一般的代码。。。。。)
怎么算坐标的不详述了,主要按原理去推就可以了,坐标计算有三种模式:CLAMP(越界时限制在边界)、REPEAT(越界时从开头取起)、MIRROR(越界时取样方向倒转去取)。
sampleProc函数也是类似的方法组合出来的,不详述。
三、高级插值算法
双线性插值虽然在一般情况下够用了,但在放大图片时,效果还是不够好。需要更好的效果,可以用高级插值算法,代价是性能的大幅消耗。
高级插值算法目前在Android的Java代码处是走不进去的,不知道chromium是否用到。
几个要点:
1、在 setBitmapFilterProcs 时判断高级插值是否支持,若支持,设置 shaderProc 为 highQualityFilter32/highQualityFilter16(也就是独立计算坐标和采样像素)
2、highQualityFilter先通过变换矩阵计算原始点。
3、highQualityFilter根据 SkBitmapFilter 的采样窗口,将这个窗口中的所有点按其与原始点矩离,查询对应权重值,然后相加,得到最终像素点。
4、SkBitmapFilter 采用查表法去给出权重值,预计算由子类完成。
5、目前Skia库用的是双三次插值 mitchell 法。
SK_CONF_DECLARE(const char *, c_bitmapFilter, "bitmap.filter", "mitchell", "Which scanline bitmap filter to use [mitchell, lanczos, hamming, gaussian, triangle, box]");
详细代码见 external/skia/src/core/SkBitmapFilter.cpp,尽量这部分代码几乎无用武之地,但里面的公式很值得借鉴,随便改改就能做成 glsl shader 用。
看完这段代码,可以作不负责任的猜想:Skia设计之初,只考虑了近邻插值和双线性插值两种情况,因此采用这种模板方法,可以最小化代码量。而且MatrixProc和SampleProc可以后续分别作SIMD优化(Intel的SSE和ARM的Neon),以提高性能。
但是对于线性插值,两步法(取值——采样)在算法实现上本来就不是最优的,后面又不得不引入shader函数,应对一些场景做优化。高阶插值无法在这个设计下实现,因此又像补丁一样打上去。
四、总结
看完这一部分代码,有几个感受。
第一:绘张图片看上去一件简单的事,在渲染执行时,真心不容易,如果追求效果,还会有各种各样的花样。
第二:在性能有要求的场景下,用模板真是灾难:函数改写时,遇到模板,就不得不重新定义函数,并替换之,弄得代码看上去一下子混乱不少。
第三:从图像绘制这个角度上看,skia渲染性能虽然确实很好了,但远没有达到极限,仍然是有一定的优化空间的,如果这部分出现了性能问题,还是能做一定的优化的。关于Skia性能的讨论将放到介绍Skia系列的最后一章。
第四:OpenGL+glsl确实是轻松且高效多了,软件渲染在复杂场景上性能很有限。