TensorFlow-谷歌深度学习库 图片处理模块

Module: tf.image

这篇文章主要介绍TensorFlow处理图片这一块,这个模块和之前说过的文件I/O处理一样也是主要从python导过来的。

通过官方文档,我们了解到这个模块主要有一下这些个函数。

Functions

adjust_brightness(...): Adjust the brightness of RGB or Grayscale images.

adjust_contrast(...): Adjust contrast of RGB or grayscale images.

adjust_gamma(...): Performs Gamma Correction on the input image.

adjust_hue(...): Adjust hue of an RGB image.

adjust_saturation(...): Adjust saturation of an RGB image.

central_crop(...): Crop the central region of the image.

convert_image_dtype(...): Convert image to dtype, scaling its values if needed.

crop_and_resize(...): Extracts crops from the input image tensor and bilinearly resizes them (possibly

crop_to_bounding_box(...): Crops an image to a specified bounding box.

decode_and_crop_jpeg(...): Decode and Crop a JPEG-encoded image to a uint8 tensor.

decode_bmp(...): Decode the first frame of a BMP-encoded image to a uint8 tensor.

decode_gif(...): Decode the first frame of a GIF-encoded image to a uint8 tensor.

decode_image(...): Convenience function for decode_bmpdecode_gifdecode_jpeg,

decode_jpeg(...): Decode a JPEG-encoded image to a uint8 tensor.

decode_png(...): Decode a PNG-encoded image to a uint8 or uint16 tensor.

draw_bounding_boxes(...): Draw bounding boxes on a batch of images.

encode_jpeg(...): JPEG-encode an image.

encode_png(...): PNG-encode an image.

extract_glimpse(...): Extracts a glimpse from the input tensor.

extract_jpeg_shape(...): Extract the shape information of a JPEG-encoded image.

flip_left_right(...): Flip an image horizontally (left to right).

flip_up_down(...): Flip an image vertically (upside down).

grayscale_to_rgb(...): Converts one or more images from Grayscale to RGB.

hsv_to_rgb(...): Convert one or more images from HSV to RGB.

image_gradients(...): Returns image gradients (dy, dx) for each color channel.

is_jpeg(...): Convenience function to check if the 'contents' encodes a JPEG image.

non_max_suppression(...): Greedily selects a subset of bounding boxes in descending order of score.

pad_to_bounding_box(...): Pad image with zeros to the specified height and width.

per_image_standardization(...): Linearly scales image to have zero mean and unit norm.

psnr(...): Returns the Peak Signal-to-Noise Ratio between a and b.

random_brightness(...): Adjust the brightness of images by a random factor.

random_contrast(...): Adjust the contrast of an image by a random factor.

random_flip_left_right(...): Randomly flip an image horizontally (left to right).

random_flip_up_down(...): Randomly flips an image vertically (upside down).

random_hue(...): Adjust the hue of an RGB image by a random factor.

random_saturation(...): Adjust the saturation of an RGB image by a random factor.

resize_area(...): Resize images to size using area interpolation.

resize_bicubic(...): Resize images to size using bicubic interpolation.

resize_bilinear(...): Resize images to size using bilinear interpolation.

resize_image_with_crop_or_pad(...): Crops and/or pads an image to a target width and height.

resize_images(...): Resize images to size using the specified method.

resize_nearest_neighbor(...): Resize images to size using nearest neighbor interpolation.

rgb_to_grayscale(...): Converts one or more images from RGB to Grayscale.

rgb_to_hsv(...): Converts one or more images from RGB to HSV.

rgb_to_yiq(...): Converts one or more images from RGB to YIQ.

rgb_to_yuv(...): Converts one or more images from RGB to YUV.

rot90(...): Rotate image(s) counter-clockwise by 90 degrees.

sample_distorted_bounding_box(...): Generate a single randomly distorted bounding box for an image.

sobel_edges(...): Returns a tensor holding Sobel edge maps.

ssim(...): Computes SSIM index between img1 and img2.

ssim_multiscale(...): Computes the MS-SSIM between img1 and img2.

total_variation(...): Calculate and return the total variation for one or more images.

transpose_image(...): Transpose image(s) by swapping the height and width dimension.

yiq_to_rgb(...): Converts one or more images from YIQ to RGB.

yuv_to_rgb(...): Converts one or more images from YUV to RGB.

接下来,我们从中挑一些常用的详细了解一下。

decode_jpeg函数

tf.image.decode_jpeg(
    contents,
    channels=0,
    ratio=1,
    fancy_upscaling=True,
    try_recover_truncated=False,
    acceptable_fraction=1,
    dct_method='',
    name=None
)

相应的,tensorflow也提供解码其他图片格式如png,gif,bmp等等。

这个函数将一个jpeg格式的图片解析为一个unit8的tensor。

参数列表中的channel表示转换后想要的颜色通道的个数。接收0(使用原jpeg通道个数),1(grayscale),3(rgb)

参数列表中的ratio表示降级(downscale)图片。 接收值有1,2,4,8.

未完待续ing。。。

上一篇:jQuery实现全网热播视频


下一篇:《基于 Vue.js 的在线设计开放平台研究与实现》文献阅读随笔