（转）FFMPEG解码流程

2024-03-08 13:53:46

http://www.douban.com/note/228831821/

FFMPEG解码流程：

1. 注册所有容器格式和CODEC: av_register_all()

2. 打开文件: av_open_input_file()

3. 从文件中提取流信息: av_find_stream_info()

4. 穷举所有的流，查找其中种类为CODEC_TYPE_VIDEO

5. 查找对应的解码器: avcodec_find_decoder()

6. 打开编解码器: avcodec_open()

7. 为解码帧分配内存: avcodec_alloc_frame()

8. 不停地从码流中提取出帧数据: av_read_frame()

9. 判断帧的类型，对于视频帧调用: avcodec_decode_video()

10. 解码完后，释放解码器: avcodec_close()

11. 关闭输入文件: avformat_close_input_file()

主要数据结构：

基本概念:

编解码器、数据帧、媒体流和容器是数字媒体处理系统的四个基本概念。

首先需要统一术语：

容器／文件（Conainer/File）：即特定格式的多媒体文件。

媒体流（Stream）：指时间轴上的一段连续数据，如一段声音数据，一段视频数据或一段字幕数据，可以是压缩的，也可以是非压缩的，压缩的数据需要关联特定的编解码器。

数据帧／数据包（Frame/Packet）：通常，一个媒体流由大量的数据帧组成，对于压缩数据，帧对应着编解码器的最小处理单元。通常，分属于不同媒体流的数据帧交错复用于容器之中，参见交错。

编解码器：编解码器以帧为单位实现压缩数据和原始数据之间的相互转换。

在FFMPEG中，使用AVFormatContext、AVStream、AVCodecContext、AVCodec及AVPacket等结构来抽象这些基本要素，它们的关系如上图所示：

AVCodecContext：

这是一个描述编解码器上下文的数据结构，包含了众多编解码器需要的参数信息，如下列出了部分比较重要的域：

typedef struct AVCodecContext {

/ **

*一些编解码器需要/可以像使用extradata Huffman表。

* MJPEG：Huffman表

* RV10其他标志

* MPEG4：全球头（也可以是在比特流或这里）

*分配的内存应该是FF_INPUT_BUFFER_PADDING_SIZE字节较大

*，比extradata_size避免比特流器，如果它与读prolems。

* extradata按字节的内容必须不依赖于架构或CPU的字节顺序。

* - 编码：设置/分配/释放由libavcodec的。

* - 解码：由用户设置/分配/释放。

* /

uint8_t *extradata;

int extradata_size;

/ **

*这是时间的基本单位，在条件（以秒为单位）

*帧时间戳派代表出席了会议。对于固定fps的内容，

*基应该1/framerate和时间戳的增量应该

*相同的1。

* - 编码：必须由用户设置。

* - 解码：libavcodec的设置。

* /

AVRational time_base;

enum CodecID codec_id;

/ **

*的fourcc（LSB在前，所以“的ABCD” - >（“D”<< 24）（“C”<< 16）（“B”<< 8）+“A”）。

*这是用来解决一些编码错误。

*分路器应设置什么是编解码器用于识别领域中。

*如果有分路器等多个领域，在一个容器，然后选择一个

*最大化使用的编解码器有关的信息。

*如果在容器中的编解码器标记字段然后32位大分路器应该

*重新映射到一个表或其他结构的32位编号。也可选择新

* extra_codec_tag+大小可以添加，但必须证明这是一个明显的优势

*第一。

* - 编码：由用户设置，如果没有则默认基础上codec_id将使用。

* - 解码：由用户设置，将被转换成在初始化libavcodec的大写。

* /

unsigned int codec_tag;

......

/ **

*在解码器的帧重排序缓冲区的大小。

*对于MPEG-2，这是IPB1或0低延时IP。

* - 编码：libavcodec的设置。

* - 解码：libavcodec的设置。

* /

int has_b_frames;

/ **

*每包的字节数，如果常量和已知或0

*用于一些WAV的音频编解码器。

* /

int block_align;

/ **

*从分路器位每个样品/像素（huffyuv需要）。

* - 编码：libavcodec的设置。

* - 解码：由用户设置。

* /

int bits_per_coded_sample;

.....

} AVCodecContext;

如果是单纯使用libavcodec，这部分信息需要调用者进行初始化；如果是使用整个FFMPEG库，这部分信息在调用 avformat_open_input和avformat_find_stream_info的过程中根据文件的头信息及媒体流内的头部信息完成初始化。其中几个主要域的释义如下：

extradata/extradata_size：这个buffer中存放了解码器可能会用到的额外信息，在av_read_frame中填充。一般来说，首先，某种具体格式的demuxer在读取格式头信息的时候会填充extradata，其次，如果 demuxer没有做这个事情，比如可能在头部压根儿就没有相关的编解码信息，则相应的parser会继续从已经解复用出来的媒体流中继续寻找。在没有找到任何额外信息的情况下，这个buffer指针为空。

time_base：

width/height：视频的宽和高。

sample_rate/channels：音频的采样率和信道数目。

sample_fmt：音频的原始采样格式。

codec_name/codec_type/codec_id/codec_tag：编解码器的信息。

AVStrea

该结构体描述一个媒体流，定义如下：

typedef struct AVStream {

int index;

AVCodecContext *codec;

/ **

*流的实时帧率基地。

*这是所有时间戳可以最低帧率

*准确代表（它是所有的最小公倍数

*流的帧率）。请注意，这个值只是一个猜测！

*例如，如果时间基数为1/90000和所有帧

*约3600或1800计时器刻度，，然后r_frame_rate将是50/1。

* /

AVRational r_frame_rate;

/ **

*这是时间的基本单位，在条件（以秒为单位）

*帧时间戳派代表出席了会议。对于固定fps的内容，

*时基应该是1/framerate的时间戳的增量应为1。

* /

AVRational time_base;

......

/ **

*解码流量的第一帧，在流量时-base分。

*如果你是绝对100％的把握，设定值

*它真的是第一帧点。

*这可能是未定义（AV_NOPTS_VALUE）的。

*@注意的业余头不弱者受制与正确的START_TIME的业余

*分路器必须不设定此。

* /

int64_t start_time;

/ **

*解码：时间流流时基。

*如果源文件中没有指定的时间，但不指定

*比特率，这个值将被从码率和文件大小的估计。

* /

int64_t duration;

#if LIBAVFORMAT_VERSION_INT < (53<<16)

char language[4];

#endif

/ *流信息* /

int64_t timestamp;

#if LIBAVFORMAT_VERSION_INT < (53<<16)

char title[512];

char author[512];

char copyright[512];

char comment[512];

char album[512];

int year;

int track;

char genre[32];

#endif

int ctx_flags;

int64_t data_offset;

int index_built;

int mux_rate;

unsigned int packet_size;

int preload;

int max_delay;

#define AVFMT_NOOUTPUTLOOP -1

#define AVFMT_INFINITEOUTPUTLOOP 0

int loop_output;

int flags;

#define AVFMT_FLAG_GENPTS 0x0001 ///< 生成失踪分，即使它需要解析未来框架。

#define AVFMT_FLAG_IGNIDX 0x0002 ///< 忽略指数。

#define AVFMT_FLAG_NONBLOCK 0x0004 ///<从输入中读取数据包时，不要阻止。

#define AVFMT_FLAG_IGNDTS 0x0008 ///< 忽略帧的DTS包含DTS与PTS

#define AVFMT_FLAG_NOFILLIN 0x0010 ///< 不要从任何其他值推断值，只是返回存储在容器中

#define AVFMT_FLAG_NOPARSE 0x0020 ///< 不要使用AVParsers，你还必须设置为FILLIN帧代码的工作，没有解析AVFMT_FLAG_NOFILLIN - >无帧。也在寻求框架不能工作，如果找到帧边界的解析已被禁用

#define AVFMT_FLAG_RTP_HINT 0x0040 ///< 暗示到输出文件添加的RTP

int loop_input;

CODEC_ID_MPEG1VIDEO,

CODEC_ID_MPEG2VIDEO, ///< preferred ID for MPEG-1/2 video decoding

CODEC_ID_MPEG2VIDEO_XVMC,

CODEC_ID_H261,

CODEC_ID_H263,

...

};

通常，如果某种媒体格式具备完备而正确的头信息，调用avformat_open_input即可以得到这两个参数，但若是因某种原因 avformat_open_input无法获取它们，这一任务将由avformat_find_stream_info完成。

其次还要获取各媒体流对应编解码器的时间基准。

此外，对于音频编解码器，还需要得到：

采样率，

声道数，

位宽，

帧长度（对于某些编解码器是必要的），

对于视频编解码器，则是：

图像大小，

色彩空间及格式，

av_read_frame

int av_read_frame(AVFormatContext *s, AVPacket *pkt);

这个函数用于从多媒体文件或多媒体流中读取媒体数据，获取的数据由AVPacket结构pkt来存放。对于音频数据，如果是固定比特率，则pkt中装载着一个或多个音频帧；如果是可变比特率，则pkt中装载有一个音频帧。对于视频数据，pkt中装载有一个视频帧。需要注意的是：再次调用本函数之前，必须使用 av_free_packet释放pkt所占用的资源。

通过pkt→stream_index可以查到获取的媒体数据的类型，从而将数据送交相应的解码器进行后续处理。

av_seek_frame

int av_seek_frame(AVFormatContext *s, int stream_index, int64_t timestamp, int flags);

这个函数通过改变媒体文件的读写指针来实现对媒体文件的随机访问，支持以下三种方式：

基于时间的随机访问：具体而言就是将媒体文件读写指针定位到某个给定的时间点上，则之后调用av_read_frame时能够读到时间标签等于给定时间点的媒体数据，通常用于实现媒体播放器的快进、快退等功能。

基于文件偏移的随机访问：相当于普通文件的seek函数，timestamp也成为文件的偏移量。

基于帧号的随机访问：timestamp为要访问的媒体数据的帧号。

关于参数：

s：是个AVFormatContext指针，就是avformat_open_input返回的那个结构。

stream_index：指定媒体流，如果是基于时间的随机访问，则第三个参数timestamp将以此媒体流的时间基准为单位；如果设为负数，则相当于不指定具体的媒体流，FFMPEG会按照特定的算法寻找缺省的媒体流，此时，timestamp的单位为AV_TIME_BASE（微秒）。

timestamp：时间标签，单位取决于其他参数。

flags：定位方式，AVSEEK_FLAG_BYTE表示基于字节偏移，AVSEEK_FLAG_FRAME表示基于帧号，其它表示基于时间。

av_close_input_file:

void av_close_input_file(AVFormatContext *s);

关闭一个媒体文件：释放资源，关闭物理IO。

avcodec_find_decoder:

AVCodec *avcodec_find_decoder(enum CodecID id);

AVCodec *avcodec_find_decoder_by_name(const char *name);

根据给定的codec id或解码器名称从系统中搜寻并返回一个AVCodec结构的指针。

avcodec_open:

int avcodec_open(AVCodecContext *avctx, AVCodec *codec);

此函数根据输入的AVCodec指针具体化AVCodecContext结构。在调用该函数之前，需要首先调用avcodec_alloc_context 分配一个AVCodecContext结构，或调用avformat_open_input获取媒体文件中对应媒体流的AVCodecContext结构；此外还需要通过avcodec_find_decoder获取AVCodec结构。

这一函数还将初始化对应的解码器。

avcodec_decode_video2

int avcodec_decode_video2(AVCodecContext *avctx, AVFrame *picture, int *got_picture_ptr, AVPacket *avpkt);

解码一个视频帧。got_picture_ptr指示是否有解码数据输出。

输入数据在AVPacket结构中，输出数据在AVFrame结构中。AVFrame是定义在avcodec.h中的一个数据结构：

typedef struct AVFrame {

FF_COMMON_FRAME

} AVFrame;

FF_COMMON_FRAME定义了诸多数据域，大部分由FFMpeg内部使用，对于用户来说，比较重要的主要包括：

#define FF_COMMON_FRAME \

......

uint8_t *data[4];\

int linesize[4];\

int key_frame;\

int pict_type;\

int64_t pts;\

int reference;\

......

FFMpeg内部以planar的方式存储原始图像数据，即将图像像素分为多个平面（R/G/B或Y/U/V），data数组内的指针分别指向四个像素平面的起始位置，linesize数组则存放各个存贮各个平面的缓冲区的行宽：

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+++data[0]->#################################++++++++++++

++++++++++++###########picture data##########++++++++++++

........................

++++++++++++#################################++++++++++++

|<-------------------line_size[0]---------------------->|

此外，key_frame标识该图像是否是关键帧；pict_type表示该图像的编码类型：I(1)/P(2)/B(3)……；pts是以 time_base为单位的时间标签，对于部分解码器如H.261、H.263和MPEG4，可以从头信息中获取；reference表示该图像是否被用作参考。

avcodec_decode_audio4

int avcodec_decode_audio4(AVCodecContext *avctx, AVFrame *frame, int *got_frame_ptr, AVPacket *avpkt);

解码一个音频帧。输入数据在AVPacket结构中，输出数据在frame中，got_frame_ptr表示是否有数据输出。

avcodec_close

int avcodec_close(AVCodecContext *avctx);

关闭解码器，释放avcodec_open中分配的资源。

测试程序

#include

#include

#include

#include

#include "libavutil/avstring.h"

#include "libavformat/avformat.h"

#include "libavdevice/avdevice.h"

#include "libavcodec/opt.h"

#include "libswscale/swscale.h"

#define DECODED_AUDIO_BUFFER_SIZE 192000

struct options

{

int streamId;

int frames;

int nodec;

int bplay;

int thread_count;

int64_t lstart;

char finput[256];

char foutput1[256];

char foutput2[256];

};

int parse_options(struct options *opts, int argc, char** argv)

{

int optidx;

char *optstr;

if (argc < 2) return -1;

opts->streamId = -1;

opts->lstart = -1;

opts->frames = -1;

opts->foutput1[0] = 0;

opts->foutput2[0] = 0;

opts->nodec = 0;

opts->bplay = 0;

opts->thread_count = 0;

strcpy(opts->finput, argv[1]);

optidx = 2;

while (optidx < argc)

{

optstr = argv[optidx++];

if (*optstr++ != '-') return -1;

switch (*optstr++)

{

case 's': //< stream id

opts->streamId = atoi(optstr);

break;

case 'f': //< frames

opts->frames = atoi(optstr);

break;

case 'k': //< skipped

opts->lstart = atoll(optstr);

break;

case 'o': //< output

strcpy(opts->foutput1, optstr);

strcat(opts->foutput1, ".mpg");

strcpy(opts->foutput2, optstr);

strcat(opts->foutput2, ".raw");

break;

case 'n': //decoding and output options

if (strcmp("dec", optstr) == 0)

opts->nodec = 1;

break;

case 'p':

opts->bplay = 1;

break;

case 't':

opts->thread_count = atoi(optstr);

break;

default:

return -1;

}

}

return 0;

}

void show_help(char* program)

{

printf("简单的FFMPEG测试方案\n");

printf("Usage: %s inputfile [-sstreamid [-fframes] [-kskipped] [-ooutput_filename(without extension)] [-p] [-tthread_count］\n",

program);

return;

}

static void log_callback(void* ptr, int level, const char* fmt, va_list vl)

{

vfprintf(stdout, fmt, vl);

}

/ *音频渲染器的代码（OSS）*/

#include

#include

#include

#include

#define OSS_DEVICE "/dev/dsp0"

struct audio_dsp

{

int audio_fd;

int channels;

int format;

int speed;

};

int map_formats(enum SampleFormat format)

{

switch(format)

{

case SAMPLE_FMT_U8:

return AFMT_U8;

case SAMPLE_FMT_S16:

return AFMT_S16_LE;

default:

return AFMT_U8;

}

}

int set_audio(struct audio_dsp* dsp)

{

if (dsp->audio_fd == -1)

{

printf("无效的音频DSP ID!\n");

return -1;

}

if (-1 == ioctl(dsp->audio_fd, SNDCTL_DSP_SETFMT, &dsp->format))

{

printf("无法设置DSP格式!\n");

return -1;

}

if (-1 == ioctl(dsp->audio_fd, SNDCTL_DSP_CHANNELS, &dsp->channels))

{

printf("无法设置DSP格式!\n");

return -1;

}

if (-1 == ioctl(dsp->audio_fd, SNDCTL_DSP_SPEED, &dsp->speed))

{

printf("无法设置DSP格式!\n");

return -1;

}

return 0;

}

int play_pcm(struct audio_dsp* dsp, unsigned char *buf, int size)

{

if (dsp->audio_fd == -1)

{

printf("无效的音频DSP ID！\n");

return -1;

}

if (-1 == write(dsp->audio_fd, buf, size))

{

printf("音频DSP无法写入！\n");

return -1;

}

return 0;

}

#include

#include

#define FB_DEVICE "/dev/fb0"

enum pic_format

{

eYUV_420_Planer,

};

struct video_fb

{

int video_fd;

struct fb_var_screeninfo vinfo;

struct fb_fix_screeninfo finfo;

unsigned char *fbp;

AVFrame *frameRGB;

struct

{

int x;

int y;

} video_pos;

};

int open_video(struct video_fb *fb, int x, int y)

{

int screensize;

fb->video_fd = open(FB_DEVICE, O_WRONLY);

if (fb->video_fd == -1) return -1;

if (ioctl(fb->video_fd, FBIOGET_FSCREENINFO, &fb->finfo)) return -2;

if (ioctl(fb->video_fd, FBIOGET_VSCREENINFO, &fb->vinfo)) return -2;

printf("视频设备：分解 %dx%d, �pp\n", fb->vinfo.xres, fb->vinfo.yres, fb->vinfo.bits_per_pixel);

screensize = fb->vinfo.xres * fb->vinfo.yres * fb->vinfo.bits_per_pixel / 8;

fb->fbp = (unsigned char *) mmap(0, screensize, PROT_READ|PROT_WRITE, MAP_SHARED, fb->video_fd, 0);

if (fb->fbp == -1) return -3;

if (x >= fb->vinfo.xres || y >= fb->vinfo.yres)

{

return -4;

}

else

{

fb->video_pos.x = x;

fb->video_pos.y = y;

}

fb->frameRGB = avcodec_alloc_frame();

if (!fb->frameRGB) return -5;

return 0;

}

#if 0

int show_picture(struct video_fb *fb, AVFrame *frame, int width, int height, enum pic_format format)

{

struct SwsContext *sws;

int i;

unsigned char *dest;

unsigned char *src;

if (fb->video_fd == -1) return -1;

if ((fb->video_pos.x >= fb->vinfo.xres) || (fb->video_pos.y >= fb->vinfo.yres)) return -2;

if (fb->video_pos.x + width > fb->vinfo.xres)

{

width = fb->vinfo.xres - fb->video_pos.x;

}

if (fb->video_pos.y + height > fb->vinfo.yres)

{

height = fb->vinfo.yres - fb->video_pos.y;

}

if (format == PIX_FMT_YUV420P)

{

sws = sws_getContext(width, height, format, width, height, PIX_FMT_RGB32, SWS_FAST_BILINEAR, NULL, NULL, NULL);

if (sws == 0)

{

return -3;

}

if (sws_scale(sws, frame->data, frame->linesize, 0, height, fb->frameRGB->data, fb->frameRGB->linesize))

{

return -3;

}

dest = fb->fbp + (fb->video_pos.x+fb->vinfo.xoffset) * (fb->vinfo.bits_per_pixel/8) +(fb->video_pos.y+fb->vinfo.yoffset) * fb->finfo.line_length;

for (i = 0; i < height; i++)

{

memcpy(dest, src, width*4);

src += fb->frameRGB->linesize[0];

dest += fb->finfo.line_length;

}

}

return 0;

}

#endif

void close_video(struct video_fb *fb)

{

if (fb->video_fd != -1)

{

munmap(fb->fbp, fb->vinfo.xres * fb->vinfo.yres * fb->vinfo.bits_per_pixel / 8);

close(fb->video_fd);

fb->video_fd = -1;

}

}

int main(int argc, char **argv)

{

AVFormatContext* pCtx = 0;

AVCodecContext *pCodecCtx = 0;

AVCodec *pCodec = 0;

AVPacket packet;

AVFrame *pFrame = 0;

FILE *fpo1 = NULL;

FILE *fpo2 = NULL;

int nframe;

int err;

int got_picture;

int picwidth, picheight, linesize;

unsigned char *pBuf;

int i;

int64_t timestamp;

struct options opt;

int usefo = 0;

struct audio_dsp dsp;

int dusecs;

float usecs1 = 0;

float usecs2 = 0;

struct timeval elapsed1, elapsed2;

int decoded = 0;

av_register_all();

av_log_set_callback(log_callback);

av_log_set_level(50);

if (parse_options(&opt, argc, argv) < 0 || (strlen(opt.finput) == 0))

{

show_help(argv[0]);

return 0;

}

err = avformat_open_input(&pCtx, opt.finput, 0, 0);

if (err < 0)

{

printf("\n->(avformat_open_input)\tERROR:\t%d\n", err);

goto fail;

}

err = avformat_find_stream_info(pCtx, 0);

if (err < 0)

{

printf("\n->(avformat_find_stream_info)\tERROR:\t%d\n", err);

goto fail;

}

if (opt.streamId < 0)

{

av_dump_format(pCtx, 0, pCtx->filename, 0);

goto fail;

}

else

{

printf("\n 额外的数据流 %d (�):", opt.streamId, pCtx->streams[opt.streamId]->codec->extradata_size);

for (i = 0; i < pCtx->streams[opt.streamId]->codec->extradata_size; i++)

{

if (i == 0) printf("\n");

printf("%2x ", pCtx->streams[opt.streamId]->codec->extradata[i]);

}

}

/ *尝试打开输出文件*/

if (strlen(opt.foutput1) && strlen(opt.foutput2))

{

fpo1 = fopen(opt.foutput1, "wb");

fpo2 = fopen(opt.foutput2, "wb");

if (!fpo1 || !fpo2)

{

printf("\n->error 打开输出文件\n");

goto fail;

}

usefo = 1;

}

else

{

usefo = 0;

}

if (opt.streamId >= pCtx->nb_streams)

{

printf("\n->StreamId\tERROR\n");

goto fail;

}

if (opt.lstart > 0)

{

err = av_seek_frame(pCtx, opt.streamId, opt.lstart, AVSEEK_FLAG_ANY);

if (err < 0)

{

printf("\n->(av_seek_frame)\tERROR:\t%d\n", err);

goto fail;

}

}

/ *解码器的配置*/

if (!opt.nodec)

{

pCodecCtx = pCtx->streams[opt.streamId]->codec;

if (opt.thread_count <= 16 && opt.thread_count > 0 )

{

pCodecCtx->thread_count = opt.thread_count;

pCodecCtx->thread_type = FF_THREAD_FRAME;

}

pCodec = avcodec_find_decoder(pCodecCtx->codec_id);

if (!pCodec)

{

printf("\n->不能找到编解码器!\n");

goto fail;

}

err = avcodec_open2(pCodecCtx, pCodec, 0);

if (err < 0)

{

printf("\n->(avcodec_open)\tERROR:\t%d\n", err);

goto fail;

}

pFrame = avcodec_alloc_frame();

/ *准备设备* /

if (opt.bplay)

{

/ *音频设备* /

dsp.audio_fd = open(OSS_DEVICE, O_WRONLY);

if (dsp.audio_fd == -1)

{

printf("\n-> 无法打开音频设备\n");

goto fail;

}

dsp.channels = pCodecCtx->channels;

dsp.speed = pCodecCtx->sample_rate;

dsp.format = map_formats(pCodecCtx->sample_fmt);

if (set_audio(&dsp) < 0)

{

printf("\n-> 不能设置音频设备\n");

goto fail;

}

/ *视频设备* /

}

}

nframe = 0;

while(nframe < opt.frames || opt.frames == -1)

{

gettimeofday(&elapsed1, NULL);

err = av_read_frame(pCtx, &packet);

if (err < 0)

{

printf("\n->(av_read_frame)\tERROR:\t%d\n", err);

break;

}

gettimeofday(&elapsed2, NULL);

dusecs = (elapsed2.tv_sec - elapsed1.tv_sec)*1000000 + (elapsed2.tv_usec - elapsed1.tv_usec);

usecs2 += dusecs;

timestamp = av_rescale_q(packet.dts, pCtx->streams[packet.stream_index]->time_base, (AVRational){1, AV_TIME_BASE});

printf("\nFrame No ] stream#%d\tsize mB, timestamp:%6lld, dts:%6lld, pts:%6lld, ", nframe++, packet.stream_index, packet.size,

timestamp, packet.dts, packet.pts);

if (packet.stream_index == opt.streamId)

{

#if 0

for (i = 0; i < 16; i++)

{

if (i == 0) printf("\n pktdata: ");

printf("%2x ", packet.data[i]);

}

printf("\n");

#endif

if (usefo)

{

fwrite(packet.data, packet.size, 1, fpo1);

fflush(fpo1);

}

if (pCtx->streams[opt.streamId]->codec->codec_type == AVMEDIA_TYPE_VIDEO && !opt.nodec)

{

picheight = pCtx->streams[opt.streamId]->codec->height;

picwidth = pCtx->streams[opt.streamId]->codec->width;

gettimeofday(&elapsed1, NULL);

avcodec_decode_video2(pCodecCtx, pFrame, &got_picture, &packet);

decoded++;

gettimeofday(&elapsed2, NULL);

dusecs = (elapsed2.tv_sec - elapsed1.tv_sec)*1000000 + (elapsed2.tv_usec - elapsed1.tv_usec);

usecs1 += dusecs;

if (got_picture)

{

printf("[Video: type %d, ref %d, pts %lld, pkt_pts %lld, pkt_dts %lld]",

pFrame->pict_type, pFrame->reference, pFrame->pts, pFrame->pkt_pts, pFrame->pkt_dts);

if (pCtx->streams[opt.streamId]->codec->pix_fmt == PIX_FMT_YUV420P)

{

if (usefo)

{

linesize = pFrame->linesize[0];

pBuf = pFrame->data[0];

for (i = 0; i < picheight; i++)

{

fwrite(pBuf, picwidth, 1, fpo2);

pBuf += linesize;

}

linesize = pFrame->linesize[1];

pBuf = pFrame->data[1];

for (i = 0; i < picheight/2; i++)

{

fwrite(pBuf, picwidth/2, 1, fpo2);

pBuf += linesize;

}

linesize = pFrame->linesize[2];

pBuf = pFrame->data[2];

for (i = 0; i < picheight/2; i++)

{

fwrite(pBuf, picwidth/2, 1, fpo2);

pBuf += linesize;

}

fflush(fpo2);

}

if (opt.bplay)

{

}

}

}

av_free_packet(&packet);

}

else if (pCtx->streams[opt.streamId]->codec->codec_type == AVMEDIA_TYPE_AUDIO && !opt.nodec)

{

int got;

gettimeofday(&elapsed1, NULL);

avcodec_decode_audio4(pCodecCtx, pFrame, &got, &packet);

decoded++;

gettimeofday(&elapsed2, NULL);

dusecs = (elapsed2.tv_sec - elapsed1.tv_sec)*1000000 + (elapsed2.tv_usec - elapsed1.tv_usec);

usecs1 += dusecs;

if (got)

{

printf("[Audio: ]B raw data, decoding time: %d]", pFrame->linesize[0], dusecs);

if (usefo)

{

fwrite(pFrame->data[0], pFrame->linesize[0], 1, fpo2);

fflush(fpo2);

}

if (opt.bplay)

{

play_pcm(&dsp, pFrame->data[0], pFrame->linesize[0]);

}

}

}

}

}

if (!opt.nodec && pCodecCtx)

{

avcodec_close(pCodecCtx);

}

printf("\n%d 帧解析, average %.2f us per frame\n", nframe, usecs2/nframe);

printf("%d 帧解码，平均 %.2f 我们每帧\n", decoded, usecs1/decoded);

fail:

if (pCtx)

{

avformat_close_input(&pCtx);

}

if (fpo1)

{

fclose(fpo1);

}

if (fpo2)

{

fclose(fpo2);

}

if (!pFrame)

{

av_free(pFrame);

}

if (!usefo && (dsp.audio_fd != -1))

{

close(dsp.audio_fd);

}

return 0;}

本文以H264视频流为例，讲解解码流数据的步骤。

为突出重点，本文只专注于讨论解码视频流数据，不涉及其它（如开发环境的配置等）。如果您需要这方面的信息，请和我联系。

准备变量

定义AVCodecContext。如果您使用类，可以定义成类成员。我这里定义成全局变量。

static AVCodecContext * g_pCodecCtx = NULL;

定义一个AVFrame，AVFrame描述一个多媒体帧。解码后的数据将被放在其中。

static AVFrame * g_pavfFrame = NULL;

初始化解码器

现在开始初始化您的解码器。我把整个初始化过程包在了一个函数里，除非您有更好的主意，我建议您也这么做。函数长得象这样：

BOOL H264_Init()

{

…

}

初始化libavcodec，MMPEG要求，这个函数一定要第一个被调用：

avcodec_init();

挂上所有的codec。也许只挂一个H264的codec就行，我没试过：

av_register_all();

得到H264的解码器：

AVCodec * pCodec = avcodec_find_decoder(CODEC_ID_H264);

创建一个AVCodecContext，并用默认值初始化：

g_pCodecCtx = avcodec_alloc_context();

更改g_pCodecCtx的一些成员变量的值，您应该从解码方得到这些变量值：

g_pCodecCtx->time_base.num = 1; //这两行：一秒钟25帧

g_pCodecCtx->time_base.den = 25;

g_pCodecCtx->bit_rate = 0; //初始化为0

g_pCodecCtx->frame_number = 1; //每包一个视频帧

g_pCodecCtx->codec_type = CODEC_TYPE_VIDEO;

g_pCodecCtx->width = 704; //这两行：视频的宽度和高度

g_pCodecCtx->height = 576;

打开codec。如果打开成功的话，分配AVFrame：

if(avcodec_open(g_pCodecCtx, pCodec) >= 0)

{

g_pavfFrame = avcodec_alloc_frame();// Allocate video frame

}

列出完整的初始化解码库的代码：

解码

如果您只要求解成YUV 420I数据，只需一次调用就可以了：

avcodec_decode_video(g_pCodecCtx, g_pavfFrame, (int *)&nGot, (unsigned __int8 *)pSrcData, dwDataLen);

这里，nGot用来返回解码成功与否，avcodec_decode_video调用完成后，如果nGot不等于0,则表示解码成功，否则未解出视频帧。

pSrcData是待解的H264编码的一段数据流，dwDataLen表示该段数据流的长度，单位是byte。

解码后的视频帧（YUV数据）被存入g_pavfFrame，g_pavfFrame->data[0]、 g_pavfFrame->data[1]、g_pavfFrame->data[2]即是YUV数据。下面的示例代码把YUV数据压在了一块内存里，排列方式为：

该函数有返回值：如果解码成功，则返回本次解码使用的码流字节数，否则返回0。为简单起见，我这里假设pSrcData只包含一个视频帧。

同样，出于模块化的要求和代码维护的方便，我把解码动作也包在了一个函数里:

BOOL H264_Decode(const PBYTE pSrcData, const DWORD dwDataLen, PBYTE pDeData, int * pnWidth, int * pnHeight)

pSrcData – 待解码数据

dwDataLen – 待解码数据字节数

pDeData – 用来返回解码后的YUV数据

pnWidth， pnHeight – 用来返回视频的长度和宽度

下面列出完整的代码：

释放解码器

以上其实已经完成了本文的任务，但从负责任的角度，要善始善终嘛。

释放的过程没什么好说的，一看就明白。同样，我也把它们包在了一个函数里：

（抱歉的很，文章本来是用Word写的，代码块是一个个文本框，但贴到这里却变成了图片。）

下面是如果解码以及将 YV12 数据转换成

32 位 ARGB 数据的代码

1

2 #include "avcodec.h"

3 #include "h264decoder.h";

4

5 typedef unsigned char byte_t;

6 typedef unsigned int uint_t;

7

8 struct AVCodec *fCodec = NULL; // Codec

9 struct AVCodecContext *fCodecContext = NULL; // Codec Context

10 struct AVFrame *fVideoFrame = NULL; // Frame

11

12 int fDisplayWidth = 0;

13 int fDisplayHeight = 0;

14 int *fColorTable = NULL;

15

16 int avcodec_decode_video(AVCodecContext *avctx, AVFrame *picture,

17 int *got_picture_ptr,

18 const uint8_t *buf, int buf_size)

19 {

20 AVPacket avpkt;

21 av_init_packet(&avpkt);

22 avpkt.data = buf;

23 avpkt.size = buf_size;

24 // HACK for CorePNG to decode as normal PNG by default

25 avpkt.flags = AV_PKT_FLAG_KEY;

26 return avcodec_decode_video2(avctx, picture, got_picture_ptr, &avpkt);

27 }

28

29 #define RGB_V(v) ((v < 0) ? 0 : ((v > 255) ? 255 : v))

30

31 void DeleteYUVTable()

32 {

33 av_free(fColorTable);

34 }

35

36 void CreateYUVTable()

37 {

38 int i;

39 int u, v;

40 int *u_b_tab = NULL;

41 int *u_g_tab = NULL;

42 int *v_g_tab = NULL;

43 int *v_r_tab = NULL;

44

45 fColorTable = (int *)av_malloc(4 * 256 * sizeof(int));

46 u_b_tab = &fColorTable[0 * 256];

47 u_g_tab = &fColorTable[1 * 256];

48 v_g_tab = &fColorTable[2 * 256];

49 v_r_tab = &fColorTable[3 * 256];

50

51 for (i = 0; i < 256; i++) {

52 u = v = (i - 128);

53 u_b_tab[i] = (int) ( 1.772 * u);

54 u_g_tab[i] = (int) ( 0.34414 * u);

55 v_g_tab[i] = (int) ( 0.71414 * v);

56 v_r_tab[i] = (int) ( 1.402 * v);

57 }

58 }

59

60

61 void DisplayYUV_32(uint_t *displayBuffer, int videoWidth, int videoHeight, int outPitch)

62 {

63 int *u_b_tab = &fColorTable[0 * 256];

64 int *u_g_tab = &fColorTable[1 * 256];

65 int *v_g_tab = &fColorTable[2 * 256];

66 int *v_r_tab = &fColorTable[3 * 256];

67

68 // YV12: [Y:MxN] [U:M/2xN/2] [V:M/2xN/2]

69 byte_t* y = fVideoFrame->data[0];

70 byte_t* u = fVideoFrame->data[1];

71 byte_t* v = fVideoFrame->data[2];

72

73 int src_ystride = fVideoFrame->linesize[0];

74 int src_uvstride = fVideoFrame->linesize[1];

75

76 int i, line;

77 int r, g, b;

78

79 int ub, ug, vg, vr;

80

81 int width = videoWidth;

82 int height = videoHeight;

83

84 // 剪切边框

85 if (width > fDisplayWidth) {

86 width = fDisplayWidth;

87 y += (videoWidth - fDisplayWidth) / 2;

88 u += (videoWidth - fDisplayWidth) / 4;

89 v += (videoWidth - fDisplayWidth) / 4;

90 }

91

92 if (height > fDisplayHeight) {

93 height = fDisplayHeight;

94 }

95

96 for (line = 0; line < height; line++) {

97 byte_t* yoff = y + line * src_ystride;

98 byte_t* uoff = u + (line / 2) * src_uvstride;

99 byte_t* voff = v + (line / 2) * src_uvstride;

100 //uint_t* buffer = displayBuffer + (height - line - 1) * outPitch;

101 uint_t* buffer = displayBuffer + line * outPitch;

102

103 for (i = 0; i < width; i++) {

104 ub = u_b_tab[*uoff];

105 ug = u_g_tab[*uoff];

106 vg = v_g_tab[*voff];

107 vr = v_r_tab[*voff];

108

109 b = RGB_V(*yoff + ub);

110 g = RGB_V(*yoff - ug - vg);

111 r = RGB_V(*yoff + vr);

112

113 *buffer = 0xff000000 | b << 16 | g << 8 | r;

114

115 buffer++;

116 yoff ++;

117

118 if ((i % 2) == 1) {

119 uoff++;

120 voff++;

121 }

122 }

123 }

124 }

125

126 int avc_decode_init(int width, int height)

127 {

128 if (fCodecContext != NULL) {

129 return 0;

130 }

131 avcodec_init();

132 avcodec_register_all();

133 fCodec = avcodec_find_decoder(CODEC_ID_H264);

134

135 fDisplayWidth = width;

136 fDisplayHeight = height;

137

138 CreateYUVTable();

139

140 fCodecContext = avcodec_alloc_context();

141 avcodec_open(fCodecContext, fCodec);

142 fVideoFrame = avcodec_alloc_frame();

143

144 return 1;

145 }

146

147 int avc_decode_release()

148 {

149 if (fCodecContext) {

150 avcodec_close(fCodecContext);

151 free(fCodecContext->priv_data);

152 free(fCodecContext);

153 fCodecContext = NULL;

154 }

155

156 if (fVideoFrame) {

157 free(fVideoFrame);

158 fVideoFrame = NULL;

159 }

160

161 DeleteYUVTable();

162 return 1;

163 }

164

165 int avc_decode(char* buf, int nalLen, char* out)

166 {

167 byte_t* data = (byte_t*)buf;

168 int frameSize = 0;

169

170 int ret = avcodec_decode_video(fCodecContext, fVideoFrame, &frameSize, data, nalLen);

171 if (ret <= 0) {

172 return ret;

173 }

174

175 int width = fCodecContext->width;

176 int height = fCodecContext->height;

177 DisplayYUV_32((uint32_t*)out, width, height, fDisplayWidth);

178 return ret;

179 }

码农公寓

相关文章