音视频教程-第四节

一个专注音视频领域的小圈子

音视频系列教程

课程目标

学习如何使用 FFmpeg 解码视频帧，理解 AVPacket 和 AVFrame 的区别，掌握从压缩数据到原始 YUV 数据的解码流程。

知识点

1. AVPacket 和 AVFrame 的区别

AVPacket（数据包）：

表示压缩后的编码数据
包含一个或多个压缩帧的数据
从媒体文件中读取得到（通过 av_read_frame）
需要解码后才能使用

AVFrame（帧）：

表示解码后的原始数据
视频帧包含 YUV 像素数据
音频帧包含 PCM 采样数据
可以直接用于渲染或处理

关系：AVPacket（压缩数据）→ 解码器 → AVFrame（原始数据）

2. 解码流程：packet → frame

FFmpeg 使用新的解码 API（avcodec_send_packet + avcodec_receive_frame）：

1. 打开解码器（avcodec_open2）
2. 读取数据包（av_read_frame）→ AVPacket
3. 发送数据包到解码器（avcodec_send_packet）
4. 从解码器接收帧（avcodec_receive_frame）→ AVFrame
5. 处理解码后的帧数据
6. 重复步骤 2-5 直到文件结束
7. 发送 NULL 数据包刷新解码器（获取缓冲的帧）
8. 关闭解码器

重要提示：

一个 AVPacket 可能产生 0 个、1 个或多个 AVFrame
需要循环调用 avcodec_receive_frame 直到返回 AVERROR(EAGAIN) 或错误
文件结束时需要发送 NULL 数据包来刷新解码器缓冲区

3. YUV 格式理解

YUV 是一种颜色编码格式，常用于视频处理：

YUV 分量：

Y（Luma）：亮度信息
U（Cb）：蓝色色度信息
V（Cr）：红色色度信息

常见 YUV 格式：

YUV420P（Planar）：
- 最常用的格式
- Y、U、V 数据分别存储在三个平面中
- 内存布局：[Y plane][U plane][V plane]
- 色度信息是亮度信息的 1/4（水平和垂直都减半）
YUV422P：
- Y、U、V 分别存储
- 色度信息水平减半，垂直不变
YUV444P：
- Y、U、V 分别存储
- 色度信息不减少（最高质量）

YUV420P 内存布局示例（1280x720 分辨率）：

Y plane: 1280 * 720 = 921,600 字节
U plane: 640 * 360 = 230,400 字节
V plane: 640 * 360 = 230,400 字节
总计: 1,382,400 字节

访问 YUV 数据：

AVFrame* frame = ...;
// Y 平面（亮度）
uint8_t* y_data = frame->data[0];
int y_linesize = frame->linesize[0];

// U 平面（蓝色色度）
uint8_t* u_data = frame->data[1];
int u_linesize = frame->linesize[1];

// V 平面（红色色度）
uint8_t* v_data = frame->data[2];
int v_linesize = frame->linesize[2];

4. AVCodecContext 结构体

AVCodecContext 是编解码器的上下文，包含编解码器的配置信息。

主要字段：

codec：编解码器（AVCodec*）
codecpar：编码参数（从 AVCodecParameters 复制）
width、height：视频分辨率
pix_fmt：像素格式（如 AV_PIX_FMT_YUV420P）
time_base：时间基
extradata：额外数据（如 H.264 的 SPS/PPS）

实践内容

实践1：打开视频解码器

API： avcodec_find_decoder、avcodec_alloc_context3、avcodec_parameters_to_context、avcodec_open2

AVFormatContext* fmt_ctx = nullptr;
avformat_open_input(&fmt_ctx, filename, nullptr, nullptr);
avformat_find_stream_info(fmt_ctx, nullptr);

// 找到视频流
int video_index = -1;
for (unsigned int i = 0; i < fmt_ctx->nb_streams; i++) {
    if (fmt_ctx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
        video_index = i;
        break;
    }
}

if (video_index < 0) {
    LOG("No video stream found");
    return;
}

// 获取编码参数
AVCodecParameters* codecpar = fmt_ctx->streams[video_index]->codecpar;

// 查找解码器
const AVCodec* codec = avcodec_find_decoder(codecpar->codec_id);
if (codec == nullptr) {
    LOG("Codec not found");
    return;
}

// 分配解码器上下文
AVCodecContext* codec_ctx = avcodec_alloc_context3(codec);
if (codec_ctx == nullptr) {
    LOG("Could not allocate codec context");
    return;
}

// 将编码参数复制到解码器上下文
int ret = avcodec_parameters_to_context(codec_ctx, codecpar);
if (ret < 0) {
    LOG("Could not copy codec parameters");
    avcodec_free_context(&codec_ctx);
    return;
}

// 打开解码器
ret = avcodec_open2(codec_ctx, codec, nullptr);
if (ret < 0) {
    char errbuf[AV_ERROR_MAX_STRING_SIZE];
    av_strerror(ret, errbuf, AV_ERROR_MAX_STRING_SIZE);
    LOG("Could not open codec: %s", errbuf);
    avcodec_free_context(&codec_ctx);
    return;
}

LOG("Decoder opened: %s, resolution: %dx%d, pix_fmt: %s",
    codec->name, codec_ctx->width, codec_ctx->height,
    av_get_pix_fmt_name(codec_ctx->pix_fmt));

// 使用完后需要释放
// avcodec_free_context(&codec_ctx);

关键点：

avcodec_find_decoder 根据编码 ID 查找解码器
avcodec_alloc_context3 分配解码器上下文
avcodec_parameters_to_context 将流的编码参数复制到上下文
avcodec_open2 打开解码器

实践2：读取和解码视频帧

API： av_read_frame、avcodec_send_packet、avcodec_receive_frame

// 假设已经打开了文件和解码器
AVFormatContext* fmt_ctx = ...;
AVCodecContext* codec_ctx = ...;
int video_index = ...;

// 分配数据包和帧
AVPacket* packet = av_packet_alloc();
AVFrame* frame = av_frame_alloc();

if (packet == nullptr || frame == nullptr) {
    LOG("Could not allocate packet or frame");
    return;
}

int frame_count = 0;

// 读取并解码帧
while (av_read_frame(fmt_ctx, packet) >= 0) {
    // 只处理视频流的数据包
    if (packet->stream_index == video_index) {
        // 发送数据包到解码器
        int ret = avcodec_send_packet(codec_ctx, packet);
        if (ret < 0) {
            LOG("Error sending packet: %d", ret);
            av_packet_unref(packet);
            continue;
        }

        // 接收解码后的帧（可能需要多次调用）
        while (ret >= 0) {
            ret = avcodec_receive_frame(codec_ctx, frame);
            if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
                // EAGAIN: 需要更多输入数据
                // EOF: 解码器已刷新，没有更多帧
                break;
            } else if (ret < 0) {
                LOG("Error receiving frame: %d", ret);
                break;
            }

            // 成功解码一帧
            frame_count++;
            LOG("Decoded frame #%d: %dx%d, pts=%lld, format=%s",
                frame_count, frame->width, frame->height, frame->pts,
                av_get_pix_fmt_name((AVPixelFormat)frame->format));
        }
    }

    // 释放数据包引用
    av_packet_unref(packet);
}

// 刷新解码器（发送 NULL 数据包）
avcodec_send_packet(codec_ctx, nullptr);
while (true) {
    int ret = avcodec_receive_frame(codec_ctx, frame);
    if (ret == AVERROR_EOF) {
        break;  // 解码器已刷新完成
    } else if (ret < 0) {
        break;
    }

    frame_count++;
    LOG("Decoded frame #%d (from buffer)", frame_count);
}

LOG("Total decoded frames: %d", frame_count);

// 释放资源
av_frame_free(&frame);
av_packet_free(&packet);

关键点：

av_read_frame 读取一个数据包（可能包含多个帧）
avcodec_send_packet 发送数据包到解码器
avcodec_receive_frame 接收解码后的帧，需要循环调用直到返回 EAGAIN
文件结束后需要发送 NULL 数据包刷新解码器缓冲区

实践3：获取解码后的 YUV 数据

API： AVFrame->data、AVFrame->linesize

// 假设已经解码得到 frame
AVFrame* frame = ...;

// 获取像素格式
AVPixelFormat pix_fmt = (AVPixelFormat)frame->format;
const char* pix_fmt_name = av_get_pix_fmt_name(pix_fmt);
LOG("Pixel format: %s", pix_fmt_name);

// 获取分辨率
LOG("Resolution: %dx%d", frame->width, frame->height);

// 对于 YUV420P 格式
if (pix_fmt == AV_PIX_FMT_YUV420P) {
    // Y 平面（亮度）
    uint8_t* y_data = frame->data[0];
    int y_linesize = frame->linesize[0];
    int y_size = y_linesize * frame->height;

    // U 平面（蓝色色度）
    uint8_t* u_data = frame->data[1];
    int u_linesize = frame->linesize[1];
    int u_size = u_linesize * (frame->height / 2);

    // V 平面（红色色度）
    uint8_t* v_data = frame->data[2];
    int v_linesize = frame->linesize[2];
    int v_size = v_linesize * (frame->height / 2);

    LOG("Y plane: %d bytes, linesize=%d", y_size, y_linesize);
    LOG("U plane: %d bytes, linesize=%d", u_size, u_linesize);
    LOG("V plane: %d bytes, linesize=%d", v_size, v_linesize);

    // 访问特定像素的 Y 值（第 row 行，第 col 列）
    // int row = 100, col = 200;
    // uint8_t y_value = y_data[row * y_linesize + col];
}

关键点：

frame->data[0]、frame->data[1]、frame->data[2] 分别指向 Y、U、V 平面
frame->linesize[0]、frame->linesize[1]、frame->linesize[2] 是每行的字节数（可能包含对齐填充）
linesize 可能大于 width（由于内存对齐）

实践4：统计解码帧数和帧率

int frame_count = 0;
int64_t first_pts = AV_NOPTS_VALUE;
int64_t last_pts = AV_NOPTS_VALUE;
AVRational time_base = fmt_ctx->streams[video_index]->time_base;

// 在解码循环中
while (av_read_frame(fmt_ctx, packet) >= 0) {
    if (packet->stream_index == video_index) {
        avcodec_send_packet(codec_ctx, packet);

        while (avcodec_receive_frame(codec_ctx, frame) >= 0) {
            frame_count++;

            // 记录第一帧和最后一帧的 PTS
            if (first_pts == AV_NOPTS_VALUE) {
                first_pts = frame->pts;
            }
            last_pts = frame->pts;
        }
    }
    av_packet_unref(packet);
}

// 计算实际帧率
if (frame_count > 0 && first_pts != AV_NOPTS_VALUE && last_pts != AV_NOPTS_VALUE) {
    int64_t pts_diff = last_pts - first_pts;
    double time_diff = pts_diff * av_q2d(time_base);

    if (time_diff > 0) {
        double actual_fps = frame_count / time_diff;
        LOG("Decoded %d frames in %.2f seconds", frame_count, time_diff);
        LOG("Actual frame rate: %.2f fps", actual_fps);
    }
}

// 从编码参数获取的帧率
AVRational avg_frame_rate = fmt_ctx->streams[video_index]->avg_frame_rate;
double declared_fps = av_q2d(avg_frame_rate);
LOG("Declared frame rate: %.2f fps", declared_fps);

实践5：保存 YUV 文件并用 ffplay 查看

目标：将解码后的 YUV 数据保存为文件，然后使用 ffplay 命令行工具查看。

// 假设已经打开了文件和解码器
AVFormatContext* fmt_ctx = ...;
AVCodecContext* codec_ctx = ...;
int video_index = ...;

// 打开输出文件
FILE* yuv_file = fopen("output.yuv", "wb");
if (yuv_file == nullptr) {
    LOG("Could not open output file");
    return;
}

AVPacket* packet = av_packet_alloc();
AVFrame* frame = av_frame_alloc();

int frame_count = 0;
int width = codec_ctx->width;
int height = codec_ctx->height;

// 读取并解码帧
while (av_read_frame(fmt_ctx, packet) >= 0) {
    if (packet->stream_index == video_index) {
        avcodec_send_packet(codec_ctx, packet);

        while (avcodec_receive_frame(codec_ctx, frame) >= 0) {
            // 确保是 YUV420P 格式
            if (frame->format == AV_PIX_FMT_YUV420P) {
                // 写入 Y 平面
                for (int y = 0; y < height; y++) {
                    fwrite(frame->data[0] + y * frame->linesize[0], 1, width, yuv_file);
                }

                // 写入 U 平面
                for (int y = 0; y < height / 2; y++) {
                    fwrite(frame->data[1] + y * frame->linesize[1], 1, width / 2, yuv_file);
                }

                // 写入 V 平面
                for (int y = 0; y < height / 2; y++) {
                    fwrite(frame->data[2] + y * frame->linesize[2], 1, width / 2, yuv_file);
                }

                frame_count++;
            }
        }
    }
    av_packet_unref(packet);
}

// 刷新解码器
avcodec_send_packet(codec_ctx, nullptr);
while (avcodec_receive_frame(codec_ctx, frame) >= 0) {
    if (frame->format == AV_PIX_FMT_YUV420P) {
        // 写入 Y、U、V 平面（同上）
        for (int y = 0; y < height; y++) {
            fwrite(frame->data[0] + y * frame->linesize[0], 1, width, yuv_file);
        }
        for (int y = 0; y < height / 2; y++) {
            fwrite(frame->data[1] + y * frame->linesize[1], 1, width / 2, yuv_file);
        }
        for (int y = 0; y < height / 2; y++) {
            fwrite(frame->data[2] + y * frame->linesize[2], 1, width / 2, yuv_file);
        }
        frame_count++;
    }
}

// 刷新文件缓冲区，确保所有数据都写入磁盘
fflush(yuv_file);
fclose(yuv_file);
LOG("Saved %d frames to output.yuv", frame_count);

// 清理
av_frame_free(&frame);
av_packet_free(&packet);

使用 ffplay 查看 YUV 文件：

YUV 文件是原始像素数据，没有文件头，需要指定分辨率、像素格式和帧率才能正确播放。

# 基本用法：指定分辨率、像素格式和帧率
ffplay -f rawvideo -video_size 1280x720 -pixel_format yuv420p -framerate 30 output.yuv

# 参数说明：
# -f rawvideo          : 指定输入格式为原始视频
# -video_size 1280x720 : 指定视频分辨率
# -pixel_format yuv420p: 指定像素格式
# -framerate 30        : 指定帧率（可选，用于控制播放速度）

# 如果只查看前几帧，可以使用 -loop 参数
ffplay -f rawvideo -video_size 1280x720 -pixel_format yuv420p -framerate 30 -loop 1 output.yuv

# 查看特定范围的帧（需要先计算文件大小）
# 每帧大小 = width * height * 1.5 (YUV420P)
# 例如：1280 * 720 * 1.5 = 1,382,400 字节/帧
# 查看第 10-20 帧：
# dd if=output.yuv bs=1382400 skip=9 count=11 | \
#   ffplay -f rawvideo -video_size 1280x720 -pixel_format yuv420p -

关键点：

YUV 文件是原始数据，没有容器格式，直接按顺序写入 Y、U、V 平面的数据
写入时需要注意 linesize 可能大于 width，应该只写入实际宽度，忽略填充字节
使用 ffplay 查看时需要指定正确的分辨率、像素格式和帧率
YUV420P 格式：每帧大小 = width * height * 1.5 字节
写入完成后应该调用 fflush() 刷新文件缓冲区，确保数据完全写入磁盘

验证 YUV 文件：

# 使用 ffmpeg 将 YUV 文件转换回 MP4（验证数据正确性）
ffmpeg -f rawvideo -video_size 1280x720 -pixel_format yuv420p -framerate 30 \
       -i output.yuv -c:v libx264 -preset fast -crf 23 output_verify.mp4

使用封装类

项目提供了 Demuxer 类用于解封装，解码功能可以在此基础上实现：

#include "format/Demuxer.h"

Demuxer demuxer;
if (demuxer.open("test.mp4")) {
    demuxer.findStreamInfo();

    int video_index = demuxer.getVideoStreamIndex();
    if (video_index >= 0) {
        AVCodecParameters* codecpar = demuxer.getCodecParameters(video_index);

        // 打开解码器
        const AVCodec* codec = avcodec_find_decoder(codecpar->codec_id);
        AVCodecContext* codec_ctx = avcodec_alloc_context3(codec);
        avcodec_parameters_to_context(codec_ctx, codecpar);
        avcodec_open2(codec_ctx, codec, nullptr);

        // 读取和解码
        AVPacket* packet = av_packet_alloc();
        AVFrame* frame = av_frame_alloc();

        while (demuxer.readFrame(packet) >= 0) {
            if (packet->stream_index == video_index) {
                avcodec_send_packet(codec_ctx, packet);
                while (avcodec_receive_frame(codec_ctx, frame) >= 0) {
                    // 处理解码后的帧
                }
            }
            av_packet_unref(packet);
        }

        // 清理
        av_frame_free(&frame);
        av_packet_free(&packet);
        avcodec_free_context(&codec_ctx);
    }

    demuxer.close();
}

运行测试

编译项目

在运行测试前，需要先编译项目：

cmake --build build/Release

运行所有第4课的测试

# 注意：在 zsh 中需要用引号包裹参数，避免 * 被解释为通配符
./build/Release/unit-test --gtest_filter="Lesson4_Decode.*"

运行直接使用 API 的测试

./build/Release/unit-test --gtest_filter="Lesson4_Decode.*DirectAPI"

运行使用封装类的测试

./build/Release/unit-test --gtest_filter="Lesson4_Decode.*UseClass"

编译并运行（推荐）

一条命令完成编译和运行：

cmake --build build/Release && ./build/Release/unit-test --gtest_filter="Lesson4_Decode.*"

常见问题

Q1: `avcodec_send_packet` 返回 `AVERROR(EAGAIN)` 是什么意思？

A: 表示解码器的输入缓冲区已满，需要先调用 avcodec_receive_frame 接收一些帧，然后再发送新的数据包。

Q2: 为什么需要循环调用 `avcodec_receive_frame`？

A: 因为一个 AVPacket 可能产生多个 AVFrame（特别是 B 帧的情况）。需要循环调用直到返回 AVERROR(EAGAIN) 或错误。

Q3: 文件结束后还需要做什么？

A: 需要发送一个 NULL 数据包（avcodec_send_packet(codec_ctx, nullptr)）来刷新解码器缓冲区，获取所有剩余的帧。

Q4: `linesize` 为什么可能大于 `width`？

A: 为了内存对齐和性能优化，linesize 可能包含填充字节。访问像素数据时应该使用 linesize 而不是 width。

Q5: 如何判断解码器是否支持某个像素格式？

A: 解码后的 AVFrame->format 就是解码器输出的像素格式。通常视频解码器输出 YUV420P 格式。

Q6: 解码后的帧数据可以保存为文件吗？

A: 可以，但需要将 YUV 数据写入文件。YUV 文件格式是原始像素数据，没有文件头，直接按顺序写入 Y、U、V 平面的数据即可。

保存 YUV 文件的步骤：

打开文件（二进制写入模式）
对每一帧，按顺序写入 Y、U、V 平面的数据
注意只写入实际宽度，忽略 linesize 中的填充字节

使用 ffplay 查看 YUV 文件：

ffplay -f rawvideo -video_size 1280x720 -pixel_format yuv420p -framerate 30 output.yuv

Q7: 为什么 YUV 文件无法直接用普通播放器打开？

A: YUV 文件是原始像素数据，没有容器格式和文件头，播放器无法自动识别分辨率、像素格式等信息。必须使用 ffplay 等工具，并手动指定这些参数。

Q8: 如何验证保存的 YUV 文件是否正确？

A: 可以通过以下方式验证：

使用 ffplay 播放：

ffplay -f rawvideo -video_size 1280x720 -pixel_format yuv420p -framerate 30 output.yuv

转换回 MP4 验证：

ffmpeg -f rawvideo -video_size 1280x720 -pixel_format yuv420p -framerate 30 \
       -i output.yuv -c:v libx264 -preset fast -crf 23 output_verify.mp4

参考

原创文章，转载请注明来源: 音视频教程-第四节

音视频教程-第四节

标签: 视频课程

分类目录: video_lesson

课程目标

知识点

1. AVPacket 和 AVFrame 的区别

2. 解码流程：packet → frame

3. YUV 格式理解

4. AVCodecContext 结构体

实践内容

实践1：打开视频解码器

实践2：读取和解码视频帧

实践3：获取解码后的 YUV 数据

实践4：统计解码帧数和帧率

实践5：保存 YUV 文件并用 ffplay 查看

使用封装类

运行测试

编译项目

运行所有第4课的测试

运行直接使用 API 的测试

运行使用封装类的测试

编译并运行（推荐）

常见问题

Q1: `avcodec_send_packet` 返回 `AVERROR(EAGAIN)` 是什么意思？

Q2: 为什么需要循环调用 `avcodec_receive_frame`？

Q3: 文件结束后还需要做什么？

Q4: `linesize` 为什么可能大于 `width`？

Q5: 如何判断解码器是否支持某个像素格式？

Q6: 解码后的帧数据可以保存为文件吗？

Q7: 为什么 YUV 文件无法直接用普通播放器打开？

Q8: 如何验证保存的 YUV 文件是否正确？

参考

相关文章

留言板

标签: 视频课程

分类目录: video_lesson

课程目标

知识点

1. AVPacket 和 AVFrame 的区别

2. 解码流程：packet → frame

3. YUV 格式理解

4. AVCodecContext 结构体

实践内容

实践1：打开视频解码器

实践2：读取和解码视频帧

实践3：获取解码后的 YUV 数据

实践4：统计解码帧数和帧率

实践5：保存 YUV 文件并用 ffplay 查看

使用封装类

运行测试

编译项目

运行所有第4课的测试

运行直接使用 API 的测试

运行使用封装类的测试

编译并运行（推荐）

常见问题

Q1: avcodec_send_packet 返回 AVERROR(EAGAIN) 是什么意思？

Q2: 为什么需要循环调用 avcodec_receive_frame？

Q3: 文件结束后还需要做什么？

Q4: linesize 为什么可能大于 width？

Q5: 如何判断解码器是否支持某个像素格式？

Q6: 解码后的帧数据可以保存为文件吗？

Q7: 为什么 YUV 文件无法直接用普通播放器打开？

Q8: 如何验证保存的 YUV 文件是否正确？

参考

相关文章

留言板

Q1: `avcodec_send_packet` 返回 `AVERROR(EAGAIN)` 是什么意思？

Q2: 为什么需要循环调用 `avcodec_receive_frame`？

Q4: `linesize` 为什么可能大于 `width`？