VideoToolbox视频硬编码

视频编码相关知识概念

帧：每帧代表一张静态的图像

GOP：GOP就是一组连续额画面，每个画面都是一帧，一个GOP就是很多帧的集合，GOP cache长度越长，画面质量越好

码率：画面进行压缩后每秒显示的数据量

帧率：每秒显示图片的数(人眼所看画面在16帧以上，就会认为是连贯的)

分辨率：图片的长度 * 宽度，图片的尺寸

压缩前每秒数据：帧率 * 分辨率

压缩比：压缩前的每秒钟数据 / 码率(压缩比越高，画面质量越差)

视频封装格式：一种存储视频信息的容器(流式封装：TS\FLV;索引封装：MP4\MOV\AVI)

主要作用：一个视频文件往往会包含图像和音频，还有一些配置信息，这些内容需要按照一定规则组织封装起来

注意：封装格式和文件格式一样，因为一般视频文件格式的后缀即采用相应的视频封装格式的名称，所以视频文件格式就是视频封装格式

I帧(关键帧)：帧内编码帧，包含一帧画面的完整帧，是P帧和B帧的参考帧，占用数据的信息量比较大，是GOP基础帧的第一帧，一组GOP中只有一个I帧

P帧(差别帧)：保留帧与前帧的区别(以I帧为参考帧)，解码需要缓存画面叠加本帧定义的差别，生成最终画面，P帧只存储差别数据，并不是完整帧，压缩比比较高

B帧(双向差别帧)：记录的是本帧与前后帧的差别。通过前面的I帧或P帧和后面的P帧来进行预测的

帧内压缩(空间压缩)：只考虑本帧数据，不考虑相邻帧之间的冗余信息。当压缩⼀帧图像时，仅考虑本帧的数据⽽不考虑相邻帧之间的冗余信息，这实际上与静态图像压缩类似。帧内⼀般采⽤用有损压缩算法，由于帧内压缩是编码一个完整的图像，所以可以独立的解码、显示。帧内压缩一般达不不到很⾼高的压缩，跟编码jpeg差不多

帧间压缩：通过比较时间轴上不同帧之间的数据进行压缩。帧间压缩一般是无损的。

码率计算公式

image

H264视频编码压缩方法：

1、分组：把几帧图像分为一组(GOP)，为防止运动变化，帧数不宜去多

2、定义帧：将每组内各帧图像定义为三种类型：I帧、B帧、P帧

3、预测帧：以I帧作为基础，以I帧预测P帧，再由I帧预测B帧

4、数据传输：最后将I帧数据与预测的差值信息进行存储和传输

H264 NAL头解析

如果NALU对相应的Slice为一帧的开始，则用4字节表示，即0x00000001;否则用3字节表示，0x000001、

NAL Header: forbidden_bit, nal_reference_bit(优先级)2bit,nal_unit_type（类型）5bit。标识NAL单位称为VCL的NAL单元，其他类型的NAL单元为非VCL的NAL单元

0：未规定

1：非IDR图像中不采用数据划分的片段

2：非IDR图像中A类数据划分片段

3：非IDR图像中B类数据划分片段

4：非IDR图像中C类数据划分片段

5：IDR图像的片段

6：补充增强信息(SEI)

7：序列参数集(SPS)

8：图像参数集(PPS)

9：分割符

10：序列结束符

11：流结束符

12：填充数据

13：序列参数集扩展

14：带前缀的NAL单元

15：子序列参数集

16-18：保留

19：不采用数据划分的辅助编码图像片段

20：编码片段扩展

21-23：保留

14-31：未规定

H.264的SPS和PPS串，包含了初始化H.264解析器所需要的信息参数，包括编码所用的profile,level,图像的宽高，deblock滤波器等编码数据格式

编码前或者解码后的数据格式 = CMSampleBuffer = CMTime + CMVideoFormatDesc + CVPixelBuffer

编码后的数据格式 = CMSampleBuffer = CMTime + CMVideoFormat(图像存储格式) + CMBlockBuffer

image

VideoToolbox编码流程

1.初始化摄像头，output设定的时候，需要设置delegate和输出队列。在delegate方法，处理采集好的图像。

2.初始化VideoToolbox，设置各种属性。

3.获取每一帧数并编码。

4.每一帧数据编码完成后，在回调方法中判断是不是关键帧，如果是关键帧需要用CMSampleBufferGetFormatDescription获取CMFormatDescriptionRef，然后用CMVideoFormatDescriptionGetH264ParameterSetAtIndex取得PPS和SPS；最后把每一帧的所有NALU数据前四个字节变成0x00 00 00 01之后再写入文件。

5.循环步骤3步骤4。

6.调用VTCompressionSessionCompleteFrames完成编码，然后销毁session：VTCompressionSessionInvalidate，释放session。

image

VideoToolbox示例代码:

#import "H264Encoder.h"

#import <VideoToolbox/VideoToolbox.h>

@interface H264Encoder()

@property(nonatomic, assign)int frameID;

@property(nonatomic, assign)VTCompressionSessionRef cEncodeingSession;

@property (nonatomic, strong) NSFileHandle * videoFileHandle;

@property (nonatomic, strong) dispatch_queue_t encodeQueue;

@end

@implementation H264Encoder

- (instancetype)init

{

    self = [super init];

    if (self) {

        dispatch_sync(self.encodeQueue, ^{

            [self initVideoToolbox];

        });

    }

    return self;

}

- (void)stopEncode

{

    VTCompressionSessionCompleteFrames(self.cEncodeingSession, kCMTimeInvalid);

    VTCompressionSessionInvalidate(self.cEncodeingSession);

    CFRelease(self.cEncodeingSession);

    self.cEncodeingSession = NULL;

    [self.videoFileHandle closeFile];

    self.videoFileHandle = NULL;

}

- (void)encodeH264:(CMSampleBufferRef)sampleBuffer {

    dispatch_sync(self.encodeQueue, ^{

        NSLog(@"H264编码中...");

        // 拿到每一帧的未编码的数据

        CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);

        // 根据当前的帧数创建帧时间

        CMTime ptime = CMTimeMake(self.frameID ++, 1000);

        // 编码准备

        VTEncodeInfoFlags flags; // 0 同步编码 1表示异步编码

        OSStatus status = VTCompressionSessionEncodeFrame(self.cEncodeingSession, imageBuffer, ptime, kCMTimeInvalid, NULL, NULL, &flags);

        if (status != noErr) {

            VTCompressionSessionInvalidate(self.cEncodeingSession);

            CFRelease(self.cEncodeingSession);

            self.cEncodeingSession = NULL;

            return;

        } else {

            NSLog(@"encode error status = %d", (int)status);

        }

    });

}

- (void)initVideoToolbox {

    // 用于记录是第几帧数据

    self.frameID = 0;

    // 捕捉视频的宽高

    int width = [UIScreen mainScreen].bounds.size.width;

    int height = [UIScreen mainScreen].bounds.size.height;

    // 创建一个编码器 didCompressH264编码回调函数

    OSStatus status = VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264,

                                                NULL, NULL, NULL,

                                                didCompressH264,

                                                (__bridge void*)self, &_cEncodeingSession);


    if (status != 0) {

        NSLog(@"创建编码器失败 status = %d", (int)status);

        return ;

    }


    // 设置实施编码输出

    VTSessionSetProperty(self.cEncodeingSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);

    VTSessionSetProperty(self.cEncodeingSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Baseline_AutoLevel);


    // 设置关键帧（GOPsize）间隔

    int frameInterval = 30;

    CFNumberRef frameIntervalRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval);

    VTSessionSetProperty(self.cEncodeingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRef);


    // 设置期望帧率，不是实际帧率

    int fps = 30;

    CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps);

    VTSessionSetProperty(self.cEncodeingSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef);


    // 设置码率，单位是byte （编码效率, 码率越高,则画面越清晰, 如果码率较低会引起马赛克 --> 码率高有利于还原原始画面,但是也不利于传输）

    int bigRate = width * height * 3 * 4 * 8;

    CFNumberRef bigRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bigRate);

    VTSessionSetProperty(self.cEncodeingSession, kVTCompressionPropertyKey_AverageBitRate, bigRateRef);


    int bigRateLimit = width * height * 3 * 4;

    CFNumberRef bigRateLimitRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bigRateLimit);

    VTSessionSetProperty(self.cEncodeingSession, kVTCompressionPropertyKey_DataRateLimits, bigRateLimitRef);


    // 开始准备编码

    VTCompressionSessionPrepareToEncodeFrames(self.cEncodeingSession);

}

#pragma mark - 编码回调

// 编码完成回调

void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer)

{

    // CMSampleBufferRef  包括  CMTime(时间戳) + CMVideoGormatDesc(图像存储方式) + CMBlockBuffer（编码后的数据）

    // 获取h264编码的数据 sampleBuffer


    NSLog(@"didCompressH264: status = %d  infoFlags = %u", (int)status, (unsigned int)infoFlags);

    // 状态错误

    if (status != 0) {

        return;

    }


    // 没准备好

    if (!CMSampleBufferDataIsReady(sampleBuffer)) {

        NSLog(@"didCompressH264 data is not ready");

        return;

    }


    // 需要调用oc的方法

    H264Encoder * self = (__bridge H264Encoder*)outputCallbackRefCon;


    // 判断当前帧是否为关键帧

    bool keyFrame = !CFDictionaryContainsKey(CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0), kCMSampleAttachmentKey_NotSync);

    if (keyFrame) {

        // sps 序列参数集  pps 图像参数集    h264

        // 获取图像编码后的存储信息

        CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);

        // 获取 sps 内容、大小、长度

        size_t spsCount, spsLength;

        const uint8_t *spsSet;

        OSStatus spsStatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format,

                                                                                0,

                                                                                &spsSet,

                                                                                &spsLength,

                                                                                &spsCount,

                                                                                0);

        if (spsStatus == noErr) {

            // 获取pps信息

            size_t ppsCount, ppsLength;

            const uint8_t *ppsSet;

            OSStatus ppsStatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format,

                                                                                    1,

                                                                                    &ppsSet,

                                                                                    &ppsLength,

                                                                                    &ppsCount,

                                                                                    0);

            if (ppsStatus == noErr) {


                // 将sps pps转成 NSData 写入文件

                NSData * spsData = [NSData dataWithBytes:spsSet length:spsLength];

                NSData * ppsData = [NSData dataWithBytes:ppsSet length:ppsLength];


                if (self) {

                    [self gotSpsPps:spsData pps:ppsData];

                }

            }

        }

    }


    // 获取数据块

    CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);

    size_t length, totleLength;

    char *dataPointer;

    OSStatus blockStatus = CMBlockBufferGetDataPointer(dataBuffer,

                                                      0,

                                                      &length,

                                                      &totleLength,

                                                      &dataPointer);

    if (blockStatus == noErr) {

        size_t bufferOfSet = 0;

        // 返回的nalu数据前四个字节不是0001的startcode，而是大端模式的帧长度length

        static const int AVCCHeaderLength = 4;

        // 获取nalu数据

        while (bufferOfSet < totleLength - AVCCHeaderLength) {

            UInt32 NALUnitLength = 0;

            // Read the NAL unit length

            memcpy(&NALUnitLength, dataPointer + bufferOfSet, AVCCHeaderLength);


            // 大端模式 转换为 系统端模式

            NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);


            // 获取nalu数据

            NSData * data = [[NSData alloc] initWithBytes:(dataPointer + AVCCHeaderLength + bufferOfSet) length:NALUnitLength];

            // 将 nalu数据 写入文件

            [self gotEncodedData:data isKeyFrame:keyFrame];


            // 移动偏移量

            bufferOfSet += AVCCHeaderLength + NALUnitLength;

        }

    }

}

- (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps

{

    NSLog(@"gotSpsPps %lu - %lu", (unsigned long)sps.length, (unsigned long)pps.length);

    const char bytres[] = "\x00\x00\x00\x01";

    size_t length = (sizeof bytres) - 1;

    NSData * byteHeader = [NSData dataWithBytes:bytres length:length];


    [self.videoFileHandle writeData:byteHeader];

    [self.videoFileHandle writeData:sps];

    [self.videoFileHandle writeData:byteHeader];

    [self.videoFileHandle writeData:pps];

}

- (void)gotEncodedData:(NSData*)data isKeyFrame:(BOOL)isKeyFrame

{

    NSLog(@"gotEncodedData = %lu", (unsigned long)data.length);


    if (self.videoFileHandle != NULL) {

        const char bytres[] = "\x00\x00\x00\x01";

        size_t length = (sizeof bytres) - 1;

        NSData * byteHeader = [NSData dataWithBytes:bytres length:length];

        [self.videoFileHandle writeData:byteHeader];

        [self.videoFileHandle writeData:data];

    }

}

#pragma mark - get

- (dispatch_queue_t)encodeQueue {

    if (!_encodeQueue) {

        _encodeQueue = dispatch_queue_create("encode_video_queue", DISPATCH_QUEUE_SERIAL);

    }

    return _encodeQueue;

}

- (NSFileHandle *)videoFileHandle {

    if (!_videoFileHandle) {

        NSString * filePath = [NSHomeDirectory() stringByAppendingPathComponent:@"/Documents/demo.h264"];

        [[NSFileManager defaultManager] removeItemAtPath:filePath error:nil];

        BOOL createFile = [[NSFileManager defaultManager] createFileAtPath:filePath contents:nil attributes:nil];

        NSAssert(createFile, @"create video path error");

        _videoFileHandle = [NSFileHandle fileHandleForWritingAtPath:filePath];

    }

    return _videoFileHandle;

}