server端 -- 编码音视频数据为H264、AAC
这部分花了好多时间,本身就不具备这方面的相关知识,查阅了不少资料,不过关于VideoToolbox和AudioToolbox方面的编码资料寥寥无几,虽然网上搜索结果看似特别多,其实一看 内容也大同小异,建议还是看看官方的文档。
下载
GitHub:
client 端:https://github.com/AmoAmoAmo/Smart_Device_Client
server端:https://github.com/AmoAmoAmo/Smart_Device_Server
另还写了一份macOS版的server,但是目前还有一些问题,有兴趣的去看看吧:https://github.com/AmoAmoAmo/Server_Mac
VideoToolbox编码视频数据为H264
初始化--创建session
// ----- 1. 创建session ----- int width = 640, height = 480; OSStatus status = VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH264, (__bridge void *)(self), &EncodingSession); NSLog(@"H264: VTCompressionSessionCreate %d", (int)status); if (status != 0) { NSLog(@"H264: session 创建失败"); return ; } // ----- 2. 设置session属性 ----- // 设置实时编码输出(避免延迟) VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue); VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Baseline_AutoLevel); // 设置关键帧(GOPsize)间隔 int frameInterval = 10; CFNumberRef frameIntervalRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval); VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRef); // 设置期望帧率 int fps = 10; CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps); VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef); //设置码率,上限,单位是bps int bitRate = width * height * 3 * 4 * 8; CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate); VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef); //设置码率,均值,单位是byte int bitRateLimit = width * height * 3 * 4; CFNumberRef bitRateLimitRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRateLimit); VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_DataRateLimits, bitRateLimitRef); // Tell the encoder to start encoding VTCompressionSessionPrepareToEncodeFrames(EncodingSession);
编码完成回调
将来通过这个回调获取H264数据
void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) { // NSLog(@"didCompressH264 called with status %d infoFlags %d", (int)status, (int)infoFlags); // 0 1 if (status != 0) { return; } if (!CMSampleBufferDataIsReady(sampleBuffer)) { NSLog(@"didCompressH264 data is not ready "); return; } // ViewController* encoder = (__bridge ViewController*)outputCallbackRefCon; HJH264Encoder *encoder = (__bridge HJH264Encoder*)(outputCallbackRefCon); // ----- 关键帧获取SPS和PPS ------ bool keyframe = !CFDictionaryContainsKey( (CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0)), kCMSampleAttachmentKey_NotSync); // 判断当前帧是否为关键帧 // 获取sps & pps数据 if (keyframe) { CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer); size_t sparameterSetSize, sparameterSetCount; const uint8_t *sparameterSet; OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0 ); if (statusCode == noErr) { // Found sps and now check for pps size_t pparameterSetSize, pparameterSetCount; const uint8_t *pparameterSet; OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0 ); if (statusCode == noErr) { // Found pps NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize]; NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize]; if (encoder) { [encoder gotSpsPps:sps pps:pps]; // 获取sps & pps数据 } } } } // --------- 写入数据 ---------- CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer); size_t length, totalLength; char *dataPointer; OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer); if (statusCodeRet == noErr) { size_t bufferOffset = 0; static const int AVCCHeaderLength = 4; // 返回的nalu数据前四个字节不是0001的startcode,而是大端模式的帧长度length // 循环获取nalu数据 while (bufferOffset < totalLength - AVCCHeaderLength) { uint32_t NALUnitLength = 0; // Read the NAL unit length memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength); // 从大端转系统端 NALUnitLength = CFSwapInt32BigToHost(NALUnitLength); NSData* data = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength]; [encoder gotEncodedData:data isKeyFrame:keyframe]; // Move to the next NAL unit in the block buffer bufferOffset += AVCCHeaderLength + NALUnitLength; } } }
传入需要编码的帧
- (void) encode:(CMSampleBufferRef )sampleBuffer { CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer); // 帧时间,如果不设置会导致时间轴过长。 CMTime presentationTimeStamp = CMTimeMake(frameID++, 1000); // CMTimeMake(分子,分母);分子/分母 = 时间(秒) VTEncodeInfoFlags flags; OSStatus statusCode = VTCompressionSessionEncodeFrame(EncodingSession, imageBuffer, presentationTimeStamp, kCMTimeInvalid, NULL, NULL, &flags); if (statusCode != noErr) { NSLog(@"H264: VTCompressionSessionEncodeFrame failed with %d", (int)statusCode); VTCompressionSessionInvalidate(EncodingSession); CFRelease(EncodingSession); EncodingSession = NULL; return; } }
然后就可以在上面的回调里取得编码后的数据,再把数据通过socket发给客户端即可。
在每个阶段都要记得测试、打印日志,不然以后找bug会很辛苦的。
这里可以把编码后的数据写入本地文件,然后用VLC工具打开,检测编码是否有问题。
最后不要忘记关闭编码器
- (void)EndVideoToolBox { VTCompressionSessionCompleteFrames(EncodingSession, kCMTimeInvalid); VTCompressionSessionInvalidate(EncodingSession); CFRelease(EncodingSession); EncodingSession = NULL; }
另:在macOS环境下使用VideoToolbox编码的过程在这个博客里:
[置顶] VideoToolbox视频编码——在macOS上对获取到的视频进行编码的问题记录 及YUV422转YUV420
AudioToolbox编码音频数据为AAC
设置编码参数
- (void) setupEncoderFromSampleBuffer:(CMSampleBufferRef)sampleBuffer { AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer)); AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // 初始化输出流的结构体描述为0. 很重要。 outAudioStreamBasicDescription.mSampleRate = inAudioStreamBasicDescription.mSampleRate; // 音频流,在正常播放情况下的帧率。如果是压缩的格式,这个属性表示解压缩后的帧率。帧率不能为0。 outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // 设置编码格式 outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_LC; // 无损编码 ,0表示没有 outAudioStreamBasicDescription.mBytesPerPacket = 0; // 每一个packet的音频数据大小。如果的动态大小,设置为0。动态大小的格式,需要用AudioStreamPacketDescription 来确定每个packet的大小。 outAudioStreamBasicDescription.mFramesPerPacket = 1024; // 每个packet的帧数。如果是未压缩的音频数据,值是1。动态码率格式,这个值是一个较大的固定数字,比如说AAC的1024。如果是动态大小帧数(比如Ogg格式)设置为0。 outAudioStreamBasicDescription.mBytesPerFrame = 0; // 每帧的大小。每一帧的起始点到下一帧的起始点。如果是压缩格式,设置为0 。 outAudioStreamBasicDescription.mChannelsPerFrame = 1; // 声道数 outAudioStreamBasicDescription.mBitsPerChannel = 0; // 压缩格式设置为0 outAudioStreamBasicDescription.mReserved = 0; // 8字节对齐,填0. AudioClassDescription *description = [self getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC fromManufacturer:kAppleSoftwareAudioCodecManufacturer]; //软编 OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, description, &_audioConverter); // 创建转换器 if (status != 0) { NSLog(@"setup converter: %d", (int)status); } }
获取编解码器
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type fromManufacturer:(UInt32)manufacturer { static AudioClassDescription desc; UInt32 encoderSpecifier = type; OSStatus st; UInt32 size; st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(encoderSpecifier), &encoderSpecifier, &size); if (st) { NSLog(@"error getting audio format propery info: %d", (int)(st)); return nil; } unsigned int count = size / sizeof(AudioClassDescription); AudioClassDescription descriptions[count]; st = AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(encoderSpecifier), &encoderSpecifier, &size, descriptions); if (st) { NSLog(@"error getting audio format propery: %d", (int)(st)); return nil; } for (unsigned int i = 0; i < count; i++) { if ((type == descriptions[i].mSubType) && (manufacturer == descriptions[i].mManufacturer)) { memcpy(&desc, &(descriptions[i]), sizeof(desc)); return &desc; } } return nil; }
将设备捕获到的音频数据传给编码器
- (void) encodeSampleBuffer:(CMSampleBufferRef)sampleBuffer completionBlock:(void (^)(NSData * encodedData, NSError* error))completionBlock { CFRetain(sampleBuffer); dispatch_async(_encoderQueue, ^{ if (!_audioConverter) { [self setupEncoderFromSampleBuffer:sampleBuffer]; } CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer); CFRetain(blockBuffer); // --------- 通过CMBlockBufferGetDataPointer获取到_pcmBufferSize和_pcmBuffer -------- OSStatus status = CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &_pcmBufferSize, &_pcmBuffer); NSError *error = nil; if (status != kCMBlockBufferNoErr) { error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil]; } memset(_aacBuffer, 0, _aacBufferSize); AudioBufferList outAudioBufferList = {0}; outAudioBufferList.mNumberBuffers = 1; outAudioBufferList.mBuffers[0].mNumberChannels = 1; outAudioBufferList.mBuffers[0].mDataByteSize = (int)_aacBufferSize; outAudioBufferList.mBuffers[0].mData = _aacBuffer; AudioStreamPacketDescription *outPacketDescription = NULL; UInt32 ioOutputDataPacketSize = 1; // Converts data supplied by an input callback function, supporting non-interleaved and packetized formats. // Produces a buffer list of output data from an AudioConverter. The supplied input callback function is called whenever necessary. status = AudioConverterFillComplexBuffer(_audioConverter, inInputDataProc, (__bridge void *)(self), &ioOutputDataPacketSize, &outAudioBufferList, outPacketDescription); NSData *data = nil; if (status == 0) { NSData *rawAAC = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize]; NSData *adtsHeader = [self adtsDataForPacketLength:rawAAC.length]; NSMutableData *fullData = [NSMutableData dataWithData:adtsHeader]; [fullData appendData:rawAAC]; data = fullData; } else { error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil]; } if (completionBlock) { dispatch_async(_callbackQueue, ^{ // printf("----- audio data len = %d ----\n",(int)[data length]); completionBlock(data, error); }); } CFRelease(sampleBuffer); CFRelease(blockBuffer); }); }
回调函数
OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData) { AACEncoder *encoder = (__bridge AACEncoder *)(inUserData); UInt32 requestedPackets = *ioNumberDataPackets; size_t copiedSamples = [encoder copyPCMSamplesIntoBuffer:ioData]; if (copiedSamples < requestedPackets) { //PCM 缓冲区还没满 *ioNumberDataPackets = 0; return -1; } *ioNumberDataPackets = 1; return noErr; } /** * 填充PCM到缓冲区 */ - (size_t) copyPCMSamplesIntoBuffer:(AudioBufferList*)ioData { size_t originalBufferSize = _pcmBufferSize; if (!originalBufferSize) { return 0; } ioData->mBuffers[0].mData = _pcmBuffer; ioData->mBuffers[0].mDataByteSize = (int)_pcmBufferSize; _pcmBuffer = NULL; _pcmBufferSize = 0; return originalBufferSize; }
最后在需要的地方释放编码器
- (void) dealloc { AudioConverterDispose(_audioConverter); free(_aacBuffer); }
参考文章
1. http://www.jianshu.com/p/9febe519732a#comment-13802063
2. http://www.jianshu.com/p/a671f5b17fc1
3. http://blog.csdn.net/hard_man/article/details/53511026
4. https://developer.apple.com/documentation/videotoolbox
相关文章
基于iOS的网络音视频实时传输系统(三)- VideoToolbox编码音视频数据为H264、AAC
基于iOS的网络音视频实时传输系统(四)- 自定义socket协议(TCP、UDP)
基于iOS的网络音视频实时传输系统(五)- 使用VideoToolbox硬解码H264
基于iOS的网络音视频实时传输系统(六)- AudioQueue播放音频,OpenGL渲染显示图像