iPhone的音质一般,音量偏小,作为听个响系列,表现也是不置可否,这篇文章总结一下,iOS本地录音生成Wav文件格式遇到的一个问题。
Wav文件格式错误
之前开发一个项目,里面需要通过录音生成wav文件,提交给后台实现声纹注册,当我调用系统的API生成Wav文件上传后,后台报文件格式错误,然而android生成的wav文件却没有报这个错….
问题定位
- 是否iOS本地生成的Wav文件格式有问题
使用itunes文件共享功能,将生成的Wav文件导出,进行本地端(Mac OS)播放正常,在windwos端播放正常。
初略的认定生成的Wav格式没有问题。
是否wav文件上传功能出现问题
借用android端生成的wav文件重新发送,提示声纹注册成功。排除本地端文件上传功能有问题
后台解析Wav文件是不是有什么问题?
后台说不是他这边报的文件格式错误,他只是做了中间转发,错误是微软后台那边报的(项目是跟微软内部合作)
是不是对录音的参数有什么要求,采样率,位深….?
各种找人,各种甩锅之后,找到了内部跟微软对接需求的人(DNS迭代查询即视感),没有详细文档,没有参数限制,让我自己试….
分析两边Wav文件格式的差异
对比之后,发现苹果生成wav文件格式不是标准的wav协议头的文件。。。
WAV简介
WAVE(Waveform Audio File Format), 采用RIFF(Resource Interchange File Format)文件格式结构
WAV格式的音频文件通常用来保存PCM格式的原始音频数据,通常被称之为无损音频。
WAV音频文件,粗略来说是WAV数据头+PCM数据组成的,裸数据PCM外面包了一层文件头,WAV实质是一个RIFF文件
WAV数据头
关于 WAV 音频文件的数据头定义如下图所示:
一般的 WAV 文件的数据头为 44 个字节, 其后面跟的是 PCM 数据
分析wav数据头
使用hexdump
或者xd来看一下 WAV 文件的数据头
1
| hexdump -n 44 m.wav //查看前 44 个字节
|
这是标准的44字节的wav文件头,来看下iOS录音生成wav文件的wav格式
1 2 3 4 5 6
| NSMutableDictionary* settingDic=@{}.mutableCopy; [settingDic setObject:@(kAudioFormatLinearPCM) forKey:AVFormatIDKey]; [settingDic setValue:@(44100) forKey:AVSampleRateKey]; [settingDic setValue:@(2) forKey:AVNumberOfChannelsKey]; [settingDic setValue:@(16) forKey:AVLinearPCMBitDepthKey]; [settingDic setValue:@(AVAudioQualityHigh) forKey:AVEncoderAudioQualityKey];
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| Jack-Mac-mini:~ Jack$ afinfo /Users/Jack/Desktop/Audio/test.wav File: /Users/Jack/Desktop/Audio/test.wav File type ID: WAVE Num Tracks: 1 ---- Data format: 2 ch, 44100 Hz, 'lpcm' (0x0000000C) 16-bit little-endian signed integer no channel layout. estimated duration: 17.507846 sec audio bytes: 3088384 audio packets: 772096 bit rate: 1411200 bits per second packet size upper bound: 4 maximum packet size: 4 audio data file offset: 4096 optimized source bit depth: I16 ----
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| Jack-Mac-mini:~ Jack$ xd -cl 4096 /Users/Jack/Desktop/Audio/test.wav File name: /Users/Jack/Desktop/Audio/test.wav
00000000: 52 49 46 46 f8 2f 2f 00 57 41 56 45 4a 55 4e 4b RIFF.//.WAVEJUNK 00000010: 1c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000030: 66 6d 74 20 10 00 00 00 01 00 02 00 44 ac 00 00 fmt.........D... 00000040: 10 b1 02 00 04 00 10 00 46 4c 4c 52 a8 0f 00 00 ........FLLR.... 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ **** 00000fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000ff0: 00 00 00 00 00 00 00 00 64 61 74 61 00 20 2f 00 ........data../. Total 4096 Byte
|
生成的WAV 文件的数据头为 4096 个字节
转换思路
AVAudioRecorder录音采样生成PCM文件格式,PCM转成标准44字节的Wav文件格式
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89
|
-(NSString*)transformPCM2Wav:(NSString*)originPath { NSString* outPath=[[originPath stringByDeletingPathExtension] stringByAppendingString:@".wav"]; NSData* data=[NSData dataWithContentsOfFile:originPath]; BOOL isSuccess=[[self writeWavHead:data] writeToFile:outPath atomically:YES]; if (isSuccess) { return outPath; } else { return nil; } }
-(NSData*)writeWavHead:(NSData *)audioData { long sampleRate = SampleRateKey; long numOfChannelsKey = NumberOfChannels; Byte waveHead[44]; waveHead[0] = 'R'; waveHead[1] = 'I'; waveHead[2] = 'F'; waveHead[3] = 'F'; long totalDatalength = [audioData length] + 44; waveHead[4] = (Byte)(totalDatalength & 0xff); waveHead[5] = (Byte)((totalDatalength >> 8) & 0xff); waveHead[6] = (Byte)((totalDatalength >> 16) & 0xff); waveHead[7] = (Byte)((totalDatalength >> 24) & 0xff); waveHead[8] = 'W'; waveHead[9] = 'A'; waveHead[10] = 'V'; waveHead[11] = 'E'; waveHead[12] = 'f'; waveHead[13] = 'm'; waveHead[14] = 't'; waveHead[15] = ' '; waveHead[16] = 16; waveHead[17] = 0; waveHead[18] = 0; waveHead[19] = 0; waveHead[20] = 1; waveHead[21] = 0; waveHead[22] = numOfChannelsKey; waveHead[23] = 0; waveHead[24] = (Byte)(sampleRate & 0xff); waveHead[25] = (Byte)((sampleRate >> 8) & 0xff); waveHead[26] = (Byte)((sampleRate >> 16) & 0xff); waveHead[27] = (Byte)((sampleRate >> 24) & 0xff); long byteRate = sampleRate*numOfChannelsKey*LinearPCMBitDepth/8; waveHead[28] = (Byte)(byteRate & 0xff); waveHead[29] = (Byte)((byteRate >> 8) & 0xff); waveHead[30] = (Byte)((byteRate >> 16) & 0xff); waveHead[31] = (Byte)((byteRate >> 24) & 0xff); waveHead[32] =numOfChannelsKey*LinearPCMBitDepth/8; waveHead[33] = 0; waveHead[34] = LinearPCMBitDepth; waveHead[35] = 0; waveHead[36] = 'd'; waveHead[37] = 'a'; waveHead[38] = 't'; waveHead[39] = 'a'; long totalAudiolength = [audioData length]; waveHead[40] = (Byte)(totalAudiolength & 0xff); waveHead[41] = (Byte)((totalAudiolength >> 8) & 0xff); waveHead[42] = (Byte)((totalAudiolength >> 16) & 0xff); waveHead[43] = (Byte)((totalAudiolength >> 24) & 0xff); NSMutableData *pcmData = [[NSMutableData alloc]initWithBytes:&waveHead length:sizeof(waveHead)]; [pcmData appendData:audioData]; return pcmData; }
|
参考
- 简单分析 WAV 文件