学计算机的那个

不是我觉到、悟到,你给不了我,给了也拿不住;只有我觉到、悟到,才有可能做到,能做到的才是我的.

0%

iOS录音生成Wav格式问题

iPhone的音质一般,音量偏小,作为听个响系列,表现也是不置可否,这篇文章总结一下,iOS本地录音生成Wav文件格式遇到的一个问题。

Wav文件格式错误

之前开发一个项目,里面需要通过录音生成wav文件,提交给后台实现声纹注册,当我调用系统的API生成Wav文件上传后,后台报文件格式错误,然而android生成的wav文件却没有报这个错….

问题定位

  1. 是否iOS本地生成的Wav文件格式有问题
    使用itunes文件共享功能,将生成的Wav文件导出,进行本地端(Mac OS)播放正常,在windwos端播放正常。

初略的认定生成的Wav格式没有问题。

  1. 是否wav文件上传功能出现问题
    借用android端生成的wav文件重新发送,提示声纹注册成功。排除本地端文件上传功能有问题

  2. 后台解析Wav文件是不是有什么问题?
    后台说不是他这边报的文件格式错误,他只是做了中间转发,错误是微软后台那边报的(项目是跟微软内部合作)

  3. 是不是对录音的参数有什么要求,采样率,位深….?
    各种找人,各种甩锅之后,找到了内部跟微软对接需求的人(DNS迭代查询即视感),没有详细文档,没有参数限制,让我自己试….

  4. 分析两边Wav文件格式的差异
    对比之后,发现苹果生成wav文件格式不是标准的wav协议头的文件。。。

WAV简介

WAVE(Waveform Audio File Format), 采用RIFF(Resource Interchange File Format)文件格式结构

WAV格式的音频文件通常用来保存PCM格式的原始音频数据,通常被称之为无损音频。

WAV音频文件,粗略来说是WAV数据头+PCM数据组成的,裸数据PCM外面包了一层文件头,WAV实质是一个RIFF文件

WAV数据头

关于 WAV 音频文件的数据头定义如下图所示:

一般的 WAV 文件的数据头为 44 个字节, 其后面跟的是 PCM 数据

分析wav数据头

使用hexdump或者xd来看一下 WAV 文件的数据头

1
hexdump -n 44 m.wav      //查看前 44 个字节

这是标准的44字节的wav文件头,来看下iOS录音生成wav文件的wav格式

1
2
3
4
5
6
NSMutableDictionary* settingDic=@{}.mutableCopy;
[settingDic setObject:@(kAudioFormatLinearPCM) forKey:AVFormatIDKey];
[settingDic setValue:@(44100) forKey:AVSampleRateKey];
[settingDic setValue:@(2) forKey:AVNumberOfChannelsKey];
[settingDic setValue:@(16) forKey:AVLinearPCMBitDepthKey];
[settingDic setValue:@(AVAudioQualityHigh) forKey:AVEncoderAudioQualityKey];
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Jack-Mac-mini:~ Jack$ afinfo /Users/Jack/Desktop/Audio/test.wav
File: /Users/Jack/Desktop/Audio/test.wav
File type ID: WAVE
Num Tracks: 1
----
Data format: 2 ch, 44100 Hz, 'lpcm' (0x0000000C) 16-bit little-endian signed integer
no channel layout.
estimated duration: 17.507846 sec
audio bytes: 3088384
audio packets: 772096
bit rate: 1411200 bits per second
packet size upper bound: 4
maximum packet size: 4
audio data file offset: 4096
optimized
source bit depth: I16
----
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Jack-Mac-mini:~ Jack$ xd -cl  4096  /Users/Jack/Desktop/Audio/test.wav
File name: /Users/Jack/Desktop/Audio/test.wav

00000000: 52 49 46 46 f8 2f 2f 00 57 41 56 45 4a 55 4e 4b RIFF.//.WAVEJUNK
00000010: 1c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000030: 66 6d 74 20 10 00 00 00 01 00 02 00 44 ac 00 00 fmt.........D...
00000040: 10 b1 02 00 04 00 10 00 46 4c 4c 52 a8 0f 00 00 ........FLLR....
00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
****
00000fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000ff0: 00 00 00 00 00 00 00 00 64 61 74 61 00 20 2f 00 ........data../.
Total 4096 Byte

生成的WAV 文件的数据头为 4096 个字节

转换思路

AVAudioRecorder录音采样生成PCM文件格式,PCM转成标准44字节的Wav文件格式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
/**
PCM 转 Wav

@param originPath PCM Path
@return Wav Path
*/
-(NSString*)transformPCM2Wav:(NSString*)originPath
{
NSString* outPath=[[originPath stringByDeletingPathExtension] stringByAppendingString:@".wav"];
NSData* data=[NSData dataWithContentsOfFile:originPath];
BOOL isSuccess=[[self writeWavHead:data] writeToFile:outPath atomically:YES];
if (isSuccess) {
return outPath;
}
else
{
return nil;
}
}
// 为pcm文件写入wav头
-(NSData*)writeWavHead:(NSData *)audioData {
long sampleRate = SampleRateKey;
long numOfChannelsKey = NumberOfChannels;
Byte waveHead[44];
waveHead[0] = 'R';
waveHead[1] = 'I';
waveHead[2] = 'F';
waveHead[3] = 'F';

long totalDatalength = [audioData length] + 44;
waveHead[4] = (Byte)(totalDatalength & 0xff);
waveHead[5] = (Byte)((totalDatalength >> 8) & 0xff);
waveHead[6] = (Byte)((totalDatalength >> 16) & 0xff);
waveHead[7] = (Byte)((totalDatalength >> 24) & 0xff);

waveHead[8] = 'W';
waveHead[9] = 'A';
waveHead[10] = 'V';
waveHead[11] = 'E';

waveHead[12] = 'f';
waveHead[13] = 'm';
waveHead[14] = 't';
waveHead[15] = ' ';

waveHead[16] = 16; //size of 'fmt '
waveHead[17] = 0;
waveHead[18] = 0;
waveHead[19] = 0;

waveHead[20] = 1; //format
waveHead[21] = 0;

waveHead[22] = numOfChannelsKey; //chanel
waveHead[23] = 0;

waveHead[24] = (Byte)(sampleRate & 0xff);
waveHead[25] = (Byte)((sampleRate >> 8) & 0xff);
waveHead[26] = (Byte)((sampleRate >> 16) & 0xff);
waveHead[27] = (Byte)((sampleRate >> 24) & 0xff);

long byteRate = sampleRate*numOfChannelsKey*LinearPCMBitDepth/8;//每秒字节数 = 采样频率*采样块大小
waveHead[28] = (Byte)(byteRate & 0xff);
waveHead[29] = (Byte)((byteRate >> 8) & 0xff);
waveHead[30] = (Byte)((byteRate >> 16) & 0xff);
waveHead[31] = (Byte)((byteRate >> 24) & 0xff);

waveHead[32] =numOfChannelsKey*LinearPCMBitDepth/8;//采样块大小=声道数量*采样点大小/8
waveHead[33] = 0;

waveHead[34] = LinearPCMBitDepth;//采样点大小
waveHead[35] = 0;

waveHead[36] = 'd';
waveHead[37] = 'a';
waveHead[38] = 't';
waveHead[39] = 'a';

long totalAudiolength = [audioData length];

waveHead[40] = (Byte)(totalAudiolength & 0xff);
waveHead[41] = (Byte)((totalAudiolength >> 8) & 0xff);
waveHead[42] = (Byte)((totalAudiolength >> 16) & 0xff);
waveHead[43] = (Byte)((totalAudiolength >> 24) & 0xff);

NSMutableData *pcmData = [[NSMutableData alloc]initWithBytes:&waveHead length:sizeof(waveHead)];
[pcmData appendData:audioData];
return pcmData;
}

参考

  1. 简单分析 WAV 文件