背景知识
分贝
分贝(decibel, db)是一个对数单位, 通常用于描述声音的等级。
假设有两个音源A和B,其中音源B的功率P2是音源A功率P1的两倍, 即 P2/P1 = 2
那么在其它条件相同的情况下(声音的频率, 听的距离),衡量两个声音的等级
10 log (P2/P1) = 10 log 2 = 3 dB //功率相差两倍的声音,等级相差3db
但有时候我们又看到20倍log(),这是咋回事呢?
20 log使用的单位通常是声压 (sound pressure),功率可以看做是声压的平方,其实它们是对等的:
20 log (p2/p1) dB = 10 log (p2^2/p1^2) dB = 10 log (P2/P1) dB
标准声音等级
参考声压:20 μPa,认为是人耳能感受的极限
0 db 表示什么?
sound level = 20 log (pmeasured/pref) = 20 log 1 = 0 dB
只是表示待测试的声压刚好等于参考声压20 μPa,并不代表没有声音,可以认为该声压人耳无法感知,但振动还是存在的。同理- 20 dB就表示更微弱的振动了,只有参考声压的1/10
声音和距离的关系
假设声源辐射的总能量为P,声音是均匀辐射的额,单位面积接收到的能量为I
I = P/(4πr2)
那么I 就和距离的平方成反比
I2/I1 = (r1^2)/(r2^2)
换句话说: 如果我们将距离加倍,则声压降低2倍,强度降低4倍,声级降低6 dB
音量
音量代表声音的强度,可由一个窗口或一帧内信号振幅的大小来衡量,一般有两种度量方法:
(1)每个帧的振幅的绝对值的总和:
其中为该帧的第i个采样点,n为该帧总的采样点数。这种度量方法的计算量小,但不太符合人的听觉感受。
(2)样本平方和取10为底的对数的10倍
它的单位是分贝(Decibels),是一个对数强度值,比较符合人耳对声音大小的感觉,但计算量稍复杂。
音量计算的Python实现如下:
import math
import numpy as np
# method 1: absSum
def calVolume(waveData, frameSize, overLap):
wlen = len(waveData)
step = frameSize - overLap
frameNum = int(math.ceil(wlen*1.0/step))
volume = np.zeros((frameNum,1))
for i in range(frameNum):
curFrame = waveData[np.arange(i*step,min(i*step+frameSize,wlen))]
curFrame = curFrame - np.median(curFrame) # zero-justified
volume[i] = np.sum(np.abs(curFrame))
return volume
# method 2: 10 times log10 of square sum
def calVolumeDB(waveData, frameSize, overLap):
wlen = len(waveData)
step = frameSize - overLap
frameNum = int(math.ceil(wlen*1.0/step))
volume = np.zeros((frameNum,1))
for i in range(frameNum):
curFrame = waveData[np.arange(i*step,min(i*step+frameSize,wlen))]
curFrame = curFrame - np.mean(curFrame) # zero-justified
volume[i] = 10*np.log10(np.sum(curFrame*curFrame))
return volume
#--------main.py-------
import wave
import pylab as pl
import numpy as np
import volume as vp
# ============ test the algorithm =============
# read wave file and get parameters.
fw = wave.open('aeiou.wav','r')
params = fw.getparams()
print(params)
nchannels, sampwidth, framerate, nframes = params[:4]
strData = fw.readframes(nframes)
waveData = np.fromstring(strData, dtype=np.int16)
waveData = waveData*1.0/max(abs(waveData)) # normalization
fw.close()
# calculate volume
frameSize = 256
overLap = 128
volume11 = vp.calVolume(waveData,frameSize,overLap)
volume12 = vp.calVolumeDB(waveData,frameSize,overLap)
# plot the wave
# 计算时间轴的长度
time = np.arange(0, nframes)*(1.0/framerate)
time2 = np.arange(0, len(volume11))*(frameSize-overLap)*1.0/framerate
pl.subplot(311)
pl.plot(time, waveData)
pl.ylabel("Amplitude")
pl.subplot(312)
pl.plot(time2, volume11)
pl.ylabel("absSum")
pl.subplot(313)
pl.plot(time2, volume12, c="g")
pl.ylabel("Decibel(dB)")
pl.xlabel("time (seconds)")
pl.show()