# 在MP3文件中查找节拍

27

https://dl.dropboxusercontent.com/u/24197429/noisy-beats.mp3

• 处理具有类似“复杂性”的任何mp3文件。它可能无法发出嘈杂的录音或无法快速播放旋律-我不在乎。
• 相当精确。公差为+/- 50毫秒。因此，如果心跳发生在1500毫秒并且您的解决方案报告为1400，那么这是不可接受的。
• 仅使用免费软件。允许调用ffmpeg，就像使用任何免费的第三方软件来选择您的语言一样。

1

Fabinout 2014年

18

3

8
@BjörnLindqvist您不应该建议改善自己的内心。除非以前的一些评论已删除，否则我在这里看不到任何负面评论，而只是提出改进建议。
Gareth 2014年

6

# Python 2.7 492字节（仅beats.mp3）

## 高尔夫守则

``````import sys
from math import *
from numpy import *
from pydub import AudioSegment
p=square(AudioSegment.from_mp3(sys.argv[1]).set_channels(1).get_array_of_samples())
n=len(p)
t=arange(n)/44.1
h=array([.54-.46*cos(i/477) for i in range(3001)])
p=convolve(p,h, 'same')
d=[p[i]-p[max(0,i-500)] for i in xrange(n)]
e=sort(d)
e=d>e[int(.94*n)]
i=0
while i<n:
if e[i]:
u=o=0
j=i
while u<2e3:
u=0 if e[j] else u+1
#u=(0,u+1)[e[j]]
o+=e[j]
j+=1
if o>500:
print "%g"%t[argmax(d[i:j])+i]
i=j
i+=1
``````

## 非高尔夫代码

``````# Import stuff
import sys
from math import *
from numpy import *
from pydub import AudioSegment

# Read in the audio file, convert from stereo to mono
song = AudioSegment.from_mp3(sys.argv[1]).set_channels(1).get_array_of_samples()

# Convert to power by squaring it
signal = square(song)
numSamples = len(signal)

# Create an array with the times stored in ms, instead of samples
times = arange(numSamples)/44.1

# Create a Hamming Window and filter the data with it. This gets rid of a lot of
# high frequency stuff.
h = array([.54-.46*cos(i/477) for i in range(3001)])
signal = convolve(signal,h, 'same') #The same flag gets rid of the time shift from this

# Differentiate the filtered signal to find where the power jumps up.
# To reduce noise from the operation, instead of using the previous sample,
# use the sample 500 samples ago.
diff = [signal[i] - signal[max(0,i-500)] for i in xrange(numSamples)]

# Identify the top 6% of the derivative values as possible beats
ecdf = sort(diff)
exceedsThresh = diff > ecdf[int(.94*numSamples)]

# Actually identify possible peaks
i = 0
while i < numSamples:
if exceedsThresh[i]:
underThresh = overThresh = 0
j=i
# Keep saving values until 2000 consecutive ones are under the threshold (~50ms)
while underThresh < 2000:
underThresh =0 if exceedsThresh[j] else underThresh+1
overThresh += exceedsThresh[j]
j += 1
# If at least 500 of those samples were over the threshold, take the maximum one
# to be the beat definition
if overThresh > 500:
print "%g"%times[argmax(diff[i:j])+i]
i=j
i+=1
``````

## 为什么我错过其他文件上的笔记（以及为什么它们难以置信的挑战）

`beats2.mp3`极具挑战性。这是声谱图 在第一行中，有几行，但有些音符确实在这些行上流血。为了可靠地识别音符，您必须开始跟踪音符的音高（基本音和和声），并查看音符的变化位置。一旦第一位开始工作，第二位的速度将是节奏的两倍！

Rɪᴋᴇʀ

Dominic A.

Rɪᴋᴇʀ