英文:
Recognise a specific sound (not speech)
问题
什么样的机器学习模型可以用来识别特定声音,大致比较和识别波形(我已经有所需数据)。
我只能找到关于识别语音而不是特定声音的信息。
英文:
What kind of ml model would one use to recognise a specific sound to roughly compare and recognise waveforms (I have the data I need).
I could only find stuff about recognising speech but not a specific sound.
答案1
得分: -1
这是一个多类分类问题,有多种解决方法。
您需要在代表您期望的声音的数据集上训练一个模型。假设您拥有数据,您可以根据训练时间、预测时间和所需的准确性选择任何多类分类算法。
一个关键步骤是处理数据以适应分类器,同时还有很多方法可以使用,但对于音频处理,大多数将依赖于频率表示(傅里叶变换)。MEL频谱2,3 在这里似乎效果很好。也可以将频谱表示为图像并使用图像分类方法。
正如您提到的Python,如果您绝对不知道从何开始,我建议使用深度学习库,比如pytorch。
CNN2 在这个特定任务中似乎有很好的结果。
PyTorch文档包含了图像或音频分类的入门教程。
简而言之,您需要:
- 获取并标记您要分类的每种声音的数据。
- 使用频率表示对音频进行预处理。
- 在数据上训练模型。
- 使用模型对新数据进行类别预测。
英文:
This is a multiclass classification problem and there are multiple ways to solve it.
You'll need to train a model on a dataset representative of sounds you expect. Assuming you have the data, you can choose any multiclass classification algorithm depending on training time, prediction time, accuracy you need.
One critical step is to process the data to fit into the classifier and there is also a lot of methods you can use but for audio processing most will rely on a frequency representation (Fourier Transform). MEL Spectrogram<sup>2,3</sup> seems to yield good results well here. It's also possible to represent the spectrogram as images and use image classification methods.
As you mention python, and if you absolutely don't know where to start, I'd suggest to use a deep learning library like pytorch.
CNN<sup>2</sup> seems to yield good result for this particular task.
The pytorch documentation contains a beginner tutorial for image or audio classification.
To recap, you'll need to:
- Obtain and label data for each type of sound you want to classify.
- Preprocess audio using a frequency representation.
- Train a model on the data.
- Use the model to predict class on new data.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论