2020年9月13日 16:06:05go评论102阅读模式

英文:

voice recognition constantly in background android

问题

我想开发一个应用程序，每当它识别出关键词时，它会执行某些操作。它需要始终处于监听模式，甚至在后台也是如此。
我接触过这个和这个。我尝试运行它，但在我说话时它不起作用。
实际上，我读到现在它仍然不支持我的母语。这是原因吗？
我想知道它是如何工作的？它是在进行语音转文本并将其保存在资产文件中吗？它会在后台运行吗？它使用了人工智能模型吗？当两个应用程序都需要麦克风资源时，它会如何表现？噪音问题？它能与神经网络API一起使用吗？我该如何开始开发这样的东西？

谢谢！

英文:

FfhggtffgtfgtgftI want develop application that whenever it recognizes a keyword it does something. it needs to be in listening mode all the time, in backgeound too.
I was exposed to this and this. I treid run it but it is not work when I am speaking.
actually I read it still doesn't support my native language. is that the reason?
I want to know how it works? does it is doing speach to text and saved it in assets files? does it is run in background? does it is used AI models? how it behaves when two apps need mic resource in parallel? noises? does it is work with Neural Networks API? how can I start developing such a thing?

thanks!

答案1

得分: 1

这是您在Android上尝试了Vosk离线语音识别，以下是对您问题的一些回答：

> 实际上，我了解到它仍不支持我的母语。

如果您是指希伯来语，也许将来我们会支持它，您可以自行构建。

> 是这个原因吗？

您没有提供足够的信息来回答这个问题，请解释一下"它无法工作"是什么意思。

> 我想知道它是如何工作的？

关于语音识别的详细文档可以在讲座、课程和书籍中找到。例如，您可以在这里找到一些介绍：https://www.youtube.com/watch?v=q67z7PTGRi8

> 它会将语音转换为文本并将其保存在资产文件中吗？

它确实会将语音转换为文本，但不会将结果保存到资产中，它只是显示出来。您无法修改资产，它们是静态的。

> 它会在后台运行吗？

会的。

> 它会使用人工智能模型吗？

当然会。

> 当两个应用程序同时需要麦克风资源时，它会有什么表现？

在Android中，不可能同时从两个应用程序录制音频，第二个应用程序将被阻止。

> 噪音？

它对噪音有很好的适应性。

> 它是否与神经网络API一起使用？

不，它是可移植的。

> 我该如何开始开发这样一个东西？

先获取一些基本的了解，然后开始编写代码。如果您有更多问题，可以在Telegram聊天室中提问。

英文:

It is great you tried Vosk offline speech recognition on Android, here are some answers to your questions:

> actually I read it still doesn't support my native language.

If you are about Hebrew, we might support it in the future, and you can build it yourself.

> is that the reason?

You didn't provide enough information to answer this, please explain a bit more what is "it is not work"

> I want to know how it works?

Extensive documentation on speech recognition is available on lectures, courses and books. You can find some introduction here for example: https://www.youtube.com/watch?v=q67z7PTGRi8

> does it is doing speech to text and saved it in assets files?

It does speech to text, but it doesn't save results into assets, it just displays them. You can not modify assets, they are static.

> does it is run in background?

Yes

> does it is used AI models?

Sure

> how it behaves when two apps need mic resource in parallel?

In android it is not possible to record audio from two apps in parallel, second one will be blocked.

> noises?

It is robust to noises.

> does it is work with Neural Networks API?

No, it is portable

> how can I start developing such a thing?

Get some basic understanding and start writing the code. If you have further questions you can ask them in the Telegram chat

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

声音识别在安卓中的后台持续运行

问题

答案1

Spring网关AsyncPredicate在与Reactor和Flux一起使用时不起作用

将 Json 数组按照自定义数据在 JAVA 中进行筛选

如何在Java中执行Shell脚本

在Unity项目和Android Studio Java代码之间共享数据。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。