问题

我有两个高分辨率本地视频录制文件，来自一次播客采访。

我想将它们合并成一个输出文件，始终显示说话者。

所以我们需要分析音轨，看谁在说话（嘉宾有优先权），然后创建说话者的时间戳数组。

使用类似于我所描述的ffmpeg进行音量分析的示例

然后，我想使用AMS根据时间戳合并视频文件（例如，host.mp4 源文件播放 20 秒，然后 guest.mp4 播放 30 秒，依此类推）。

我该如何操作？

英文:

I have two hi-res local video recording files from a podcast interview.

I would like to merge them into one output file with the speaker showing at all times.

So we'd need to analyse the audio track and see who is speaking (guest has priority) and then create an array of timestamps of the speaker.

Volume analysis example using ffmpeg similar to what I'm describing

Then I'd like to use AMS to merge the video files based on the timestamps (eg. host.mp4 source for 20 seconds then guest.mp4 for 30 seconds, etc)

How would I go about this?

答案1

得分: 2

这听起来像是Azure Video Indexer中的扬声器枚举功能。您可以在此链接中了解更多信息：https://learn.microsoft.com/en-us/azure/azure-video-indexer/video-indexer-overview#videoaudio-ai-features。

英文:

This sounds like the speaker enumeration feature in Azure Video Indexer https://learn.microsoft.com/en-us/azure/azure-video-indexer/video-indexer-overview#videoaudio-ai-features.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

扬声器模式视频在Azure媒体服务中

问题

答案1

如何将视频直播流发送至Azure云？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论