2023年1月8日 21:43:12go评论98阅读模式

英文:

Azure Face API - How to manage very large volumes (more than 30 million faces)

问题

我正在使用LargeFaceGroup来存储人脸。我正在处理的用例涉及超过3000万个人脸。在这3000万张图像上，我也需要运行Face-Identify调用。

LargeFaceGroup的限制是 - 它只能容纳100万个。如果我使用30个LargeFaceGroup，我将不得不进行30次Face-Identify以在3000万个人脸之间找到匹配。因此，为了找到单个人脸的匹配，需要进行30次API交易。

我有一些问题：

是否有更高效的处理大量数据的方法？
如何优化API成本和时间？（例如 - 我已经发现我们可以将最多10个faceIds传递给Face-Identify，从而将API交易减少了10倍）
我是否可以批量检测/添加/删除人脸，还是必须为每个单独的人脸进行API交易？
Face-Identify在LargeFaceGroup中的搜索时间是多少？它是否取决于LargeFaceGroup中存在的人脸数量？

英文:

I am using LargeFaceGroup to store the faces. The usecase I am dealing with has more than 30 millions of faces. On these 30 millions Images, I need to run Face-Identify call as well.

The limitation of LargeFaceGroup is - It can only hold upto 1 million. If I use 30 LargeFaceGroup I will have to make 30 Face-Identify to find match between 30 million faces. Hence making 30 API transaction for finding match for a single face.

I have few question:

Is there any more efficient way to deal with large volumes.
How can I optimize API Cost and time? (example- I have found out that we can pass upto 10 faceIds to Face-Identify, thus reducing the API transaction by 10 fold)
Can I also detect/add/delete faces in batch, or I will have to make API transaction for each individual faces?
What is the search time for Face-Identify in a LargeFaceGroup. Is is dependent upon the number of faces present in the LargeFaceGroup?

答案1

得分: 0

为处理大量数据，我们应该使用“PersonDirectory”来存储人脸。它可以处理高达7500万张人脸。在“PersonDirectory”数据结构中没有培训成本。
如第一点所述，培训成本可以消除。可以优化时间 - 您可以从Azure请求超过10TPS的速率，他们将允许这样做。其他API调用，如“detect”、“Add-Face”和“Delete-Face”无法优化。（一些技巧，例如将多个图像拼接成一个，然后在其上调用detect可以节省API调用。您可以检查是否适用于您的用例）。而不是专注于避免一些多余的API调用，比如2个“detect”调用，而是保存“faceid”，然后在24小时内进行后续调用。
除了“detect”的技巧之外，您将需要为每个独立的图像/人脸调用API。
我不确定个别查询的响应时间，但在处理大量数据时，我们关心的是API的吞吐量，吞吐量可以从10 TPS增加到所需的上限。

Face API文档 - https://westus.dev.cognitive.microsoft.com/docs/services/face-v1-0-preview/operations/563879b61984550f30395239

英文:

After a discussion with the Azure Face API product team. I got answers to these questions.

To handle large volumes, we should use PersonDirectory to store faces. It can handle up to 75 million faces. There is no training cost in PersonDirectory data structure as well.
As mentioned in the first point. Training costs can be eliminated. Time can be optimized - You can request more than 10TPS from Azure, and they will allow it. Other API calls such as detect,Add-Face, and Delete-Face can not be optimized. (Some hacks like stitching multiple images to one and then call detect on it can save API calls. You can check if this is suitable for the use case).
Rather you should focus that you are not having some redundant API calls such as 2 detect calls, rather save the faceid and make subsequent calls within 24 hours.
Apart from the hack for detect. You will have to call API for each individual Image/Face.
I am not sure about the response time for an individual query, but while handling large volumes we are concerned about the throughput of the API, and throughput can be increased from 10 TPS to some upper limit as desired.

Face API Doc - https://westus.dev.cognitive.microsoft.com/docs/services/face-v1-0-preview/operations/563879b61984550f30395239

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Azure Face API – 如何管理非常大的数据量（超过3000万张人脸）

问题

答案1

Azure Event Grid 可以在回应调用方之前运行逻辑吗？

Azure OpenAI Embeddings与OpenAI Embeddings

你如何使用Active Directory限制登录到Azure虚拟机？

删除Azure恢复服务保险库中的所有备份项目，使用PowerShell。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。