英文:
Azure Face API - How to manage very large volumes (more than 30 million faces)
问题
我正在使用LargeFaceGroup
来存储人脸。我正在处理的用例涉及超过3000万个人脸。在这3000万张图像上,我也需要运行Face-Identify
调用。
LargeFaceGroup
的限制是 - 它只能容纳100万个。如果我使用30个LargeFaceGroup
,我将不得不进行30次Face-Identify
以在3000万个人脸之间找到匹配。因此,为了找到单个人脸的匹配,需要进行30次API交易。
我有一些问题:
- 是否有更高效的处理大量数据的方法?
- 如何优化API成本和时间?(例如 - 我已经发现我们可以将最多10个
faceIds
传递给Face-Identify
,从而将API交易减少了10倍) - 我是否可以批量检测/添加/删除人脸,还是必须为每个单独的人脸进行API交易?
Face-Identify
在LargeFaceGroup
中的搜索时间是多少?它是否取决于LargeFaceGroup
中存在的人脸数量?
英文:
I am using LargeFaceGroup
to store the faces. The usecase I am dealing with has more than 30 millions of faces. On these 30 millions Images, I need to run Face-Identify
call as well.
The limitation of LargeFaceGroup
is - It can only hold upto 1 million. If I use 30 LargeFaceGroup
I will have to make 30 Face-Identify
to find match between 30 million faces. Hence making 30 API transaction for finding match for a single face.
I have few question:
- Is there any more efficient way to deal with large volumes.
- How can I optimize API Cost and time? (example- I have found out that we can pass upto 10
faceIds
toFace-Identify
, thus reducing the API transaction by 10 fold) - Can I also detect/add/delete faces in batch, or I will have to make API transaction for each individual faces?
- What is the search time for
Face-Identify
in aLargeFaceGroup
. Is is dependent upon the number of faces present in theLargeFaceGroup
?
答案1
得分: 0
-
为处理大量数据,我们应该使用“PersonDirectory”来存储人脸。它可以处理高达7500万张人脸。在“PersonDirectory”数据结构中没有培训成本。
-
如第一点所述,培训成本可以消除。可以优化时间 - 您可以从Azure请求超过10TPS的速率,他们将允许这样做。其他API调用,如“detect”、“Add-Face”和“Delete-Face”无法优化。(一些技巧,例如将多个图像拼接成一个,然后在其上调用detect可以节省API调用。您可以检查是否适用于您的用例)。而不是专注于避免一些多余的API调用,比如2个“detect”调用,而是保存“faceid”,然后在24小时内进行后续调用。
-
除了“detect”的技巧之外,您将需要为每个独立的图像/人脸调用API。
-
我不确定个别查询的响应时间,但在处理大量数据时,我们关心的是API的吞吐量,吞吐量可以从10 TPS增加到所需的上限。
Face API文档 - https://westus.dev.cognitive.microsoft.com/docs/services/face-v1-0-preview/operations/563879b61984550f30395239
英文:
After a discussion with the Azure Face API product team. I got answers to these questions.
-
To handle large volumes, we should use
PersonDirectory
to store faces. It can handle up to 75 million faces. There is no training cost inPersonDirectory
data structure as well. -
As mentioned in the first point. Training costs can be eliminated. Time can be optimized - You can request more than 10TPS from Azure, and they will allow it. Other API calls such as
detect
,Add-Face
, andDelete-Face
can not be optimized. (Some hacks like stitching multiple images to one and then call detect on it can save API calls. You can check if this is suitable for the use case).
Rather you should focus that you are not having some redundant API calls such as 2detect
calls, rather save thefaceid
and make subsequent calls within 24 hours. -
Apart from the hack for
detect
. You will have to call API for each individual Image/Face. -
I am not sure about the response time for an individual query, but while handling large volumes we are concerned about the throughput of the API, and throughput can be increased from 10 TPS to some upper limit as desired.
Face API Doc - https://westus.dev.cognitive.microsoft.com/docs/services/face-v1-0-preview/operations/563879b61984550f30395239
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论