Azure Face API – 如何管理非常大的数据量(超过3000万张人脸)

huangapple go评论70阅读模式
英文:

Azure Face API - How to manage very large volumes (more than 30 million faces)

问题

我正在使用LargeFaceGroup来存储人脸。我正在处理的用例涉及超过3000万个人脸。在这3000万张图像上,我也需要运行Face-Identify调用。

LargeFaceGroup的限制是 - 它只能容纳100万个。如果我使用30个LargeFaceGroup,我将不得不进行30次Face-Identify以在3000万个人脸之间找到匹配。因此,为了找到单个人脸的匹配,需要进行30次API交易。

我有一些问题:

  1. 是否有更高效的处理大量数据的方法?
  2. 如何优化API成本和时间?(例如 - 我已经发现我们可以将最多10个faceIds传递给Face-Identify,从而将API交易减少了10倍)
  3. 我是否可以批量检测/添加/删除人脸,还是必须为每个单独的人脸进行API交易?
  4. Face-IdentifyLargeFaceGroup中的搜索时间是多少?它是否取决于LargeFaceGroup中存在的人脸数量?
英文:

I am using LargeFaceGroup to store the faces. The usecase I am dealing with has more than 30 millions of faces. On these 30 millions Images, I need to run Face-Identify call as well.

The limitation of LargeFaceGroup is - It can only hold upto 1 million. If I use 30 LargeFaceGroup I will have to make 30 Face-Identify to find match between 30 million faces. Hence making 30 API transaction for finding match for a single face.

I have few question:

  1. Is there any more efficient way to deal with large volumes.
  2. How can I optimize API Cost and time? (example- I have found out that we can pass upto 10 faceIds to Face-Identify, thus reducing the API transaction by 10 fold)
  3. Can I also detect/add/delete faces in batch, or I will have to make API transaction for each individual faces?
  4. What is the search time for Face-Identify in a LargeFaceGroup. Is is dependent upon the number of faces present in the LargeFaceGroup?

答案1

得分: 0

  1. 为处理大量数据,我们应该使用“PersonDirectory”来存储人脸。它可以处理高达7500万张人脸。在“PersonDirectory”数据结构中没有培训成本。

  2. 如第一点所述,培训成本可以消除。可以优化时间 - 您可以从Azure请求超过10TPS的速率,他们将允许这样做。其他API调用,如“detect”、“Add-Face”和“Delete-Face”无法优化。(一些技巧,例如将多个图像拼接成一个,然后在其上调用detect可以节省API调用。您可以检查是否适用于您的用例)。而不是专注于避免一些多余的API调用,比如2个“detect”调用,而是保存“faceid”,然后在24小时内进行后续调用。

  3. 除了“detect”的技巧之外,您将需要为每个独立的图像/人脸调用API。

  4. 我不确定个别查询的响应时间,但在处理大量数据时,我们关心的是API的吞吐量,吞吐量可以从10 TPS增加到所需的上限。

Face API文档 - https://westus.dev.cognitive.microsoft.com/docs/services/face-v1-0-preview/operations/563879b61984550f30395239

英文:

After a discussion with the Azure Face API product team. I got answers to these questions.

  1. To handle large volumes, we should use PersonDirectory to store faces. It can handle up to 75 million faces. There is no training cost in PersonDirectory data structure as well.

  2. As mentioned in the first point. Training costs can be eliminated. Time can be optimized - You can request more than 10TPS from Azure, and they will allow it. Other API calls such as detect,Add-Face, and Delete-Face can not be optimized. (Some hacks like stitching multiple images to one and then call detect on it can save API calls. You can check if this is suitable for the use case).
    Rather you should focus that you are not having some redundant API calls such as 2 detect calls, rather save the faceid and make subsequent calls within 24 hours.

  3. Apart from the hack for detect. You will have to call API for each individual Image/Face.

  4. I am not sure about the response time for an individual query, but while handling large volumes we are concerned about the throughput of the API, and throughput can be increased from 10 TPS to some upper limit as desired.

Face API Doc - https://westus.dev.cognitive.microsoft.com/docs/services/face-v1-0-preview/operations/563879b61984550f30395239

huangapple
  • 本文由 发表于 2023年1月8日 21:43:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75048220.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定