2023年8月4日 08:51:32go评论97阅读模式

英文:

Cosmos .NET SDK Bulk execution without exceeded provisioned throughput

问题

我有一个手动配置吞吐量的NoSQL Cosmos DB容器。

我有一个应用程序，使用启用了批量执行模式的v3 .NET SDK来一次性更新许多文档。

我在应用程序级别遇到了429（请求过多）错误。我该如何避免这种情况？

我正在使用批量执行模式，因为我希望尽快更新文档。但当然，我不想超出我手动配置的吞吐量。我该怎么做？

这篇文章提到批量执行模式会增加最大RU消耗。

英文:

I have a NoSQL Cosmos DB Container with manually provisioned throughput.

I have an application that upserts a lot of documents at once using the v3 .NET SDK for cosmos DB with bulk execution mode enabled.

I am having 429 (Too Many Request) errors surfaced to the application level. How can I avoid this?

I am using bulk execution mode because I want to upsert the documents as quickly as possible. But of course I don't want to exceed my manually provisioned throughput. What can I do?

This article mentions that max RU consumption increases with bulk execution mode.

答案1

得分: 1

总消耗的 RU 不会增加，但因为您每秒发送更多的操作，所以每秒的 RU 会增加，因为目标是消除客户端瓶颈，让您利用可用的 RU 进行摄取。

目前还没有一种方法来定义 RU 限制，您是通过批量处理的文档大小/数量来定义 RU 消耗的，因此您可以定义一个数字，并根据您的 RU 使用情况创建您感到舒适的发送批次的整个数据的子集。

英文:

The total RU consumed does not increase, but because you are sending more operations per second then the RU per second increases, because the goal is to remove the client-side bottleneck and allow you to take advantage of available RU for ingestion.

At the moment there is no way to define an RU limit, you are defining the RU consumption by the size/number of documents you are processing through Bulk, so you can define a number and create subsets of your whole data that are in batches of a number you feel comfortable sending based on your RU usage.

答案2

得分: 0

我正在使用批量执行模式，因为我希望尽快地执行文档的插入或更新。

如果你收到 429 错误，那么你已经达到了给定 RU 限制下的“尽快执行”的目标。 请参阅诊断和排除 Azure Cosmos DB 请求速率过高 (429) 异常：

通常情况下，对于生产工作负载，如果你看到 1-5% 的请求收到 429 响应，而且端到端的延迟是可接受的，这表明 RU/s 被充分利用，无需采取任何措施。

所以，429 错误并不一定是坏事，除非你的客户端继续尝试同时推送过多的文档，超过了 RU 限制并完全超过了重试机制，导致 429 错误变成了失败的业务操作。那就是一个不好的迹象。

内置的重试机制应该能够通过在重试时自动尊重 x-ms-retry-after-ms 头来避免硬性失败：
> 在初始操作收到 HTTP 状态码 429 并被限制后，重试操作之前等待的毫秒数。

正如 Matias 已经提到的，关键是通过在客户端添加批处理并在每个批处理之后等待，以赶上 RU 限制，然后再发送下一批文档，帮助你的应用程序避免堆积过多的请求。这样，自动重试机制将继续使用完整的 RU 限制，业务方面如果需要等待则会减速，最终所有文档都会被发送。

另外，增加 RU 限制以匹配你的插入速率也有帮助;) 1: https://learn.microsoft.com/en-us/rest/api/cosmos-db/common-cosmosdb-rest-response-headers

英文:

> I am using bulk execution mode because I want to upsert the documents as quickly as possible.

If you get 429s then you are reaching your goal of "as quickly as possible" for given RU limit. See Diagnose and troubleshoot Azure Cosmos DB request rate too large (429) exceptions:

> In general, for a production workload, if you see between 1-5% of requests with 429 responses, and your end to end latency is acceptable, this is a healthy sign that the RU/s are being fully utilized. No action is required.

So, 429s are not necessarily a bad thing unless you client continues to try to push in too many docs at the same time, overwhelms the RU limit and retry-mechanism completely and 429s turn to failed business operations. Then its a bad sign.

The built-in retry mechanism should be able to avoid hard fails by respecting the x-ms-retry-after-ms header automatically on retries:
> The number of milliseconds to wait to retry the operation after an initial operation received HTTP status code 429 and was throttled.

As already mentioned by Matias the key is to help your app avoid piling up too many requests by adding batching to the client and waiting after each batch to catch up with RU limit, before sending in the next pile. This way auto-retry mechanism keeps on using full RU-limit, business side slows down if waits are needed, and eventually all docs get sent.

Though, increasing RU limit to match you ingestion rate also helps;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Cosmos .NET SDK 批量执行而不超出预配的吞吐量

问题

答案1

答案2

在Cosmos中创建/查询引发了取消异常。

Azure Cosmos DB：我们能否使用NoSQL API访问MongoDB实例，反之亦然？

Azure CosmosDB + EF Core：无法在拥有的属性内使用嵌套集合？

Entity Framework Core CosmosDB upsert (update/insert) recognition

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。