英文:
AppendBlockAsync with more than 4mb works locally but not on Azure app service
问题
我正在尝试复现在我们生产服务器上使用AppendBlobs时遇到的问题。
文档中提到
Append Blob中的每个块可以具有不同的大小,最大不超过4 MiB,并且一个追加Blob可以包括多达50,000个块。因此,追加Blob的最大大小略大于195 GiB(4 MiB x 50,000块)。
这与我在我们的生产应用程序上看到的情况相符,而且我确实看到了这些异常:
请求正文太大,超出了最大允许限制。
请求标识:3cb3ffd7-001e-0087-5789-ae3e0c000000
时间:2023-07-04T15:10:01.2687679Z
状态:413(请求正文太大,超出了最大允许限制。)
错误代码:RequestBodyTooLarge
我遇到的问题是无法在测试中复现这个问题。
我在下面提供了一个最小可重现的示例,基本上是通过将一堆GUID序列化为字符串来创建指定大小的内存流。
然后我使用AppendBlob
来附加Blob...
我可以看到memoryStream.Length
确实大于4MB。
然而,令人困惑的是,这个方法有效。文件被正确上传到Blob存储,没有异常。
我已经看到了一些“修复”异常的方法(例如,对内存流进行分块处理),但我首先想在测试中复制此问题,但似乎无法在任何地方复制出这个错误。
发生了什么事情,有什么想法吗?
[Fact]
public async Task Can_append_blob_even_if_larger_than_4mb()
{
var containerClient = new BlobServiceClient(ConnectionString)
.GetBlobContainerClient("test-123");
await containerClient.CreateIfNotExistsAsync();
var outputFilename = $"Test-{DateTime.UtcNow.Ticks}.txt";
var appendBlobClient = containerClient.GetAppendBlobClient(outputFilename);
await appendBlobClient.CreateIfNotExistsAsync();
var json = JsonConvert
.SerializeObject(CreateList(6));
var memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(json));
await appendBlobClient
.AppendBlockAsync(memoryStream);
}
private static List<object> CreateList(int sizeInMb)
{
const int mbInBytes = 1024 * 1024;
var maxSizeInBytes = sizeInMb * mbInBytes;
var totalSize = 0;
var list = new List<object>();
while (totalSize < maxSizeInBytes)
{
var obj = Guid.NewGuid();
var serializedObj = JsonConvert.SerializeObject(obj);
var objectSize = Encoding.UTF8.GetBytes(serializedObj).Length;
if (objectSize + totalSize > maxSizeInBytes)
{
break;
}
list.Add(obj);
totalSize += objectSize;
}
return list;
}
英文:
I am trying to reproduce an issue I see on our production server when using AppendBlobs.
The docs state
> Each block in an append blob can be a different size, up to a maximum of 4 MiB, and an append blob can include up to 50,000 blocks. The maximum size of an append blob is therefore slightly more than 195 GiB (4 MiB X 50,000 blocks).
Which rings true with what I'm seeing on our production app, and sure enough I see these exceptions:
> The request body is too large and exceeds the maximum permissible limit.
RequestId:3cb3ffd7-001e-0087-5789-ae3e0c000000
Time:2023-07-04T15:10:01.2687679Z
Status: 413 (The request body is too large and exceeds the maximum permissible limit.)
ErrorCode: RequestBodyTooLarge
The problem I am having is I cannot reproduce this issue in a test.
I've got a minimal reproducible example below, which essentially creates a memory stream to a specified size, by serializing a bunch of GUIDs to a string.
I then use AppendBlob
to append the blob...
I can see the memoryStream.Length
is indeed greater than 4mb.
However, the puzzling thing is, this works. The file is uploaded to Blob Storage correctly, without exception.
I have seen ways to 'fix' the exception (chunking the memory stream, for example) but I was trying to reproduce this in a test first, but I can't seem to reproduce the error anywhere.
Any ideas what is happening?
[Fact]
public async Task Can_append_blob_even_if_larger_than_4mb()
{
var containerClient = new BlobServiceClient(ConnectionString)
.GetBlobContainerClient("test-123");
await containerClient.CreateIfNotExistsAsync();
var outputFilename = $"Test-{DateTime.UtcNow.Ticks}.txt";
var appendBlobClient = containerClient.GetAppendBlobClient(outputFilename);
await appendBlobClient.CreateIfNotExistsAsync();
var json = JsonConvert
.SerializeObject(CreateList(6));
var memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(json));
await appendBlobClient
.AppendBlockAsync(memoryStream);
}
private static List<object> CreateList(int sizeInMb)
{
const int mbInBytes = 1024 * 1024;
var maxSizeInBytes = sizeInMb * mbInBytes;
var totalSize = 0;
var list = new List<object>();
while (totalSize < maxSizeInBytes)
{
var obj = Guid.NewGuid();
var serializedObj = JsonConvert.SerializeObject(obj);
var objectSize = Encoding.UTF8.GetBytes(serializedObj).Length;
if (objectSize + totalSize > maxSizeInBytes)
{
break;
}
list.Add(obj);
totalSize += objectSize;
}
return list;
}
答案1
得分: 3
看看int AppendBlobClient.AppendBlobMaxAppendBlockBytes
的源代码。
这是
public virtual int AppendBlobMaxAppendBlockBytes =>
ClientConfiguration.Version < BlobClientOptions.ServiceVersion.V2022_11_02
? Constants.Blob.Append.Pre_2022_11_02_MaxAppendBlockBytes
: Constants.Blob.Append.MaxAppendBlockBytes;
其中这些常量是:
public const int Pre_2022_11_02_MaxAppendBlockBytes = 4 * Constants.MB; // 4MB
public const int MaxAppendBlockBytes = 100 * Constants.MB; // 100MB
这个更大的大小目前(还)没有记录。
这是在包Azure.Storage.Blobs Version 12.17.0
中定义的,发布于2023-07-11。
然而,在之前的包版本12.16.0
中,我们看到了一些不同的东西:
public virtual int AppendBlobMaxAppendBlockBytes => Constants.Blob.Append.MaxAppendBlockBytes;
const int MaxAppendBlockBytes = 4 * Constants.MB; // 4MB
假设:
测试代码和容器正在使用新的、更大的100MB值。
出现问题的代码正在使用较小的4MB值。
看起来这个检查是由_Azure_执行的,而不是由Azure客户端代码执行的,因此单独更新Azure客户端包并不能解决问题;实际上,如果告诉你可以写入100MB,而实际上不能,或者反之,可能会使情况变得更糟。
这可能解释了为什么最近创建的容器有效,而旧的容器无效。在Azure中,这个大小限制是一个容器设置还是一个账户设置?是否可以在现有容器上进行非破坏性地更改?不幸的是,这方面的文档尚未完善。
你可以通过选项来控制这个设置,例如:
var opts = new BlobClientOptions(BlobClientOptions.ServiceVersion.V2021_12_02);
var client = new AppendBlobClient(uri, creds, opts);
但是,虽然这会设置int AppendBlobClient.AppendBlobMaxAppendBlockBytes
,但它本身并不会导致“分块”,你仍然需要像这样追加最大大小的块。
文档仍然主要讨论4MB,但请参见这里的“Azure存储版本控制”:
> 最大追加块内容长度已从4 MiB提升到100 MiB。
英文:
Have a look at the source of int AppendBlobClient.AppendBlobMaxAppendBlockBytes
This is
public virtual int AppendBlobMaxAppendBlockBytes =>
ClientConfiguration.Version < BlobClientOptions.ServiceVersion.V2022_11_02
? Constants.Blob.Append.Pre_2022_11_02_MaxAppendBlockBytes
: Constants.Blob.Append.MaxAppendBlockBytes;
Where these constants are:
public const int Pre_2022_11_02_MaxAppendBlockBytes = 4 * Constants.MB; // 4MB
public const int MaxAppendBlockBytes = 100 * Constants.MB; // 100MB
This larger size is not (yet) documented.
This is defined in the package Azure.Storage.Blobs Version 12.17.0
, released 2023-07-11.
However, in the previous package version, 12.16.0
we see something different:
public virtual int AppendBlobMaxAppendBlockBytes => Constants.Blob.Append.MaxAppendBlockBytes;
const int MaxAppendBlockBytes = 4 * Constants.MB; // 4MB
Hypothesis:
The test code and container is using the new, larger 100Mb value.
The failing code is using the smaller 4Mb value.
it looks like this check is carried out by Azure and not by the Azure client code, so updating the Azure client package does not in itself fix the issue; in fact can make it worse if it tells you that can write 100Mb, when you can not, or vice versa.
This could explain why a recently-created container works where an older one does not. Is this size limit in Azure a container or an account setting? Can it be changed non-destructively on existing containers? Unfortunately this is not yet documented.
You can control this setting with options, e.g.
var opts = new BlobClientOptions(BlobClientOptions.ServiceVersion.V2021_12_02);
var client = new AppendBlobClient(uri, creds, opts);
But while this sets int AppendBlobClient.AppendBlobMaxAppendBlockBytes
, it does not in itself cause "chunking", you still have to append chunks of the max size like this.
The docs still mostly just talk about 4Mb, but see here in "Versioning for Azure Storage":
> The max append block content length has been raised from 4 MiB to 100 MiB.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论