为什么这个 Cosmos Db 查询比点读更好?

huangapple go评论92阅读模式
英文:

Why this Cosmos Db query is better than a point read?

问题

Question 1:
如果我只对视图模型感兴趣,想要将宇宙数据库不返回整个“用户”对象给 Web 应用程序(Asp.Net Core),而只在宇宙数据库服务器上进行选择并发送给我视图模型,该怎么做?

Question 2:
RU 相同是否意味着 Cosmos Db 即使在使用第一种选项,即“UserViewModel”时,仍会将整个对象返回给我,而是由我的 Web 应用程序进行类型转换?

Question 3:
为什么这种方法(3.03 RUs)比点读(4.76 RUs)更高效?

Question 4:
请求费用仍然是3.03 RU。为什么?

Question 5:
请求费用现在为3.8 RUs。为什么比3.03 RUs更高?

感谢您的问题。如果您有任何其他翻译需求,请随时提出。

英文:

After reading and watching infinite videos on Cosmos Db partitioning,
I sweared only on point reads.

In the NoSQL .Net SDK this translates to using ReadItemAsync<T>. But it seems I found a cheaper method and would like to know why that is expected, if it is expected.

Testing on the emulator, I have a container with a partition key path set to /userId. I have 200 documents representing a User poco. Each User is of the form :

public class User
{
    public User(string userId)
    {
        UserId = userId;
    }
    public required string Id { get; set; }

    public string UserId { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string FullName { get; set; }
    public string UserName { get; set; }
    public string Email { get; set; }

    public IList<Order> Orders { get; set; }

    public IList<Sale<Order, User>> Sales { get; set; }

    public string Type { get; } = nameof(User);
}

Importantly for what follows, each user has an average of 500 orders and 500 sales. I understand that I have to keep the documents lightweight, but I am testing. This results in an average document size of circa 120kB, so I believe this is still ok (<<2 MB).

More specifically, the app is designed such that I expect customers to do:

  1. Rare reads of the entire User object (i.e. the complete document in the container)
  2. More frequent reads of a part of the document, where I want only a view model of the User.

Thus, I have in addition the class:

public class UserViewModel
{
   public string FirstName { get; set; }
}

which essentially represents a view of the User class without the Orders and Sales and I focus only on the first name for testing.

Question 1

If I am interested only in the View Model to display it to the user, how I can ask Cosmos db not to return the entire User object to the web app (Asp.Net Core) but rather just do a selection on the cosmos db server and send me the View Model?


Test 1:
Listen to the Gods of Cosmos Db and do a point read by using as a Template the UserViewModel, i.e. do this

        var user = await _container.ReadItemAsync&lt;UserViewModel&gt;(&quot;fbea1444-8dcb-cb89-8cde-7adf8a580812&quot;, new PartitionKey(&quot;920a8e38-481a-3874-f14e-6de85029419f&quot;));

The result of this point read on the logical partition gives

    &quot;resource&quot;: {
      &quot;firstName&quot;: &quot;Reggie&quot;
    },
    &quot;statusCode&quot;: 200,
    &quot;diagnostics&quot;: { },
    &quot;requestCharge&quot;: 4.76

and if, instead, I cast to the User

   var user = await _container.ReadItemAsync&lt;User&gt;(&quot;fbea1444-8dcb-cb89-8cde-7adf8a580812&quot;, new PartitionKey(&quot;920a8e38-481a-3874-f14e-6de85029419f&quot;));

I obtain of course the (huge) User as a resource, but the request charge is the same, 4.76 RUs.

Question 2: That the RU is the same, does this mean that Cosmos Db sends me back the entire object even if I use the first option, i.e. UserViewModel, and it is my web app which does the cast?


Test 2: Where Clause on the partition userId

In the hope of getting better, I abandoned the point read to use instead GetItemLinqQueryable

        // Get LINQ IQueryable object
        var queryable = _container.GetItemLinqQueryable&lt;User&gt;();

        // Construct LINQ query
        var matches = queryable
            .Where(b =&gt; b.UserId == &quot;920a8e38-481a-3874-f14e-6de85029419f&quot;)
            .Select(u =&gt; new UserViewModel() { FirstName = u.FirstName });

        // Convert to feed iterator
        using var linqFeed = matches.ToFeedIterator();

        var totalCharge = 0.0;
        // Iterate query result pages
        while (linqFeed.HasMoreResults)
        {
            var response = await linqFeed.ReadNextAsync();
            result.Add($&quot;Request Charge {response.RequestCharge} {totalCharge += response.RequestCharge} RUs&quot;);
            result.Add(response);
        }
  &quot;Request Charge 3.03 3.03 RUs&quot;,
  [
    {
      &quot;firstName&quot;: &quot;Reggie&quot;
    }
  ]

Question 3

Why this method (3.03 RUs) is more efficient than a point read (4.76 Rus)?


Test 3 Where Clause on FirstName (not the parition key)

From the above code, I also tested by using the Where clause on something else than the parition key property UserId, i.e.

        var matches = queryable
              .Where(b =&gt; b.FirstName == &quot;Reggie&quot;)
              .Select(u =&gt; new UserViewModel() { FirstName = u.FirstName });

Question 4

The Request charge is still 3.03 Ru. Why is that?


Test 4 Where clause with no Select

        // Get LINQ IQueryable object
    var queryable = _container.GetItemLinqQueryable&lt;UserViewModel&gt;();

    // Construct LINQ query
    var matches = queryable
          .Where(b =&gt; b.FirstName == &quot;Reggie&quot;);

Question 5

The Request charge is now 3.8 RUs. Why is it higher than the 3.03RUs?

Adding a Select clause will allow me to get 3.03Rus, whereas it does not make sense to have the Select clause given that I already use UserViewModel in GetItemLinqQueryable.

Note that I also tested by adding a UserId property to the ViewModel, thinking this is partition issue, and thus replaced the above where clause by .Where(b=&gt;b.UserId == &quot;920a8e38-481a-3874-f14e-6de85029419f&quot;). But the request charge is the same, 3.8 Rus > 3.03 Rus.

These questions will allow me to understand how I can best ask the Cosmos Db servers to the job of sending me only the bits I need. My first test 1 above failed, and it seems the reason being that ReadItemAsync&lt;T&gt; does not ask Cosmos Db to "cast" on its server to send the much smaller ViewModel along the wire. Instead, what this method does, I believe, tests, is that it merely gets the entire document from Cosmos Db, and does the cast on my own app server.

Thank you

答案1

得分: 2

> 当然,我可以获取(巨大的)用户作为资源,但请求费用是一样的,4.76 RUs。

模型(类型)仅用于反序列化,ReadItem将始终返回整个文档。如果您的模型属性较少,其他属性将不会被反序列化,但它们仍然包含在响应中。

> 为什么这种方法(3.03 RUs)比点读(4.76 RUs)更有效?

可能是因为数据量较少,这不是一个公平的比较,因为您在查询中进行了投影,从而减少了从服务返回的数据量。

> 请求费用现在为3.8 RUs。为什么比3.03 RUs更高?

您的查询不同,它没有进行过滤,因此在服务上处理和返回更多的数据。

总之,ReadItem将优于SELECT * WHERE Id/PartitionKey,但如果文档很大,SELECT <单一属性> WHERE Id/PartitionKey,其中投影排除了大数据属性,可能更便宜,因为数据量要低得多。

英文:

> I obtain of course the (huge) User as a resource, but the request charge is the same, 4.76 RUs

The Model (type) is only used for deserialization, ReadItem will always return the entire document. If you have a Model that has less properties, the others will simply not get deserialized, but are part of the response.

> Why this method (3.03 RUs) is more efficient than a point read (4.76 Rus)?

Probably because the data volume is less, it's not a fair comparison because you are doing a projection on the query, thus reducing the volume of data that is returned from the service.

> The Request charge is now 3.8 RUs. Why is it higher than the 3.03RUs?

Your query is different, it is not filtering and thus processing and returning more data on the service.

Bottom line is, a ReadItem will be better than a SELECT * WHERE Id/PartitionKey, but if the documents are big, a SELECT &lt;single property&gt; WHERE Id/PartitionKey where the projection leaves out the big data properties, it might be cheaper because the volume of data is much lower.

huangapple
  • 本文由 发表于 2023年8月5日 00:02:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76837539.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定