2020年9月19日 17:29:32go评论101阅读模式

英文:

Is it better to do one db request and process data in application, or fetch data from db in multiple queries

问题

我在数据库中保存了大约100个不同类型的物品库存，针对一处房产。以下是代码部分，不需要翻译：

List<Inventory> type1s = inventoryRepo.findByPropertyIdAndType(propertyId, Type1);
List<Inventory> type2s = inventoryRepo.findByPropertyIdAndType(propertyId, Type2);
Map<InventoryType, List<Inventory>> typeListMap = new HashMap<>();
typeListMap.put(Type1, type1s);
typeListMap.put(Type2, type2s);

或者

List<Inventory> inventories = inventoryRepo.findByPropertyIdAndTypeIn(propertyId, Arrays.asList(Type1, Type2));
Map<InventoryType, List<Inventory>> typeListMap = inventories.stream().collect(
        Collectors.groupingBy(Inventory::getType, Collectors.toList()));

注意：数据库是PostgreSQL。根据我所知，第二种方法更好，因为它遵循减少数据库调用的原则。但是，我是否忽略了其他需要考虑的关键方面？

英文:

I have around ~100 inventory of different types saved in db for a property
Is it better to have code like this

List&lt;Inventory&gt; type1s = inventoryRepo.findByPropertyIdAndType(propertyId, Type1);
List&lt;Inventory&gt; type2s = inventoryRepo.findByPropertyIdAndType(propertyId, Type2);
Map&lt;InventoryType, List&lt;Inventory&gt;&gt; typeListMap = new HashMap&lt;&gt;();
typeListMap.put(Type1, type1s);
typeListMap.put(Type2, type2s);

List&lt;Inventory&gt; inventories = inventoryRepo.findByPropertyIdAndTypeIn(propertyId
        , Arrays.asList(Type1, Type2));
Map&lt;InventoryType, List&lt;Inventory&gt;&gt; typeListMap = inventories.stream().collect(
        Collectors.groupingBy(Inventory::getType, Collectors.toList()));

Note: DB is postgresql.
I think 2nd approach is better, going by the rule of having min db calls. But am I missing some other key aspects to be considered?

答案1

得分: 2

在这种情况下，答案通常是 - 它取决于情况。

如果要处理的数据量不多，那么单次往返可能是最佳选择。

另一方面，当数据量增加时，问题就会出现：

应用程序内存可能不足，可能会抛出 OutOfMemoryError。
数据库可能会变得不响应，因为它必须处理一个昂贵的查询。

在这种情况下，通常批处理是最合理的方法。

英文:

As it is always with these question, the answer is - it depends.

If there is not much data to process, then single roundtrip would be optimal.

On the other hand when the amount of data grows, the problems arise:

The application memory might not be sufficient and OutOfMemoryError might be thrown.
The database can be unresponsive because it has to process a costly query.

In that case batching is usually the most reasonable approach.

答案2

得分: 0

从数据库一次性获取数据，然后在应用程序中处理数据更好，因为如果您要扩展，减轻数据库负担始终是明智之举。

英文:

It is better if you fetch the data from db at once and then process it in the application as it's always advisable to have less load on the database if you ever scale.

答案3

得分: -1

在某种程度上，最快的方法是一次性收集所有数据。超过这个点后，建议将数据分成每次大约有1万行的块。

由你来确定最佳拆分点和最适合的块大小。

我无法想象在任何情况下逐行选择将会更快。

英文:

Up to a point, the fastest approach is to gather all data at once. Above that point, it would be advisable to separate the data into chunks of, say, 10K rows at a time.

It's up to you to discover the point at which it would be best to split, and the optimal chunk size.

I can't think of any scenario in which selecting one row at a time would be faster.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Is it better to do one db request and process data in application, or fetch data from db in multiple queries

问题

答案1

答案2

答案3

Spring MockMvc测试：Swagger中的allowableValues未过滤掉错误的参数。

将一个二维数组中搜索数字，然后输出一个布尔数组。

多线程_无法获得预期的输出

（主观）无效的Java类构造函数

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。