2023年3月7日 16:51:35go评论76阅读模式

英文:

How do I tune symbol column capacity in QuestDB?

问题

CREATE TABLE my_table (symb SYMBOL CAPACITY 256 INDEX CAPACITY 256);

英文:

Say, I have a symbol column with 1M unique values and 10M rows per day and I want to add an index for that column.

How do I tune symbol and index capacity to make sure that QuestDB performance is optimal?

The default values are 256:

CREATE TABLE my_table (symb SYMBOL CAPACITY 256 INDEX CAPACITY 256);

答案1

得分: 1

Internally symbols use a symbol table, i.e. a mapping from string values to internal ids (32-bit integers).

There are two separate capacity settings:

Symbol table capacity: symb SYMBOL CAPACITY N - this one should be at least as big as the expected number of unique symbol values. You could think of the symbol table as a persistent hash table: if the number of buckets is insufficient, there will be unnecessary bucket scans on lookups.
Index block capacity: symb SYMBOL INDEX CAPACITY M - we recommend keeping the default value for this one which is 256. Index blocks are part of a persistent linked list that stores row ids for a given symbol value. There is no need to tweak this capacity as the linked list grows when needed.

You should set the symbol table capacity as big as the expected number of unique symbol values while keeping the default value for the index block capacity:

CREATE TABLE my_table (symb SYMBOL CAPACITY 1000000 INDEX);

There is also CACHE/NOCACHE setting which either enables or disables on-heap cache used for symbol lookups. It's enabled by default and we recommend disabling it only when your symbol column has a lot of unique values (way more than a million) or you have many symbol columns, so that JVM heap won't fit caches for all of them.

One more thing to notice: while indexes help to avoid full scans in certain queries, they slow down inserts. We recommend starting with no index and then adding them if you find the query performance insufficient.

英文:

Internally symbols use a symbol table, i.e. a mapping from string values to internal ids (32-bit integers).

There are two separate capacity settings:

Symbol table capacity: symb SYMBOL CAPACITY N - this one should be at least as big as the expected number of unique symbol values. You could think of the symbol table as a persistent hash table: if the number of buckets is insufficient, there will be unnecessary bucket scans on lookups.
Index block capacity: symb SYMBOL INDEX CAPACITY M - we recommend keeping the default value for this one which is 256. Index blocks are part of a persistent linked list that stores row ids for a given symbol value. There is no need to tweak this capacity as the linked list grows when needed.

You should set the symbol table capacity as big as the expected number of unique symbol values while keeping the default value for the index block capacity:

CREATE TABLE my_table (symb SYMBOL CAPACITY 1000000 INDEX);

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何调整QuestDB中的符号列容量？

问题

答案1

如何在Kdb+中以正确的形式引用一个变量？

SQL异常，用户’ronal’@’localhost’被拒绝访问（使用密码：是）。

如何将备份的mimetypes.xml文件从我的MarkLogic备份中排除？

理解Go语言中的SQL连接池的工作原理

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。