英文:
How to create an index in Postgres on an expression involving TEXT columns and a DATE column, concatenated?
问题
我想能够搜索与我在数据库中拥有的内容“相似”的人(例如,在他们的姓名或出生日期中有错别字的人)。我想要以以下方式使用levenshtein
函数:
create temp table person (first_name text, last_name text, dob date);
select * from person where levenshtein('johndoe2000-01-01', lower(first_name || last_name || dob)) < 4;
(我知道这可能可以通过使用YYYYMMDD
而不是YYYY-MM-DD
来进行优化,但这与我的问题无关。)
我认为可以通过索引加速搜索。然而,我不知道如何创建一个能够工作的索引:
create index fuzzy on person ((lower(first_name || last_name || dob)));
结果显示ERROR: functions in index expression must be marked IMMUTABLE
。我尝试了一些方法:使用concat
代替||
,显式或使用to_char
将dob
强制转换为text
,但都无济于事。我做错了什么?问题似乎源于混合使用text
和date
- 我可以创建只包含text
列或只包含date
列的索引而没有问题。
英文:
I want to be able to search for people "similar" to what I have in the database (e.g. a person with a typo in their name or date of birth). I want to use the levenshtein
function for that in the following way:
create temp table person (first_name text, last_name text, dob date);
select * from person where levenshtein('johndoe2000-01-01', lower(first_name || last_name || dob)) < 4;
(I am aware this can be probably optimized a bit by e.g. using YYYYMMDD
instead of YYYY-MM-DD
, but that is irrelevant to my question.)
I thought that I could speed up the search with an index. However, I don't know how to create one that would work:
create index fuzzy on person ((lower(first_name || last_name || dob)));
results in ERROR: functions in index expression must be marked IMMUTABLE
. I tried a few things: using concat
instead of ||
, casting dob
to text
explicitly or with to_char
, but to no avail. What am I doing wrong? It seems that the problem results from mixing text
and date
– I can create an index with only text
columns or only the date
column without a problem.
答案1
得分: 1
这个练习没有意义:你永远无法使用索引来加速使用 levenshtein()
的查询。
英文:
The exercise is pointless: you can never use an index to speed up a query that uses levenshtein()
.
答案2
得分: 0
将日期转换为字符串取决于可能会更改的数据库设置,因此它不是不可变的。
select '2021-04-03'::date::text;
2021-04-03
set datestyle TO 'german';
select '2021-04-03'::date::text;
03.04.2021
您可以以多种方式解决这个问题。但一旦解决,它仍然无法作为Levenshtein距离通常无法建立索引。
英文:
Casting dates to strings depends on database setting that might change, and so it is not immutable.
select '2021-04-03'::date::text;
2021-04-03
set datestyle TO 'german';
select '2021-04-03'::date::text;
03.04.2021
You could solve that problem in a variety of ways. But once you do, it still wouldn't work as levenshtein distance is generally not indexable.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论