英文:
How to hash a string column in power query
问题
我需要一个使用本地Power Query来对文本字符串进行哈希处理的函数。我尝试过使用JavaScript的Web.Page,但它从不等待脚本完成。
我希望它返回一个整数。
有哪些好的方法可以实现这个目标?
英文:
I need a function that uses a native power query to hash a text string. I have tried using Web.Page with javascript but it never waits on the script to complete.
I would like it to return an integer.
What are some good methods to do this?
答案1
得分: 2
使用提供的JavaScript算法javascript版本,我们可以在Power Query中使用列表函数来对字符串进行哈希处理。其目的是将GUID或文件名转换为整数哈希以节省内存。
let
HashFunction = (input) =>
let
ListChars = Text.ToList(input),
ListNumbers = List.Transform(ListChars,
each Character.ToNumber(_)),
HashNumber = List.Accumulate(ListNumbers,
0,
(state, current) =>
Number.Mod((state * 31 + current), 9223372036854775807))
in
HashNumber
in
HashFunction
该函数将字符串转换为字符列表,然后将每个字符转换为数字。
计算涉及将当前哈希乘以一个常数,加上当前数字,并确保结果是32位整数。
编辑:上述函数对于相似的字符串具有较高的碰撞率。
这个函数效果更好,需要在其他地方定义名为'prime'的查询,其中包含13、131、1313等等的质数。
let
BKDRHashFunction = (input, seed) =>
let
ListChars = Text.ToList(input),
ListNumbers = List.Transform(ListChars, each Character.ToNumber(_)),
HashNumber = List.Accumulate(ListNumbers, 0, (state, current) => Number.Mod((state * seed + current),2147483647))
in
HashNumber
in
BKDRHashFunction
这个函数的碰撞率似乎要好得多。
英文:
Using the algorithm provided in javascript javascript version, we can use list functions in power query to hash a string. The purpose is to convert a guid or file name to an integer hash to save memory.
let
HashFunction = (input) =>
let
ListChars = Text.ToList(input),
ListNumbers = List.Transform(ListChars,
each Character.ToNumber(_)),
HashNumber = List.Accumulate(ListNumbers,
0,
(state, current) =>
Number.Mod((state * 31 + current), 9223372036854775807))
in
HashNumber
in
HashFunction
enter code here
The function converts the string to a list of characters and then each character is converted to a number.
The calculation involves multiplying the current hash by a constant, adding the current number, and ensuring the result is a 32-bit integer.
Edit: The function above has a high collision rate for similar strings.
This function works better, with a query called 'prime' defined elsewhere with a prime number in the sequence 13,131,1313...
let
BKDRHashFunction = (input, seed) =>
let
ListChars = Text.ToList(input),
ListNumbers = List.Transform(ListChars, each Character.ToNumber(_)),
HashNumber = List.Accumulate(ListNumbers, 0, (state, current) => Number.Mod((state * seed + current),2147483647))
in
HashNumber
in
BKDRHashFunction
The collision rate appears to be much better for this one.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论