英文:
deterministic and reversible permutation function to store a message
问题
我想通过改变CSV文件列中行的顺序来隐藏一个秘密消息。
我正在寻找一个确定性和可逆的排列函数。
假设我有行列表A=[1,2,3,4,5]按字典序排序。我有5! = 120个可能的排列。这意味着我可以保存一个6位的消息,因为2^6 = 64 < 120。
我想要以下函数encode和decode。
A = [1,2,3,4,5]
message = '010010'
B = encode(A, message) # 例如,[2,1,3,4,5]
B = [2,1,3,4,5]
message = decode(B) # 获取消息
我可以计算所有排列,但如果有很多行,这将花费太长时间。这个函数必须能够处理很多行,例如10,000行。所以我在这里寻求帮助,如果你有一些建议。
英文:
I'd like to hide a secret message in a column of a csv file by changing the order of the row.
I'm looking for a deterministic and reversible permutation function.
Suppose I have the row list A=[1,2,3,4,5] ordered lexicographically. I have 5! = 120 possible permutations. This means I can save a 6-bit message because 2^6 = 64 < 120.
I would like to have the following functions encode and decode.
A = [1,2,3,4,5]
message = '010010'
B = encode(A, message) # [2,1,3,4,5] for example
B = [2,1,3,4,5]
message = decode(B) # get back message
I can compute all permutations, but that would take too long if there are a lot of lines. The function must work with a lot a lines, for instance 10 000 lines.
So I'm asking for help here if you have some suggestion.
答案1
得分: 3
想象一个按升序排列的所有排列的列表(最高有效元素在前)。让 N
代表元素的数量。
您想要确定您的排列在此列表中的(以0为基准的)索引。
您可以逐个元素地进行计算。如果您的第一个元素是3,您知道在以1或2开始的排列之前有 2*(N-1)!
种排列。
然后转到下一个元素,如果它是2,您会得到 1*(N-2)!
种排列,以3,1 开头。
依此类推。最后,您就能得到在列表中的排列的索引。
要进行反向操作,您需要将您的索引分解为模 1!
、2!
、3!
、... 的部分,就像您会分解一个十进制数字的各位数一样。这是因为较小的阶乘总是能够整除较大的阶乘而不产生余数。
最后,您将您的索引号解释为二进制数。
英文:
Imagine a list of all permutations which is sorted in ascending order (most significant element first). Let N
be the number of elements.
You want to determine the (0-based) index of your permutation in this list.
You can go element by element. If your fist element is a 3, you know that there are 2*(N-1)!
many permutations before that starting with 1 or 2.
Then go to the next element, if its a 2, you get 1*(N-2)!
permutations before, starting with 3,1.
Ans so on. At the end you have the index of the permutation in the list.
For the reverse have to decompose your index into the 1!
, 2!
, 3!
, ... part with modulo. Like you would decompose the digits of a base10 number. This works because smaller factorials always divide the bigger factorials without remainder.
You finally interpret your index-number as binary.
答案2
得分: 2
这里有一个在线性时间内将排列和排名相互转换的算法。然而,它使用的排名不是词典序的。它有点奇怪,但是是一致的。我将提供两个函数,一个用于将排名转换为排列,另一个用于执行相反的操作。
首先,要取消排名(从排名到排列的转换):
初始化:
n = 排列的长度
r = 所需的排名
p = 包含n个元素的恒等排列 [0, 1, ..., n]
unrank(n, r, p)
如果 n > 0 则
交换(p[n-1], p[r mod n])
unrank(n-1, floor(r/n), p)
结束
接下来,要进行排名:
初始化:
p = 输入排列
q = 输入排列的逆排列(线性时间内计算,q] = i,其中 0 <= i < n)
n = p的长度
rank(n, p, q)
如果 n=1 则 返回 0
s = p[n-1]
交换(p[n-1], p[q[n-1]])
交换(q展开收缩, q[n-1])
返回 s + n * rank(n-1, p, q)
这两者的运行时间都是 O(n)。
有一篇很好的易读论文解释了为什么这个算法有效:《在线性时间内排列和取消排列排列》,作者是 Myrvold & Ruskey,发表于《信息处理通讯》第79卷,第6期,2001年9月30日,第281-284页。
论文链接:Ranking & Unranking Permutations in Linear Time
英文:
Here's an algorithm to convert between permutations and ranks in linear time. However, the ranking it uses is not lexicographic. It's weird, but consistent. I'm going to give two functions, one that converts from a rank to a permutation, and one that does the inverse.
First, to unrank (go from rank to permutation)
Initialize:
n = length(permutation)
r = desired rank
p = identity permutation of n elements [0, 1, ..., n]
unrank(n, r, p)
if n > 0 then
swap(p[n-1], p[r mod n])
unrank(n-1, floor(r/n), p)
fi
end
Next, to rank:
Initialize:
p = input permutation
q = inverse input permutation (in linear time, q] = i for 0 <= i < n)
n = length(p)
rank(n, p, q)
if n=1 then return 0 fi
s = p[n-1]
swap(p[n-1], p[q[n-1]])
swap(q
展开收缩, q[n-1])
return s + n * rank(n-1, p, q)
end
The running time of both of these is O(n).
There's a nice, readable paper explaining why this works: Ranking & Unranking Permutations in Linear Time, by Myrvold & Ruskey, Information Processing Letters Volume 79, Issue 6, 30 September 2001, Pages 281–284.
http://webhome.cs.uvic.ca/~ruskey/Publications/RankPerm/MyrvoldRuskey.pdf
答案3
得分: 1
简单方法
归并排序具有方便的特性,即如果输入中没有重复元素,那么每次使用的比较都会对输出产生不同的影响。您可以利用这个特性轻松地编码一条消息。
首先,将您的列按照排序顺序排列,然后执行归并排序,但是每次比较两个元素时,使用您的消息中的一位来确定结果。
为了解码您的消息,您必须以相反的顺序执行归并排序操作。在每次取消归并时,您可以使用每个元素的值来确定它来自哪个输入列表,这样就解码了一位消息。
这种方法相当不错,但它并不总是提供绝对最大数量的位。有时它给您提供了稍微少一些,有时则更多。
如果您真的需要最大数量的位数,那么...
更难的方法
在执行选择排序时,存在正好有N!个可能的“下一个最小元素”结果序列,每个不同的序列都会产生不同的输出。
首先,按顺序排列您的元素,然后执行选择排序,但是不要搜索最小元素,而是使用算术编码从您的消息中解码所选元素的索引。
使用一个合理大小的寄存器进行算术编码,这将接近最优的floor(log(N!)),只有1或2位的差距。
英文:
Easy way
Merge sort has the convenient property that every comparison used has a distinct effect on the output if there are no duplicate elements in the input. You can use this property to encode a message easily.
Start with your column in sorted order, and then perform a merge sort, except that each time you would compare two elements, use a bit from your message to determine the result instead.
In order to decode your message, you have to perform the merge sort operation in reverse. In each un-merge, you can use the value of each element to determine which input list it came from, which decodes one bit of message.
This method is pretty good, but it doesn't always provide the absolute maximum number of bits. Sometimes it gives you a little less, and sometimes more.
If you really need the maximum number of bits, then...
Harder Way
When performing a selection sort, there are exactly N! possible sequences of "next minimum element" results, and each different sequence produces different output.
Start with your elements in order, and then perform a selection sort, but instead of doing a search for the minimum element, use arithmetic decoding to decode the index of the chosen element from your message.
Using a reasonably sized register for the arithmetic encoding, this will within 1 or 2 bits of the optimal floor(log(N!)).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论