英文:
Tail recursive remove duplicate consecutive entries in list
问题
I'm here to provide the translated content as requested. Here's the translation of the text you provided:
我正在尝试解决99个OCaml问题中的问题8,该问题要求编写一个名为compress
的函数,用于移除列表中的连续重复项:
assert (
compress
[ "a"; "a"; "a"; "a"; "b"; "c"; "c"; "a"; "a"; "d"; "e"; "e"; "e"; "e" ]
= [ "a"; "b"; "c"; "a"; "d"; "e" ])
我提出了以下解决方案:
let head = function x :: _ -> Some x | [] -> None
let compress list =
let rec fn old_list new_list =
match (old_list, new_list) with
| h :: t, _ ->
fn t (if Some h = head new_list then new_list else h :: new_list)
| _ -> new_list
in
List.rev (fn list [])
;;
网站提供的示例解决方案如下:
let rec compress = function
| a :: (b :: _ as t) -> if a = b then compress t else a :: compress t
| smaller -> smaller;;
起初,我认为我的解决方案更高效,因为它是尾递归的,而提供的解决方案显然不是(需要在堆栈上保留a :: compress t
中的a
)。然而,当我测试我的代码是否尾递归时:
assert (
(compress [@tailcall])
[ "a"; "a"; "a"; "a"; "b"; "c"; "c"; "a"; "a"; "d"; "e"; "e"; "e"; "e" ]
= [ "a"; "b"; "c"; "a"; "d"; "e" ])
它给我一个警告,说它不是尾递归的。为什么?
从我的理解来看,我的解决方案不需要在堆栈上保留任何状态,这应该使它成为尾递归的。
编辑
还尝试将[@tailcall]
应用于fn
,通过List.rev ((fn [@tailcall]) list [])
,但收到相同的警告。
英文:
I'm attempting problem 8 of the 99 OCaml problems, which asks you to write a function compress
that removes consecutive duplicate entires in a list:
assert (
compress
[ "a"; "a"; "a"; "a"; "b"; "c"; "c"; "a"; "a"; "d"; "e"; "e"; "e"; "e" ]
= [ "a"; "b"; "c"; "a"; "d"; "e" ])
I came to the following solution:
let head = function x :: _ -> Some x | [] -> None
let compress list =
let rec fn old_list new_list =
match (old_list, new_list) with
| h :: t, _ ->
fn t (if Some h = head new_list then new_list else h :: new_list)
| _ -> new_list
in
List.rev (fn list [])
;;
The provided example solution by the website is as follows:
let rec compress = function
| a :: (b :: _ as t) -> if a = b then compress t else a :: compress t
| smaller -> smaller;;
At first, I thought that my solution was more efficient as it is tail recursive, while the provided solution is clearly not (requires us to keep the a
in a :: compress t
on the stack). However, when I test if my code is tail recursive:
assert (
(compress [@tailcall])
[ "a"; "a"; "a"; "a"; "b"; "c"; "c"; "a"; "a"; "d"; "e"; "e"; "e"; "e" ]
= [ "a"; "b"; "c"; "a"; "d"; "e" ])
It gives me a warning saying it's not tail recursive. Why?
From my understanding, my solution doesn't require keeping any state on the stack, which should make it tail recursive.
EDIT
Also tried applying the [@tailcall]
to fn
directly via List.rev ((fn [@tailcall]) list [])
, get the same warning.
答案1
得分: 2
Here are the translated portions of your text:
当你进行断言时,compress
不在尾位置,(
=)
是。 这与你的 compress
函数实现无关。
同样,在表达式 List.rev ((fn [@tailcall]) list [])
中,fn
的调用不在尾调用位置。
你可以通过尝试以下方式进行测试:
请注意,尾递归并不总是意味着_更高效_。尾递归函数通常_不太_高效,但它们可以用于大型数据而不会导致栈溢出。如果你处理可能导致这种情况的数据,那么建议重新评估你正在使用的数据结构。
截至 OCaml 4.14 版本,我们也可以使用 tail_mod_cons
使 compress
成为尾递归。
或者,你可以考虑使用延续传递方式来实现这个功能。
作为另一种有趣的尾递归替代方法,你可以编写一个生成 sequence 的 compress
函数。为了做到这一点,我们的 aux
函数需要接受一个参数来跟踪上次看到的值。因为在开始时不会有上次看到的值,所以使用 option
类型是有意义的。
以下是翻译好的部分。如果需要更多翻译,请提供具体的文本。
英文:
When you make your assertion, compress
isn't in tail position, (=)
is. This has no bearing on your compress
function implementation.
# assert (
((=) [@tailcall])
(compress ["a"; "a"; "b"])
["a"; "b"]
);;
- : unit = ()
Similarly, in the expression List.rev ((fn [@tailcall]) list [])
, the call of fn
is not in the tail call position.
You can test this by trying:
let compress list =
let rec fn old_list new_list =
match (old_list, new_list) with
| h :: t, _ ->
(fn[@tailcall]) t (if Some h = head new_list then new_list else h :: new_list)
| _ -> new_list
in
List.rev (fn list [])
Note also that tail-recursive does not always mean more efficient. Tail-recursive functions are often less efficient, but they can be used on large data without a stack overflow. If you are dealing with data likely to cause this, it suggests you may need to re-evaluate the data structure you're using.
We could also, as of OCaml 4.14 make compress
tail-recursive with tail_mod_cons
.
let[@tail_mod_cons] rec compress =
function
| a :: (b :: _ as t) ->
if a = b then compress t
else a :: compress t
| smaller -> smaller;;
Alternatively, you might implement this with continuation passing.
let compress lst =
let rec aux k = function
| ([] | [_]) as lst -> k lst
| a::(b::_ as t) when a = b -> aux k t
| a::t -> aux (fun i -> k (a :: i)) t
in
aux Fun.id lst
As yet another fun alternative that is tail-recursive, you might write a compress
function which generates a sequence. In order to do this, our aux
function needs to take an argument to keep track of the last value seen. Because at the beginning there will not be a last value seen, the option
type makes sense.
# let compress lst =
let rec aux lst last_seen () =
match lst, last_seen with
| [], _ -> Seq.Nil
| x::xs, Some x' when x = x' -> aux xs last_seen ()
| x::xs, _ -> Seq.Cons (x, aux xs (Some x))
in
aux lst None;;
val compress : 'a list -> 'a Seq.t = <fun>
# compress [1;1;1;3;3;4;6;6;7;4]
|> Seq.take 3
|> List.of_seq;;
- : int list = [1; 3; 4]
答案2
得分: 1
以下是已翻译的内容:
"Okay, so I figured it out.
1) Is it tail recursive?
To test if the functions were tail recursive, I decided to just try and break them by inducing a stack overflow:
let _ = compress (List.init 10_000_000 (fun x -> Some x))
For my implementation, this works just fine. The provided solution, on the other hand, results in a segvault, which I'm assuming is from stack overflow:
[1] 70387 segmentation fault ./a.out
So we can conclude that my implementation is indeed tail recursive, while the other one is not.
2) Which one is faster?
I used the following to test the speed of both functions:
let _ =
let l = List.init 250000 (fun x -> x) in
let t = Sys.time () in
let f = compress l in
Printf.printf "Execution time: %fs\n" (Sys.time () -. t);
f
Note, 250000
is the limit on my machine before things stack-overflowed.
For the tail recursive implementation, it was around 0.018s
.
For the non-tail recursive implementation, it was around 0.013s
.
So seems that the overhead of function calls is not enough to make the non-tail recursive implementation slower than the tail recursive one, which requires 2 passes of the list.
I should also be noted, this is also for the worst case, where our input list List.init 250000 (fun x -> x)
is a list with all unique elements. The amount of stack space required for the non-tail recursive implementation is proportional to the number of unique elements, not the number of elements in the list, because in the left branch of if a = b then compress t else a :: compress t
, we don't use any stack space. I tested this by changing the list to only hold a constant and made the list much larger List.init 100000000 (fun x -> 0)
, and the non-tail recursive implementation no longer seg faults. It seems the OCaml compiler is smart enough to know that if the left branch of if a = b then compress t else a :: compress t
is tail recursive, and only allocates a stack if the right branch is hit."
英文:
Okay, so I figured it out.
1) Is it tail recursive?
To test if the functions were tail recursive, I decided to just try and break them by inducing a stack overflow:
let _ = compress (List.init 10_000_000 (fun x -> Some x))
For my implementation, this works just fine. The provided solution, on the other hand, results in a segvault, which I'm assuming is from stack overflow:
[1] 70387 segmentation fault ./a.out
So we can conclude that my implementation is indeed tail recursive, while the other one is not.
2) Which one is faster?
I used the following to test the speed of both functions:
let _ =
let l = List.init 250000 (fun x -> x) in
let t = Sys.time () in
let f = compress l in
Printf.printf "Execution time: %fs\n" (Sys.time () -. t);
f
Note, 250000
is the limit on my machine before things stack-overflowed.
For the tail recursive implementation, it was around 0.018s
.
For the non-tail recursive implementation, it was around 0.013s
.
So seems that the overhead of function calls is not enough to make the non-tail recursive implementation slower than the tail recursive one, which requires 2 passes of the list.
I should also be noted, this is also for the worst case, where our input list List.init 250000 (fun x -> x)
is a list with all unique elements. The amount of stack space required for the non-tail recursive implementation is proportional to the number of unique elements, not the number of elements in the list, because in the left branch of if a = b then compress t else a :: compress t
, we don't use any stack space. I tested this by changing the list to only hold a constant and made the list much larger List.init 100000000 (fun x -> 0)
, and the non-tail recursive implementation no longer seg faults. It seems the OCaml compiler is smart enough to know that if the left branch of if a = b then compress t else a :: compress t
is tail recursive, and only allocates a stack if the right branch is hit.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论