如何通过unsafePerformIO或类似方法在Haskell中将对象持久保存到内存位置

huangapple go评论62阅读模式
英文:

How to persist an object in Haskell to a memory location via unsafePerformIO or similar

问题

在一个我几乎无法控制的Haskell系统中,我提供了以下语法纯净的函数:

    doTheWork :: Int -> TheInput -> MyResult
    doTheWork counter data = ...

这个函数执行一些工作并返回结果。但是这个函数会被再次调用(直到系统资源用尽),使用递增的`Int`在相同的问题(相同的`TheInput`数据)上做更多的工作。

所以为了利用被再次调用的特性,而不是每次都重新开始,我需要存储中间的`MyResults`。当然,“正确”的语义纯净的方式是系统调用函数时同时传递旧的最后结果;也就是说函数签名应该是像这样的`doTheWork :: MyResult -> Int -> TheInput -> MyResult`,但是我无法控制签名。我的函数必须符合问题开头显示的签名!而且我不能将函数移到程序的IO Monad中(因为我无法控制函数被调用的位置)。

所以我考虑通过`unsafePerformIO`或类似的方式自己在内存中存储`MyResults`。但是我卡在了“安全写入内存部分”。我可以使用类似以下的方式将结果打印到stdout:

```haskell
storeMyResults :: MyResults -> MyResults
storeMyResults data =
  unsafePerformIO $ do
    putStrLn (show data)
    return data

然后在doTheWork结束时使用storeMyResults data将结果返回,每次函数计算新结果时都将它们写出来。

但是我不知道如何将MyResults值写入内存,然后在下一次调用时读取它。我认为我可以将它写入文件,但那太慢了。

有办法将一个值存储在内存位置吗?


<details>
<summary>英文:</summary>

In a Haskell system I don&#39;t have much control over, I provide a syntactically pure function with the following signature:

    doTheWork :: Int -&gt; TheInput -&gt; MyResult
    doTheWork counter data = ...

This function does some work and returns its result. However the function is called again and again (until the system resource are used up) with incremented `Int`s to do more work on the same problem (the _same_ `TheInput` data).

So to make use of the fact to be called again and again and not every time start again, I would need to store the intermediate `MyResults`. Of course, the &quot;proper&quot; semantically pure way would be that the system calls the function together with the old last results; i.e. the function signature would be something like `doTheWork :: MyResult -&gt; Int -&gt; TheInput -&gt; MyResult`, but I have no control over the signature. The signature my function has to conform to is the shown at the beginning of the question! And I cannot move the function into the IO Monad of the program (because I don&#39;t have control over where the function is called).

So I was thinking to store the `MyResults` myself in memory via `unsafePerformIO` or similar. However I got stuck on the &quot;safe to memory part&quot;. I can print the results out to stdout with something like

storeMyResults :: MyResults -> MyResults
storeMyResults data =
unsafePerformIO $ do
putStrLn show data)
return data


and then use `storeMyResults data` at the end of `doTheWork` when returning the results to write them out each time the function has computed new results.

But I have no idea how to write the `MyResults` value to memory and then read it out the next time I get called. I think I could write it to a file, but that&#39;s unnecessarily slow. 

Is there a way to store a value in a memory location?

</details>


# 答案1
**得分**: 3

以下是翻译的内容:

```haskell
典型的做法是使用 `unsafePerformIO` 创建一个顶层的 [`IORef`](https://hackage.haskell.org/package/base-4.18.0.0/docs/Data-IORef.html)。`IORef` 通常提供在 `IO` 内部的“可变内存位置”功能,但你也可以将它从 `IO` 中取出。

    data MemoRecord = MemoRecord Int TheInput MyResult
    cacheCell :: IORef (Maybe MemoRecord)
    cacheCell = unsafePerformIO (newIORef Nothing)

在这一点上,你拥有一个可变的、程序全局的变量,就像在 C 中一样。

继续将这个变量与 `doTheWork` 结合起来。请注意保存输入以及输出,以便你可以检查是否做对了。同样,这在 `IO` 中进行,然后通过 `unsafePerformIO` 退出。

    -- 实际上,如何做取决于你
    -- 我只是写了一个看起来合理的东西,让你知道如何做

    doTheWork :: Int -> TheInput -> MyResult
    doTheWork i x = unsafePerformIO $ do
        r <- doTheWorkHinted i x <$> (check <$> readIORef cacheCell)
        writeIORef cacheCell (Just (MemoRecord i x r))
        return r
      where
        -- check :: MemoRecord -> Maybe (Int, MyResult)
        check (MemoRecord i x' r)
          | x == x'   = Just (i, r) -- 假设 Eq TheInput;否则可能需要使用 System.Mem.StableName,它可能会产生错误的结果
          | otherwise = Nothing

    -- 给定一个整数,输入以及可能是相同输入的先前结果,找到结果
    doTheWorkHinted :: Int -> TheInput -> Maybe (Int, MyResult) -> MyResult
    doTheWorkHinted = error "实现这部分"
英文:

The typical way to do this is to unsafePerformIO yourself a top-level IORef. IORefs normally provide "mutable memory location" functionality within IO, but you can also break one out of IO.

data MemoRecord = MemoRecord Int TheInput MyResult
cacheCell :: IORef (Maybe MemoRecord)
cacheCell = unsafePerformIO (newIORef Nothing)

At this point, you have a mutable, program-global variable, like you would in C.

Proceed to wire this into doTheWork. Note to take care to save the inputs as well as the output, so you can check you're doing the right thing. Again, do this in IO and then unsafePerformIO yourself out.

-- really, it&#39;s up to you how you want to do this
-- i&#39;ll just write something plausible so you know how

doTheWork :: Int -&gt; TheInput -&gt; MyResult
doTheWork i x = unsafePerformIO $ do
    r &lt;- doTheWorkHinted i x &lt;$&gt; (check =&lt;&lt;) &lt;$&gt; readIORef cacheCell
    writeIORef cacheCell (Just (MemoRecord i x r))
    return r
  where
    -- check :: MemoRecord -&gt; Maybe (Int, MyResult)
    check (MemoRecord i x&#39; r)
      | x == x&#39;   = Just (i, r) -- assuming Eq TheInput; otherwise you may have to resort to System.Mem.StableName, which can give false negatives
      | otherwise = Nothing

-- given an int, the input, and maybe a previous result from the same input, find the result
doTheWorkHinted :: Int -&gt; TheInput -&gt; Maybe (Int, MyResult) -&gt; MyResult
doTheWorkHinted = error &quot;implement this&quot;

答案2

得分: 2

以下是翻译的内容:

这段文字建议将参数的顺序颠倒并利用部分应用来解决问题,而不是涉及任何IO操作来存储这些值。您甚至不需要记忆函数的结果(尽管这也可以以纯粹的方式完成,例如使用MemoTrie)。

而是,您应该简单地颠倒参数的顺序,然后利用部分应用。

doTheWork :: TheInput -> Int -> MyResult
doTheWork inpData
    = \counter -> cheapSpecificComputation counter preciousSharedValue
 where preciousSharedValue = expensiveGeneralComputation inpData

然后可以像这样使用它:

map (doTheWork constantInput) [LONG LIST OF COUNTERS]

其中preciousSharedValue只会计算一次,然后在整个列表中重复使用。但与全局可变引用不同,您无需担心可能出现的所有问题,例如在不同的TheInput值之间错误地切换上下文。惰性计算、并行计算等使得显式的可变存储成为一场噩梦。

参数顺序的颠倒似乎明显是函数将要使用的适当方式。您应该相应地更改函数签名。即使要更改一个大型代码库以适应此更改,在Haskell中也相当容易,因为编译器将准确检查需要修改函数调用的地方。

如果这是需要保留其接口的库,那么正确的方法是为颠倒版本提供一个新的名称,在旧的签名中在其中实现它并不赞成使用旧版本:

workTheDo :: TheInput -> Int -> MyResult
workTheDo inpData = ...

{-# DEPRECATED doTheWork "Use workTheDo" #-}
doTheWork :: Int -> TheInput -> MyResult
doTheWork = flip workTheDo

在这里,doTheWork将像以前一样工作(并且在映射到列表上时不会共享preciousSharedValue)。

英文:

This is too long for a comment, but a necessary remark.

Your description strongly suggests that you should not dabble with any IO shenanigans to store those values. You don't even need to memoise function results (though that also can be done in a pure fashion, e.g. with MemoTrie).

Instead, you should simply flip the arguments around and then exploit partial application.

doTheWork :: TheInput -&gt; Int -&gt; MyResult
doTheWork inpData
       = \counter -&gt; cheapSpecificComputation counter preciousSharedValue
 where preciousSharedValue = expensiveGeneralComputation inpData

This can then be used like

map (doTheWork constantInput) [LONG LIST OF COUNTERS]

where preciousSharedValue will only be computed once and then re-used for all the list. But unlike with a global mutable reference, you don't need to worry about all the things that can go wrong like incorrectly switching context between different TheInput values. Laziness, parallelism and more make explicit mutable storage a nightmare.

The flipped order of arguments seems clearly the appropriate one for the way the function is going to be used. You should change the signature accordingly. Changing even a large code base to adjust for such a change is pretty easy in Haskell, since the compiler will check exactly where a call to the function needs to be modified.

If this is a library that needs to preserve its interface, then the way to go is to give the flipped version a new name, implement the old-signature one in terms of it and deprecate it in favour of the flipped version:

workTheDo :: TheInput -&gt; Int -&gt; MyResult
workTheDo inpData = ...

{-# DEPRECATED doTheWork &quot;Use workTheDo&quot; #-}
doTheWork :: Int -&gt; TheInput -&gt; MyResult
doTheWork = flip workTheDo

Here, doTheWork will work as it did before (and not share the preciousSharedValue when mapped over a list).

huangapple
  • 本文由 发表于 2023年5月21日 09:25:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/76297935.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定