英文:
Idiomatic formatting of error messages and other complex strings
问题
在创建命令行应用程序时,通常需要对命令行参数进行解析,并且如果期望的参数数量不同或者参数无意义时,需要打印错误消息。为了简单起见,假设程序接受一个正整数作为其唯一参数。在Haskell中,解析和进一步的程序执行可以这样实现:
main :: IO ()
main = do
args <- getArgs
case args of
[arg] -> case readMaybe arg :: Maybe Int of
Just n | n > 0 -> runProg n
Just n -> die $ "expected a positive integer (got: " <> show n <> ")"
Nothing -> die $ "expected an integer (got: " <> arg <> ")"
_ -> die $ "expected exactly one argument (got: " <> show (length args) <> ")"
对于我来说,创建适当的错误消息感觉有些笨拙,特别是在任何我想包含非字符串参数的地方都要使用 show
。虽然有 printf
这个函数,但是它感觉... 不太符合Haskell的风格。在这里,使用何种方法才是惯用的Haskell方式呢?也许我对我列出的方法有偏见是不合理的,实际上,它们可能是符合Haskell惯用法的呢?
英文:
When creating a command-line app, one usually has to do some kind of parsing of command-line arguments, and print an error message if a different number of arguments is expected, or they do not make sense. For the sake of simplicity let's say that a program takes a positive integer as its only argument. Parsing and further program execution in Haskell can be done like this:
main :: IO ()
main = do
args <- getArgs
case args of
[arg] -> case readMaybe arg :: Maybe Int of
Just n | n > 0 -> runProg n
Just n -> die $ "expected a positive integer (got: " <> show n <> ")"
Nothing -> die $ "expected an integer (got: " <> arg <> ")"
_ -> die $ "expected exactly one argument (got: " <> show (length args) <> ")"
Creation of appropriate error message feels clunky to me, especially combined with show
anywhere I want to include a non-string argument. There is printf
but this on the other hand feels... not Haskell-y. What would be the idiomatic approach here? Perhaps my bias against the methods I listed is unjustified and it is, in fact, idiomatic Haskell?
答案1
得分: 3
根据评论,如果您实际上正在解析命令行参数,您可能希望使用optparse-applicative
(或者也可以使用optparse
)。
更一般地说,我认为在Haskell中构建复杂错误消息的合理惯用方法是使用代数数据类型表示错误:
data OptError
= BadArgCount Int Int -- 预期的参数个数,实际的参数个数
| NotInteger String
| NotPositive Int
提供一个漂亮的打印函数:
errorMessage :: OptError -> String
errorMessage (BadArgCount exp act) = "期望 " <> show exp
<> " 个参数,实际得到 " <> show act
errorMessage (NotInteger str) = "期望整数,实际得到 " <> show str
errorMessage (NotPositive n) = "期望正整数,实际得到 " <> show n
然后,在支持抛出错误的Monad中执行处理:
data Args = Args Int
processArgs :: [String] -> Either OptError Args
processArgs [x] = case readMaybe x of
Just n | n > 0 -> pure $ Args n
| otherwise -> throwError $ NotPositive n
Nothing -> throwError $ NotInteger x
processArgs xs = throwError $ BadArgCount 1 (length xs)
对于小型命令行实用程序中的参数处理来说,这显然是过度设计,但在需要复杂错误报告的其他上下文中,它非常有效,并且相对于die ...
方法具有以下几个优势:
- 所有错误消息都在一个地方列出,因此您确切知道
processArgs
函数可以抛出哪些错误。 - 错误构造受到类型检查,减少了错误处理代码中出现错误的潜在可能性。
- 错误报告与错误呈现分开。这对国际化、终端和非终端输出的分离错误报告样式、在希望自行处理错误的驱动程序代码中重用函数等都很有用。这也对开发来说更加人性化,因为您不必中断“真正的编码”来创建合理的错误消息。通常,这会导致最终产品中更好的错误报告,因为它鼓励您在核心逻辑完成后一次性编写清晰、一致的错误消息集。
- 它有助于系统地重构错误,例如添加位置信息(对于命令行参数来说无关,但对于输入文件中的错误而言是相关的)或添加修复提示/建议。
- 相对容易定义一个自定义的Monad,还支持警告和“非致命”错误,允许进一步错误检查继续,一次生成所有错误的错误列表,而不是在第一个错误后失败。
我尚未在命令行参数中使用这种方法,因为我通常使用optparse-applicative
。但是,在编写解释器时,我曾经使用过这种方法。
英文:
As per the comment, if you're actually parsing command line arguments, you probably want to use optparse-applicative
(or maybe optparse
).
More generally, I think a reasonably idiomatic way of constructing complex error messages in Haskell is to represent the errors with an algebraic data type:
data OptError
= BadArgCount Int Int -- expected, actual
| NotInteger String
| NotPositive Int
supply a pretty-printer:
errorMessage :: OptError -> String
errorMessage (BadArgCount exp act) = "expected " <> show exp
<> " arguments, got " <> show act
errorMessage (NotInteger str) = "expected integer, got " <> show str
errorMessage (NotPositive n) = "expected positive integer, got " <> show n
and perform the processing in a monad that supports throwing errors:
data Args = Args Int
processArgs :: [String] -> Either OptError Args
processArgs [x] = case readMaybe x of
Just n | n > 0 -> pure $ Args n
| otherwise -> throwError $ NotPositive n
Nothing -> throwError $ NotInteger x
processArgs xs = throwError $ BadArgCount 1 (length xs)
This is certainly overkill for argument processing in a small command-line utility, but it works well in other contexts that demand complex error reporting, and it has several advantages over the die ...
approach:
- All the error messages are tabulated in one place, so you know exactly what errors the
processArgs
function can throw. - Error construction is type checked, reducing the potential for errors in your error handling code.
- Error reporting is separated from error rendering. This is useful for internationalization, separate error reporting styles for terminal and non-terminal output, reuse of the functions in driver code that wants to handle errors itself, etc. It's also more ergonomic for development, since you don't have to take a break from "real coding" to make up a sensible error message. This typically results in better error reporting in the final product, since it encourages you to write a clear, consistent set of error messages all at once, after the core logic is finished.
- It facilitates refactoring the errors systematically, for example to add location information (not relevant for command line arguments, but relevant for errors in input files, for example), or to add hints/recommendations for correction.
- It's relatively easy to define a custom monad that also supports warnings and "non-fatal" errors that allow further error checking to continue, generating a list of errors all at once, instead of failing after the first error.
I haven't used this approach for command line arguments, since I usually use optparse-applicative
. But, I have used it when coding up interpreters.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论