有没有一种简单的方法可以使用nom来要求和解析特定长度的输入?

huangapple go评论49阅读模式
英文:

Is there a way to easily require and parse a specific length of input with nom?

问题

Here's the translated code:

给定一个数字序列,`b"12345678..."`,是否有一种简单的方法让 `nom` 只解析输入的前 `n` 位数字为一种类型,例如 `u16`,然后将后续的 `m` 位数字解析为另一种类型,依此类推。目前,`nom` 将连续的数字序列全部解析为一个单一值。

当前的应用是针对表示日期时间的数字序列,例如输入 `(D:20230416140523Z00'00)` 中的 `YYYYMMDDHHmmSS`。

我的尝试将 `take(n)` 结合起来并不成功。[例如...](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=00647dbe3504b0fd9710b29889c6d732)

```rust
use nom::bytes::complete::take;
use nom::character::complete::u16;
use nom::combinator::{flat_map};
type IResult<'a, T> = nom::IResult<&'a [u8], T, ()>;

fn parse_fixed_length_prefix<'a>(input: &'a [u8], n: usize) -> IResult<u16> {
    flat_map(take(n), u16::<_, ()>)(input)
}

#[test]
fn test() {
    let output = parse_fixed_length_prefix(b"123456789", 4);
    assert_eq!(output, Ok((b"56789".as_ref(), 1234u16)));
    // expected: Ok(([56789], 1234))
}

以上的方法导致了以下编译错误,但我怀疑是否有更好的方法可以避免这个问题...

   Compiling playground v0.0.1 (/playground)
error[E0277]: expected a `FnMut<(_,)>` closure, found `Result<(_, u16), nom::Err<()>>`
   --> src/main.rs:7:5
    |
7   |     flat_map(take(n), u16::<_, ()>)(input)
    |     ^^^^^^^^ expected an `FnMut<(_,)>` closure, found `Result<(_, u16), nom::Err<()>>`
    |
    = help: the trait `FnMut<(_,)>` is not implemented for `Result<(_, u16), nom::Err<()>>`
    = help: the following other types implement trait `Parser<I, O, E>`:
              <And<F, G> as Parser<I, (O1, O2), E>>
              <AndThen<F, G, O1> as Parser<I, O2, E>>
              <Box<(dyn Parser<I, O, E> + 'a)> as Parser<I, O, E>>
              <Or<F, G> as Parser<I, O, E>>
              <nom::FlatMap<F, G, O1> as Parser<I, O2, E>>
              <nom::Into<F, O1, O2, E1, E2> as Parser<I, O2, E2>>
              <nom::Map<F, G, O1> as Parser<I, O2, E>>
    = note: required for `Result<(_, u16), nom::Err<()>>` to implement `Parser<_, _, _>`
note: required by a bound in `nom::combinator::flat_map`
   --> /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/nom-7.1.3/src/combinator/mod.rs:213:6
    |
213 |   H: Parser<I, O2, E>,
    |      ^^^^^^^^^^^^^^^^ required by this bound in `flat_map`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `playground` due to previous error

Hope this helps! If you have any more questions or need further assistance, please feel free to ask.

英文:

Given a sequence of numbers, b&quot;12345678...&quot;, is there an easy way to have nom parse just the first n digits of the input into a type, such as u16, then the subsequent m digits into another type, and so on. Presently, nom consumes the entire contiguous sequence of digits into a single value.

The current application is for a sequence of digits representing datetime, as in, YYYYMMDDHHmmSS, from the input, (D:20230416140523Z00&#39;00)

My attempts combining take(n) have not been successful. For instance...

use nom::bytes::complete::take;
use nom::character::complete::u16;
use nom::combinator::{flat_map};
type IResult&lt;&#39;a, T&gt; = nom::IResult&lt;&amp;&#39;a [u8], T, ()&gt;;

fn parse_fixed_length_prefix&lt;&#39;a&gt;(input: &amp;[u8], n: usize) -&gt; IResult&lt;u16&gt; {
    flat_map(take(n), u16::&lt;_, ()&gt;)(input)
}

#[test]
fn test() {
    let output = parse_fixed_length_prefix(b&quot;123456789&quot;, 4);
    assert_eq!(output, Ok((b&quot;56789&quot;.as_ref(), 1234u16)));
    // expected: Ok(([56789], 1234))
}

The above approach results in the following compile error, however I suspect there is a better approach that would obviate this...

   Compiling playground v0.0.1 (/playground)
error[E0277]: expected a `FnMut&lt;(_,)&gt;` closure, found `Result&lt;(_, u16), nom::Err&lt;()&gt;&gt;`
   --&gt; src/main.rs:7:5
    |
7   |     flat_map(take(n), u16::&lt;_, ()&gt;)(input)
    |     ^^^^^^^^ expected an `FnMut&lt;(_,)&gt;` closure, found `Result&lt;(_, u16), nom::Err&lt;()&gt;&gt;`
    |
    = help: the trait `FnMut&lt;(_,)&gt;` is not implemented for `Result&lt;(_, u16), nom::Err&lt;()&gt;&gt;`
    = help: the following other types implement trait `Parser&lt;I, O, E&gt;`:
              &lt;And&lt;F, G&gt; as Parser&lt;I, (O1, O2), E&gt;&gt;
              &lt;AndThen&lt;F, G, O1&gt; as Parser&lt;I, O2, E&gt;&gt;
              &lt;Box&lt;(dyn Parser&lt;I, O, E&gt; + &#39;a)&gt; as Parser&lt;I, O, E&gt;&gt;
              &lt;Or&lt;F, G&gt; as Parser&lt;I, O, E&gt;&gt;
              &lt;nom::FlatMap&lt;F, G, O1&gt; as Parser&lt;I, O2, E&gt;&gt;
              &lt;nom::Into&lt;F, O1, O2, E1, E2&gt; as Parser&lt;I, O2, E2&gt;&gt;
              &lt;nom::Map&lt;F, G, O1&gt; as Parser&lt;I, O2, E&gt;&gt;
    = note: required for `Result&lt;(_, u16), nom::Err&lt;()&gt;&gt;` to implement `Parser&lt;_, _, _&gt;`
note: required by a bound in `nom::combinator::flat_map`
   --&gt; /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/nom-7.1.3/src/combinator/mod.rs:213:6
    |
213 |   H: Parser&lt;I, O2, E&gt;,
    |      ^^^^^^^^^^^^^^^^ required by this bound in `flat_map`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `playground` due to previous error

答案1

得分: 2

请注意,我只会翻译文本部分,不会翻译代码。以下是文本的翻译:

"不要使用 nom:combination::flat_map。使用 nom:combination::map_restake(n) 的输出转换为所需的类型 u16,然后使用 Result 类型的 map 方法将结果转换为值 Ok<u16>,以便 map_res 可以将其整合到其输出中,此处的类型为 IResult<&[u8], u16, ()>

Rust 中我喜欢的一点是,如果编译通过,那么它就能工作,甚至比 Haskell 更可靠。

根据 @ChayimFriedman 的建议更新解决方案:

fn parse_fixed_length_prefix<'a>(input: &[u8], n: usize) -> IResult<u16> {
    map_parser(take(n), u16)(input)
}
英文:

Don't use nom:combination::flat_map. Use nom:combination::map_res to transform the output of take(n) to the desired type, u16, then use map of the Result type to transform the result to the value, Ok&lt;u16&gt;, that map_res can integrate into its output, in this case IResult&lt;&amp;[u8], u16, ()&gt;.

fn parse_fixed_length_prefix&lt;&#39;a&gt;(input: &amp;[u8], n: usize) -&gt; IResult&lt;u16&gt; {
    map_res(take(n), |v| u16::&lt;_,()&gt;(v).map(|(_,a)|a))(input)
}

One of the things I like about Rust is that if it compiles it works, even more so than Haskell.

Update solution per the recommendations of @ChayimFriedman

fn parse_fixed_length_prefix&lt;&#39;a&gt;(input: &amp;[u8], n: usize) -&gt; IResult&lt;u16&gt; {
    map_parser(take(n), u16)(input)
}

huangapple
  • 本文由 发表于 2023年4月17日 02:19:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/76029568.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定