2020年8月31日 01:59:55go评论78阅读模式

英文:

Parsing picocli-based CLI usage output into structured data

问题

我有一组基于 picocli 的应用程序，我想解析用法输出以获取结构化数据。到目前为止，我已经编写了三个不同的输出解析器，但对于其中任何一个都不满意（易损性、复杂性、扩展困难等等）。对于如何清洁地解析这种类型的半结构化输出，您有什么想法吗？

用法输出通常如下所示：

Usage: taker-mvo-2 [-hV] [-C=file] [-E=file] [-p=payoffs] [-s=millis] PENALTY
                    (ASSET SPREAD)...
Submits liquidity-taking orders based on mean-variance optimization of multiple
assets.
      PENALTY             risk penalty for payoff variance
      (ASSET SPREAD)...   Spread for creating market above fundamental value
                            for assets
  -C, --credential=file   credential file
  -E, --endpoint=file     marketplace endpoint file
  -h, --help              display this help message
  -p, --payoffs=payoffs   payoff states and probabilities (default: .fm/payoffs)
  -s, --sleep=millis      sleep milliseconds before acting (default: 2000)
  -V, --version           print product version and exit

我想捕获程序名称和描述、选项、参数以及参数组，以及它们的描述，放入一个 Agent 类中：

public class Agent {
    private String name;
    private String description = "";
    private List<Option> options;
    private List<Parameter> parameters;
    private List<ParameterGroup> parameterGroups;
}

程序名称是 taker-mvo-2，(可能多行) 描述位于 (可能多行) 参数列表之后：

Submits liquidity-taking orders based on mean-variance optimization of multiple assets.

选项（在方括号内）应被解析为：

public class Option {
    private String shortName;
    private String parameter;
    private String longName;
    private String description;
}

解析后的选项的 JSON 如下：

options: [{
  "shortName": "h",
  "parameter": null,
  "longName": "help",
  "description": "display this help message"
}, {
  "shortName": "V",
  "parameter": null,
  "longName": "version",
  "description": "print product version and exit"
}, {
  "shortName": "C",
  "parameter": "file",
  "longName": "credential",
  "description": "credential file"
}, {
  "shortName": "E",
  "parameter": "file",
  "longName": "endpoint",
  "description": "marketplace endpoint file"
}, {
  "shortName": "p",
  "parameter": "payoffs",
  "longName": "payoffs",
  "description": "payoff states and probabilities (default: ~/.fm/payoffs)"
}]

类似地，参数应解析为：

public class Parameter {
    private String name;
    private String description;
}

以及被 ( 和 )... 包围的参数组应解析为：

public class ParameterGroup {
    private List<String> parameters;
    private String description;
}

我编写的第一个手写解析器遍历缓冲区，随着进展捕获数据。它运行得相当不错，但看起来很糟糕。而且很难扩展。第二个手写解析器在遍历缓冲区时使用了正则表达式。与第一个相比看起来更好，但仍然很丑陋且难以扩展。第三个解析器也使用了正则表达式。可能是其中最好看的，但仍然很丑陋且难以维护。

我原以为这段文本应该很容易手动解析，但现在我在想，ANTLR 可能是一个更好的工具。您有什么想法或者替代方案吗？

英文:

I have a set of picocli-based applications that I'd like to parse the usage output into structured data. I've written three different output parsers so far and I'm not happy with any of them (fragility, complexity, difficulty in extending, etc.). Any thoughts on how to cleanly parse this type of semi-structured output?

The usage output generally looks like this:

Usage: taker-mvo-2 [-hV] [-C=file] [-E=file] [-p=payoffs] [-s=millis] PENALTY
                    (ASSET SPREAD)...
Submits liquidity-taking orders based on mean-variance optimization of multiple
assets.
      PENALTY             risk penalty for payoff variance
      (ASSET SPREAD)...   Spread for creating market above fundamental value
                            for assets
  -C, --credential=file   credential file
  -E, --endpoint=file     marketplace endpoint file
  -h, --help              display this help message
  -p, --payoffs=payoffs   payoff states and probabilities (default: .fm/payoffs)
  -s, --sleep=millis      sleep milliseconds before acting (default: 2000)
  -V, --version           print product version and exit

I want to capture the program name and description, options, parameters, and parameter-groups along with their descriptions into an agent:

public class Agent {
    private String name;
    private String description = &quot;&quot;;
    private List&lt;Option&gt; options;
    private List&lt;Parameter&gt; parameters;
    private List&lt;ParameterGroup&gt; parameterGroups;
}

The program name is taker-mvo-2 and the (possibly multi-lined) description is after the (possibly multi-line) arguments list:

Submits liquidity-taking orders based on mean-variance optimization of multiple assets.

Options (in square brackets) should be parsed into:

public class Option {
    private String shortName;
    private String parameter;
    private String longName;
    private String description;

}

The parsed options' JSON is:

options: [ {
  &quot;shortName&quot;: &quot;h&quot;,
  &quot;parameter&quot;: null,
  &quot;longName&quot;: &quot;help&quot;,
  &quot;description&quot;: &quot;display this help message&quot;
}, {
  &quot;shortName&quot;: &quot;V&quot;,
  &quot;parameter&quot;: null,
  &quot;longName&quot;: &quot;version&quot;,
  &quot;description&quot;: &quot;print product version and exit&quot;
}, {
  &quot;shortName&quot;: &quot;C&quot;,
  &quot;parameter&quot;: file,
  &quot;longName&quot;: &quot;credential&quot;,
  &quot;description&quot;: &quot;credential file&quot;
}, {
  &quot;shortName&quot;: &quot;E&quot;,
  &quot;parameter&quot;: file,
  &quot;longName&quot;: &quot;endpoint&quot;,
  &quot;description&quot;: &quot;marketplace endpoint file&quot;
}, {
  &quot;shortName&quot;: &quot;p&quot;,
  &quot;parameter&quot;: payoffs,
  &quot;longName&quot;: &quot;payoffs&quot;,
  &quot;description&quot;: &quot;payoff states and probabilities (default: ~/.fm/payoffs)&quot;
}]

Similarly for the parameters which should be parsed into:

public class Parameter {
    private String name;
    private String description;

}

and parameter-groups which are surrounded by ( and )... should be parsed into:

public class ParameterGroup {
    private List&lt;String&gt; parameters;
    private String description;

}

The first hand-written parser I wrote walked the buffer, capturing the data as it progresses. It works pretty well, but it looks horrible. And it's horrible to extend. The second hand-written parser uses regex expressions while walking the buffer. Better looking than the first but still ugly and difficult to extend. The third parser uses regex expressions. Probably the best looking of the bunch but still ugly and unmanageable.

I thought this text would be pretty simple to parse manually but now I'm wondering if ANTLR might be a better tool for this. Any thoughts or alternative ideas?

答案1

得分: 1

模型

听起来你需要的是一个模型。一个对象模型，用来描述命令、其选项、选项参数类型、选项描述、选项名称，以及用于位置参数、参数组和可能的子命令的类似内容。

然后，一旦你有了应用程序的对象模型，将其呈现为 JSON 或其他格式就相对简单了。

Picocli 有一个对象模型

你可以自己构建这个模型，但如果你已经在使用 Picocli，为什么不利用 Picocli 的优势并使用 Picocli 的内置模型呢？

访问 Picocli 的对象模型

命令可以访问自己的模型

在基于 Picocli 的应用程序中，使用@Command注解的类可以通过声明一个@Spec注解的字段来访问自己的 Picocli 对象模型。Picocli 将把CommandSpec注入到该字段中。

例如：

@Command(name = "taker-mvo-2", mixinStandardHelpOptions = true, version = "taker-mvo-2 0.2")
class TakerMvo2 implements Runnable {
    // ...

    @Option(names = {"-C", "--credential"}, description = "credential file")
    File file;

    @Spec CommandSpec spec; // injected by picocli

    public void run() {
        for (OptionSpec option : spec.options()) {
            System.out.printf("%s=%s%n", option.longestName(), option.getValue());
        }
    }
}

Picocli 用户手册有一个更详细的示例，使用CommandSpec来遍历命令中的所有选项，以查看选项是否为默认值或在命令行上指定了值。

创建任何 Picocli 命令的模型

访问 Picocli 的对象模型的另一种方法是使用带有@Command注解的类（或该类的对象）构建一个CommandLine实例。你可以在 Picocli 应用程序之外执行这个操作。

例如：

class Agent {
    public static void main(String... args) {
        CommandLine cmd = new CommandLine(new TakerMvo2());
        CommandSpec spec = cmd.getCommandSpec();
        
        // 获取子命令
        Map<String, CommandLine> subCmds = spec.subcommands();
        
        // 获取选项列表
        List<OptionSpec> options = spec.options()

        // 获取参数组
        List<ArgGroupSpec> argGroups = spec.argGroups()

        ...
    }
}

英文:

Model

It sounds like what you need is a model. An object model that describes the command, its options, option parameter types, option description, option names, and similar for positional parameters, argument groups, and potentially subcommands.

Then, once you have an object model of your application, it is relatively straightforward to render this as JSON or as some other format.

Picocli has an object model

You could build this yourself, but if you are using picocli anyway, why not leverage picocli's strengths and use picocli's built-in model?

Accessing picocli's object model

Commands can access their own model

Within a picocli-based application, a @Command-annotated class can access its own picocli object model by declaring a @Spec-annotated field. Picocli will inject the CommandSpec into that field.

For example:

@Command(name = &quot;taker-mvo-2&quot;, mixinStandardHelpOptions = true, version = &quot;taker-mvo-2 0.2&quot;)
class TakerMvo2 implements Runnable {
    // ...

    @Option(names = {&quot;-C&quot;, &quot;--credential&quot;}, description = &quot;credential file&quot;)
    File file;

    @Spec CommandSpec spec; // injected by picocli

    public void run() {
        for (OptionSpec option : spec.options()) {
            System.out.printf(&quot;%s=%s%n&quot;, option.longestName(), option.getValue());
        }
    }
}

The picocli user manual has a more detailed example that uses the CommandSpec to loop over all options in a command to see if the option was defaulted or whether a value was specified on the command line.

Creating a model of any picocli command

An alternative way to access picocli's object model is to construct a CommandLine instance with the @Command-annotated class (or an object of that class). You can do this outside of your picocli application.

For example:

class Agent {
    public static void main(String... args) {
        CommandLine cmd = new CommandLine(new TakerMvo2());
        CommandSpec spec = cmd.getCommandSpec();
        
        // get subcommands
        Map&lt;String,CommandLine&gt; subCmds = spec.subcommands();
        
        // get options as a list
        List&lt;OptionSpec&gt; options = spec.options()

        // get argument groups
        List&lt;ArgGroupSpec&gt; argGroups = spec.argGroups()

        ...
    }
}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

解析基于 picocli 的命令行界面使用输出为结构化数据

问题

答案1

“Hibernate”：参数值与预期类型不匹配

在Java文献中，”＃”符号代表什么意思？

将 JSON 响应包装在动态包裹中 – REST API

ANTLR / java / SDK 生成-编译-执行序列在 Windows10 命令窗口上失败。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论