如何使用最新的Boost Spirit将代码定义文本解析为XML结构?

huangapple go评论76阅读模式
英文:

How to parse code definition text into XML structure with the latest Boost Spirit?

问题

我是新手学习C++,第一次使用Boost Spirit来处理一个团队的任务,学习并使用C++(我来自web开发背景 :))。在互联网上搜索时,我看到了一些来自这个社区的很好的示例(尤其是来自Sehe),但由于XML结构的复杂性,我无法将所有东西完全组合在一起以完成此任务。

这个解析器将充当中间人,将结构代码定义(由其他团队编写)翻译成XML,以供多个集成团队使用,并根据XML结构生成他们选择的语言的代码。

以下是代码结构定义文本的小例子(来自外部文件)。这个文件根据任务的复杂性可能非常大

Class Simple caption;
Class Simple columns "Column Name";

Class Container CONTAINER_NAME ( 
  Complex OBJECT_NAME ( 
    Simple obj_id 
    Simple obj_property1
    Simple obj_attribute enumeration(EnumOption1, EnumOption2,EnumOption3,EnumOption4)
    Container OBJECT_ITEMS (
      Complex OBJECT_ITEM (
        Simple obj_item_name
        Container set_value (
          Simple obj_item_value
        )
      )
    )
  )
);

解析器将评估并生成以下格式的XML

<task>
  <class>
    <simple>
      <identifier>caption</identifier>
      <literal>" "</literal>
    </simple>
  </class>
  <class>
    <simple>
      <identifier>caption</identifier>
      <literal>"Column Name"</literal>
    </simple>
  </class>
  <class>
    <container>
      <identifier>CONTAINER_NAME:CONTAINER_NAME</identifier>
      <literal>" "</literal>
      <complex>
        <identifier>CONTAINER_NAME:OBJECT_NAME</identifier>
        <literal>" "</literal>
        <simple>
          <identifier>CONTAINER_NAME:obj_id</identifier>
          <literal>" "</literal>
        </simple>
        <simple>
          <identifier>CONTAINER_NAME:obj_property1</identifier>
          <literal>" "</literal>
        </simple>
        <simple>
          <identifier>CONTAINER_NAME:obj_attribute</identifier>
          <literal>" "</literal>
          <enumeration>
            <word>EnumOption1</word>
            <word>EnumOption2</word>
            <word>EnumOption3</word>
            <word>EnumOption4</word>
          </enumeration>
        </simple>
        <container>
          <identifier>CONTAINER_NAME:OBJECT_ITEMS</identifier>
          <literal>" "</literal>
          <complex>
            <identifier>CONTAINER_NAME:OBJECT_ITEM</identifier>
            <literal>" "</literal>
            <simple>
              <identifier>CONTAINER_NAME:obj_item_name</identifier>
              <literal>" "</literal>
            </simple>
            <container>
              <identifier>CONTAINER_NAME:set_value</identifier>
              <literal>" "</literal>
              <simple>
                <identifier>CONTAINER_NAME:obj_item_value</identifier>
                <literal>" "</literal>
              </simple>
            </container>
          </complex>
        </container>
      </complex>
    </container>
  </class>
</task>

根据我所了解,我需要以下内容(只是我基本了解的思路):

  1. 具有Class、Container、Complex、Simple等规则的语法定义,以解析代码定义文本(我的最大挑战);
  2. 一些语义动作/函数来为每个组(Simple、Complex、Container、Class等)创建XML节点。我看到我可以在这里使用msxml6.dll来生成XML,但无法弄清楚如何将它们连接起来。

我看到一些示例是从AST构建XML,但他们使用的XML结构不完全遵循任何标准,因为Container可以包含Complex,但Complex也可以包含Container。

任何帮助、指导或示例,指向我从哪里开始将不胜感激。

更新

  1. 分号用于表示CLASS块的结束。
  2. 存在注释,但将位于单独的一行上。没有内联注释。
  3. 代码定义中没有literal标签。字面内容位于双引号内。请参阅更新的代码定义结构块第2行。
英文:

I'm new to C++ and first time using Boost Spirit taking on a task for my team to learn and work with C++ (coming from web developer background :)). Searching from the internet, I saw some great examples from this community (especially from Sehe) but can't quite piece all things together to achieve this task due to the complication of the XML structure.

This parser will act as the middle man to translate structure code definition (written by some other teams) to XML for multiple integration teams to use and generate code from it to the language of their choices base on the XML structure.

Below is a small example of the code structure definition text (from external file). This file could be very large depending on the task

Class Simple caption;
Class Simple columns "Column Name";

Class Container CONTAINER_NAME ( 
  Complex OBJECT_NAME ( 
    Simple obj_id 
    Simple obj_property1
    Simple obj_attribute enumeration(EnumOption1, EnumOption2,EnumOption3,EnumOption4)
    Container OBJECT_ITEMS (
      Complex OBJECT_ITEM (
        Simple obj_item_name
        Container set_value (
          Simple obj_item_value
        )
      )
    )
  )
);

The parser will evaluate and produce XML in this format

<task>
  <class>
    <simple>
      <identifier>caption</identifier>
      <literal>" "</literal>
    </simple>
  </class>
  <class>
    <simple>
      <identifier>caption</identifier>
      <literal>"Column Name"</literal>
    </simple>
  </class>
  <class>
    <container>
      <identifier>CONTAINER_NAME:CONTAINER_NAME</identifier>
      <literal>" "</literal>
      <complex>
        <identifier>CONTAINER_NAME:OBJECT_NAME</identifier>
        <literal>" "</literal>
        <simple>
          <identifier>CONTAINER_NAME:obj_id</identifier>
          <literal>" "</literal>
        </simple>
        <simple>
          <identifier>CONTAINER_NAME:obj_property1</identifier>
          <literal>" "</literal>
        </simple>
        <simple>
          <identifier>CONTAINER_NAME:obj_attribute</identifier>
          <literal>" "</literal>
          <enumeration>
            <word>EnumOption1</word>
            <word>EnumOption2</word>
            <word>EnumOption3</word>
            <word>EnumOption4</word>
          </enumeration>
        </simple>
        <container>
          <identifier>CONTAINER_NAME:OBJECT_ITEMS</identifier>
          <literal>" "</literal>
          <complex>
            <identifier>CONTAINER_NAME:OBJECT_ITEM</identifier>
            <literal>" "</literal>
            <simple>
              <identifier>CONTAINER_NAME:obj_item_name</identifier>
              <literal>" "</literal>
            </simple>
            <container>
              <identifier>CONTAINER_NAME:set_value</identifier>
              <literal>" "</literal>
              <simple>
                <identifier>CONTAINER_NAME:obj_item_value</identifier>
                <literal>" "</literal>
              </simple>
            </container>
          </complex>
        </container>
      </complex>
    </container>
  </class>
</task>

From what I've read, I will need (just my thought process with a very basic knowledge of this) the following:

  1. Grammar definition with rules for Class, Container, Complex, Simple, to parse the code definition text (my biggest challenge);
  2. Some kind of semantic actions/functions to create XML node for each group (Simple, complex, container, class, etc.). I see that I can use msxml6.dll here for xml generator, but can't figure out how to go hook them in.

I saw a few examples to construct AST then build XML from it but the XML structure they use is not quite follow any standard as Container can have Complex, but Complex can also have Container

Any help or instruction or example to point me to where to begin would be greatly appreciate.

UPDATED

  1. Semicolon is used to indicate the end of CLASS block.
  2. Comment exists but will be on separate line. No inline comment.
  3. There is no literal tag in code definition. literal content is inside doublequote. See updated code definition structure block line #2.

答案1

得分: 1

好的,以下是您请求的翻译部分:

好的,这些解释帮助我意识到了输入和XML之间的对应关系。仍然有一些不太清楚的规范,但让我们继续。
# 解析
1. ### _AST_
一如既往,我从AST开始。这一次,与其基于示例输入,不如基于输出XML更容易:
namespace Ast {
using boost::recursive_wrapper;
using Id      = std::string;
using Literal = std::string;
using Enum    = std::vector<Id>;
struct Base {
Id      id;
Literal literal;
};
struct Simple : Base {
Enum enumeration;
};
struct Complex;
struct Container;
using Class = boost::variant<   
Simple,                     
recursive_wrapper<Complex>, 
recursive_wrapper<Container>
>;
using Classes = std::vector<Class>;
struct Container : Base { Class   element; };
struct Complex   : Base { Classes members; };
using Task = std::vector<Class>;
} // namespace Ast
到目前为止还好,没有什么意外。主要的事情是使用递归变体来允许嵌套复杂/容器类型。作为副产品,我将所有类型的共同部分都反映为`Base`。让我们适应这些,以便作为Fusion序列使用:
BOOST_FUSION_ADAPT_STRUCT(Ast::Simple,    id, literal, enumeration);
BOOST_FUSION_ADAPT_STRUCT(Ast::Complex,   id, literal, members)
BOOST_FUSION_ADAPT_STRUCT(Ast::Container, id, literal, element)
现在Spirit将知道如何传播属性,无需进一步帮助。
2. ### _语法_
骨架很简单,只是将AST节点映射到规则:
template <typename It> struct Task : qi::grammar<It, Ast::Task()> {
Task() : Task::base_type(start) {
start = skip(space)[task_];
// ...
}
private:
qi::rule<It, Ast::Task()> start;
using Skipper = qi::space_type;
qi::rule<It, Ast::Task(), Skipper>      task_;
qi::rule<It, Ast::Class(), Skipper>     class_;
qi::rule<It, Ast::Simple(), Skipper>    simple_;
qi::rule<It, Ast::Complex(), Skipper>   complex_;
qi::rule<It, Ast::Container(), Skipper> container_;
// 词法单元:
qi::rule<It, Ast::Id()>      id_;
qi::rule<It, Ast::Literal()> literal_;
};
注意,我将词法单元(不允许跳过)分组,并将`space`跳过器封装到起始规则中。
因为“classes”可以明确出现,也可以没有前导的`Class`关键字,所以我将引入额外的规则`type_`,这样我们可以这样说:
task_  = *class_ > eoi;
type_  = simple_ | complex_ | container_;
class_ = "Class" > type_ > ';';
并在可以接受Simple/Complex/Container的地方使用`type_`。
对于其余部分,没有太多意外,所以让我们展示整个构造块:
Task() : Task::base_type(start) {
using namespace qi;
start = skip(space)[task_];
// 词法单元:
id_      = raw[alpha >> *('_' | alnum)];
literal_ = '"' > *('\\' >> char_ | ~char_('"')) > '"';
auto optlit = copy(literal_ | attr(std::string(" "))); // 奇怪,但可以
task_      = *class_ > eoi;
type_      = simple_ | complex_ | container_;
class_     = lit("Class") > type_ > ';';
simple_    = lit("Simple") >> id_ >> optlit >> enum_;
complex_   = lit("Complex") >> id_ >> optlit >> '(' >> *type_ >> ')';
container_ = lit("Container") >> id_ >> optlit >> '(' >> type_ >> ')';
enum_      = -(lit("enumeration") >> '(' >> (id_ % ',') > ')' );
BOOST_SPIRIT_DEBUG_NODES(
(task_)(class_)(type_)(simple_)(complex_)(container_)(enum_)(id_)(literal_))
}
> _请注意另一个“额外”的规则(`enum_`)。当然,我可以将所有内容都保留在`simple_`规则中。_
这里有一个**[演示链接](http://coliru.stacked-crooked.com/a/7c57ff4ecf6708f8)**,用于打印示例输入的原始AST:
- (caption " " {})
- (columns "Column Name" {})
- (CONTAINER_NAME " " (OBJECT_NAME " " {(obj_id " " {}), (obj_property1 " " {}), (obj_attribute " " {EnumOption1, EnumOption2, EnumOption3, EnumOption4}), (OBJECT_ITEMS " " (OBJECT_ITEM " " {(obj_item_name " " {}), (set_value " " (obj_item_value " " {}))}))})
遗憾的是,我的所有漂亮的错误处理代码都没有启动:) 输出显然非常丑陋,所以让我们修复它。
# 生成XML
我不是微软的粉丝,也更喜欢其他库来处理XML(参见https://stackoverflow.com/questions/9387610/what-xml-parser-should-i-use-in-c)。
所以我选择了
<details>
<summary>英文:</summary>
Okay, the explanations helped me realize the correspondence between the input and the XML. There&#39;s still a number of ... unclear specs, but let&#39;s roll with it.
# Parsing
----
1. ### _AST_
As always, I start out with the AST. This time instead of basing it on the  sample input, it was easier to base it on the output XML:
namespace Ast {
using boost::recursive_wrapper;
using Id      = std::string;
using Literal = std::string;
using Enum    = std::vector&lt;Id&gt;;
struct Base {
Id      id;
Literal literal;
};
struct Simple : Base {
Enum enumeration;
};
struct Complex;
struct Container;
using Class = boost::variant&lt;   
Simple,                     
recursive_wrapper&lt;Complex&gt;, 
recursive_wrapper&lt;Container&gt;
&gt;;
using Classes = std::vector&lt;Class&gt;;
struct Container : Base { Class   element; };
struct Complex   : Base { Classes members; };
using Task = std::vector&lt;Class&gt;;
} // namespace Ast
So far so good. No surprises. The main thing is using recursive variants to allow nesting complex/container types. As a side note I reflected the common parts of all types as `Base`. Let&#39;s adapt these for use as Fusion sequences:
BOOST_FUSION_ADAPT_STRUCT(Ast::Simple,    id, literal, enumeration);
BOOST_FUSION_ADAPT_STRUCT(Ast::Complex,   id, literal, members)
BOOST_FUSION_ADAPT_STRUCT(Ast::Container, id, literal, element)
Now Spirit will know how to propagate attributes without further help.
1. ### _Grammar_
The skeleton is easy, just mapping AST nodes to rules:
template &lt;typename It&gt; struct Task : qi::grammar&lt;It, Ast::Task()&gt; {
Task() : Task::base_type(start) {
start = skip(space)[task_];
// ...
}
private:
qi::rule&lt;It, Ast::Task()&gt; start;
using Skipper = qi::space_type;
qi::rule&lt;It, Ast::Task(), Skipper&gt;      task_;
qi::rule&lt;It, Ast::Class(), Skipper&gt;     class_;
qi::rule&lt;It, Ast::Simple(), Skipper&gt;    simple_;
qi::rule&lt;It, Ast::Complex(), Skipper&gt;   complex_;
qi::rule&lt;It, Ast::Container(), Skipper&gt; container_;
// lexemes:
qi::rule&lt;It, Ast::Id()&gt;      id_;
qi::rule&lt;It, Ast::Literal()&gt; literal_;
};
Note I grouped the lexemes (that [do not allow a skipper](https://stackoverflow.com/a/17073965/85371)) and encapsulated the `space` skipper into the start rule.
Because &quot;classes&quot; can appear explicitly, but also without the leading `Class` keyword, I will introduce an extra rule `type_` so we can say:
task_  = *class_ &gt; eoi;
type_  = simple_ | complex_ | container_;
class_ = &quot;Class&quot; &gt; type_ &gt; &#39;;&#39;;
And also use `type_` where Simple/Complex/Container is acceptable.
For the rest, there aren&#39;t many surprises, so let&#39;s show the whole constructor block:
Task() : Task::base_type(start) {
using namespace qi;
start = skip(space)[task_];
// lexemes:
id_      = raw[alpha &gt;&gt; *(&#39;_&#39; | alnum)];
literal_ = &#39;&quot;&#39; &gt; *(&#39;\\&#39; &gt;&gt; char_ | ~char_(&#39;&quot;&#39;)) &gt; &#39;&quot;&#39;;
auto optlit = copy(literal_ | attr(std::string(&quot; &quot;))); // weird, but okay
task_      = *class_ &gt; eoi;
type_      = simple_ | complex_ | container_;
class_     = lit(&quot;Class&quot;) &gt; type_ &gt; &#39;;&#39;;
simple_    = lit(&quot;Simple&quot;) &gt;&gt; id_ &gt;&gt; optlit &gt;&gt; enum_;
complex_   = lit(&quot;Complex&quot;) &gt;&gt; id_ &gt;&gt; optlit &gt;&gt; &#39;(&#39; &gt;&gt; *type_ &gt;&gt; &#39;)&#39;;
container_ = lit(&quot;Container&quot;) &gt;&gt; id_ &gt;&gt; optlit &gt;&gt; &#39;(&#39; &gt;&gt; type_ &gt; &#39;)&#39;;
enum_      = -(lit(&quot;enumeration&quot;) &gt;&gt; &#39;(&#39; &gt;&gt; (id_ % &#39;,&#39;) &gt; &#39;)&#39;);
BOOST_SPIRIT_DEBUG_NODES(
(task_)(class_)(type_)(simple_)(complex_)(container_)(enum_)(id_)(literal_))
}
&gt; _Note the other &quot;extra&quot; (`enum_`). Of course, I could have kept it all in the `simple_` rule instead._
Here&#39;s a **[Live Demo](http://coliru.stacked-crooked.com/a/7c57ff4ecf6708f8)** printing the raw AST for the sample input:
- (caption &quot; &quot; {})
- (columns &quot;Column Name&quot; {})
- (CONTAINER_NAME &quot; &quot; (OBJECT_NAME &quot; &quot; {(obj_id &quot; &quot; {}), (obj_property1 &quot; &quot; {}), (obj_attribute &quot; &quot; {EnumOption1, EnumOption2, EnumOption3, EnumOption4}), (OBJECT_ITEMS &quot; &quot; (OBJECT_ITEM &quot; &quot; {(obj_item_name &quot; &quot; {}), (set_value &quot; &quot; (obj_item_value &quot; &quot; {}))}))}))
It&#39;s just a shame that all my pretty error handling code is not firing :) The output is obviously pretty ugly, so let&#39;s fix that.
# Generating XML
----
I&#39;m not a Microsoft fan, and prefer other libraries for XML anyways (see https://stackoverflow.com/questions/9387610/what-xml-parser-should-i-use-in-c).
So I&#39;ll choose PugiXML here.
1. ### Generator
Simply put, we have to teach the computer how to convert any Ast node into XML:
#include &lt;pugixml.hpp&gt;
namespace Generate {
using namespace Ast;
struct XML {
using Node = pugi::xml_node;
// callable for variant visiting:
template &lt;typename T&gt; void operator()(Node parent, T const&amp; node) const { apply(parent, node); }
private:
void apply(Node parent, Ast::Class const&amp; c) const {
using std::placeholders::_1;
boost::apply_visitor(std::bind(*this, parent, _1), c);
}
void apply(Node parent, Id const&amp; id) const {
auto identifier = named_child(parent, &quot;identifier&quot;);
identifier.text().set(id.c_str());
}
void apply(Node parent, Literal const&amp; l) const {
auto literal = named_child(parent, &quot;literal&quot;);
literal.text().set(l.c_str());
}
void apply(Node parent, Simple const&amp; s) const {
auto simple = named_child(parent, &quot;simple&quot;);
apply(simple, s.id);
apply(simple, s.literal);
apply(simple, s.enumeration);
}
void apply(Node parent, Enum const&amp; e) const {
if (!e.empty()) {
auto enum_ = named_child(parent, &quot;enumeration&quot;);
for (auto&amp; v : e)
named_child(enum_, &quot;word&quot;).text().set(v.c_str());
}
}
void apply(Node parent, Complex const&amp; c) const {
auto complex_ = named_child(parent, &quot;complex&quot;);
apply(complex_, c.id);
apply(complex_, c.literal);
for (auto&amp; m : c.members)
apply(complex_, m);
}
void apply(Node parent, Container const&amp; c) const {
auto cont = named_child(parent, &quot;container&quot;);
apply(cont, c.id);
apply(cont, c.literal);
apply(cont, c.element);
}
void apply(Node parent, Task const&amp; t) const {
auto task = named_child(parent, &quot;task&quot;);
for (auto&amp; c : t)
apply(task, c);
}
private:
Node named_child(Node parent, std::string const&amp; name) const {
auto child = parent.append_child();
child.set_name(name.c_str());
return child;
}
};
} // namespace Generate
I&#39;m not gonna say I typed this up error-free in a jiffy, but you&#39;ll recognize the pattern: It&#39;s following the Ast 1:1 to great success.
# FULL DEMO
------
Integrating all the above, and printing the XML output:
**[Live On Compiler Explorer](https://compiler-explorer.com/z/9a17K4Gdf)**
// #define BOOST_SPIRIT_DEBUG 1
#include &lt;boost/fusion/adapted.hpp&gt;
#include &lt;boost/spirit/include/qi.hpp&gt;
#include &lt;iomanip&gt;
namespace qi = boost::spirit::qi;
namespace Ast {
using boost::recursive_wrapper;
using Id      = std::string;
using Literal = std::string;
using Enum    = std::vector&lt;Id&gt;;
struct Base {
Id      id;
Literal literal;
};
struct Simple : Base {
Enum enumeration;
};
struct Complex;
struct Container;
using Class = boost::variant&lt;    //
Simple,                      //
recursive_wrapper&lt;Complex&gt;,  //
recursive_wrapper&lt;Container&gt; //
&gt;;
using Classes = std::vector&lt;Class&gt;;
struct Container : Base { Class   element; };
struct Complex   : Base { Classes members; };
using Task = std::vector&lt;Class&gt;;
} // namespace Ast
BOOST_FUSION_ADAPT_STRUCT(Ast::Simple,    id, literal, enumeration);
BOOST_FUSION_ADAPT_STRUCT(Ast::Complex,   id, literal, members)
BOOST_FUSION_ADAPT_STRUCT(Ast::Container, id, literal, element)
namespace Parser {
template &lt;typename It&gt; struct Task : qi::grammar&lt;It, Ast::Task()&gt; {
Task() : Task::base_type(start) {
using namespace qi;
start = skip(space)[task_];
// lexemes:
id_      = raw[alpha &gt;&gt; *(&#39;_&#39; | alnum)];
literal_ = &#39;&quot;&#39; &gt; *(&#39;\\&#39; &gt;&gt; char_ | ~char_(&#39;&quot;&#39;)) &gt; &#39;&quot;&#39;;
auto optlit = copy(literal_ | attr(std::string(&quot; &quot;))); // weird, but okay
task_      = *class_ &gt; eoi;
type_      = simple_ | complex_ | container_;
class_     = lit(&quot;Class&quot;) &gt; type_ &gt; &#39;;&#39;;
simple_    = lit(&quot;Simple&quot;) &gt;&gt; id_ &gt;&gt; optlit &gt;&gt; enum_;
complex_   = lit(&quot;Complex&quot;) &gt;&gt; id_ &gt;&gt; optlit &gt;&gt; &#39;(&#39; &gt;&gt; *type_ &gt;&gt; &#39;)&#39;;
container_ = lit(&quot;Container&quot;) &gt;&gt; id_ &gt;&gt; optlit &gt;&gt; &#39;(&#39; &gt;&gt; type_ &gt; &#39;)&#39;;
enum_      = -(lit(&quot;enumeration&quot;) &gt;&gt; &#39;(&#39; &gt;&gt; (id_ % &#39;,&#39;) &gt; &#39;)&#39;);
BOOST_SPIRIT_DEBUG_NODES(
(task_)(class_)(type_)(simple_)(complex_)(container_)(enum_)(id_)(literal_))
}
private:
qi::rule&lt;It, Ast::Task()&gt; start;
using Skipper = qi::space_type;
qi::rule&lt;It, Ast::Task(), Skipper&gt;      task_;
qi::rule&lt;It, Ast::Class(), Skipper&gt;     class_, type_;
qi::rule&lt;It, Ast::Simple(), Skipper&gt;    simple_;
qi::rule&lt;It, Ast::Complex(), Skipper&gt;   complex_;
qi::rule&lt;It, Ast::Container(), Skipper&gt; container_;
qi::rule&lt;It, Ast::Enum(), Skipper&gt;      enum_;
// lexemes:
qi::rule&lt;It, Ast::Id()&gt;      id_;
qi::rule&lt;It, Ast::Literal()&gt; literal_;
};
}
#include &lt;pugixml.hpp&gt;
namespace Generate {
using namespace Ast;
struct XML {
using Node = pugi::xml_node;
// callable for variant visiting:
template &lt;typename T&gt; void operator()(Node parent, T const&amp; node) const { apply(parent, node); }
private:
void apply(Node parent, Ast::Class const&amp; c) const {
using std::placeholders::_1;
boost::apply_visitor(std::bind(*this, parent, _1), c);
}
void apply(Node parent, std::string const&amp; s, char const* kind) const {
named_child(parent, kind).text().set(s.c_str());
}
void apply(Node parent, Simple const&amp; s) const {
auto simple = named_child(parent, &quot;simple&quot;);
apply(simple, s.id, &quot;identifier&quot;);
apply(simple, s.literal, &quot;literal&quot;);
apply(simple, s.enumeration);
}
void apply(Node parent, Enum const&amp; e) const {
if (!e.empty()) {
auto enum_ = named_child(parent, &quot;enumeration&quot;);
for (auto&amp; v : e)
named_child(enum_, &quot;word&quot;).text().set(v.c_str());
}
}
void apply(Node parent, Complex const&amp; c) const {
auto complex_ = named_child(parent, &quot;complex&quot;);
apply(complex_, c.id, &quot;identifier&quot;);
apply(complex_, c.literal, &quot;literal&quot;);
for (auto&amp; m : c.members)
apply(complex_, m);
}
void apply(Node parent, Container const&amp; c) const {
auto cont = named_child(parent, &quot;container&quot;);
apply(cont, c.id, &quot;identifier&quot;);
apply(cont, c.literal, &quot;literal&quot;);
apply(cont, c.element);
}
void apply(Node parent, Task const&amp; t) const {
auto task = named_child(parent, &quot;task&quot;);
for (auto&amp; c : t)
apply(task.append_child(&quot;class&quot;), c);
}
private:
Node named_child(Node parent, std::string const&amp; name) const {
auto child = parent.append_child();
child.set_name(name.c_str());
return child;
}
};
} // namespace Generate
int main() { 
using It = std::string_view::const_iterator;
static const Parser::Task&lt;It&gt; p;
static const Generate::XML to_xml;
for (std::string_view input :
{
R&quot;(Class Simple caption;
Class Simple columns &quot;Column Name&quot;;
Class Container CONTAINER_NAME ( 
Complex OBJECT_NAME ( 
Simple obj_id 
Simple obj_property1
Simple obj_attribute enumeration(EnumOption1, EnumOption2,EnumOption3,EnumOption4)
Container OBJECT_ITEMS (
Complex OBJECT_ITEM (
Simple obj_item_name
Container set_value (
Simple obj_item_value
)
)
)
)
);)&quot;,
}) //
{
try {
Ast::Task t;
if (qi::parse(begin(input), end(input), p, t)) {
pugi::xml_document doc;
to_xml(doc.root(), t);
doc.print(std::cout, &quot;  &quot;, pugi::format_default);
std::cout &lt;&lt; std::endl;
} else {
std::cout &lt;&lt; &quot; -&gt; INVALID&quot; &lt;&lt; std::endl;
}
} catch (qi::expectation_failure&lt;It&gt; const&amp; ef) {
auto f    = begin(input);
auto p    = ef.first - input.begin();
auto bol  = input.find_last_of(&quot;\r\n&quot;, p) + 1;
auto line = std::count(f, f + bol, &#39;\n&#39;) + 1;
auto eol  = input.find_first_of(&quot;\r\n&quot;, p);
std::cerr &lt;&lt; &quot; -&gt; EXPECTED &quot; &lt;&lt; ef.what_ &lt;&lt; &quot; in line:&quot; &lt;&lt; line &lt;&lt; &quot;\n&quot;
&lt;&lt; input.substr(bol, eol - bol) &lt;&lt; &quot;\n&quot;
&lt;&lt; std::setw(p - bol) &lt;&lt; &quot;&quot;
&lt;&lt; &quot;^--- here&quot; &lt;&lt; std::endl;
}
}
}
Printing the coveted output:
&lt;!-- language: xml --&gt;
&lt;task&gt;
&lt;class&gt;
&lt;simple&gt;
&lt;identifier&gt;caption&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;/simple&gt;
&lt;/class&gt;
&lt;class&gt;
&lt;simple&gt;
&lt;identifier&gt;columns&lt;/identifier&gt;
&lt;literal&gt;Column Name&lt;/literal&gt;
&lt;/simple&gt;
&lt;/class&gt;
&lt;class&gt;
&lt;container&gt;
&lt;identifier&gt;CONTAINER_NAME&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;complex&gt;
&lt;identifier&gt;OBJECT_NAME&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;simple&gt;
&lt;identifier&gt;obj_id&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;/simple&gt;
&lt;simple&gt;
&lt;identifier&gt;obj_property1&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;/simple&gt;
&lt;simple&gt;
&lt;identifier&gt;obj_attribute&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;enumeration&gt;
&lt;word&gt;EnumOption1&lt;/word&gt;
&lt;word&gt;EnumOption2&lt;/word&gt;
&lt;word&gt;EnumOption3&lt;/word&gt;
&lt;word&gt;EnumOption4&lt;/word&gt;
&lt;/enumeration&gt;
&lt;/simple&gt;
&lt;container&gt;
&lt;identifier&gt;OBJECT_ITEMS&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;complex&gt;
&lt;identifier&gt;OBJECT_ITEM&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;simple&gt;
&lt;identifier&gt;obj_item_name&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;/simple&gt;
&lt;container&gt;
&lt;identifier&gt;set_value&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;simple&gt;
&lt;identifier&gt;obj_item_value&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;/simple&gt;
&lt;/container&gt;
&lt;/complex&gt;
&lt;/container&gt;
&lt;/complex&gt;
&lt;/container&gt;
&lt;/class&gt;
&lt;/task&gt;
&gt; _I still don&#39;t unserstand how the `CONTAINER_NAME:` &quot;namespacing&quot; works, so I&#39;ll leave that to you to get right._
</details>
# 答案2
**得分**: 0
感谢你再次提供这么棒的教训。回答你关于 CONTAINER_NAME 命名空间的问题,它只是用于分组(并不是我规定的,而是定义结构的制定者们想要它这样)。
因此,如果我们解析这行代码:
```cpp
Class Simple caption;

那么结果应该是:

<task>
  <class>
    <simple>
      <identifier>caption:caption</identifier>
      <literal>" "</literal>
    </simple>
  </class>
</task>

命名空间 caption: 被添加,因为这是该类的第一个子元素。但是如果我们解析:

Class Container CONTAINER_NAME ( 
  Complex OBJECT_NAME ( 
    Simple obj_id 
    Container OBJECT_ITEMS (
      Complex OBJECT_ITEM (
        Simple obj_item_name
        Container set_value (
          Simple obj_item_value
        )
      )
    )
  )
);

那么 CONTAINER_NAME: 命名空间将被附加到所有子元素的标识符名称中。

<class>
    <container>
      <identifier>CONTAINER_NAME:CONTAINER_NAME</identifier>
      <literal> </literal>
      <complex>
        <identifier>CONTAINER_NAME:OBJECT_NAME</identifier>
        <literal> </literal>
        <simple>
          <identifier>CONTAINER_NAME:obj_id</identifier>
          <literal> </literal>
        </simple>
        <container>
          <identifier>CONTAINER_NAME:OBJECT_ITEMS</identifier>
          <literal> </literal>
          <complex>
            <identifier>CONTAINER_NAME:OBJECT_ITEM</identifier>
            <literal> </literal>
            <simple>
              <identifier>CONTAINER_NAME:obj_item_name</identifier>
              <literal> </literal>
            </simple>
            <container>
              <identifier>CONTAINER_NAME:set_value</identifier>
              <literal> </literal>
              <simple>
                <identifier>CONTAINER_NAME:obj_item_value</identifier>
                <literal> </literal>
              </simple>
            </container>
          </complex>
        </container>
      </complex>
    </container>
  </class>

我添加了以下函数来处理命名空间到 XML 结构:

std::string get_namespace(Node parent, std::string const& ident) const {
  auto parent_name = std::string(parent.name());
  std::string ns = ident + ":" + ident;  // 默认命名空间
  // 如果这是类容器的子节点,只需返回对象的标识符值并添加冒号(:)
  if (parent_name != "class") {
    // 父级不是类类型,只需从此父节点的标识符中提取命名空间。
    std::string parent_id = parent.child("identifier").text().as_string();
    ns = parent_id.substr(0, parent_id.find(":") + 1) + ident;
  }
  return ns;
};

然后在处理 Simple、Complex 和 Container 的 XML 时,我只需调用此函数:

void apply(Node parent, Simple const& s) const {
  auto simple = named_child(parent, "simple");
  apply(simple, get_namespace(parent, s.id), "identifier");
  apply(simple, s.literal, "literal");
  apply(simple, s.enumeration);
}

总之,还有很多工作要做,我还需要解析 if-else、case 语句,但这给了我一个很好的起点。再次感谢你花时间与我分享你的知识。

英文:

Thanks again for this great lesson. To answer your question about the CONTAINER_NAME: namespace, it just simply for grouping (not my rule, just the folks who come up with the definition structure want it that way).

So if we parse this line

Class Simple caption;

then the out come should be:

&lt;task&gt;
&lt;class&gt;
&lt;simple&gt;
&lt;identifier&gt;caption:caption&lt;/identifier&gt;
&lt;literal&gt;&quot; &quot;&lt;/literal&gt;
&lt;/simple&gt;
&lt;/class&gt;
&lt;/task&gt;

The namespace caption: is added since this is the first child of this class. But if we are parsing

Class Container CONTAINER_NAME ( 
Complex OBJECT_NAME ( 
Simple obj_id 
Container OBJECT_ITEMS (
Complex OBJECT_ITEM (
Simple obj_item_name
Container set_value (
Simple obj_item_value
)
)
)
)
);

Then the CONTAINER_NAME: namespace will be appended to all chidren's identifier name.

&lt;class&gt;
&lt;container&gt;
&lt;identifier&gt;CONTAINER_NAME:CONTAINER_NAME&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;complex&gt;
&lt;identifier&gt;CONTAINER_NAME:OBJECT_NAME&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;simple&gt;
&lt;identifier&gt;CONTAINER_NAME:obj_id&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;/simple&gt;
&lt;container&gt;
&lt;identifier&gt;CONTAINER_NAME:OBJECT_ITEMS&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;complex&gt;
&lt;identifier&gt;CONTAINER_NAME:OBJECT_ITEM&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;simple&gt;
&lt;identifier&gt;CONTAINER_NAME:obj_item_name&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;/simple&gt;
&lt;container&gt;
&lt;identifier&gt;CONTAINER_NAME:set_value&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;simple&gt;
&lt;identifier&gt;CONTAINER_NAME:obj_item_value&lt;/identifier&gt;
&lt;literal&gt; &lt;/literal&gt;
&lt;/simple&gt;
&lt;/container&gt;
&lt;/complex&gt;
&lt;/container&gt;
&lt;/complex&gt;
&lt;/container&gt;
&lt;/class&gt;

I add the following function to handle the namespace to XML struct. It did the job but I'm pretty sure you will come up with just one line to do this...:)

std::string get_namespace(Node parent, std::string const&amp; ident) const {
auto parent_name = std::string(parent.name());
std::string ns = ident + &quot;:&quot; + ident;  // Default namespace
// If this is the child of class container, just return the object&#39;s identifier value and add colon (:)
if (parent_name != &quot;class&quot;) {
// Parent is not a class type, just extract the namespace from
// identifier of this parent node.
std::string parent_id = parent.child(&quot;identifier&quot;).text().as_string();
ns = parent_id.substr(0, parent_id.find(&quot;:&quot;) + 1) + ident;
}
return ns;
};

Then I just call this function when handling XML for Simple, Complex, and Container

void apply(Node parent, Simple const&amp; s) const {
auto simple = named_child(parent, &quot;simple&quot;);
apply(simple, get_namespace(parent, s.id), &quot;identifier&quot;);
apply(simple, s.literal, &quot;literal&quot;);
apply(simple, s.enumeration);
}

Anyway, there's a lot more for me to do as I need to also parse if-else, case statements but this give me a great starting point. Again, thanks for taking time sharing your knowledge with me.

huangapple
  • 本文由 发表于 2023年6月1日 01:19:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76375924.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定