2023年8月11日 03:22:39go评论199阅读模式

英文:

boost::ptree is taking to much memory during push_back and put_child

问题

datanode在我使用push_back将其添加到childnode时，大约占用30MB的数据，然后它占用了大约200MB的内存。当我再次将这个childnode放入父节点时，这个操作也占用了200MB的内存。我在这个操作期间使用perf来统计了所有的内存使用情况。为什么push_back和put_child会占用如此大量的内存？我使用perf来查看push_back和put_child期间的内存统计数据。

英文:

datanode holds approximately data of 30MB when i push_back it to childnode then it takes memory of 200MB approximately and when I put this childnode into parent node again 200MB in this operation. I have stats all memory during this operation using perf stats.
why push_back and put_child is taking so huge amount of memory?

used perf to see memory stats during push_back and put_child.

答案1

得分: 1

是的，属性树不太高效。它更灵活而不是高效。

例如，每个树节点都可以既有一个值又有子节点（即使相应的后端不支持，比如JSON）。它还允许按“键”查找，但同时可能存在具有相同名称的多个子节点。为了完整地说明，子节点不能存储在有序映射中，因为插入节点的顺序可能很重要。

ptree 使用一个多索引容器，具有多个索引来满足所有这些要求。值得一提的是，您会获得强大的保证（例如，put 与 add 具有不同的语义）和非常灵活的失效规则（与任何基于节点的容器一样：所有引用和迭代器在所有操作中保持有效，包括任何重新散列/重新分配，除非它们引用已删除的元素）。

再加上ptree 不允许自定义分配器（尽管 boost::multi_index_container 支持分配器），以及缺乏移动语义的支持，您就会明白为什么 ptree 永远不会赢得任何效率奖项。

您需要什么

从您问题中的标签推测，您似乎需要JSON支持。

首先，让我们明确一点，Boost Property Tree 不是一个JSON库¹。

其次，您很幸运。Boost 1.75 引入了 Boost JSON！它不仅是一个JSON库，而且还支持智能池分配、移动语义以及通常允许高效访问，包括流式解析/序列化比内存容量大得多的文档。

请查看这里的文档和示例：https://www.boost.org/doc/libs/1_83_0/libs/json/doc/html/json/quick_look.html

另外，如果您可以使用C++14，请注意Boost Describe 中的示例，它们使用 Boost JSON 的方式可以击败我见过的任何 Property Tree 的滥用。实际上，最近的 Boost JSON 可以直接使用 value_from/value_to，而您的工作量很小。例如：

在线编译器示例

#include &lt;boost/describe.hpp&gt;
#include &lt;boost/json/src.hpp&gt;
#include &lt;iostream&gt;
#include &lt;optional&gt;
namespace json = boost::json;
namespace MyLib {
    BOOST_DEFINE_ENUM_CLASS(Enum, foo, bar, qux)
    struct Base {
        Enum enumValue;
        double number;
    };
    BOOST_DESCRIBE_STRUCT(Base, (), (enumValue, number))
    struct Derived : Base {
        std::optional&lt;std::string&gt; maybeMessage;
    };
    BOOST_DESCRIBE_STRUCT(Derived, (Base), (maybeMessage))
} // namespace MyLib
template &lt;&gt; struct json::is_described_class&lt;MyLib::Derived&gt; : std::true_type {};
int main() {
    using MyLib::Enum;
    MyLib::Derived objs[] = {
        {{Enum::bar, 42e-1}, "Hello world"},
        {{Enum::qux, M_PI}, {}},
    };
    for (MyLib::Derived obj : objs) {
        std::cout &lt;&lt; json::value_from(obj) &lt;&lt; std::endl;
    }
}

输出：

{"enumValue":"bar","number":4.2E0,"maybeMessage":"Hello world"}
{"enumValue":"qux","number":3.141592653589793E0,"maybeMessage":null}

¹（抱歉大声说，但告诉人们这个已经多年了）

英文:

Yes, property tree is not very efficient. It's more versatile than efficient.

For example, each tree node can have both a value and child nodes (even if the respective backend doesn't, like JSON). It also allows for lookup by "key" but at the same time, multiple child nodes with the same name may exist. To complete the picture, children cannot be stored in an ordered map, because the order in which nodes are inserted may matter.

ptree uses a multi-index container with several indices to serve all these requirements. On the bright side, you get strong guarantees (like put having different semantics then add) and very flexible invalidation rules (like any node-based container: all references and iterators stay valid through all operations, including any rehashes/reallocation, unless they refer to elements removed).

Add to this the fact that ptree doesn't allow one to customize the allocator (even though boost::multi_index_container supports allocators), and a lack of move-awareness, and you see why ptree will never win any efficiency prizes.

What You Need

Guessing from the tags in your question you seem to need JSON support.

Firstly, let's note for once and for all that Boost Property Tree is NOT a JSON library¹.

Secondly, you're in luck. Boost 1.75 introduced Boost JSON! That not only IS a JSON library, but it even supports smart pool allocation, move semantics and in general allows highly efficient access, including streaming parsing/serialization of documents way bigger than would ever fit in memory.

See here for the documentation and examples: https://www.boost.org/doc/libs/1_83_0/libs/json/doc/html/json/quick_look.html

Also, if you can use C++14, note the examples in Boost Describe that use Boost JSON in ways that will knock the socks out of any abuse of Property Tree that I've seen. In fact, recent Boost JSON this directly using value_from/value_to with very little additional work on your part. E.g.:

Live On Compiler Explorer

#include &lt;boost/describe.hpp&gt;
#include &lt;boost/json/src.hpp&gt;
#include &lt;iostream&gt;
#include &lt;optional&gt;
namespace json = boost::json;
namespace MyLib {
    BOOST_DEFINE_ENUM_CLASS(Enum, foo, bar, qux)
    struct Base {
        Enum enumValue;
        double number;
    };
    BOOST_DESCRIBE_STRUCT(Base, (), (enumValue, number))
    struct Derived : Base {
        std::optional&lt;std::string&gt; maybeMessage;
    };
    BOOST_DESCRIBE_STRUCT(Derived, (Base), (maybeMessage))
} // namespace MyLib
template &lt;&gt; struct json::is_described_class&lt;MyLib::Derived&gt; : std::true_type {};
int main() {
    using MyLib::Enum;
    MyLib::Derived objs[] = {
        {{Enum::bar, 42e-1}, &quot;Hello world&quot;},
        {{Enum::qux, M_PI}, {}},
    };
    for (MyLib::Derived obj : objs) {
        std::cout &lt;&lt; json::value_from(obj) &lt;&lt; std::endl;
    }
}

Prints

{&quot;enumValue&quot;:&quot;bar&quot;,&quot;number&quot;:4.2E0,&quot;maybeMessage&quot;:&quot;Hello world&quot;}
{&quot;enumValue&quot;:&quot;qux&quot;,&quot;number&quot;:3.141592653589793E0,&quot;maybeMessage&quot;:null}

¹ (sorry for yelling, but decades of telling people this does that to you)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

boost::ptree 在 push_back 和 put_child 操作时占用了太多内存。

问题

答案1

您需要什么

What You Need

如何根据 Swagger 模式验证 JSON 模式的实例？

获取JSON HTML数据

使用设备上下文时的内存泄漏问题

如何使用Jackson排除超类属性

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。