boost::ptree is taking to much memory during push_back and put_child

boost::ptree is taking to much memory during push_back and put_child




datanode holds approximately data of 30MB when i push_back it to childnode then it takes memory of 200MB approximately and when I put this childnode into parent node again 200MB in this operation. I have stats all memory during this operation using perf stats.
why push_back and put_child is taking so huge amount of memory?

used perf to see memory stats during push_back and put_child.


ptree 使用一个多索引容器,具有多个索引来满足所有这些要求。值得一提的是,您会获得强大的保证(例如,putadd 具有不同的语义)和非常灵活的失效规则(与任何基于节点的容器一样:所有引用和迭代器在所有操作中保持有效,包括任何重新散列/重新分配,除非它们引用已删除的元素)。

再加上ptree 不允许自定义分配器(尽管 boost::multi_index_container 支持分配器),以及缺乏移动语义的支持,您就会明白为什么 ptree 永远不会赢得任何效率奖项。



首先,让我们明确一点,Boost Property Tree 不是一个JSON库¹。

其次,您很幸运。Boost 1.75 引入了 Boost JSON!它不仅是一个JSON库,而且还支持智能池分配、移动语义以及通常允许高效访问,包括流式解析/序列化比内存容量大得多的文档。


另外,如果您可以使用C++14,请注意Boost Describe 中的示例,它们使用 Boost JSON 的方式可以击败我见过的任何 Property Tree 的滥用。实际上,最近的 Boost JSON 可以直接使用 value_from/value_to,而您的工作量很小。例如:


  1. #include <boost/describe.hpp>
  2. #include <boost/json/src.hpp>
  3. #include <iostream>
  4. #include <optional>
  5. namespace json = boost::json;
  6. namespace MyLib {
  7. BOOST_DEFINE_ENUM_CLASS(Enum, foo, bar, qux)
  8. struct Base {
  9. Enum enumValue;
  10. double number;
  11. };
  12. BOOST_DESCRIBE_STRUCT(Base, (), (enumValue, number))
  13. struct Derived : Base {
  14. std::optional<std::string> maybeMessage;
  15. };
  16. BOOST_DESCRIBE_STRUCT(Derived, (Base), (maybeMessage))
  17. } // namespace MyLib
  18. template <> struct json::is_described_class<MyLib::Derived> : std::true_type {};
  19. int main() {
  20. using MyLib::Enum;
  21. MyLib::Derived objs[] = {
  22. {{Enum::bar, 42e-1}, "Hello world"},
  23. {{Enum::qux, M_PI}, {}},
  24. };
  25. for (MyLib::Derived obj : objs) {
  26. std::cout << json::value_from(obj) << std::endl;
  27. }
  28. }


  1. {"enumValue":"bar","number":4.2E0,"maybeMessage":"Hello world"}
  2. {"enumValue":"qux","number":3.141592653589793E0,"maybeMessage":null}



Yes, property tree is not very efficient. It's more versatile than efficient.

For example, each tree node can have both a value and child nodes (even if the respective backend doesn't, like JSON). It also allows for lookup by "key" but at the same time, multiple child nodes with the same name may exist. To complete the picture, children cannot be stored in an ordered map, because the order in which nodes are inserted may matter.

ptree uses a multi-index container with several indices to serve all these requirements. On the bright side, you get strong guarantees (like put having different semantics then add) and very flexible invalidation rules (like any node-based container: all references and iterators stay valid through all operations, including any rehashes/reallocation, unless they refer to elements removed).

Add to this the fact that ptree doesn't allow one to customize the allocator (even though boost::multi_index_container supports allocators), and a lack of move-awareness, and you see why ptree will never win any efficiency prizes.

What You Need

Guessing from the tags in your question you seem to need JSON support.

Firstly, let's note for once and for all that Boost Property Tree is NOT a JSON library¹.

Secondly, you're in luck. Boost 1.75 introduced Boost JSON! That not only IS a JSON library, but it even supports smart pool allocation, move semantics and in general allows highly efficient access, including streaming parsing/serialization of documents way bigger than would ever fit in memory.

See here for the documentation and examples:

Also, if you can use C++14, note the examples in Boost Describe that use Boost JSON in ways that will knock the socks out of any abuse of Property Tree that I've seen. In fact, recent Boost JSON this directly using value_from/value_to with very little additional work on your part. E.g.:

Live On Compiler Explorer

  1. #include <boost/describe.hpp>
  2. #include <boost/json/src.hpp>
  3. #include <iostream>
  4. #include <optional>
  5. namespace json = boost::json;
  6. namespace MyLib {
  7. BOOST_DEFINE_ENUM_CLASS(Enum, foo, bar, qux)
  8. struct Base {
  9. Enum enumValue;
  10. double number;
  11. };
  12. BOOST_DESCRIBE_STRUCT(Base, (), (enumValue, number))
  13. struct Derived : Base {
  14. std::optional<std::string> maybeMessage;
  15. };
  16. BOOST_DESCRIBE_STRUCT(Derived, (Base), (maybeMessage))
  17. } // namespace MyLib
  18. template <> struct json::is_described_class<MyLib::Derived> : std::true_type {};
  19. int main() {
  20. using MyLib::Enum;
  21. MyLib::Derived objs[] = {
  22. {{Enum::bar, 42e-1}, "Hello world"},
  23. {{Enum::qux, M_PI}, {}},
  24. };
  25. for (MyLib::Derived obj : objs) {
  26. std::cout << json::value_from(obj) << std::endl;
  27. }
  28. }


  1. {"enumValue":"bar","number":4.2E0,"maybeMessage":"Hello world"}
  2. {"enumValue":"qux","number":3.141592653589793E0,"maybeMessage":null}

¹ (sorry for yelling, but decades of telling people this does that to you)

