英文:
boost::ptree is taking to much memory during push_back and put_child
问题
datanode在我使用push_back将其添加到childnode时,大约占用30MB的数据,然后它占用了大约200MB的内存。当我再次将这个childnode放入父节点时,这个操作也占用了200MB的内存。我在这个操作期间使用perf来统计了所有的内存使用情况。为什么push_back和put_child会占用如此大量的内存?我使用perf来查看push_back和put_child期间的内存统计数据。
英文:
datanode holds approximately data of 30MB when i push_back it to childnode then it takes memory of 200MB approximately and when I put this childnode into parent node again 200MB in this operation. I have stats all memory during this operation using perf stats.
why push_back and put_child is taking so huge amount of memory?
used perf to see memory stats during push_back and put_child.
答案1
得分: 1
是的,属性树不太高效。它更灵活而不是高效。
例如,每个树节点都可以既有一个值又有子节点(即使相应的后端不支持,比如JSON)。它还允许按“键”查找,但同时可能存在具有相同名称的多个子节点。为了完整地说明,子节点不能存储在有序映射中,因为插入节点的顺序可能很重要。
ptree
使用一个多索引容器,具有多个索引来满足所有这些要求。值得一提的是,您会获得强大的保证(例如,put
与 add
具有不同的语义)和非常灵活的失效规则(与任何基于节点的容器一样:所有引用和迭代器在所有操作中保持有效,包括任何重新散列/重新分配,除非它们引用已删除的元素)。
再加上ptree
不允许自定义分配器(尽管 boost::multi_index_container
支持分配器),以及缺乏移动语义的支持,您就会明白为什么 ptree
永远不会赢得任何效率奖项。
您需要什么
从您问题中的标签推测,您似乎需要JSON支持。
首先,让我们明确一点,Boost Property Tree 不是一个JSON库¹。
其次,您很幸运。Boost 1.75 引入了 Boost JSON!它不仅是一个JSON库,而且还支持智能池分配、移动语义以及通常允许高效访问,包括流式解析/序列化比内存容量大得多的文档。
请查看这里的文档和示例:https://www.boost.org/doc/libs/1_83_0/libs/json/doc/html/json/quick_look.html
另外,如果您可以使用C++14,请注意Boost Describe 中的示例,它们使用 Boost JSON 的方式可以击败我见过的任何 Property Tree 的滥用。实际上,最近的 Boost JSON 可以直接使用 value_from
/value_to
,而您的工作量很小。例如:
#include <boost/describe.hpp>
#include <boost/json/src.hpp>
#include <iostream>
#include <optional>
namespace json = boost::json;
namespace MyLib {
BOOST_DEFINE_ENUM_CLASS(Enum, foo, bar, qux)
struct Base {
Enum enumValue;
double number;
};
BOOST_DESCRIBE_STRUCT(Base, (), (enumValue, number))
struct Derived : Base {
std::optional<std::string> maybeMessage;
};
BOOST_DESCRIBE_STRUCT(Derived, (Base), (maybeMessage))
} // namespace MyLib
template <> struct json::is_described_class<MyLib::Derived> : std::true_type {};
int main() {
using MyLib::Enum;
MyLib::Derived objs[] = {
{{Enum::bar, 42e-1}, "Hello world"},
{{Enum::qux, M_PI}, {}},
};
for (MyLib::Derived obj : objs) {
std::cout << json::value_from(obj) << std::endl;
}
}
输出:
{"enumValue":"bar","number":4.2E0,"maybeMessage":"Hello world"}
{"enumValue":"qux","number":3.141592653589793E0,"maybeMessage":null}
¹(抱歉大声说,但告诉人们这个已经多年了)
英文:
Yes, property tree is not very efficient. It's more versatile than efficient.
For example, each tree node can have both a value and child nodes (even if the respective backend doesn't, like JSON). It also allows for lookup by "key" but at the same time, multiple child nodes with the same name may exist. To complete the picture, children cannot be stored in an ordered map, because the order in which nodes are inserted may matter.
ptree
uses a multi-index container with several indices to serve all these requirements. On the bright side, you get strong guarantees (like put
having different semantics then add
) and very flexible invalidation rules (like any node-based container: all references and iterators stay valid through all operations, including any rehashes/reallocation, unless they refer to elements removed).
Add to this the fact that ptree
doesn't allow one to customize the allocator (even though boost::multi_index_container
supports allocators), and a lack of move-awareness, and you see why ptree
will never win any efficiency prizes.
What You Need
Guessing from the tags in your question you seem to need JSON support.
Firstly, let's note for once and for all that Boost Property Tree is NOT a JSON library¹.
Secondly, you're in luck. Boost 1.75 introduced Boost JSON! That not only IS a JSON library, but it even supports smart pool allocation, move semantics and in general allows highly efficient access, including streaming parsing/serialization of documents way bigger than would ever fit in memory.
See here for the documentation and examples: https://www.boost.org/doc/libs/1_83_0/libs/json/doc/html/json/quick_look.html
Also, if you can use C++14, note the examples in Boost Describe that use Boost JSON in ways that will knock the socks out of any abuse of Property Tree that I've seen. In fact, recent Boost JSON this directly using value_from
/value_to
with very little additional work on your part. E.g.:
#include <boost/describe.hpp>
#include <boost/json/src.hpp>
#include <iostream>
#include <optional>
namespace json = boost::json;
namespace MyLib {
BOOST_DEFINE_ENUM_CLASS(Enum, foo, bar, qux)
struct Base {
Enum enumValue;
double number;
};
BOOST_DESCRIBE_STRUCT(Base, (), (enumValue, number))
struct Derived : Base {
std::optional<std::string> maybeMessage;
};
BOOST_DESCRIBE_STRUCT(Derived, (Base), (maybeMessage))
} // namespace MyLib
template <> struct json::is_described_class<MyLib::Derived> : std::true_type {};
int main() {
using MyLib::Enum;
MyLib::Derived objs[] = {
{{Enum::bar, 42e-1}, "Hello world"},
{{Enum::qux, M_PI}, {}},
};
for (MyLib::Derived obj : objs) {
std::cout << json::value_from(obj) << std::endl;
}
}
Prints
{"enumValue":"bar","number":4.2E0,"maybeMessage":"Hello world"}
{"enumValue":"qux","number":3.141592653589793E0,"maybeMessage":null}
¹ (sorry for yelling, but decades of telling people this does that to you)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论