英文:
converting a flat table to a tree structure using ranges-v3
问题
以下是翻译的代码部分:
// 用于唯一性比较的比较器
static const auto customComp = [](const auto& lhs, const auto& rhs) {
return std::tie(lhs.filename, lhs.function, lhs.lineNum, lhs.type) <
std::tie(rhs.filename, rhs.function, rhs.lineNum, rhs.type);
};
int main() {
print("unsorted vector", structs);
// 对结构进行排序
actions::sort(structs, customComp);
const auto outerComp = [](auto&& lhs, auto&& rhs) {
return lhs.filename == rhs.filename;
};
const auto innerComp = [](auto&& lhs, auto&& rhs) {
return lhs.function == rhs.function;
};
print("sorted vector", structs);
std::cout << std::endl;
// 按文件名将排序后的探测点列表分组
for (const auto& sources : structs | views::chunk_by(outerComp)) {
auto foo = sources.size();
for (const auto& next : sources) {
auto outcomes = 0;
// 按函数名将探测点列表分组
for (const auto& functions : sources | views::chunk_by(innerComp)) {
for (const auto& probe : functions) {
outcomes += (probe.type == Type::Two) ? 2 : 1;
std::cout << std::format("{}\n", probe);
}
}
std::cout << next.filename << " outcomes [" << outcomes << "]\n";
break;
}
std::cout << "\n";
}
}
请注意,这段代码已经根据您提供的信息进行了翻译。如果您需要进一步的帮助或有其他问题,请随时告诉我。
英文:
I have a flat representation of a tree shown in the table below.
The unsorted data, std::vector<MyStruct> is:
unsorted vector
(id) (path) (fn) (line) (extra)
1 /abc/file3.c foo0 10 1
2 /abc/file3.c foo0 15 2
3 /abc/file3.c foo0 20 1
4 /abc/file3.c foo1 30 1
5 /abc/file3.c foo1 35 2
6 /abc/file3.c foo1 40 1
7 /abc/file1.c foo2 10 1
8 /abc/file1.c foo2 15 2
9 /abc/file1.c foo2 20 1
10 /abc/file3.c baz1 70 1
11 /abc/file3.c baz1 75 2
12 /abc/file3.c baz1 80 1
13 /abc/file2.c bat 10 1
14 /abc/file2.c bat 15 2
15 /abc/file2.c bat 17 2
16 /abc/file2.c bat 20 1
17 /def/file2.c baz 70 1
18 /def/file2.c baz 71 1
19 /def/file2.c baz 72 1
20 /def/file2.c baz 73 1
The columns represent 'ID', 'path', 'function', 'linenumber' and 'extra'. The data in tree form is hierarchically ordered as path->funcion->lineNumber (each path contains multiple functions, which contains multiple lines of interest (probe points)).
Each row in this table is represented with this struct:
using Type = enum class Type : unsigned {
One = 1,
Two = 2
};
using MyStruct = struct MyStruct {
unsigned id;
std::string filename;
std::string function;
unsigned lineNum;
Type type;
};
After sorting this data using the hierarchy described above (via the following comparator)
// comparator used for unique
static const auto customComp = [](const auto& lhs, const auto& rhs) {
return std::tie(lhs.filename, lhs.function, lhs.lineNum, lhs.type) <
std::tie(rhs.filename, rhs.function, rhs.lineNum, rhs.type);
};
We end up with the correctly ordered vector:
sorted vector
(id) (path) (fn) (line) (extra)
7 /abc/file1.c foo2 10 1
8 /abc/file1.c foo2 15 2
9 /abc/file1.c foo2 20 1
13 /abc/file2.c bat 10 1
14 /abc/file2.c bat 15 2
15 /abc/file2.c bat 17 2
16 /abc/file2.c bat 20 1
10 /abc/file3.c baz1 70 1
11 /abc/file3.c baz1 75 2
12 /abc/file3.c baz1 80 1
1 /abc/file3.c foo0 10 1
2 /abc/file3.c foo0 15 2
3 /abc/file3.c foo0 20 1
4 /abc/file3.c foo1 30 1
5 /abc/file3.c foo1 35 2
6 /abc/file3.c foo1 40 1
17 /def/file2.c baz 70 1
18 /def/file2.c baz 71 1
19 /def/file2.c baz 72 1
20 /def/file2.c baz 73 1
I need to parse this data using the new ranges or ranges-v3 API to efficiently recreate the tree structure from which the table originated. I specify ranges here firstly as I am learning my way through this complicated API, but also because the API seems to show a very efficient way of handling large data sets by lazy evaluation).
The following code works (which is also in godbolt), however it seems wrong. I am using a pair of nested ranges chunk_by loops to parse the data. I need to terminate the outer loop early by a break.
The main body of the code is here:
// comparator used for unique
static const auto customComp = [](const auto& lhs, const auto& rhs) {
return std::tie(lhs.filename, lhs.function, lhs.lineNum, lhs.type) <
std::tie(rhs.filename, rhs.function, rhs.lineNum, rhs.type);
};
int
main() {
print("unsorted vector", structs);
// split the sorted probes into chunks
actions::sort(structs, customComp);
const auto outerComp = [](auto&& lhs, auto&& rhs) {
return lhs.filename == rhs.filename;
};
const auto innerComp = [](auto&& lhs, auto&& rhs) {
return lhs.function == rhs.function;
};
print("sorted vector", structs);
std::cout << std::endl;
// split sorted list of probes into chunks by filename
for (const auto& sources : structs | views::chunk_by(outerComp)) {
auto foo = sources.size();
for (const auto& next : sources) {
auto outcomes = 0;
for (const auto& functions : sources | views::chunk_by(innerComp)) {
for (const auto& probe : functions) {
outcomes += (probe.type == Type::Two) ? 2 : 1;
std::cout << std::format("{}\n", probe);
}
}
std::cout << next.filename << " outcomes [" << outcomes << "]\n";
break;
}
std::cout << "\n";
}
}
Would it be possible to perform the sort and double chunking on a single for loop? I would ideally like to use the composition form of the ranges API to achieve the best result.
答案1
得分: 1
鉴于每个记录都包含执行此操作所需的所有信息,我只需创建表示树结构的东西,并将记录插入其中,而不是对记录进行排序,然后从排序后的记录中解析出范围。
#include <iostream>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <map>
#include <string>
// 保持代码自包含,但在实际使用中,无疑要从文件或类似文件中读取原始数据。
char const *rawData = R"(
1 /abc/file3.c foo0 10 1
2 /abc/file3.c foo0 15 2
3 /abc/file3.c foo0 20 1
4 /abc/file3.c foo1 30 1
5 /abc/file3.c foo1 35 2
6 /abc/file3.c foo1 40 1
7 /abc/file1.c foo2 10 1
8 /abc/file1.c foo2 15 2
9 /abc/file1.c foo2 20 1
10 /abc/file3.c baz1 70 1
11 /abc/file3.c baz1 75 2
12 /abc/file3.c baz1 80 1
13 /abc/file2.c bat 10 1
14 /abc/file2.c bat 15 2
15 /abc/file2.c bat 17 2
16 /abc/file2.c bat 20 1
17 /def/file2.c baz 70 1
18 /def/file2.c baz 71 1
19 /def/file2.c baz 72 1
20 /def/file2.c baz 73 1
)";
struct record {
int id;
std::string path;
std::string fn;
int lineNumber;
int type;
bool operator<(record const &rhs) const {
return std::tie(path, fn, lineNumber, type) < std::tie(rhs.path, rhs.fn, rhs.lineNumber, rhs.type);
}
friend std::istream &operator>>(std::istream &is, record &r) {
return is >> r.id >> r.path >> r.fn >> r.lineNumber >> r.type;
}
friend std::ostream &operator<<(std::ostream &os, record const &r) {
return os << r.id << "\t" << r.path << "\t" << r.fn << "\t" << r.lineNumber << "\t" << r.type;
}
};
struct Probe {
int line;
int type;
friend std::ostream &operator<<(std::ostream &os, Probe const &p) {
return os << "\t\t" << p.line << " " << p.type;
}
};
class FuncRec {
std::vector<Probe> probes;
public:
void insert(record const &rec) {
probes.push_back(Probe{rec.lineNumber, rec.type});
}
friend std::ostream &operator<<(std::ostream &os, FuncRec const &f) {
for (auto const &p : f.probes) {
os << p << "\n";
}
return os;
}
};
class FileRec {
std::map<std::string, FuncRec> functions;
public:
void insert(record const &rec) {
functions[rec.fn].insert(rec);
}
friend std::ostream &operator<<(std::ostream &os, FileRec const &f) {
for (auto const &f : f.functions) {
os << "\t" << f.first << "\n";
os << f.second;
}
return os;
}
};
class Tree {
std::map<std::string, FileRec> files;
void insert(record const &rec) {
files[rec.path].insert(rec);
}
public:
Tree(std::vector<record> const &in) {
for (auto const &r : in)
insert(r);
}
friend std::ostream &operator<<(std::ostream &os, Tree const &t) {
for (auto const &f : t.files) {
os << f.first << "\n";
os << f.second;
}
return os;
}
};
int main() {
std::stringstream infile(rawData);
std::vector<record> recs { std::istream_iterator<record>(infile), {}};
Tree tree{recs};
std::cout << "Tree struture:\n";
std::cout << tree;
// 如果您还想显示已排序的结构体:
std::cout << "\nSorted records:\n";
std::sort(recs.begin(), recs.end());
for (auto const &r : recs) {
std::cout << r << "\n";
}
std::cout << "\n";
}
这可能比实际需要的要复杂一些。例如,FuncRec
实际上没有什么用处。我们可以将 Probe
向量嵌入到其父类中(但我假设这是更复杂版本的简化版本,其中 FuncRec
可能会起到更多的作用)。
英文:
Given that each record contains all the information necessary to do so, I'd just create something to represent the tree structure, and insert records into it, rather than sort, and then parse out ranges from the sorted records.
#include <iostream>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <map>
#include <string>
// Keep the code self-contained, though in real use you undoubtedly want to
// read the raw data from a file, or something on that order.
char const *rawData = R"(
1 /abc/file3.c foo0 10 1
2 /abc/file3.c foo0 15 2
3 /abc/file3.c foo0 20 1
4 /abc/file3.c foo1 30 1
5 /abc/file3.c foo1 35 2
6 /abc/file3.c foo1 40 1
7 /abc/file1.c foo2 10 1
8 /abc/file1.c foo2 15 2
9 /abc/file1.c foo2 20 1
10 /abc/file3.c baz1 70 1
11 /abc/file3.c baz1 75 2
12 /abc/file3.c baz1 80 1
13 /abc/file2.c bat 10 1
14 /abc/file2.c bat 15 2
15 /abc/file2.c bat 17 2
16 /abc/file2.c bat 20 1
17 /def/file2.c baz 70 1
18 /def/file2.c baz 71 1
19 /def/file2.c baz 72 1
20 /def/file2.c baz 73 1
)";
struct record {
int id;
std::string path;
std::string fn;
int lineNumber;
int type;
bool operator<(record const &rhs) const {
return std::tie(path, fn, lineNumber, type) < std::tie(rhs.path, rhs.fn, rhs.lineNumber, rhs.type);
}
friend std::istream &operator>>(std::istream &is, record &r) {
return is >> r.id >> r.path >> r.fn >> r.lineNumber >> r.type;
}
friend std::ostream &operator<<(std::ostream &os, record const &r) {
return os << r.id << "\t" << r.path << "\t" << r.fn << "\t" << r.lineNumber << "\t" << r.type;
}
};
struct Probe {
int line;
int type;
friend std::ostream &operator<<(std::ostream &os, Probe const &p) {
return os << "\t\t" << p.line << " " << p.type;
}
};
class FuncRec {
std::vector<Probe> probes;
public:
void insert(record const &rec) {
probes.push_back(Probe{rec.lineNumber, rec.type});
}
friend std::ostream &operator<<(std::ostream &os, FuncRec const &f) {
for (auto const &p : f.probes) {
os << p << "\n";
}
return os;
}
};
class FileRec {
std::map<std::string, FuncRec> functions;
public:
void insert(record const &rec) {
functions[rec.fn].insert(rec);
}
friend std::ostream &operator<<(std::ostream &os, FileRec const &f) {
for (auto const &f : f.functions) {
os << "\t" << f.first << "\n";
os << f.second;
}
return os;
}
};
class Tree {
std::map<std::string, FileRec> files;
void insert(record const &rec) {
files[rec.path].insert(rec);
}
public:
Tree(std::vector<record> const &in) {
for (auto const &r : in)
insert(r);
}
friend std::ostream &operator<<(std::ostream &os, Tree const &t) {
for (auto const &f : t.files) {
os << f.first << "\n";
os << f.second;
}
return os;
}
};
int main() {
std::stringstream infile(rawData);
std::vector<record> recs { std::istream_iterator<record>(infile), {}};
Tree tree{recs};
std::cout << "Tree struture:\n";
std::cout << tree;
// In case you also want to show sorted structs:
std::cout << "\nSorted records:\n";
std::sort(recs.begin(), recs.end());
for (auto const &r : recs) {
std::cout << r << "\n";
}
std::cout << "\n";
}
This is probably a bit more elaborate than really needed. For example, FuncRec
doesn't really accomplish much. We could just embed the vector of Probe
s in its parent (but I'm assuming this is kind of a simplified version of something more elaborate, where FuncRec
might serve more purpose.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论