英文:
perf_event_open - limit when monitoring multiple events
问题
在单个 PERF_FORMAT_GROUP
中可以监视的 PERF_TYPE_HARDWARE
事件数量是否有限制?当我尝试监视多个事件时,我能够监视5个事件,但当我添加第6个硬件事件时,所有已注册事件的值都不会更新。如果第6和第7个事件是软件事件,我不会遇到这个问题。我希望从 main_fd
读取值以减少 read
调用的数量。是否有限制硬件事件在单个 PERF_FORMAT_GROUP
中的数量?
英文:
Does anyone know if there is a limit to the number of PERF_TYPE_HARDWARE
events that we can monitor in a single group PERF_FORMAT_GROUP
?
I am attempting to monitor multiple events and I am finding that I am able to monitor 5 events, but when I add a 6th hardware event the values of all registered events do not get updated.
struct read_format {
uint64_t nr; /* The number of events */
struct {
uint64_t value; /* The value of the event */
uint64_t id; /* if PERF_FORMAT_ID */
} values[nr];
};
int main() {
struct perf_event_attr attr1;
attr1.type = PERF_TYPE_HARDWARE;
attr1.config = PERF_COUNT_HW_CPU_CYCLES;
attr1.read_format = PERF_FORMAT_GROUP | PERF_FORMAT_ID;
int main_fd = syscall(__NR_perf_event_open, &attr1, 0, -1, -1, 0);
uint64_t id1;
ioctl(main_fd, PERF_EVENT_IOC_ID, &id1);
ioctl(main_fd, PERF_EVENT_IOC_RESET, 0);
ioctl(main_fd, PERF_EVENT_IOC_ENABLE, 0);
struct perf_event_attr attr2;
attr2.type = PERF_TYPE_HARDWARE;
attr2.config = PERF_COUNT_HW_CACHE_REFERENCES;
attr2.read_format = PERF_FORMAT_GROUP | PERF_FORMAT_ID;
int fd2 = syscall(__NR_perf_event_open, &attr2, 0, -1, main_fd, 0);
uint64_t id2;
ioctl(fd2, PERF_EVENT_IOC_ID, &id2);
ioctl(fd2, PERF_EVENT_IOC_RESET, 0);
ioctl(fd2, PERF_EVENT_IOC_ENABLE, 0);
/*
commenting out attr3 through attr 7. They are the same as attr2 except the following config:
attr3.config = PERF_COUNT_HW_CACHE_MISSES;
attr4.config = PERF_COUNT_HW_BRANCH_MISSES;
attr5.config = PERF_COUNT_HW_BUS_CYCLES;
attr6.config = PERF_COUNT_HW_STALLED_CYCLES_FRONTEND;
attr7.config = PERF_COUNT_HW_STALLED_CYCLES_BACKEND;
*/
// read_values and log "START"
// action
// read_values and log "END"
return 0;
}
read_values() {
char buffer[4096];
int read_bytes = read(main_fd, &buffer, sizeof(buffer));
if (read_bytes == -1) { return 1; }
struct read_format* rf = (struct read_format*) buffer;
int values[rf->nr];
for (int i=0; i<rf->nr; i++) {
values[i] = rf->values[i].value;
}
}
In the above code, all logged values at "START" and "END" have been updated when I only open perf events for attr1 - 5. However, when I attempt to open perf events for 6 events (or all 7 hardware events), all logged values at "START" and "END" remain the exact same.
I am able to get values of all 7 events if I read each one directly: int fd2 = syscall(__NR_perf_event_open, &attr2, 0, -1, -1 /*!!instead of main_fd!!*/, 0);
, then performing a read()
on each fd
. But to reduce the number of read
calls I'd prefer to read from the main_fd
and grab the values from there. Is there a limit to the number of hardware events that can be captured in a single PERF_FORMAT_GROUP
? I've noticed if the 6th and 7th events are software events that I don't see this issue.
答案1
得分: 1
我想我会回答我自己的问题,因为我深入研究了一下,找到了这篇帖子。根据“事件组”部分:
> 可用性能计数器的数量取决于CPU。一个组不能包含比可用计数器更多的事件。例如,Intel Core CPU通常为核心提供四个通用性能计数器,以及三个用于指令、周期和参考周期的固定计数器。一些特殊事件对它们可以调度到哪个计数器有限制,并且可能不支持在单个组中有多个实例。当在组中指定了太多事件时,其中一些事件将无法测量。
英文:
Figured I'd post an answer to my own question as I went down the rabbit hole and came across this post. Per the "Event Groups" section:
> The number of available performance counters depend on the CPU. A group cannot contain
more events than available counters. For example Intel Core CPUs typically have four
generic performance counters for the core, plus three fixed counters for instructions,
cycles and ref-cycles. Some special events have restrictions on which counter they can
schedule, and may not support multiple instances in a single group. When too many events
are specified in the group some of them will not be measured.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论