query-exporter在Docker容器中无法工作。

huangapple go评论177阅读模式
英文:

query-exporter in Docker container not working

问题

我正在尝试在Docker容器中运行query-exporter。在开发人员的建议下,我已经在docker中启用了IPv6,方法是将以下内容添加到我的docker daemon.json文件中并重新启动:

  1. {
  2. "experimental": true,
  3. "ip6tables": true
  4. }

我正在使用以下docker-compose文件:

  1. version: "3.3"
  2. services:
  3. prometheus:
  4. container_name: prometheus
  5. image: prom/prometheus
  6. restart: always
  7. volumes:
  8. - ./prometheus:/etc/prometheus/
  9. - prometheus_data:/prometheus
  10. command:
  11. - '--config.file=/etc/prometheus/prometheus.yml'
  12. - '--storage.tsdb.path=/prometheus'
  13. - '--web.console.libraries=/usr/share/prometheus/console_libraries'
  14. - '--web.console.templates=/usr/share/prometheus/consoles'
  15. ports:
  16. - 9090:9090
  17. networks:
  18. - prom_app_net
  19. # 其他服务...
  20. volumes:
  21. # 定义卷...
  22. networks:
  23. prom_app_net:
  24. slurm:
  25. enable_ipv6: true
  26. ipam:
  27. config:
  28. - subnet: 2001:0DB8::/112

然后在slurmctld容器上安装了query-exporter,并使用以下config.yaml文件运行它:

  1. databases:
  2. db1:
  3. dsn: sqlite:////test.db
  4. connect-sql:
  5. - PRAGMA application_id = 123
  6. - PRAGMA auto_vacuum = 1
  7. labels:
  8. region: us1
  9. app: app1
  10. metrics:
  11. metric1:
  12. type: gauge
  13. description: A sample gauge
  14. queries:
  15. query1:
  16. interval: 5
  17. databases: [db1]
  18. metrics: [metric1]
  19. sql: SELECT random() / 1000000000000000 AS metric1

但是它没有工作 - Prometheus将目标列为离线。但容器设置似乎没问题,因为如果我运行以下测试导出器:

  1. from prometheus_client import start_http_server, Summary
  2. import random
  3. import time
  4. # 创建一个用于跟踪时间和请求的指标。
  5. REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')
  6. # 使用指标装饰函数。
  7. @REQUEST_TIME.time()
  8. def process_request(t):
  9. """一个需要一些时间的虚拟函数。"""
  10. time.sleep(t)
  11. if __name__ == '__main__':
  12. # 启动服务器以公开指标。
  13. start_http_server(8082)
  14. # 生成一些请求。
  15. while True:
  16. process_request(random.random())

Prometheus可以成功连接到目标。

根据更新信息,问题可能是query-exporter尝试绑定IPv6,而测试的test_query.py使用IPv4在端口8082上。您可以尝试将Prometheus指向8082/tcp -> [::]:8082,但如何实现取决于您的具体配置。

英文:

I am trying to get query-exporter to run in a Docker container. With advice from the developer I have enabled IPv6 in docker by putting:

  1. {
  2. "experimental": true,
  3. "ip6tables": true
  4. }

in my docker daemon.json and restarted.

I am using the following docker-compose file:

  1. version: "3.3"
  2. services:
  3. prometheus:
  4. container_name: prometheus
  5. image: prom/prometheus
  6. restart: always
  7. volumes:
  8. - ./prometheus:/etc/prometheus/
  9. - prometheus_data:/prometheus
  10. command:
  11. - '--config.file=/etc/prometheus/prometheus.yml'
  12. - '--storage.tsdb.path=/prometheus'
  13. - '--web.console.libraries=/usr/share/prometheus/console_libraries'
  14. - '--web.console.templates=/usr/share/prometheus/consoles'
  15. ports:
  16. - 9090:9090
  17. networks:
  18. - prom_app_net
  19. grafana:
  20. container_name: grafana
  21. image: grafana/grafana
  22. user: '472'
  23. restart: always
  24. environment:
  25. GF_INSTALL_PLUGINS: 'grafana-clock-panel,grafana-simple-json-datasource'
  26. volumes:
  27. - grafana_data:/var/lib/grafana
  28. - ./grafana/provisioning/:/etc/grafana/provisioning/
  29. - './grafana/grafana.ini:/etc/grafana/grafana.ini'
  30. env_file:
  31. - ./grafana/.env_grafana
  32. ports:
  33. - 3000:3000
  34. depends_on:
  35. - prometheus
  36. networks:
  37. - prom_app_net
  38. mysql:
  39. image: mariadb:10.10
  40. hostname: mysql
  41. container_name: mysql
  42. environment:
  43. MYSQL_RANDOM_ROOT_PASSWORD: "yes"
  44. MYSQL_DATABASE: slurm_acct_db
  45. MYSQL_USER: slurm
  46. MYSQL_PASSWORD: password
  47. volumes:
  48. - var_lib_mysql:/var/lib/mysql
  49. networks:
  50. - slurm
  51. # network_mode: host
  52. slurmdbd:
  53. image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
  54. build:
  55. context: .
  56. args:
  57. SLURM_TAG: ${SLURM_TAG:-slurm-21-08-6-1}
  58. command: ["slurmdbd"]
  59. container_name: slurmdbd
  60. hostname: slurmdbd
  61. volumes:
  62. - etc_munge:/etc/munge
  63. - etc_slurm:/etc/slurm
  64. - var_log_slurm:/var/log/slurm
  65. - cgroups:/sys/fs/cgroup:ro
  66. expose:
  67. - "6819"
  68. ports:
  69. - "6819:6819"
  70. depends_on:
  71. - mysql
  72. privileged: true
  73. cgroup: host
  74. networks:
  75. - slurm
  76. #network_mode: host
  77. slurmctld:
  78. image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
  79. command: ["slurmctld"]
  80. container_name: slurmctld
  81. hostname: slurmctld
  82. volumes:
  83. - etc_munge:/etc/munge
  84. - etc_slurm:/etc/slurm
  85. - slurm_jobdir:/data
  86. - var_log_slurm:/var/log/slurm
  87. - etc_prometheus:/etc/prometheus
  88. - /sys/fs/cgroup:/sys/fs/cgroup:rw
  89. expose:
  90. - "6817"
  91. - "8080"
  92. - "8081"
  93. - "8082/tcp"
  94. ports:
  95. - 8080:8080
  96. - 8081:8081
  97. - 8082:8082/tcp
  98. depends_on:
  99. - "slurmdbd"
  100. privileged: true
  101. cgroup: host
  102. #network_mode: host
  103. networks:
  104. - slurm
  105. c1:
  106. image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
  107. command: ["slurmd"]
  108. hostname: c1
  109. container_name: c1
  110. volumes:
  111. - etc_munge:/etc/munge
  112. - etc_slurm:/etc/slurm
  113. - slurm_jobdir:/data
  114. - var_log_slurm:/var/log/slurm
  115. - cgroups:/sys/fs/cgroup:ro
  116. expose:
  117. - "6818"
  118. depends_on:
  119. - "slurmctld"
  120. privileged: true
  121. cgroup: host
  122. #network_mode: host
  123. networks:
  124. - slurm
  125. c2:
  126. image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
  127. command: ["slurmd"]
  128. hostname: c2
  129. container_name: c2
  130. volumes:
  131. - etc_munge:/etc/munge
  132. - etc_slurm:/etc/slurm
  133. - slurm_jobdir:/data
  134. - var_log_slurm:/var/log/slurm
  135. - cgroups:/sys/fs/cgroup:ro
  136. expose:
  137. - "6818"
  138. - "22"
  139. depends_on:
  140. - "slurmctld"
  141. privileged: true
  142. cgroup: host
  143. networks:
  144. - slurm
  145. #network_mode: host
  146. volumes:
  147. etc_munge:
  148. etc_slurm:
  149. slurm_jobdir:
  150. var_lib_mysql:
  151. var_log_slurm:
  152. grafana_data:
  153. prometheus_data:
  154. cgroups:
  155. etc_prometheus:
  156. networks:
  157. prom_app_net:
  158. slurm:
  159. enable_ipv6: true
  160. ipam:
  161. config:
  162. - subnet: 2001:0DB8::/112

Then installed query-exporter on the slurmctld container and run it with the following config.yaml:

  1. databases:
  2. db1:
  3. dsn: sqlite:////test.db
  4. connect-sql:
  5. - PRAGMA application_id = 123
  6. - PRAGMA auto_vacuum = 1
  7. labels:
  8. region: us1
  9. app: app1
  10. metrics:
  11. metric1:
  12. type: gauge
  13. description: A sample gauge
  14. queries:
  15. query1:
  16. interval: 5
  17. databases: [db1]
  18. metrics: [metric1]
  19. sql: SELECT random() / 1000000000000000 AS metric1

But it is not working - prometheus lists the target as being down:

query-exporter在Docker容器中无法工作。

But the container set-up seems to be fine as if I run the following test exporter:

  1. from prometheus_client import start_http_server, Summary
  2. import random
  3. import time
  4. # Create a metric to track time spent and requests made.
  5. REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')
  6. # Decorate function with metric.
  7. @REQUEST_TIME.time()
  8. def process_request(t):
  9. """A dummy function that takes some time."""
  10. time.sleep(t)
  11. if __name__ == '__main__':
  12. # Start up the server to expose the metrics.
  13. start_http_server(8082)
  14. # Generate some requests.
  15. while True:
  16. process_request(random.random())

Prometheus can connect to the target fine:

query-exporter在Docker容器中无法工作。

Can anyone see what the problem could be?

Thanks!

Update

I run query-exporter by hand on the slurmctld container. There isnt anything in the container logs about query-exporter:

  1. 2023-07-10 10:11:37 ---> Starting the MUNGE Authentication service (munged) ...
  2. 2023-07-10 10:11:37 ---> Waiting for slurmdbd to become active before starting slurmctld ...
  3. 2023-07-10 10:11:37 -- slurmdbd is not available. Sleeping ...
  4. 2023-07-10 10:11:39 -- slurmdbd is now active ...
  5. 2023-07-10 10:11:39 ---> starting systemd ...

I think th etest_query.py that works is using IPv4 on port 8082, while the query exporter is trying to bind IPv6.

docker port slurmctld gives:

  1. 8080/tcp -> 0.0.0.0:8080
  2. 8080/tcp -> [::]:8080
  3. 8081/tcp -> 0.0.0.0:8081
  4. 8081/tcp -> [::]:8081
  5. 8082/tcp -> 0.0.0.0:8082
  6. 8082/tcp -> [::]:8082

I guess i need to pint prometheus at 8082/tcp -> [::]:8082 when the query-exporter runs, but I'm not sure how to do it.

答案1

得分: 0

使用 query-exporter config.yaml -H 0.0.0.0 -p 8082 运行可以使其工作。

英文:

Running with query-exporter config.yaml -H 0.0.0.0 -p 8082 gets it to work.

huangapple
  • 本文由 发表于 2023年7月10日 17:57:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76652641.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定