问题

我正在寻找一个工具，可以在给定XLA-HLO计算图时打印运行时信息。
我知道有用于打印计算图操作节点的FLOPs（浮点运算次数）的HLO成本模型（分析模型）。
但是否有任何工具可以打印XLA-HLO计算图的预期运行时或任何与运行时相关的数值？

我需要其源代码或示例用法工具。谢谢

英文:

I'm looking for a tool to print the runtime when given the computational graph of XLA-HLO.
I know there are HLO cost model (analytical model) for print the FLOPs of operator node for computational graph.
But Is there any tool for print the expected runtime or any related value for runtime of XLA-HLO computational graph?

I need a source code of it or sample usage tool for it. Thanks

答案1

得分: 1

如果您正在使用JAX，您可以使用提前降级和编译 API 来了解计算的资源消耗情况。例如：

import jax
import numpy as np

def f(M, x):
  for i in range(10):
    x = M @ x
  return x

M = np.random.randn(1000, 1000)
x = np.random.randn(1000)

print(jax.jit(f).lower(M, x).compile().cost_analysis())

[{&#39;bytes accessed&#39;: 40080000.0,
  &#39;bytes accessed operand 0 {}&#39;: 40000000.0,
  &#39;bytes accessed operand 1 {}&#39;: 40000.0,
  &#39;bytes accessed output {}&#39;: 40000.0,
  &#39;flops&#39;: 20000000.0,
  &#39;optimal_seconds&#39;: 0.0,
  &#39;utilization operand 0 {}&#39;: 10.0,
  &#39;utilization operand 1 {}&#39;: 10.0}]

（注意：这是原文的翻译，其中的代码和链接部分未进行翻译。）

英文:

If you are using JAX, you can do this using the Ahead-of-time lowering and compilation APIs to get a sense of how resource-heavy a computation is. For example:

import jax
import numpy as np

def f(M, x):
  for i in range(10):
    x = M @ x
  return x

M = np.random.randn(1000, 1000)
x = np.random.randn(1000)

print(jax.jit(f).lower(M, x).compile().cost_analysis())

[{&#39;bytes accessed&#39;: 40080000.0,
  &#39;bytes accessed operand 0 {}&#39;: 40000000.0,
  &#39;bytes accessed operand 1 {}&#39;: 40000.0,
  &#39;bytes accessed output {}&#39;: 40000.0,
  &#39;flops&#39;: 20000000.0,
  &#39;optimal_seconds&#39;: 0.0,
  &#39;utilization operand 0 {}&#39;: 10.0,
  &#39;utilization operand 1 {}&#39;: 10.0}]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

寻找一个用于预测XLA-HLO计算图运行时间的工具。

问题

答案1

合并两个TensorFlow子模型

如何从一个 TensorFlow 张量的概率分布中抽样 5 个索引及其对应的概率？

Prefetch optimization of tf.data doesn’t work.

JAX用于在Python中最小化2点Lennard-Jones势能的代码产生了意外结果。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论