2023年5月13日 18:49:06go评论98阅读模式

英文:

Passing tf.RaggedTensor to tfp.Distribution's methods in Python Tensorflow

问题

Sure, here's the translated code part:

假设我有一个 tensorflow-probability 分布

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
distr = tfd.Normal(loc=0, scale=1)

并且我想在一个不规则张量上评估负对数似然

&gt;&gt;&gt; type(rt)
tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor
&gt;&gt;&gt; rt.shape
TensorShape([50, None])
&gt;&gt;&gt; distr.log_prob(rt)  # 期望的形状：[50, None]
ValueError: TypeError: object of type &#39;RaggedTensor&#39; has no len()

我想在模型训练期间在损失函数中使用它（这意味着我不能拆分张量的批处理维度）：

&gt;&gt;&gt; def negloglike(y, distr):
        mean_over_samples = tf.reduce_mean(distr.log_prob(y), axis=-1)  # 形状 [batch_size]
        return -tf.reduce_mean(mean_over_samples)

尝试这样做会产生错误

&lt;...&gt;
line 2, in negloglike  *
        mean_over_samples = tf.reduce_mean(model.log_prob(y))
&lt;...&gt;
TypeError: Failed to convert elements of tf.RaggedTensor &lt;...&gt; to Tensor. &lt;..&gt;

我尝试用 NaN 填充 RaggedTensor 以获得常规的 tf.Tensor，然后在 log_prob(y) 之后掩盖 NaN 值，但这也导致模型权重变成 NaN。

有办法克服这个障碍吗？

英文:

Suppose I have a tensorflow-probability distribution

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
distr = tfd.Normal(loc=0, scale=1)

and I want to evaluate negative log-likelihood on a ragged tensor

&gt;&gt;&gt; type(rt)
tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor
&gt;&gt;&gt; rt.shape
TensorShape([50, None])
&gt;&gt;&gt; distr.log_prob(rt)  # Expected shape: [50, None]
ValueError: TypeError: object of type &#39;RaggedTensor&#39; has no len()

I would like to use it in a loss function during model training (which means that I can't unstack the tensor's batch dimension):

&gt;&gt;&gt; def negloglike(y, distr):
        mean_over_samples = tf.reduce_mean(distr.log_prob(y), axis=-1)  # shape [batch_size]
        return -tf.reduce_mean(mean_over_samples)

Attempt to do this produces an error

&lt;...&gt;
line 2, in negloglike  *
        mean_over_samples = tf.reduce_mean(model.log_prob(y))
&lt;...&gt;
TypeError: Failed to convert elements of tf.RaggedTensor &lt;...&gt; to Tensor. &lt;..&gt;

I tried padding RaggedTensor with NaNs to get a regular tf.Tensor, followed with masking NaN values after log_prob(y), but it made model weights go NaN as well.

Is there any way to overcome this obstacle?

答案1

得分: 2

以下是您要翻译的内容：

我能够找到至少可行的解决方案。为了保持简单，最重要的细节是 tf.RaggedTensor.flat_values 属性。它通过连接所有行将 RaggedTensor 转换为常规 Tensor。（对于具有多个不规则维度的 RaggedTensors，它的工作方式更复杂。）

另一件事是，我将分布的 batch_shape 添加了另一个维度，以确保 distribution.log_prob() 接受扁平化的 RaggedTensor。然而，这会增加大约 batch_size 的计算概率值的数量。其中大多数在下一步被抛弃。

最后，为了再次获得 1D Tensor，我使用了 tf.gather_nd。它从 2D log_probs_2d 中选择了正确的值到 1D log_probs。唯一剩下的操作是平均值。

完整的代码如下：

def negloglike(y, distr):
    '''
    y: tf.RaggedTensor, shape (batch_size, None)
    distr: batch_shape==(None, 1), event_shape==(,)
    '''
    batch_size = tf.shape(y)[0]
    segment_ids = tf.repeat(tf.range(batch_size), y.row_lengths())  # shape (total_samples, )
    total_samples = tf.shape(segment_ids)[0]
    
    flattened = y.flat_values  # shape (total_samples, )
    log_probs_2d = distr.prob(flattened) # shape (batch_size, total_samples, )
    ### calculating too many unnecessary values (\times batch_size more than needed)
    log_probs = tf.gather_nd(indices=tf.stack([segment_ids,
                                               tf.range(total_samples)], axis=1), 
                             params=log_probs_2d)  # shape (total_samples, )
    
    mean_over_samples = tf.math.segment_mean(log_probs, segment_ids)  # shape (batch_size, )
    
    return -tf.reduce_mean(mean_over_samples)

希望这对您有所帮助！

英文:

I was able to find the solution that at least works. To keep it simple, the most important detail is tf.RaggedTensor.flat_values attribute. It transformed the RaggedTensor to regular Tensor by concatenating all the rows. (It works in a more complicated manner for RaggedTensors with multiple ragged dimensions.)

Another thing is that I added another dimension to the distribution's batch_shape to make sure that distribution.log_prob() accepts flattened RaggedTensor. However, it increases the number of computed probability values by approximately batch_size. Most of them are ditched on the next step.

Finally, to obtain 1D Tensor again, I used tf.gather_nd. It picked correct values from the 2D log_probs_2d to 1D log_probs. The only operation left is averaging.

Full code is as follows:

def negloglike(y, distr):
    &#39;&#39;&#39;
    y: tf.RaggedTensor, shape (batch_size, None)
    distr: batch_shape==(None, 1), event_shape==(,)
    &#39;&#39;&#39;
    batch_size = tf.shape(y)[0]
    segment_ids = tf.repeat(tf.range(batch_size), y.row_lengths())  # shape (total_samples, )
    total_samples = tf.shape(segment_ids)[0]
    
    flattened = y.flat_values  # shape (total_samples, )
    log_probs_2d = distr.prob(flattened) # shape (batch_size, total_samples, )
    ### calculating too many unnecessary values (\times batch_size more than needed)
    log_probs = tf.gather_nd(indices=tf.stack([segment_ids,
                                               tf.range(total_samples)], axis=1), 
                             params=log_probs_2d)  # shape (total_samples, )
    
    mean_over_samples = tf.math.segment_mean(log_probs, segment_ids)  # shape (batch_size, )
    
    return -tf.reduce_mean(mean_over_samples)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Passing tf.RaggedTensor to tfp.Distribution’s methods in Python Tensorflow.

问题

答案1

webdriver-manager getting error as executable

将图像角点坐标转换为世界坐标。

如何根据其他列的值填充Python数据框的列值？

Rolling and Mode function to get the majority of voting for rows in pandas Dataframe

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。