Passing tf.RaggedTensor to tfp.Distribution’s methods in Python Tensorflow.

huangapple go评论61阅读模式
英文:

Passing tf.RaggedTensor to tfp.Distribution's methods in Python Tensorflow

问题

Sure, here's the translated code part:

假设我有一个 tensorflow-probability 分布

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

distr = tfd.Normal(loc=0, scale=1)

并且我想在一个不规则张量上评估负对数似然

>>> type(rt)
tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor
>>> rt.shape
TensorShape([50, None])
>>> distr.log_prob(rt)  # 期望的形状:[50, None]
ValueError: TypeError: object of type 'RaggedTensor' has no len()

我想在模型训练期间在损失函数中使用它(这意味着我不能拆分张量的批处理维度):

>>> def negloglike(y, distr):
        mean_over_samples = tf.reduce_mean(distr.log_prob(y), axis=-1)  # 形状 [batch_size]
        return -tf.reduce_mean(mean_over_samples)

尝试这样做会产生错误

<...>
line 2, in negloglike  *
        mean_over_samples = tf.reduce_mean(model.log_prob(y))
<...>
TypeError: Failed to convert elements of tf.RaggedTensor <...> to Tensor. <..>

我尝试用 NaN 填充 RaggedTensor 以获得常规的 tf.Tensor,然后在 log_prob(y) 之后掩盖 NaN 值,但这也导致模型权重变成 NaN。

有办法克服这个障碍吗?

英文:

Suppose I have a tensorflow-probability distribution

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

distr = tfd.Normal(loc=0, scale=1)

and I want to evaluate negative log-likelihood on a ragged tensor

>>> type(rt)
tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor
>>> rt.shape
TensorShape([50, None])
>>> distr.log_prob(rt)  # Expected shape: [50, None]
ValueError: TypeError: object of type 'RaggedTensor' has no len()

I would like to use it in a loss function during model training (which means that I can't unstack the tensor's batch dimension):

>>> def negloglike(y, distr):
        mean_over_samples = tf.reduce_mean(distr.log_prob(y), axis=-1)  # shape [batch_size]
        return -tf.reduce_mean(mean_over_samples)

Attempt to do this produces an error

<...>
line 2, in negloglike  *
        mean_over_samples = tf.reduce_mean(model.log_prob(y))
<...>
TypeError: Failed to convert elements of tf.RaggedTensor <...> to Tensor. <..>

I tried padding RaggedTensor with NaNs to get a regular tf.Tensor, followed with masking NaN values after log_prob(y), but it made model weights go NaN as well.

Is there any way to overcome this obstacle?

答案1

得分: 2

以下是您要翻译的内容:

我能够找到至少可行的解决方案。为了保持简单,最重要的细节是 tf.RaggedTensor.flat_values 属性。它通过连接所有行将 RaggedTensor 转换为常规 Tensor。(对于具有多个不规则维度的 RaggedTensors,它的工作方式更复杂。)

另一件事是,我将分布的 batch_shape 添加了另一个维度,以确保 distribution.log_prob() 接受扁平化的 RaggedTensor。然而,这会增加大约 batch_size 的计算概率值的数量。其中大多数在下一步被抛弃。

最后,为了再次获得 1D Tensor,我使用了 tf.gather_nd。它从 2D log_probs_2d 中选择了正确的值到 1D log_probs。唯一剩下的操作是平均值。

完整的代码如下:

def negloglike(y, distr):
    '''
    y: tf.RaggedTensor, shape (batch_size, None)
    distr: batch_shape==(None, 1), event_shape==(,)
    '''

    batch_size = tf.shape(y)[0]
    segment_ids = tf.repeat(tf.range(batch_size), y.row_lengths())  # shape (total_samples, )
    total_samples = tf.shape(segment_ids)[0]
    
    flattened = y.flat_values  # shape (total_samples, )
    log_probs_2d = distr.prob(flattened) # shape (batch_size, total_samples, )
    ### calculating too many unnecessary values (\times batch_size more than needed)

    log_probs = tf.gather_nd(indices=tf.stack([segment_ids,
                                               tf.range(total_samples)], axis=1), 
                             params=log_probs_2d)  # shape (total_samples, )
    
    mean_over_samples = tf.math.segment_mean(log_probs, segment_ids)  # shape (batch_size, )
    
    return -tf.reduce_mean(mean_over_samples)

希望这对您有所帮助!

英文:

I was able to find the solution that at least works. To keep it simple, the most important detail is tf.RaggedTensor.flat_values attribute. It transformed the RaggedTensor to regular Tensor by concatenating all the rows. (It works in a more complicated manner for RaggedTensors with multiple ragged dimensions.)

Another thing is that I added another dimension to the distribution's batch_shape to make sure that distribution.log_prob() accepts flattened RaggedTensor. However, it increases the number of computed probability values by approximately batch_size. Most of them are ditched on the next step.

Finally, to obtain 1D Tensor again, I used tf.gather_nd. It picked correct values from the 2D log_probs_2d to 1D log_probs. The only operation left is averaging.

Full code is as follows:

def negloglike(y, distr):
    '''
    y: tf.RaggedTensor, shape (batch_size, None)
    distr: batch_shape==(None, 1), event_shape==(,)
    '''

    batch_size = tf.shape(y)[0]
    segment_ids = tf.repeat(tf.range(batch_size), y.row_lengths())  # shape (total_samples, )
    total_samples = tf.shape(segment_ids)[0]
    
    flattened = y.flat_values  # shape (total_samples, )
    log_probs_2d = distr.prob(flattened) # shape (batch_size, total_samples, )
    ### calculating too many unnecessary values (\times batch_size more than needed)

    log_probs = tf.gather_nd(indices=tf.stack([segment_ids,
                                               tf.range(total_samples)], axis=1), 
                             params=log_probs_2d)  # shape (total_samples, )
    
    mean_over_samples = tf.math.segment_mean(log_probs, segment_ids)  # shape (batch_size, )
    
    return -tf.reduce_mean(mean_over_samples)

huangapple
  • 本文由 发表于 2023年5月13日 18:49:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/76242328.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定