英文:
Passing tf.RaggedTensor to tfp.Distribution's methods in Python Tensorflow
问题
Sure, here's the translated code part:
假设我有一个 tensorflow-probability
分布
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
distr = tfd.Normal(loc=0, scale=1)
并且我想在一个不规则张量上评估负对数似然
>>> type(rt)
tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor
>>> rt.shape
TensorShape([50, None])
>>> distr.log_prob(rt) # 期望的形状:[50, None]
ValueError: TypeError: object of type 'RaggedTensor' has no len()
我想在模型训练期间在损失函数中使用它(这意味着我不能拆分张量的批处理维度):
>>> def negloglike(y, distr):
mean_over_samples = tf.reduce_mean(distr.log_prob(y), axis=-1) # 形状 [batch_size]
return -tf.reduce_mean(mean_over_samples)
尝试这样做会产生错误
<...>
line 2, in negloglike *
mean_over_samples = tf.reduce_mean(model.log_prob(y))
<...>
TypeError: Failed to convert elements of tf.RaggedTensor <...> to Tensor. <..>
我尝试用 NaN 填充 RaggedTensor 以获得常规的 tf.Tensor,然后在 log_prob(y)
之后掩盖 NaN 值,但这也导致模型权重变成 NaN。
有办法克服这个障碍吗?
英文:
Suppose I have a tensorflow-probability
distribution
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
distr = tfd.Normal(loc=0, scale=1)
and I want to evaluate negative log-likelihood on a ragged tensor
>>> type(rt)
tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor
>>> rt.shape
TensorShape([50, None])
>>> distr.log_prob(rt) # Expected shape: [50, None]
ValueError: TypeError: object of type 'RaggedTensor' has no len()
I would like to use it in a loss function during model training (which means that I can't unstack the tensor's batch dimension):
>>> def negloglike(y, distr):
mean_over_samples = tf.reduce_mean(distr.log_prob(y), axis=-1) # shape [batch_size]
return -tf.reduce_mean(mean_over_samples)
Attempt to do this produces an error
<...>
line 2, in negloglike *
mean_over_samples = tf.reduce_mean(model.log_prob(y))
<...>
TypeError: Failed to convert elements of tf.RaggedTensor <...> to Tensor. <..>
I tried padding RaggedTensor with NaNs to get a regular tf.Tensor, followed with masking NaN values after log_prob(y)
, but it made model weights go NaN as well.
Is there any way to overcome this obstacle?
答案1
得分: 2
以下是您要翻译的内容:
我能够找到至少可行的解决方案。为了保持简单,最重要的细节是 tf.RaggedTensor.flat_values
属性。它通过连接所有行将 RaggedTensor
转换为常规 Tensor
。(对于具有多个不规则维度的 RaggedTensors,它的工作方式更复杂。)
另一件事是,我将分布的 batch_shape
添加了另一个维度,以确保 distribution.log_prob()
接受扁平化的 RaggedTensor
。然而,这会增加大约 batch_size
的计算概率值的数量。其中大多数在下一步被抛弃。
最后,为了再次获得 1D Tensor
,我使用了 tf.gather_nd
。它从 2D log_probs_2d
中选择了正确的值到 1D log_probs
。唯一剩下的操作是平均值。
完整的代码如下:
def negloglike(y, distr):
'''
y: tf.RaggedTensor, shape (batch_size, None)
distr: batch_shape==(None, 1), event_shape==(,)
'''
batch_size = tf.shape(y)[0]
segment_ids = tf.repeat(tf.range(batch_size), y.row_lengths()) # shape (total_samples, )
total_samples = tf.shape(segment_ids)[0]
flattened = y.flat_values # shape (total_samples, )
log_probs_2d = distr.prob(flattened) # shape (batch_size, total_samples, )
### calculating too many unnecessary values (\times batch_size more than needed)
log_probs = tf.gather_nd(indices=tf.stack([segment_ids,
tf.range(total_samples)], axis=1),
params=log_probs_2d) # shape (total_samples, )
mean_over_samples = tf.math.segment_mean(log_probs, segment_ids) # shape (batch_size, )
return -tf.reduce_mean(mean_over_samples)
希望这对您有所帮助!
英文:
I was able to find the solution that at least works. To keep it simple, the most important detail is tf.RaggedTensor.flat_values
attribute. It transformed the RaggedTensor
to regular Tensor
by concatenating all the rows. (It works in a more complicated manner for RaggedTensors with multiple ragged dimensions.)
Another thing is that I added another dimension to the distribution's batch_shape
to make sure that distribution.log_prob()
accepts flattened RaggedTensor
. However, it increases the number of computed probability values by approximately batch_size
. Most of them are ditched on the next step.
Finally, to obtain 1D Tensor
again, I used tf.gather_nd
. It picked correct values from the 2D log_probs_2d
to 1D log_probs
. The only operation left is averaging.
Full code is as follows:
def negloglike(y, distr):
'''
y: tf.RaggedTensor, shape (batch_size, None)
distr: batch_shape==(None, 1), event_shape==(,)
'''
batch_size = tf.shape(y)[0]
segment_ids = tf.repeat(tf.range(batch_size), y.row_lengths()) # shape (total_samples, )
total_samples = tf.shape(segment_ids)[0]
flattened = y.flat_values # shape (total_samples, )
log_probs_2d = distr.prob(flattened) # shape (batch_size, total_samples, )
### calculating too many unnecessary values (\times batch_size more than needed)
log_probs = tf.gather_nd(indices=tf.stack([segment_ids,
tf.range(total_samples)], axis=1),
params=log_probs_2d) # shape (total_samples, )
mean_over_samples = tf.math.segment_mean(log_probs, segment_ids) # shape (batch_size, )
return -tf.reduce_mean(mean_over_samples)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论