将输出(词)标签放置在FST中词的初始转换上如何导致有效的组合?

huangapple go评论63阅读模式
英文:

How does placing the output (word) labels on the initial transitions of the words in an FST lead to effective composition?

问题

我正在阅读hbka.pdf(WFST论文)。这是参考的WFST图

这里输入标签i、输出标签o和转换的权重w都标在相应的有向弧上,用i: o/w表示。

对我来说,一个转换器如何在初始转换中就能输出整个单词是没有意义的。如果整个单词在最终转换中输出,那对我来说是有意义的。

后来我在第19页看到以下内容:

"为了使这个转换器能够有效地与G组合,输出(单词)标签必须放在单词的初始转换上;在其他位置放置会导致组合匹配的延迟,这可能会消耗大量的时间和空间。"

ChatGPT回答说:"在单词二元转换器的初始转换上放置输出标签可以通过优化转换的匹配和组合来实现与另一个转换器的更有效组合。"

但具体是如何发生的呢?

"将输出标签放在初始转换上确保了单词二元转换器中的单词转换与其他转换器中的转换直接对齐。"

但仍然有一个问题,有限状态转换器如何使用d、ey、dx、ax等作为输入符号来找出整个单词,然后将其作为初始转换的输出呢?

英文:

I am going through hbka.pdf (WFST paper). https://cs.nyu.edu/~mohri/pub/hbka.pdf

A WFST figure for reference

Here the input label i, the output label o, and weight w of a transition are marked on the corresponding directed arc by i: o/w.

It does not make sense as to how a transducer can output the entire word at the initial transition itself. If the entire word was outputted at the final transition, it is sensible to me.

Later I saw the following in Page 19,

"In order for this transducer to efficiently compose with G, the output (word) labels must be placed on the initial transitions of the words; other locations would lead to delays in the composition matching, which could consume significant time and space."

ChatGPT answers that "placing the output labels on the initial transitions of the words in the Word bigram transducer enables more efficient composition with another transducer by optimizing the matching and combination of transitions."

But how exactly does it happen?

"Placing the output labels on the initial transitions ensures that the word transitions in the Word bigram transducer align directly with the transitions in the other transducer."

But still, the entire word which the Finite state transducer has to figure out using phones as input symbols like d,ey,dx,ax, how can it be the output of the initial transition?

答案1

得分: 0

根据我的理解,在第一个转换中确定了输出,但只有在达到最终状态时才实际产生输出。所以在某种程度上,输出是一种假设,随后的转换用于测试这个假设。如果没有达到最终状态,到目前为止的输出将被丢弃。

一个优点是,具有相同输出的多条路径(你示例图中的上方路径),不必在每个最终转换中重复输出。此外,如果你有类似结尾的输入,你可以稍后合并这些路径;这可能会使有限状态自动机(FST)更加高效。想象一下有多少英文单词以 /ing/、/ed/ 或 /s/ 结尾 —— 这些都可以指向相同的最终状态,但如果在最后生成输出就不行。

我猜另一个原因是,这样在将FST与其他FST组合时更容易操作。如果合并两个FST,将输出生成推迟到更早的位置始终更容易,而不是在路径末端处理它。

英文:

As far as I understand it, while the output is determined in the first transition, it is only actually produced once a final state is reached. So in a way the output is hypothesised, and subsequent transitions are used to test the hypothesis. If no final state is reached, the output so far is discarded.

On advantage is that multiple paths with identical output (the upper path in your example image), do not have to repeat the output in every final transition. Also, if you have inputs with similar endings you can merge the paths later; which might make the FST more efficient. Imaging how many English words end in /ing/ or /ed/ or /s/ -- these can all point to the same identical final states, but not if the output is generated at the end.

I guess a further reason is that this makes it easier to manipulate the FST when it is combined with other FSTs. It is always easier to push the output generation further backwards if you merge two FSTs, rather than deal with it when it is already at the end of the path.

huangapple
  • 本文由 发表于 2023年7月10日 14:11:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76651059.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定