2020年1月3日 20:50:13go评论68阅读模式

英文:

Is there any way to merge rows to fill null values in Talend Open Studio?

问题

我在使用Talend Open Studio时遇到了困难。

我的问题是，如何用相同的键从相同的列中填充空值，使其变为非空值？

假设我有这样的源数据。

我希望得到以下结果：

我尝试了几种方法来解决这个问题，但没有找到一个合适的方法。

如果您有想法，请帮助我。

**添加的内容

更直观的示例

因此，每个键可能在同一列中具有多个值，它们不应该用逗号分隔在同一行中，例如“C-1, C-2, C-3”，它们应该从具有相同键的第一行顶部填充。

这就是为什么第一个ID有三行而第二个ID只有一行的原因。

英文:

I have difficulty, working using Talend Open Studio.

My question is,

how can I fill the null values with NOTNULL values from the same columns with the same keys?

Suppose that I have source data like this.

And I'd like to get result like following:

I've tried several ways to solve this, but I couldn't find one.

If you have an idea, please help me.

** Added

More intuitive example

So, each key might have multiple values for the same column,

and they should not be in the same row with commas like "C-1, C-2, C-3",

and they should be filled from the top of the first row with the same key.

This is the reason the first ID has three rows while the second one has only one row.

答案1

得分: 0

使用一个tMap和类似的coalesce函数。在tMap中，您可以连接这两个数据集（默认情况下，它执行左连接，非常适合您），然后执行以下操作：

A == null ? B : A

将会得到您需要的结果。

英文:

Use a tMap and a coalesce like function. In the tMap you can join the 2 dataset. (by default it is doing a left join which is perfect for you) then doing this:

A == null ? B : A

would get what you need.

答案2

得分: 0

我自己找到了其中一种解决方案，并将其分享。

解决方案的关键是组件 "tDenormalize" 和每行的另一个键值。

如果在仅使用 tDenormalize 组件时没有另一个键列，您将获得一个列中的多个值的结果，这些值由您编写的分隔符分隔，而我说的分隔符不应该与相同列中的分隔符在一起。

要获得与我在问题中想要的完全相同的结果，请为行提供额外的键值。

我在作业之前做了类似这样的事情：

row2.tmpKey = row1.Numeric.sequence(row1.EmployeeID + "PartA",1,1);

所以，原始数据会像这样：

EE_ID,ColumnA,ColumnB,ColumnC,TmpKey
EE001,Part A value,null,null,1
EE001,null,Part B value,null,1
EE001,null,Part B value,null,2
EE001,null,null,Part C value,1
...

然后，在 tDenormalize 组件视图的基本设置中设置 "要去规范化的列：ColumnA，ColumnB，ColumnC"。

英文:

I figured out one of the solutions by myself, and I'm gonna share it.

The keys for the solution are the component "tDenormalize" and another key value for each row.

Without another key column when you use only tDenormalize component, you would get the result of multiple values in a column of a row separated by the delimiter that you wrote, which I said shouldn't be in the same column with delimiters.

To get the exact same result that I wanted in the question, give rows additional key values.

I did something like this as pre-job:

row2.tmpKey = row1.Numeric.sequence(row1.EmployeeID + "PartA",1,1);

So, the raw data would be like: 
EE_ID,ColumnA,ColumnB,ColumnC,TmpKey 
EE001,Part A value,null,null,1 
EE001,null,Part B value,null,1 
EE001,null,Part B value,null,2 
EE001,null,null,Part C value,1 
...

Then you set "To denormalize columns: ColumnA, ColumnB, ColumnC" in Basic Settings of tDenormalize component view.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Talend Open Studio中是否有一种方法可以合并行以填充空值？

问题

答案1

答案2

如何在正整数和字母索引/计数值之间转换

Java设计模式：管道和不可变性

将十进制数转换为16位温度二进制。

如何使用Talend从数据库的列中获取文本的一部分，然后使用tdbnput进行打印。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论