使用Talend对相似值进行排序

huangapple go评论48阅读模式
英文:

Sorting on similar values using talend

问题

DeptID SID StudentName MID
111 A1 Nancy A1
111 A1 Nancy B1
111 A1 Nancy C1
222 Z1 James Z1
英文:

I have a CSV file

I want to group together similar DeptID and sort the MID in ascending order and assign the value of lowest MID to SID who have similar DeptID using talend open studio for data integration. If there is one DeptID, then assign the same value of MID to SID.

Input CSV :

DeptID SID StudentName MID
111 Nancy C1
111 Nancy B1
111 Nancy A1
222 James Z1

I have used tFileInputDelimited to read the input file, I have used tSortRow to sort MID. And I have used tAggregateRow to group the values.

I am getting output as:

DeptID SID StudentName MID
111 Nancy [A1,B1,C1]
222 James [Z1]

The Output CSV should be as follows:

DeptID SID StudentName MID
111 A1 Nancy A1
111 A1 Nancy B1
111 A1 Nancy C1
222 Z1 James Z1

答案1

得分: 1

一种简单的解决方案是两次读取您的输入文件:一次作为主流程,以获取MID列的详细信息,一次作为查找,以获取MID列的MIN值作为您的SID列。然后使用tMap连接这两个流,使用deptID作为连接键(使用“所有匹配”连接类型)。

另一种解决方案是使用tMap的内部变量来完成工作,减少组件数量:

在使用tSortRow对数据进行排序后,在tMap中创建2个变量,按照tSort组件的顺序:

  • “sequence”根据DeptId创建一个递增,从1开始。
  • “currentVal”检查sequence是否等于1:如果是,则将当前的MID作为SID。否则,SID不变。

这两种解决方案都可以用于获取SID值。

英文:

One simple solution would be to read twice your input file : one time as the main flow, to get detail of MID column , one time as the lookup to get MIN value of MID column as your SID column. Then join the 2 flows with a tMap, joining on deptID (with "all matches" join type).
使用Talend对相似值进行排序

使用Talend对相似值进行排序

Another solution could be to use internal variables of tMap to get the work done with fewer components :

Once you have sorted your data with tSortRow, create 2 variables in a tMap following your tSort component :
使用Talend对相似值进行排序

  • "sequence" creates an increment based on DeptId, starting at 1

  • "currentVal" checks if sequence equals 1 : if so you get the current MID as SID. Else SID don't change.
    使用Talend对相似值进行排序

The 2 solutions work to get the SID value .

huangapple
  • 本文由 发表于 2023年8月10日 18:52:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76875039.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定