英文:
Copy Stage on IBM Data Stage
问题
我在使用“复制数据”将数据插入表时发现了一个奇怪的问题。所有列都在一个转换器中进行处理,而在转换器中有两个特殊的列。
列A使用索引函数在字符串中执行LIKE操作。
列B是列A要使用的字符串,并按字母顺序排序在一列中。
以下代码使用索引函数为列A分配一个值。
IF index(COL_B, "ABC", 1) >0 THEN 'ABC' ELSE COL_B
我在目标表上的预期结果是列A应该有'ABC'或列B的原始字符串。但是当我检查表时,列A根本没有改变。
顺便说一下,在将复制阶段和目标表之间添加一个转换器后,列A被更新了。
我找不到发生这种情况的原因,也无法通过搜索找到任何线索或解释。当我们使用“复制数据”阶段时,这是正常的结果吗?如果是的话,我们的团队应该知道这种情况可能随时发生。
- 原始流程 - 不更新列A和B
- 在复制和目标表之间添加了一个转换器 - 更新了列A和B
======================================================
[更新]
-
无论是通过复制原始作业创建新作业还是在Oracle上创建目标表,都无法重现此错误。
-
原始作业和复制的作业在数据流经过一次以上的转换器修正后都能正确发送数据,尽管在修正后删除了第二个转换器。
-
由于第1条的更新,在原始作业中的转换器中的阶段变量工作正常,但是我无法对不正确的数据流进行准确的测试。
英文:
I found a strange thing while using Copy Data to insert data into a table.
All columns are processed in a transformer and there are two special columns in the transformer.
Column A uses index function to perform LIKE operation in a string.
Column B is the string to be used column A and it is sorted by alphabetical order in one column.
The following code is using index function to assign a value of Column A.
IF index(COL_B, "ABC", 1) >0 THEN 'ABC' ELSE COL_B
My expected result on the target table is Column A to have 'ABC' or the original string of Column B.
When I checked out the table, column A was not changed at all.
BTW, after placing one transformer between COPY stage and Target table, the column A is updated.
I couldn't find the reason why it happened or any clue/explanation by googling. Is it normal result when we use a COPY stage? If so, our team should know it can happen all the time.
- Original flow - not update column A and B
- Added a transformer between COPY and Target table - updated column A and B
======================================================
[Updates]
-
This error is not reproduced regardless of the creation a new job by copying the original job and a target table on the Oracle.
-
The original job and the copied job are sending the correct dataflow without Stage Variables once the dataflow was corrected by one more transformer even though the 2nd transformer was removed after the correction.
-
Due to #1 updates, Stage Variables in the transformer in the original job works well, but I couldn't have the exact testing against the incorrect dataflow.
答案1
得分: 1
A common reason for such code not to work is the order in which columns are processed. The compiler might mix it up when trying to optimize the job. This is actually valid because even though we see a top-down sorting of the columns, the columns can be treated as non-sorted. (The Connector Stage should map the columns by name.)
In Trans_1, substitute your code in Stage Variables. These are processed top-down.
If the problem still exists:
- Try to play around with the force option of the copy stage
- or contact IBM support for in-depth analysis.
英文:
<s>A common reason for such code not to work is the order in which columns are processed.
The compiler might mix it up when trying to optimize the job. This is actually valid, because even though we see a top-down sorting of the columns, the columns can be treated as non-sorted. (The Connector Stage should map the columns by name.)</s> <sup>I might have mixed something up here, but the following suggestion is still valid:</sup>
In Trans_1, substitute your code in Stage Variables. These are processed top-down.
If the problem still exists:
- Try to play around with the force option of the copy stage
- or contact IBM support for in-depth analysis.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论