英文:
Excel Input Pentaho - String to Number
问题
I'm trying to make a xlsx input in Pentaho, but it keeps giving me this error message: "Unexpected conversion error while converting value [v String] to a Number"
我在尝试在Pentaho中创建一个xlsx输入,但它一直给我这个错误消息:"在将值[v String]转换为数字时发生意外的转换错误"
I have a value column that I'm trying to transform from string to number.
我有一个值列,我正在尝试将其从字符串转换为数字。
In the line 245 of my excel I have USD 11100.00 and in other lines just the values without the USD, could that be the problem? If so, do you guys have any idea how to solve it?
在我的Excel的第245行中,我有USD 11100.00,而在其他行中只有值,没有USD,这可能是问题吗?如果是的话,你们有没有任何解决方法?
I need to transform from string to number without pulling the USD. Just the numbers.
我需要将字符串转换为数字,不包括USD。只要数字。
I need to transform from string to number without pulling the USD. Just the numbers.
我需要将字符串转换为数字,不包括USD。只要数字。
英文:
I'm trying to make a xlsx input in Pentaho, but it keeps giving me this error message:
"Unexpected conversion error while converting value [v String] to a Number"
I have a value column that I'm trying to transform from string to number.
In the line 245 of my excel I have USD 11100.00 and in other lines just the values without the USD, could that be the problem? If so, do you guys have any idea how to solve it?
I need to transform from string to number without pulling the USD. Just the numbers.
I need to transform from string to number without pulling the USD. Just the numbers.
答案1
得分: 1
答案将取决于您有多少空间来强制执行您的格式,或者在您的输入数据中有多少“问题”。
在某种意义上,您的输入是非标准的,因为文件中混合了不同格式的行/列。您有多种选择:
- 您需要读取的文件是由您的公司/您认识的人生成的,并且您可以与他们交谈,因此您可以拒绝该文件,因为它不符合标准格式,他们可以生成一个符合期望格式的新文件。
- 您对文件没有任何决定权,您正在处理来自互联网或公共数据的数据,并且您对数据没有任何决定权。
- 您忽略所有不符合您期望格式的行,也许您可以生成一个包含所有被拒绝行的文件(在输入步骤中有其他选项),以便手动处理这些被拒绝的行。
- 您最初将该列视为字符串,然后使用正则表达式步骤仅提取数字,如果正则表达式无法提取数字,那么处理后的列将为空,并在无法读取为数字时将数据加载为空值。
根据您的项目性质、数据量以及非标准数据的预期量,任何提出的解决方案都可能有效。
英文:
The answer is going to depend on how much room you have to impose your format or how many "problems" on your entry data you are going to find.
You have a non-standard input in the sense that the file has rows/columns mixing formats. You have various options:
- This file you need to read is generated by your company/someone you know and can talk to so you can reject the file because not following the standard format, and they can generate a new one with the format expected.
- You don't have any say on the file, you are processing data from the internet or public data and you don't have any say on the data.
- You ignore all the rows that don't follow the format you are expecting, maybe you generate a file with all the rejected rows (there are additional options in the Input step for that) to process these rejected rows manually.
- You treat initially that column as a string, and then use the Regexp step to extract only the numbers, if the regexp expression can't extract numbers, the processed column will be a null and load the data with null values when you can't read it as a number.
Depending on the nature of your project, volume of data, and expected volume of non-standard data any of the proposed solutions might work.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论