英文:
phone numbers loaded to my c# .net core web app from an excel file has encoding issues
问题
I'm loading a collection of rows from excel into a MemoryStream
. the excel contains a phone number column.
例如:
phoneNumber="0511234567"
直到现在,我的代码没有任何问题。我得到了一个导致以下问题的Excel文件:
在上面的示例中,如果我运行以下命令,我将得到:
phoneNumber.Length()
==> 11 // 应该是10
phoneNumber.Substring(0,2)
==> "0"
phoneNumber.Substring(0,1)
==> ""
phoneNumber.SrartsWith("05")
==> true
在我的代码中,我检查phoneNumber.Substring(0,2) == "05"
,在这个Excel文件中我得到了false
,尽管当我查看实际值时,它以"05"开头。
如您所见,似乎第一个字符为空,被视为有值。
我理解这与编码有关。我尝试解析为字节,然后解码和重新编码,但没有成功。
将不胜感激任何输入。
英文:
I'm loading a collection of rows from excel into a MemoryStream
. the excel contains a phone number column.
for example:
> phoneNumber="0511234567"
up untill now I had no issues with my code. I got an excel file that causes the folling problem:
on the example above, if I will run the follwing command I will get:
phoneNumber.Length()
==> 11 // it should be 10
phoneNumber.Substring(0,2)
==> "0"
phoneNumber.Substring(0,1)
==> ""
phoneNumber.SrartsWith("05")
==> true
in my code I check if phoneNumber.Substring(0,2) == "05"
and in this excel file I get false
though when I look at the actual value it is starting with "05".
As you can see above it seems like the first char is nothing and is counted as if there is a value.
I understand this relates to something with encoding. I tried parsing to bytes and decoding and then encoding to no success.
Will appriciate any input.
答案1
得分: 0
I could reproduce the behavior by adding a null to the beginning of the phone number:
var phoneNumber = "var phoneNumber = "\00511234567";
511234567";
So you probably have a non-printing char at the beginning of the string. To trim control characters from the string you can use the following:
phoneNumber = new string(phoneNumber.Where(c => !char.IsControl(c)).ToArray());
Update:
Since the first char has char code 8207 which is not a control character. You could use IsDigit
instead and if the phone numbers are allowed to have a hyphen you could add that as a filter condition:
phoneNumber = new string(phoneNumber.Where(c => char.IsDigit(c) || c == '-').ToArray());
英文:
I could reproduce the behavior by adding a null to the beginning of the phone number:
var phoneNumber = "\00511234567";
So you probably have a non-printing char at the beginning of the string. To trim control characters from the string you can use the following:
phoneNumber = new string(phoneNumber.Where(c => !char.IsControl(c)).ToArray());
Update:
Since the first char has char code 8207 which is not a control character. You could use IsDigit
instead and if the phone numbers are allowed to have a hyphen you could add that as a filter condition:
phoneNumber = new string(phoneNumber.Where(c => char.IsDigit(c) || c == '-').ToArray());
答案2
得分: 0
你的字符串的第一个字符是一个中文符号,所以如果你的本地环境没有安装中文,它可能不会在调试输出中显示。我建议只需使用 phoneNumber.Substring(1)
来提取电话号码,如果它总是以一个无法阅读的符号开头。
英文:
The first character in your string is a Chinese symbol, so it might not be displayed in debug output if you don't have Chinese installed locally. I would suggest just doing phoneNumber.Substring(1)
to extract the phone number if it always starts with a single unreadable symbol
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论