如何从整个字符串中提取特定的字符集并保存到数组或列表?

huangapple go评论84阅读模式
英文:

How to take specific set of characters out of overall string and save to array or list?

问题

我有一个包含Unicode字符的字符串,我试图从整个字符串中提取每个Unicode字符并保存到一个列表/数组中。

这是整个字符串:

"test 🔷 test 💙 test 🔹"

我想要以下列表:

1. 🔷 2. 💙 3. 🔹

我目前正在尝试以下方法:

string[] emojiSeparators = new string[] { "&#", ";" };
string[] resultEmojis;

resultEmojis = noHtmlEmoji.Split(
  emojiSeparators, StringSplitOptions.RemoveEmptyEntries);

但是我得到了类似下面这样的单词"test"被添加到列表中:

如何从整个字符串中提取特定的字符集并保存到数组或列表?

我只想要将Unicode字符保存到我的列表中,以便我可以迭代它们并执行操作。

英文:

I have a string with Unicodes inside of it, and I am trying to extract each unicode from the overall string and save it to a list/array..

This is the overall string:

"test 🔷 test 💙 test 🔹"

I want the following list:

1. 🔷 2. 💙 3. 🔹

Right now I am trying the following:

string[] emojiSeparators = new string[] { "&#", ";" };
string[] resultEmojis;

resultEmojis = noHtmlEmoji.Split(
  emojiSeparators, StringSplitOptions.RemoveEmptyEntries);

But I am getting the words "test" added to the list like below:

如何从整个字符串中提取特定的字符集并保存到数组或列表?

I only want the unicodes saved to my list, so that I can iterate over them and do things.

答案1

得分: 3

我建议使用正则表达式进行匹配:

using System.Linq;
using System.Text.RegularExpressions;

...

string[] resultEmojis = Regex
  .Matches(noHtmlEmoji, @"&#[1-9][0-9]{5}(?=;)")
  .Cast<Match>()
  .Select(match => match.Value)
  .ToArray();

模式 &#[1-9][0-9]{5}(?=;) 解释:

&amp;#       - 匹配 &amp;# 字符
[1-9]    - 匹配 1 到 9 范围内的数字
[0-9]{5} - 匹配 0 到 9 范围内的 5 个数字
(?=;)    - 匹配不包括在结果中的 ; 字符

Fiddle

英文:

I suggest matching with a help of regular expression:

using System.Linq;
using System.Text.RegularExpressions;

...

string[] resultEmojis = Regex
  .Matches(noHtmlEmoji, @&quot;&amp;#[1-9][0-9]{5}(?=;)&quot;)
  .Cast&lt;Match&gt;()
  .Select(match =&gt; match.Value)
  .ToArray();

Pattern &amp;#[1-9][0-9]{5}(?=;) explained:

&amp;#       - &amp;# characters
[1-9]    - digit in 1..9 range
[0-9]{5} - 5 digits in 0..9 range
(?=;)    - ; character which is not included into the match

Fiddle

huangapple
  • 本文由 发表于 2023年6月30日 01:27:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76583342.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定