如何正确地通过删除成对的相同字母来缩短字符串?

huangapple go评论77阅读模式
英文:

how do I shorten strings by removing pairs of identical letters correctly?

问题

我正在尝试构建一个程序,它应该接收一个由小写字母组成的字符串,并搜索最靠近字符串开头和结尾的相同字母对,并将它们删除。

例如,对于字符串"helloworld",第一个和最后一个'l'将被删除,'o'也将被删除。最终输出为"helwrd"。

以下是我目前编写的代码:

string[] testdata = { "helloworld" };
//string[] testdata = { "aaa" };
//string[] testdata = { "abaccbaabbaabcc" };
//string[] testdata = { "pepperonipizza" };
//string[] testdata = { "thequickbrownfoxjumpedoverthelazydog" };
string data = "helloworld";
foreach (string test in testdata)
{
    for (int i = 0; i < data.Length - 1; i++)
    {
        for(int di = data.Length - 1; di != i; di--)
        {
            if (data[i] == data[di])
            {
                data = data.Remove(di, 1);
                data = data.Remove(i, 1);
                i = 0;
                break;
            }
        }
    }
    Console.WriteLine(data);
}

我面临的问题

对于其他短语,如"pepperonipizza",最终结果应该是"rona",但实际上却是"eerona"。

同样的问题也存在于"thequickbrownfoxjumpedoverthelazydog",最终结果应该是"qickbwnfxjmpvlazyg",但实际上却是"hqickbwnfxjmpvhlazyg"。

我在调试时查看了 Watch 窗口,发现最后一对相同的字母不会被删除。因此,在 Watch 调试器和终端中的最终结果将是"eerona"和"hqickbwnfxjmpvhlazyg"。

Argument                             : Expected           : Actual
----------------------------------------------------------------------
helloworld                           : helwrd             :
aaa                                  : a                  :
abaccbaabbaabcc                      : b                  :
pepperonipizza                       : rona               : eerona
thequickbrownfoxjumpedoverthelazydog : qickbwnfxjmpvlazyg : hqickbwnfxjmpvhlazyg

请问有什么我可以帮助你的吗?

英文:

I am trying to build a program where it should be given a string of lowercase letters and will search for pairs of identical letters closest to the start and end of the string and remove them.

So for example helloworld the first and last 'l' would be removed, and the 'o' would be removed. The final output would be "helwrd".

here is what I coded so far:

string[] testdata = { &quot;helloworld&quot; };
//string[] testdata = { &quot;aaa&quot; };
//string[] testdata = { &quot;abaccbaabbaabcc&quot; };
//string[] testdata = { &quot;pepperonipizza&quot; };
//string[] testdata = { &quot;thequickbrownfoxjumpedoverthelazydog&quot; };
string data = &quot;helloworld&quot;;
foreach (string test in testdata)
{
   
        for (int i = 0; i &lt; data.Length - 1; i++)
        {
            /*if (data[i] == data[i + 1])
            {
                data = data.Remove(i, 2);
                
            */
            for(int di = data.Length - 1; di != i; di--)
        {
            if (data[i] == data[di])
            {
                data = data.Remove(di, 1);
                data = data.Remove(i, 1);
                i = 0;
                
                break;
            }
        }
        }
Console.WriteLine(data);
}

The problem I am facing

For other phrases such as &quot;pepperonipizza&quot; the final result should be rona but it comes up with eerona.

The same problem is present with &quot;thequickbrownfoxjumpedoverthelazydog&quot; the final result should be qickbwnfxjmpvlazyg but it comes up with hqickbwnfxjmpvhlazyg.

I looked at the Watch window whilst debugging and the last identical letters won't delete. so the final result in the watch debugger and the terminal will be eerona and hqickbwnfxjmpvhlazyg.

Argument                             : Expected           : Actual
----------------------------------------------------------------------
helloworld                           : helwrd             :
aaa                                  : a                  :
abaccbaabbaabcc                      : b                  :
pepperonipizza                       : rona               : eerona
thequickbrownfoxjumpedoverthelazydog : qickbwnfxjmpvlazyg : hqickbwnfxjmpvhlazyg

答案1

得分: 0

请尝试这个代码:

    string data = "pepperonipizza";
    foreach (string test in testdata)
    {
        for (int i = 0; i < data.Length - 1; i++)
        {
            for (int j = data.Length - 1; j > i; j--)
            {
                if (data[i] == data[j])
                {
                    data = data.Remove(j, 1);
                    data = data.Remove(i, 1);
                    i--;
                    break;
                }
            }
        }
        Console.WriteLine(data);
        Console.In.Read();
    }
英文:

Try this:

    string data = &quot;pepperonipizza&quot;;
    foreach (string test in testdata)
    {
        for (int i = 0; i &lt; data.Length - 1; i++)
        {
            for (int j = data.Length - 1; j &gt; i; j--)
            {
                if (data[i] == data[j])
                {
                    data = data.Remove(j, 1);
                    data = data.Remove(i, 1);
                    i--;
                    break;
                }
            }
        }
        Console.WriteLine(data);
        Console.In.Read();
    }

答案2

得分: 0

如果您有一个text,您可能希望使用线性时间算法-O(n),而不是嵌套循环。这种方法并不复杂:

  1. 计算每个符号的频率(freq),并让counts成为每个符号的计数。
  2. 然后我们逐个扫描text中的字母。
  3. 如果letter的频率是偶数,就丢弃它。
  4. 如果letter的频率是奇数,只保留一个具有(freq[letter] + 1) / 2计数的字母。

代码(Fiddle):

using System.Linq;
using System.Collections.Generic;

...

private static string MySolution(string text) {
  var freqs = text
    .GroupBy(item => item)
    .ToDictionary(group => group.Key, group => group.Count());

  var counts = freqs
    .ToDictionary(pair => pair.Key, pair => pair.Value);

  var sb = new StringBuilder(26);

  foreach (char letter in text)
    if (freqs[letter] % 2 != 0 && freqs[letter] == (counts[letter] -= 1) * 2 + 1)
      sb.Append(letter);

  return sb.ToString();
}

时间复杂度:O(n)

空间复杂度:O(1)

让我们来看一下:

string[] tests = new string[] {
  "helloworld",
  "aaa",
  "abaccbaabbaabcc",
  "pepperonipizza",
  "thequickbrownfoxjumpedoverthelazydog",
};
	  
var result = string.Join(Environment.NewLine, tests
  .Select(test => $"{test,50} :: {MySolution(test)}"));  
	  
Console.WriteLine(result);  

输出:

                                        helloworld :: helwrd
                                               aaa :: a
                                   abaccbaabbaabcc :: b
                                    pepperonipizza :: rona
              thequickbrownfoxjumpedoverthelazydog :: qickbwnfxjmpvlazyg

压力测试:

Random random = new Random(1234);

// 由一百万个随机字符组成的字符串
string text = string.Concat(Enumerable
  .Range(0, 1_000_000)
  .Select(_ => (char)('a' + random.Next(26))));
 
string result = MySolution(text);

Console.WriteLine(result);

结果(在几分之一秒后):

tljszwafhn
英文:

If you have a long text you may want a linear time algorithm - O(n) without nested loops. The approach is not that complex:

  1. Compute frequency of each symbol (freq) and let counts be count of each symbol
  2. Then we scan text, letter after letter.
  3. If frequency of letter is even, drop it
  4. If frequency of letter is odd, keep just one of it, that has (freq[letter] + 1) / 2 count

Code (Fiddle):

using System.Linq;
using System.Collections.Generic;

...

private static string MySolution(string text) {
  var freqs = text
    .GroupBy(item =&gt; item)
    .ToDictionary(group =&gt; group.Key, group =&gt; group.Count());

  var counts = freqs
    .ToDictionary(pair =&gt; pair.Key, pair =&gt; pair.Value);

  var sb = new StringBuilder(26);

  foreach (char letter in text)
    if (freqs[letter] % 2 != 0 &amp;&amp; freqs[letter] == (counts[letter] -= 1) * 2 + 1)
      sb.Append(letter);

  return sb.ToString();
}

Time Complexity: O(n)

Space Complexity: O(1)

Let's have a look:

string[] tests = new string[] {
  &quot;helloworld&quot;,
  &quot;aaa&quot;,
  &quot;abaccbaabbaabcc&quot;,
  &quot;pepperonipizza&quot;,
  &quot;thequickbrownfoxjumpedoverthelazydog&quot;,
};
	  
var result = string.Join(Environment.NewLine, tests
  .Select(test =&gt; $&quot;{test,50} :: {MySolution(test)}&quot;));  
	  
Console.WriteLine(result);  

Output:

                                        helloworld :: helwrd
                                               aaa :: a
                                   abaccbaabbaabcc :: b
                                    pepperonipizza :: rona
              thequickbrownfoxjumpedoverthelazydog :: qickbwnfxjmpvlazyg

Stress test:

Random random = new Random(1234);

// String with million random characters
string text = string.Concat(Enumerable
  .Range(0, 1_000_000)
  .Select(_ =&gt; (char)(&#39;a&#39; + random.Next(26))));
 
string result = MySolution(text);

Console.WriteLine(result);

Outcome (after fraction of second):

tljszwafhn

答案3

得分: 0

一个简单的解决方案是从字符串的开头开始迭代,如果一个字符的数量是偶数,就将其删除。如果是奇数,我们找到中间索引并将其存储在一个列表中。最后,我们将列表连接起来。

string input = "pepperonipizza";
string[] chars = new string[input.Length];
for(int i=0;i< input.Length;i++)
{
   int count= input.Count(c => c == input[i]);
    if (count % 2 == 1)
    {
        var indexes =input.Select((b, ind) => b.Equals(input[i]) ? ind : -1).Where(ind => ind != -1).ToList();
        chars[indexes[indexes.Count / 2]] = input[i].ToString();
    }
}
Console.WriteLine(string.Join("", chars.Where(item => item != null)));

结果:

"helloworld" ==> "helwrd"

"pepperonipizza" ==> "rona"

英文:

A simple solution is to iterate from the beginning of the string, if the number of a character is even, it will be removed. If it is odd, we find the middle index and store it in a list. Finally, we join the list:

string input = &quot;pepperonipizza&quot;;
string[] chars = new string[input.Length];
for(int i=0;i&lt; input.Length;i++)
{
   int count= input.Count(c =&gt; c == input[i]);
    if (count % 2 == 1)
    {
        var indexes =input.Select((b, ind) =&gt; b.Equals(input[i]) ? ind : -1).Where(ind =&gt; ind != -1).ToList();
        chars[indexes[indexes.Count / 2]] = input[i].ToString();
    }
}
Console.WriteLine(string.Join(&quot;&quot;, chars.Where(item =&gt; item != null)));

result:

> "helloworld" ==> "helwrd"
>
> "pepperonipizza" ==> "rona"

huangapple
  • 本文由 发表于 2023年8月8日 23:44:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/76861180.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定