如何获取总元素标签?

huangapple go评论72阅读模式
英文:

How to get total element tag?

问题

以下是您要翻译的部分:

我有一个包含一些标签的字符串列表,我想计算每个标签中的元素总数。

<!-- language: lang-html -->

    <L1 //2 element
     <L1 //1 element
      <H1 content> 
     > 
     <L1 //3 element
      <H2 content> 
      <P content>
      <L1 //1 element
       <H3 content>
      >
     >
    >

<!-- end snippet -->

这是我的C#代码片段

    var str = "&lt;L1\r\n &lt;L1\r\n  &lt;H1 content&gt; \r\n &gt; \r\n &lt;L1\r\n  &lt;H2 content&gt; \r\n  &lt;P content&gt;\r\n  &lt;L1\r\n   &lt;H3 content&gt;\r\n  &gt;\r\n &gt;\r\n&gt;";
    var list = str.Split(new string[] { "\r\n" }, StringSplitOptions.None);
    var array_num = new List<string>();
    int startpos = 0, endpos = 0, total = 0, newstartpos = 0;
    bool newtag = false;
    for (int i = 0; i < list.Length; i++)
    {
        if (list[i].Trim() == "<L1")
        {
            startpos = i;
            for (int Lindex = i + 1; Lindex < list.Length ; Lindex++)
            {
                var item = list[Lindex].Trim().ToString();
                if (list[Lindex].Trim().StartsWith("<L1") && list[Lindex].Trim().EndsWith(">"))
                {
                    total += 1;
                }
                if (list[Lindex].Trim() == "<L1")
                {
                    total += 2;
                    newstartpos = Lindex;
                    newtag = true;
                }
                if (list[Lindex].Trim() == ">" && newstartpos != 0)
                {
                    total -= 1;
                    endpos = Lindex;
                    newtag = false;
                }
                if (list[Lindex].Trim().StartsWith("<") && list[Lindex].Trim().EndsWith(">") && !newtag)
                {
                    total += 1;
                }
                if (list[Lindex].Trim() == ">" && newstartpos == 0)
                {
                    endpos = Lindex;
                    break;
                }
            }
            array_num.Add("start: " + startpos + " end: " + endpos + " count: " + total);
            startpos = 0;
            endpos = 0;
            total = 0;
            newstartpos = 0;
            newtag = false;
        }
    }

但当我运行它并获得不符合预期的结果时,结果```array_num```包含内容。结果是正确的应该是

<!-- language: lang-html -->

    start:0 end: 11 count: 2 //正确
    start:1 end: 3 count: 1 //正确
    start:4 end: 11 count: 1 //不正确应该是 4 10 3
    start:7 end: 9 count: 1 //正确

<!-- end snippet -->

但我不确定我的代码在其他示例中是否稳定工作。如果对此有任何想法,或需要进行调整,请告诉我以进行更改。

<details>
<summary>英文:</summary>

I have the list string with some tags is defined, I want to count how many total element in each tag. 

&lt;!-- language: lang-html --&gt;

    &lt;L1 //2 element
     &lt;L1 //1 element
      &lt;H1 content&gt; 
     &gt; 
     &lt;L1 //3 element
      &lt;H2 content&gt; 
      &lt;P content&gt;
      &lt;L1 //1 element
       &lt;H3 content&gt;
      &gt;
     &gt;
    &gt;

&lt;!-- end snippet --&gt;


This is my snipped code C#

    var str = &quot;&lt;L1\r\n &lt;L1\r\n  &lt;H1 content&gt; \r\n &gt; \r\n &lt;L1\r\n  &lt;H2 content&gt; \r\n  &lt;P content&gt;\r\n  &lt;L1\r\n   &lt;H3 content&gt;\r\n  &gt;\r\n &gt;\r\n&gt;&quot;;
    var list = str.Split(new string[] { &quot;\r\n&quot; }, StringSplitOptions.None);
    var array_num = new List&lt;string&gt;();
    int startpos = 0, endpos = 0, total = 0, newstartpos = 0;
    bool newtag = false;
    for (int i = 0; i &lt; list.Length; i++)
    {
        if (list[i].Trim() == &quot;&lt;L1&quot;)
        {
            startpos = i;
            for (int Lindex = i + 1; Lindex &lt; list.Length ; Lindex++)
            {
                var item = list[Lindex].Trim().ToString();
                if (list[Lindex].Trim().StartsWith(&quot;&lt;L1&quot;) &amp;&amp; list[Lindex].Trim().EndsWith(&quot;&gt;&quot;))
                {
                    total += 1;
                }
                if (list[Lindex].Trim() == &quot;&lt;L1&quot;)
                {
                    total += 2;
                    newstartpos = Lindex;
                    newtag = true;
                }
                if (list[Lindex].Trim() == &quot;&gt;&quot; &amp;&amp; newstartpos != 0)
                {
                    total -= 1;
                    endpos = Lindex;
                    newtag = false;
                }
                if (list[Lindex].Trim().StartsWith(&quot;&lt;&quot;) &amp;&amp; list[Lindex].Trim().EndsWith(&quot;&gt;&quot;) &amp;&amp; !newtag)
                {
                    total += 1;
                }
                if (list[Lindex].Trim() == &quot;&gt;&quot; &amp;&amp; newstartpos == 0)
                {
                    endpos = Lindex;
                    break;
                }
            }
            array_num.Add(&quot;start: &quot; + startpos + &quot; end: &quot; + endpos + &quot; count: &quot; + total);
            startpos = 0;
            endpos = 0;
            total = 0;
            newstartpos = 0;
            newtag = false;
        }
    }

But when I run it and get the result not expected, the result ```array_num``` get content.  The result is correct should be 

&lt;!-- language: lang-html --&gt;

    start:0 end: 11 count: 2 //correct
    start:1 end: 3 count: 1 //correct
    start:4 end: 11 count: 1 //incorrect should be 4 10 3
    start:7 end: 9 count: 1 //correct

&lt;!-- end snippet --&gt;




But i&#39;m not sure my code is working stable with other example, If you have any idea for this, or adjust something, kindly let me know for changed something.



</details>


# 答案1
**得分**: 1

以下是翻译好的部分:

```csharp
一个堆栈结构是最好的选择,因为它反映了正在解析的内容的结构。

以下是一种解决方案。使用一个堆栈和一个TagCounter类。TagCounter类跟踪标签的子项数量,它是否是一个L1标签以及它在字符串中的索引,以便它们可以在最后正确排序:

    internal class TagCounter
    {
        public TagCounter(bool isL1Tag, int index) 
        {
            ChildCount = 0;
            IsL1Tag = isL1Tag;
            Index = index;
        }

        public int ChildCount { get; set; }
        public bool IsL1Tag { get; private set; }
        public int Index { get; set; }
    }

计算的代码片段:

    var str = "&lt;L1\r\n &lt;L1\r\n  &lt;H1 content&gt; \r\n &gt; \r\n &lt;L1\r\n  &lt;H2 content&gt; \r\n  &lt;P content&gt;\r\n  &lt;L1\r\n   &lt;H3 content&gt;\r\n  &gt;\r\n &gt;\r\n&gt;";
    
    var openTags = new Stack<TagCounter>();   
    var parsedLTags = new List<TagCounter>();
    
    var shortenedString = str.Replace("\r\n", "");
    TagCounter? currentTag = null;
    
    var stringLength = shortenedString.Length;
    
    for (var i = 0; i < stringLength; i++)
    {
        var nextChar = shortenedString[i];
        if (nextChar == '<')
        {
            if (currentTag != null)
            {
                currentTag.ChildCount++;
            }
           
            var isL1Tag = shortenedString.Substring(i + 1, 2).Equals("L1");
            
            if (currentTag != null)
            {
                openTags.Push(currentTag);
            }
            currentTag = new TagCounter(isL1Tag, i);
        }
        else if (nextChar == '>')
        {
           
            if (currentTag.IsL1Tag)
            {
                parsedLTags.Add(currentTag);
            }
    
            if (openTags.Any())
            {
                currentTag = openTags.Pop();
            }
            else
            {
                currentTag = null;
            }
    
        }
    }
    
    var result = parsedLTags.OrderBy(x=>x.Index).Select(x=>x.ChildCount).ToList();

希望对您有所帮助!如果您有任何其他问题或需要进一步帮助,请随时告诉我。

英文:

A stack structure would be best to use as it reflects the structure of content being parsed.

Here is one solution. Use a stack and a TagCounter class. The TagCounter class tracks the number of children a tag has, whether it is an L1 tag, and its index in the string so they can be put into the correct order at the end:

internal class TagCounter
{
public TagCounter(bool isL1Tag, int index) 
{
ChildCount = 0;
IsL1Tag = isL1Tag;
Index = index;
}
public int ChildCount { get; set; }
public bool IsL1Tag { get; private set; }
public int Index { get; set; }
}

snippet to compute it:

var str = &quot;&lt;L1\r\n &lt;L1\r\n  &lt;H1 content&gt; \r\n &gt; \r\n &lt;L1\r\n  &lt;H2 content&gt; \r\n  &lt;P content&gt;\r\n  &lt;L1\r\n   &lt;H3 content&gt;\r\n  &gt;\r\n &gt;\r\n&gt;&quot;;
var openTags = new Stack&lt;TagCounter&gt;();   
var parsedLTags = new List&lt;TagCounter&gt;();
var shortenedString = str.Replace(&quot;\r\n&quot;, &quot;&quot;);
TagCounter? currentTag = null;
var stringLength = shortenedString.Length;
for (var i = 0;i &lt; stringLength; i++)
{
var nextChar = shortenedString[i];
if (nextChar == &#39;&lt;&#39;)
{
if (currentTag != null)
{
currentTag.ChildCount++;
}
var isL1Tag = shortenedString.Substring(i + 1, 2).Equals(&quot;L1&quot;);
if (currentTag != null)
{
openTags.Push(currentTag);
}
currentTag = new TagCounter(isL1Tag, i);
}
else if (nextChar == &#39;&gt;&#39;)
{
if (currentTag.IsL1Tag)
{
parsedLTags.Add(currentTag);
}
if (openTags.Any())
{
currentTag = openTags.Pop();
}
else
{
currentTag = null;
}
}
}
var result = parsedLTags.OrderBy(x=&gt;x.Index).Select(x=&gt;x.ChildCount).ToList();

huangapple
  • 本文由 发表于 2023年7月6日 15:28:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76626441.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定