ABAP:字符串操作:如何将闭合标签替换为其对应的开放标签?

huangapple go评论77阅读模式
英文:

ABAP: String Operations: How can I replace a closing tag with its corresponding opening tag?

问题

为了使事情更清晰和更容易,我有一个具有1个可变参数的方法。导入的字符串包含文本和文本格式选项,该方法应该识别关闭标记(始终为 </>),并根据相应的开放标记进行更改,这些开放标记可以是 <B><I><U>,分别变为 </B></I></U>

例如:

  • 输入字符串
    Lorem <I>ipsum dolor sit amet</>, consetetur <B>sadipscing elitr</>, sed <U>diam</> nonumy eirmod tempor <B><I><U>invidunt</></></> ut labore et <B>dolore</> magna aliquyam erat, sed diam voluptua.
  • 应变为
    Lorem <I>ipsum dolor sit amet</I>, consetetur <B>sadipscing elitr</B>, sed <U>diam</U> nonumy eirmod tempor <B><I><U>invidunt</U></I></B> ut labore et <B>dolore</B> magna aliquyam erat, sed diam voluptua.

我尝试了许多操作,包括:

  • 找到开放标记的最后出现并确定应根据最高偏移量使用哪个标记,然后将字符串截断到偏移位置,
  • 尝试反转字符串并找到最接近的开放标记。
英文:

To make things clearer and easier, I have a method which has 1 changing parameter. The imported string contains text and text formatting options, the method should recognize the closing tags (always </>) and change them based on the corresponding opening tags, which can be <B>, <I>, <U>, to become respectively </B>, </I>, </U>.

For example:

  • the input string:
    Lorem <I>ipsum dolor sit amet</>, consetetur <B>sadipscing elitr</>, sed <U>diam</> nonumy eirmod tempor <B><I><U>invidunt</></></> ut labore et <B>dolore</> magna aliquyam erat, sed diam voluptua.
  • should become:
    Lorem <I>ipsum dolor sit amet</I>, consetetur <B>sadipscing elitr</B>, sed <U>diam</U> nonumy eirmod tempor <B><I><U>invidunt</U></I></B> ut labore et <B>dolore</B> magna aliquyam erat, sed diam voluptua.

I tried many manipulations including:

  • finding the last occurrence of the opening tag and determine the one that should be used based on the highest offset, and then truncating the string to the offset,
  • trying to inverse the string and find the closest opening tag.

答案1

得分: 2

我不知道这是否是最简单/最有效的解决方案,但我已经运行了一些测试,并且它也正确处理了嵌套标记。

思路是:我们遍历字符串并在内部表中跟踪开放标记。开放标记类似于< x>(没有空格,x可以是除'/'之外的任何字符)。一旦找到关闭标记(像</>),它将在原地用< /x>(再次没有空格)替换,其中x是来自内部表的最后一个开放标记。

方法定义:

CLASS-METHODS closing_tag IMPORTING iv_html        TYPE string
                          RETURNING VALUE(rv_html) TYPE string。

以及实现:

METHOD closing_tag。

DATA tags TYPE STANDARD TABLE OF char01。
DATA position TYPE i VALUE 0。
DATA position_temp TYPE i。

rv_html = iv_html。
WHILE position &lt; strlen( rv_html ) - 1。
  IF rv_html+position(1) = &#39;&lt;&#39;。
    position_temp = position + 2。
    &quot;检查它是否是像&lt;x&gt;这样的html标记
    IF rv_html+position_temp(1) = &#39;&gt;&#39;。
      SUBTRACT 1 FROM position_temp。
      IF rv_html+position_temp(1) = &#39;/&#39;。
        &quot;关闭标记=&gt;检索最后一个开放标记
        IF tags IS INITIAL。
          &quot;没有开放标记!
          ADD 3 TO position。
        ELSE。
          DATA(tag) = tags[ lines( tags ) ]。
          REPLACE FIRST OCCURRENCE OF &#39;&lt;/&gt;&#39;
                  IN rv_html
                  WITH |&lt;/| &amp;&amp; tag &amp;&amp; |&gt;|。
          DELETE tags INDEX lines( tags )。
          ADD 4 TO position。
        ENDIF。
      ELSE。
        &quot;开放标记=&gt;存储它
        APPEND rv_html+position_temp(1) TO tags。
        ADD 3 TO position。
      ENDIF。
    ELSE。
      &quot;有一个&#39;&lt;&#39;但是后面没有&#39;&gt;&#39; 2个字符,所以继续前进。
      ADD 1 TO position。
    ENDIF。
  ELSE。
    ADD 1 TO position。
  ENDIF。
ENDWHILE。

  ENDMETHOD。

请记住:
我的假设是每个标记都以< x>的格式,所以括号内只有一个字符。如果这不是真的(我对html标记不太了解),那么逻辑必须相应地修改。

英文:

I don't know if it is the simpliest/most effective solution, but I have run some tests and it handles also the nested tags properly.

The idea is: we run through the string and keep track of the opening tags in an internal table. The opening tags are like < x> (without the space, x can be any character except '/'). Once we find a closing tag (like </>) this will be replaced (in place) with < /x> (again without the space), where x is the last opening tag (from the internal table).

The method definition:

CLASS-METHODS closing_tag IMPORTING iv_html        TYPE string
                          RETURNING VALUE(rv_html) TYPE string.

And the implementation:

METHOD closing_tag.

DATA tags TYPE STANDARD TABLE OF char01.
DATA position TYPE i VALUE 0.
DATA position_temp TYPE i.

rv_html = iv_html.
WHILE position &lt; strlen( rv_html ) - 1.
  IF rv_html+position(1) = &#39;&lt;&#39;.
    position_temp = position + 2.
    &quot; Check whether it is a html tag like &lt;x&gt;
    IF rv_html+position_temp(1) = &#39;&gt;&#39;.
      SUBTRACT 1 FROM position_temp.
      IF rv_html+position_temp(1) = &#39;/&#39;.
        &quot; Closing tag =&gt; retrive last opening tab
        IF tags IS INITIAL.
          &quot; there was no opening tag !
          ADD 3 TO position.
        ELSE.
          DATA(tag) = tags[ lines( tags ) ].
          REPLACE FIRST OCCURRENCE OF &#39;&lt;/&gt;&#39;
                  IN rv_html
                  WITH |&lt;/| &amp;&amp; tag &amp;&amp; |&gt;|.
          DELETE tags INDEX lines( tags ).
          ADD 4 TO position.
        ENDIF.
      ELSE.
        &quot; Opening tag =&gt; store it
        APPEND rv_html+position_temp(1) TO tags.
        ADD 3 TO position.
      ENDIF.
    ELSE.
      &quot; There was a &#39;&lt;&#39; but no &#39;&gt;&#39; 2 characters after so just move on.
      ADD 1 TO position.
    ENDIF.
  ELSE.
    ADD 1 TO position.
  ENDIF.
ENDWHILE.

  ENDMETHOD.

To keep in mind:
My assumption is that every tag is in format of < x>, so one character only in the bracelets. If that is not true (I don't habe much idea about html tags), than the logic has to be modified accordingly.

答案2

得分: 0

不花哨,但能完成任务。如果出现多个字母标签名称,可能需要稍微调整正则表达式。

DATA lv_input TYPE string. "您的输入字符串
DATA lv_result TYPE string.
DATA lt_match TYPE match_result_tab.
DATA lw_match TYPE match_result.
DATA lv_offset TYPE i.
DATA lt_tag_name TYPE STANDARD TABLE OF string.
DATA lv_tag_name TYPE string.

"提取所有标签名称
FIND ALL OCCURRENCES OF REGEX '<([A-Za-z])>' IN lv_input RESULTS lt_match.
lv_result = lv_input.
"用占位符替换未命名的闭合标签
REPLACE ALL OCCURRENCES OF REGEX '<([//])>' IN lv_result WITH '&'.

LOOP AT lt_match INTO lw_match.
  lv_offset = lw_match-submatches[ 1 ]-offset.
  "假设标签名称只有一个字符长
  lv_tag_name = lv_input+lv_offset(1).
  APPEND lv_tag_name TO lt_tag_name.
ENDLOOP.

LOOP AT lt_tag_name INTO lv_tag_name.
  REPLACE FIRST OCCURRENCE OF '&' IN lv_result WITH |</{ lv_tag_name }>|.
ENDLOOP.

注意:上述代码是您提供的 ABAP 代码的翻译版本,只包括代码本身,不包含其他内容。

英文:

Not fancy, but it does the job. Might need to tweak the regex in future, if more than one letter tag names occur.

DATA lv_input TYPE string. &quot;your input string
DATA lv_result TYPE string.
DATA lt_match TYPE match_result_tab.
DATA lw_match TYPE match_result.
DATA lv_offset TYPE i.
DATA lt_tag_name TYPE STANDARD TABLE OF string.
DATA lv_tag_name TYPE string.

&quot;extract all tag names
FIND ALL OCCURRENCES OF REGEX &#39;&lt;([A-Za-z])&gt;&#39; IN lv_input RESULTS lt_match.
lv_result = lv_input.
&quot;replace unnamed closing tags with placeholder
REPLACE ALL OCCURRENCES OF REGEX &#39;&lt;([//])&gt;&#39; IN lv_result WITH &#39;&amp;&#39;.

LOOP AT lt_match INTO lw_match.
  lv_offset = lw_match-submatches[ 1 ]-offset.
  &quot;assuming tag names are one character long
  lv_tag_name = lv_input+lv_offset(1).
  APPEND lv_tag_name TO lt_tag_name.
ENDLOOP.

LOOP AT lt_tag_name INTO lv_tag_name.
  REPLACE FIRST OCCURRENCE OF &#39;&amp;&#39; IN lv_result WITH |&lt;/{ lv_tag_name }&gt;|.
ENDLOOP.

huangapple
  • 本文由 发表于 2023年6月12日 18:05:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76455565.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定