英文:
ABAP: String Operations: How can I replace a closing tag with its corresponding opening tag?
问题
为了使事情更清晰和更容易,我有一个具有1个可变参数的方法。导入的字符串包含文本和文本格式选项,该方法应该识别关闭标记(始终为 </>
),并根据相应的开放标记进行更改,这些开放标记可以是 <B>
、<I>
、<U>
,分别变为 </B>
、</I>
、</U>
。
例如:
- 输入字符串:
Lorem <I>ipsum dolor sit amet</>, consetetur <B>sadipscing elitr</>, sed <U>diam</> nonumy eirmod tempor <B><I><U>invidunt</></></> ut labore et <B>dolore</> magna aliquyam erat, sed diam voluptua.
- 应变为:
Lorem <I>ipsum dolor sit amet</I>, consetetur <B>sadipscing elitr</B>, sed <U>diam</U> nonumy eirmod tempor <B><I><U>invidunt</U></I></B> ut labore et <B>dolore</B> magna aliquyam erat, sed diam voluptua.
我尝试了许多操作,包括:
- 找到开放标记的最后出现并确定应根据最高偏移量使用哪个标记,然后将字符串截断到偏移位置,
- 尝试反转字符串并找到最接近的开放标记。
英文:
To make things clearer and easier, I have a method which has 1 changing parameter. The imported string contains text and text formatting options, the method should recognize the closing tags (always </>
) and change them based on the corresponding opening tags, which can be <B>
, <I>
, <U>
, to become respectively </B>
, </I>
, </U>
.
For example:
- the input string:
Lorem <I>ipsum dolor sit amet</>, consetetur <B>sadipscing elitr</>, sed <U>diam</> nonumy eirmod tempor <B><I><U>invidunt</></></> ut labore et <B>dolore</> magna aliquyam erat, sed diam voluptua.
- should become:
Lorem <I>ipsum dolor sit amet</I>, consetetur <B>sadipscing elitr</B>, sed <U>diam</U> nonumy eirmod tempor <B><I><U>invidunt</U></I></B> ut labore et <B>dolore</B> magna aliquyam erat, sed diam voluptua.
I tried many manipulations including:
- finding the last occurrence of the opening tag and determine the one that should be used based on the highest offset, and then truncating the string to the offset,
- trying to inverse the string and find the closest opening tag.
答案1
得分: 2
我不知道这是否是最简单/最有效的解决方案,但我已经运行了一些测试,并且它也正确处理了嵌套标记。
思路是:我们遍历字符串并在内部表中跟踪开放标记。开放标记类似于< x>(没有空格,x可以是除'/'之外的任何字符)。一旦找到关闭标记(像</>),它将在原地用< /x>(再次没有空格)替换,其中x是来自内部表的最后一个开放标记。
方法定义:
CLASS-METHODS closing_tag IMPORTING iv_html TYPE string
RETURNING VALUE(rv_html) TYPE string。
以及实现:
METHOD closing_tag。
DATA tags TYPE STANDARD TABLE OF char01。
DATA position TYPE i VALUE 0。
DATA position_temp TYPE i。
rv_html = iv_html。
WHILE position < strlen( rv_html ) - 1。
IF rv_html+position(1) = '<'。
position_temp = position + 2。
"检查它是否是像<x>这样的html标记
IF rv_html+position_temp(1) = '>'。
SUBTRACT 1 FROM position_temp。
IF rv_html+position_temp(1) = '/'。
"关闭标记=>检索最后一个开放标记
IF tags IS INITIAL。
"没有开放标记!
ADD 3 TO position。
ELSE。
DATA(tag) = tags[ lines( tags ) ]。
REPLACE FIRST OCCURRENCE OF '</>'
IN rv_html
WITH |</| && tag && |>|。
DELETE tags INDEX lines( tags )。
ADD 4 TO position。
ENDIF。
ELSE。
"开放标记=>存储它
APPEND rv_html+position_temp(1) TO tags。
ADD 3 TO position。
ENDIF。
ELSE。
"有一个'<'但是后面没有'>' 2个字符,所以继续前进。
ADD 1 TO position。
ENDIF。
ELSE。
ADD 1 TO position。
ENDIF。
ENDWHILE。
ENDMETHOD。
请记住:
我的假设是每个标记都以< x>的格式,所以括号内只有一个字符。如果这不是真的(我对html标记不太了解),那么逻辑必须相应地修改。
英文:
I don't know if it is the simpliest/most effective solution, but I have run some tests and it handles also the nested tags properly.
The idea is: we run through the string and keep track of the opening tags in an internal table. The opening tags are like < x> (without the space, x can be any character except '/'). Once we find a closing tag (like </>) this will be replaced (in place) with < /x> (again without the space), where x is the last opening tag (from the internal table).
The method definition:
CLASS-METHODS closing_tag IMPORTING iv_html TYPE string
RETURNING VALUE(rv_html) TYPE string.
And the implementation:
METHOD closing_tag.
DATA tags TYPE STANDARD TABLE OF char01.
DATA position TYPE i VALUE 0.
DATA position_temp TYPE i.
rv_html = iv_html.
WHILE position < strlen( rv_html ) - 1.
IF rv_html+position(1) = '<'.
position_temp = position + 2.
" Check whether it is a html tag like <x>
IF rv_html+position_temp(1) = '>'.
SUBTRACT 1 FROM position_temp.
IF rv_html+position_temp(1) = '/'.
" Closing tag => retrive last opening tab
IF tags IS INITIAL.
" there was no opening tag !
ADD 3 TO position.
ELSE.
DATA(tag) = tags[ lines( tags ) ].
REPLACE FIRST OCCURRENCE OF '</>'
IN rv_html
WITH |</| && tag && |>|.
DELETE tags INDEX lines( tags ).
ADD 4 TO position.
ENDIF.
ELSE.
" Opening tag => store it
APPEND rv_html+position_temp(1) TO tags.
ADD 3 TO position.
ENDIF.
ELSE.
" There was a '<' but no '>' 2 characters after so just move on.
ADD 1 TO position.
ENDIF.
ELSE.
ADD 1 TO position.
ENDIF.
ENDWHILE.
ENDMETHOD.
To keep in mind:
My assumption is that every tag is in format of < x>, so one character only in the bracelets. If that is not true (I don't habe much idea about html tags), than the logic has to be modified accordingly.
答案2
得分: 0
不花哨,但能完成任务。如果出现多个字母标签名称,可能需要稍微调整正则表达式。
DATA lv_input TYPE string. "您的输入字符串
DATA lv_result TYPE string.
DATA lt_match TYPE match_result_tab.
DATA lw_match TYPE match_result.
DATA lv_offset TYPE i.
DATA lt_tag_name TYPE STANDARD TABLE OF string.
DATA lv_tag_name TYPE string.
"提取所有标签名称
FIND ALL OCCURRENCES OF REGEX '<([A-Za-z])>' IN lv_input RESULTS lt_match.
lv_result = lv_input.
"用占位符替换未命名的闭合标签
REPLACE ALL OCCURRENCES OF REGEX '<([//])>' IN lv_result WITH '&'.
LOOP AT lt_match INTO lw_match.
lv_offset = lw_match-submatches[ 1 ]-offset.
"假设标签名称只有一个字符长
lv_tag_name = lv_input+lv_offset(1).
APPEND lv_tag_name TO lt_tag_name.
ENDLOOP.
LOOP AT lt_tag_name INTO lv_tag_name.
REPLACE FIRST OCCURRENCE OF '&' IN lv_result WITH |</{ lv_tag_name }>|.
ENDLOOP.
注意:上述代码是您提供的 ABAP 代码的翻译版本,只包括代码本身,不包含其他内容。
英文:
Not fancy, but it does the job. Might need to tweak the regex in future, if more than one letter tag names occur.
DATA lv_input TYPE string. "your input string
DATA lv_result TYPE string.
DATA lt_match TYPE match_result_tab.
DATA lw_match TYPE match_result.
DATA lv_offset TYPE i.
DATA lt_tag_name TYPE STANDARD TABLE OF string.
DATA lv_tag_name TYPE string.
"extract all tag names
FIND ALL OCCURRENCES OF REGEX '<([A-Za-z])>' IN lv_input RESULTS lt_match.
lv_result = lv_input.
"replace unnamed closing tags with placeholder
REPLACE ALL OCCURRENCES OF REGEX '<([//])>' IN lv_result WITH '&'.
LOOP AT lt_match INTO lw_match.
lv_offset = lw_match-submatches[ 1 ]-offset.
"assuming tag names are one character long
lv_tag_name = lv_input+lv_offset(1).
APPEND lv_tag_name TO lt_tag_name.
ENDLOOP.
LOOP AT lt_tag_name INTO lv_tag_name.
REPLACE FIRST OCCURRENCE OF '&' IN lv_result WITH |</{ lv_tag_name }>|.
ENDLOOP.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论