英文:
Java regex to remove styles from HTML tags for Jasper text field
问题
如标题所述,我正在寻找最安全的Java正则表达式,以从HTML标记中删除用于Jasper文本字段标记为HTML的样式,但不影响任何内容和标记的一致性。例如,对于从前端接收到的以下输入:
<p>This text contains <sub style="background-color:powderblue;">subscript</sub> text.</p>
抱歉,没有转义引号。我发现这段代码运行良好:
String output = input.replaceAll("style=\"[^>]*\"","");
然后输出应该是:
<p>This text contains <sub>subscript</sub> text.</p>
英文:
As stated in the title - I am looking for safest Java regex to remove styles from HTML tags intended for Jasper text field marked as HTML, but not touching any content and tags consistency. For example, for the following input received from front-end:
<p>This text contains <sub style="background-color:powderblue;">subscript</sub> text.</p>
Sorry for not escaped quotes. I found this code works fine:
String output = input.replaceAll("style=\"[^>]*\"","");
then output should be:
<p>This text contains <sub>subscript</sub> text.</p>
答案1
得分: 1
首先,正则表达式不适用于删除内容。 正则表达式只是检查是否与特定字符集匹配的_检查_。
除此之外,使用replaceAll
的这段代码应该能起作用
String output = input.replaceAll(
"(<[^>]+?)\\s+style\\s*=\\s*['\"][^'\"]*['\"](.*?>)", "$1$2");
英文:
First off, a regex isn't something to use if you want to remove something. A regex is purely a check if something matches a certain set of characters.
But apart from that, this code using replaceAll
should do the trick
String output = input.replaceAll(
"(<[^>]+?)\\s+style\\s*=\\s*['\"][^'\"]*['\"](.*?>)", "$1$2");
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论