什么是在setAttribute/getAttribute中需要转义的字符?

huangapple go评论71阅读模式
英文:

What escapes " and & in setAttribute/getAttribute?

问题

如果我使用 setAttribute 来设置包含双引号(或模糊的与号)的字符串,然后之后使用 getAttribute,什么都不会出错,我会得到相同的字符串 - 一些自动机制谨慎地处理了属性值的转义。

const e = document.createElement('i');
e.setAttribute('a', 'the "a" & b');

console.log('get attribute:', e.getAttribute('a'));
console.log('actual markup:', e.outerHTML.match(/a="(.+)"/)[1]);

这是浏览器自行预防性处理吗?

阅读 DOM 规范,我无法从所提供的步骤中推断出哪个机制会自动确保属性值内部的适当转义(如果给定的值包含否则无效的字符)。

不知道是谁或什么是体贴的赞助人,这让人有点不安 - 应该在某个地方进行详细说明。

英文:

If I use setAttribute to set some string containing a double quote (or an ambiguous ampersand), and use getAttribute afterwards, nothing breaks and I get the same string – some automatic mechanism prudently cares for escaping the attribute value.

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-js -->

const e = document.createElement(&#39;i&#39;);
e.setAttribute(&#39;a&#39;, &#39;the &quot;a&quot; &amp; b&#39;);

console.log( &#39;get attribute:&#39;, e.getAttribute(&#39;a&#39;) );
console.log( &#39;actual markup:&#39;, e.outerHTML.match(/a=&quot;(.+)&quot;/)[1] );

<!-- end snippet -->

Is this the browser preemptively caring by itself?

Reading the DOM spec, I can’t infer from the given steps which mechanism automatically ensures proper escaping inside the attribute value (if the given value contains otherwise invalid characters).

It’s just faintly disconcerting not knowing who or what is the thoughtful benefactor – ought to be specified somewhere.

答案1

得分: 1

The outerHTML 实际上是一个 HTML 序列化版本:

> 读取 outerHTML 的值会返回一个包含元素及其后代的 HTML 序列化的字符串。

https://developer.mozilla.org/en-US/docs/Web/API/Element/outerHTML

所以在这种情况下,双引号内的双引号是 HTML 转义的。这似乎是完全正确的,否则你会得到一个损坏的 HTML。

正确转义的 HTML 可以进一步使用:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-js -->

const e = document.createElement('i');
e.setAttribute('a', 'the "a" & b');

const div = document.createElement('div');

div.innerHTML = e.outerHTML;
const attr = div.querySelector('i').getAttribute('a');
console.log(attr);

<!-- end snippet -->

此外,正如 @Pointy 在评论中正确指出的:

> "outer HTML" 文本是从 DOM 中创建的,而不考虑创建(现在已修改的)DOM 的原始 HTML。

英文:

The outerHTML is actually a HTML serialized version:

> Reading the value of outerHTML returns a string containing an HTML serialization of the element and its descendants.

https://developer.mozilla.org/en-US/docs/Web/API/Element/outerHTML

So in that case double quotes inside double qoutes are HTML escaped. Seems totally proper since otherwise you will have a broken HTML.

The properly escaped HTML could be used further:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-js -->

const e = document.createElement(&#39;i&#39;);
e.setAttribute(&#39;a&#39;, &#39;the &quot;a&quot; &amp; b&#39;);

const div = document.createElement(&#39;div&#39;);

div.innerHTML = e.outerHTML;
const attr = div.querySelector(&#39;i&#39;).getAttribute(&#39;a&#39;);
console.log(attr);

<!-- end snippet -->

Also as properly noticed @Pointy in the comments:

> the "outer HTML" text is created from the DOM, without regard to the original HTML that created the (now modified) DOM.

huangapple
  • 本文由 发表于 2023年6月22日 18:27:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76530945.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定