英文:
What escapes " and & in setAttribute/getAttribute?
问题
如果我使用 setAttribute
来设置包含双引号(或模糊的与号)的字符串,然后之后使用 getAttribute
,什么都不会出错,我会得到相同的字符串 - 一些自动机制谨慎地处理了属性值的转义。
const e = document.createElement('i');
e.setAttribute('a', 'the "a" & b');
console.log('get attribute:', e.getAttribute('a'));
console.log('actual markup:', e.outerHTML.match(/a="(.+)"/)[1]);
这是浏览器自行预防性处理吗?
阅读 DOM 规范,我无法从所提供的步骤中推断出哪个机制会自动确保属性值内部的适当转义(如果给定的值包含否则无效的字符)。
不知道是谁或什么是体贴的赞助人,这让人有点不安 - 应该在某个地方进行详细说明。
英文:
If I use setAttribute
to set some string containing a double quote (or an ambiguous ampersand), and use getAttribute
afterwards, nothing breaks and I get the same string – some automatic mechanism prudently cares for escaping the attribute value.
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
const e = document.createElement('i');
e.setAttribute('a', 'the "a" & b');
console.log( 'get attribute:', e.getAttribute('a') );
console.log( 'actual markup:', e.outerHTML.match(/a="(.+)"/)[1] );
<!-- end snippet -->
Is this the browser preemptively caring by itself?
Reading the DOM spec, I can’t infer from the given steps which mechanism automatically ensures proper escaping inside the attribute value (if the given value contains otherwise invalid characters).
It’s just faintly disconcerting not knowing who or what is the thoughtful benefactor – ought to be specified somewhere.
答案1
得分: 1
The outerHTML
实际上是一个 HTML 序列化版本:
> 读取 outerHTML 的值会返回一个包含元素及其后代的 HTML 序列化的字符串。
https://developer.mozilla.org/en-US/docs/Web/API/Element/outerHTML
所以在这种情况下,双引号内的双引号是 HTML 转义的。这似乎是完全正确的,否则你会得到一个损坏的 HTML。
正确转义的 HTML 可以进一步使用:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
const e = document.createElement('i');
e.setAttribute('a', 'the "a" & b');
const div = document.createElement('div');
div.innerHTML = e.outerHTML;
const attr = div.querySelector('i').getAttribute('a');
console.log(attr);
<!-- end snippet -->
此外,正如 @Pointy 在评论中正确指出的:
> "outer HTML" 文本是从 DOM 中创建的,而不考虑创建(现在已修改的)DOM 的原始 HTML。
英文:
The outerHTML
is actually a HTML serialized version:
> Reading the value of outerHTML returns a string containing an HTML serialization of the element and its descendants.
https://developer.mozilla.org/en-US/docs/Web/API/Element/outerHTML
So in that case double quotes inside double qoutes are HTML escaped. Seems totally proper since otherwise you will have a broken HTML.
The properly escaped HTML could be used further:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
const e = document.createElement('i');
e.setAttribute('a', 'the "a" & b');
const div = document.createElement('div');
div.innerHTML = e.outerHTML;
const attr = div.querySelector('i').getAttribute('a');
console.log(attr);
<!-- end snippet -->
Also as properly noticed @Pointy in the comments:
> the "outer HTML" text is created from the DOM, without regard to the original HTML that created the (now modified) DOM.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论