如何在JavaScript中为字符串使用可选链与括号访问?

huangapple go评论69阅读模式
英文:

How to add optional chaining to strings with brackets access in js?

问题

I create this regex to add optional chaining (?.) in the string: /(?<=\w+\.)(\w+)\.(?!\?)/g.

This regex works fine for property access with dot notation. for example foo.bar.baz will transform to foo?.bar?.baz.

But I want also to support property with brackets foo.bar[0].baz.

So I change the regex to have . or [: /(?<=\w+\.)(\w+)\.|\[(?!\?)/g but the replace remove the [ char.

Is it possible to change the regex so it match and replace also properties with brackets?

英文:

I create this regex to add optional chaining (?.) in the string: /(?&lt;=\w+\.)(\w+)\.(?!\?)/g.

This regex works fine for property access with dot notation. for example foo.bar.baz will transform to foo?.bar?.baz.

But I want also to support property with brackets foo.bar[0].baz.

So I change the regex to have . or [: /(?&lt;=\w+\.)(\w+)\.|\[(?!\?)/g but the replace remove the [ char.

Is it possible to change the regex so it match and replace also properties with brackets?

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-js -->

const inputs = [
  &#39;foo.bar&#39;,        // foo?.bar
  &#39;foo.bar.baz&#39;,    // foo?.bar?.baz
  &#39;foo.bar.baz.go&#39;, // foo?.bar?.baz?.go
  &#39;foo[0].bar&#39;,     // foo?.[0]?.bar
  &#39;foo.bar[0]&#39;,     // foo?.bar?.[0]
  &#39;foo.bar[0].bla&#39;, // foo?.bar?.[0]?.bla
  &#39;foo&#39;,            // foo
  &#39;foo.bar[text].some.bla[0]&#39;, // foo?.bar?.[text]?.some?.bla?.[0]
  &quot;foo?.you[&#39;text&#39;][text].some[0].bla&quot;, // foo.you?.[&#39;text&#39;]?.[text]?.some?.[0]?.bla
];

const regex = /(?&lt;=\w+\.)(\w+)\.(?!\?)/g;
const outputs = inputs.map((input) =&gt; input.replace(regex, &#39;$1?.&#39;));
console.log(outputs);

<!-- end snippet -->

答案1

得分: 1

你可以使用以下代码:

const inputs = [
  'foo.bar',        // foo?.bar
  'foo.bar.baz',    // foo?.bar?.baz
  'foo.bar.baz.go', // foo?.bar?.baz?.go
  'foo[0].bar',     // foo?.[0]?.bar
  'foo.bar[0]',     // foo?.bar?.[0]
  'foo.bar[0].bla', // foo?.bar?.[0]?.bla
  'foo',            // foo
  'foo.bar[text].some.bla[0]', // foo?.bar?.[text]?.some?.bla?.[0]
  "foo?.you['text']['text'].some[0].bla", // foo.you?.['text']?.['text']?.some?.[0]?.bla
];

const regex = /(?<=[a-z_])\.(?!\?)|(?<=[\]a-z_])(?=\[)|(?<=])\./gi;
const outputs = inputs.map((input) => input.replace(regex, '?.'));
console.log(outputs);

模式匹配:

  • (?<=[a-z_]) - 当前位置之前必须是一个ASCII字母或下划线字符
  • \. - 一个点号
  • (?!\?) - 不能紧跟着一个?字符
  • | - 或者
  • (?<=[\]a-z_])(?=\[) - 位于]、下划线或ASCII字母字符与[字符之间
  • | - 或者
  • (?<=])\. - 一个紧跟在.字符之后的.字符。
英文:

You can use

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-js -->

const inputs = [
  &#39;foo.bar&#39;,        // foo?.bar
  &#39;foo.bar.baz&#39;,    // foo?.bar?.baz
  &#39;foo.bar.baz.go&#39;, // foo?.bar?.baz?.go
  &#39;foo[0].bar&#39;,     // foo?.[0]?.bar
  &#39;foo.bar[0]&#39;,     // foo?.bar?.[0]
  &#39;foo.bar[0].bla&#39;, // foo?.bar?.[0]?.bla
  &#39;foo&#39;,            // foo
  &#39;foo.bar[text].some.bla[0]&#39;, // foo?.bar?.[text]?.some?.bla?.[0]
  &quot;foo?.you[&#39;text&#39;][text].some[0].bla&quot;, // foo.you?.[&#39;text&#39;]?.[text]?.some?.[0]?.bla
];

const regex = /(?&lt;=[a-z_])\.(?!\?)|(?&lt;=[\]a-z_])(?=\[)|(?&lt;=])\./gi;
const outputs = inputs.map((input) =&gt; input.replace(regex, &#39;?.&#39;));
console.log(outputs);

<!-- end snippet -->

The pattern matches

  • (?&lt;=[a-z_]) - immediately before the current location, there must be an ASCII letter or underscore
  • \. - a dot
  • (?!\?) - not immediately followed with a ? char
  • | - or
  • (?&lt;=[\]a-z_])(?=\[) - a position between ], underscore or an ASCII letter char and a [ char
  • | - or
  • (?&lt;=])\. - a . that is immediately preceded with a . char.

答案2

得分: 0

你需要识别字符串文字和小数点,以避免替换应该保留的内容。

下面是一个更长的正则表达式,它将识别字符串文字、小数点以及位于 [ 之前的插入点。

我修改并添加了一些更具挑战性的测试用例:

const inputs = [
  'foo.bar',
  'foo.bar.baz',
  'foo.bar.baz?.go',
  'foo[0].bar',
  'foo.bar[0]',
  'foo.bar[0].bla',
  'foo',
  'foo.bar[text].some.bla[0]',
  "foo.you['tex.t']['t[ext'].some?.[0].bla",
  "foo[2.][9e1]",
  "2.3.toFixed?.['.']",
  "+1_2..toFixed",
  "'ab][c.'.toUpperCase",
  "foo.[']'].$$$._",
];

const regex = /(\[?(?:(['"`])(?:\\.|(?!).)*|\.(?=[\d.[\]])|[^'"`?.[])+)(?=[?.[])\??\.?/g;
const outputs = inputs.map((input) => input.replace(regex, '$1?.'));
console.log(outputs);

第一个(大)捕获组匹配需要保持不变的文本:

  • (['"])(?:\.|(?!\2).)*\2匹配考虑到反斜杠是转义字符,因此跟随其后的字符不应被视为字符串定界符的字符串文字。组(['"]) 是整个正则表达式中的第二个组,因此关闭定界符与 \2 匹配。
  • \.(?=[\d.[\]]) 在无法解释为属性访问点时匹配小数点。
  • [^'"?.[]匹配任何字符'"?.[`。这个限制是为了避免在回溯时以这个字符集为优先选择而放弃先前的选项。
  • (?: | | )+ 重复匹配上述选项,只要可能。这部分匹配将在输入的末尾、或在 .?[ 之前停止。 (引号也可能是可能的,但这将导致输入格式不正确)

在第一个捕获组匹配完之后,我们处于 ?. 的潜在插入点。但是在输入的末尾时不应插入它,因此有以下断言:

  • (?=[?.[]) 要求字符必须仍然跟随(因此不在输入末尾),必须是 ?.[ 中的一个。这个要求还避免了回溯到无效的插入位置。

然后达到了插入点:

  • \??\.? 匹配需要在插入 ?. 之前移除的字符(如果有的话)。请注意,这也匹配 ?.,从而替换为它本身。

替换是:

  • $1 以重现在第一个(大)捕获组中捕获的内容。
  • ?. 以插入可选链操作符字符组合。
英文:

You would need to identify string literals and decimal points so to avoid replacing things that should stay.

Below is a much longer regex that will identify string literals, decimal points, and insertion points just before a [.

I modified and added some more challenging test cases:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-js -->

const inputs = [
  &#39;foo.bar&#39;,
  &#39;foo.bar.baz&#39;,
  &#39;foo.bar.baz?.go&#39;,
  &#39;foo[0].bar&#39;,
  &#39;foo.bar[0]&#39;,
  &#39;foo.bar[0].bla&#39;,
  &#39;foo&#39;,
  &#39;foo.bar[text].some.bla[0]&#39;,
  &quot;foo.you[&#39;tex.t&#39;][&#39;t[ext&#39;].some?.[0].bla&quot;,
  &quot;foo[2.][9e1]&quot;,
  &quot;2.3.toFixed?.[&#39;.&#39;]&quot;,
  &quot;+1_2..toFixed&quot;,
  &quot;&#39;ab][c.&#39;.toUpperCase&quot;,
  &quot;foo.[&#39;]&#39;].$$$._&quot;,
];

const regex = /(\[?(?:([&#39;&quot;`])(?:\\.|(?!).)*|\.(?=[\d.[\]])|[^&#39;&quot;`?.[])+)(?=[?.[])\??\.?/g;
const outputs = inputs.map((input) =&gt; input.replace(regex, &#39;$1?.&#39;));
console.log(outputs);

<!-- end snippet -->

The first (large) capture group matches text that needs to remain unaltered:

  • ([&#39;&quot;`])(?:\\.|(?!\2).)*\2 matches a string literal that takes into account that a backslash is an escape character, so the character that follows it should not be taken as string delimiter. The group ([&#39;&quot;`]) is the second group in the overal regex, so the closing delimiter is matched with \2.
  • \.(?=[\d.[\]]) matches a decimal point when it cannot be interpreted as a property accessor point.
  • [^&#39;&quot;`?.[] matches any character of &#39;&quot;`?.[. This restriction is to avoid that during backtracking the previous options would be given up in favor of this one.
  • (?: | | )+ repeats matching the above options as long as possible. This part of the match will stop at either the end of the input, or just before a ., a ? or a [. (A quote could also be possible, but then the input is malformed)

After that first capture group has been matched, we are at a potential insertion point of ?.. But it should not be inserted when at the end of the input, so we have this assertion:

  • (?=[?.[]) requires that a character must still follow (so not at the end of the input) which must be one of ?.[. This requirement also avoids backtracking to an invalid insertion position.

Then the insertion point has been reached:

  • \??\.? matches the characters (if any) that need to be removed before inserting ?.. Note that this also matches ?., which results in a replacement of itself.

The replacement is:

  • $1 to reproduce what was captured in the first (large) capture group
  • ?. to insert the optional chaining operator character combination.

huangapple
  • 本文由 发表于 2023年2月23日 20:15:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/75544701.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定