英文:
Regular Expression for all non letter changes to underscore remain fixed first letter # intact
问题
text.replace(/(?!^)[^\p{L}\p{N}।]+/gu, '_');
英文:
I use this regex for changing all symbols or any non digit character to underscore remaining first # value as it is.
text.replace(/(?!^)[^\p{L}\p{N}]+/gu, '_');
This will working fine but for bengali letter when user type চার
it prints like that: চ__া_র
Please provide me the correct regex.
答案1
得分: 2
The second char (\u09BE
) is a 09BE BENGALI VOWEL SIGN AA
that belongs to a "Mark, spacing combining" Unicode category (Mc
).
That means, you need to add a diacritic mark Unicode category class to the negated character class:
/(?!^)[^\p{L}\p{N}\p{M}]+/gu
See the JavaScript demo:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
console.log("চার".replace(/(?!^)[^\p{L}\p{N}\p{M}]+/gu, '_'))
<!-- end snippet -->
英文:
The second char (\u09BE
) is a 09BE BENGALI VOWEL SIGN AA
that belongs to a "Mark, spacing combining" Unicode category (Mc
).
That means, you need to add a diacritic mark Unicode category class to the negated character class:
/(?!^)[^\p{L}\p{N}\p{M}]+/gu
See the JavaScript demo:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
console.log("চার".replace(/(?!^)[^\p{L}\p{N}\p{M}]+/gu, '_'))
<!-- end snippet -->
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论