使用 Office 365,我如何用高效的公式将变音符替换为普通字母?

huangapple go评论57阅读模式
英文:

Using office 365, how can I replace diacritics with 'normal' letters with an efficient formula?

问题

I'll continue with the translation:

ôöǒŏōõǫőốồɵøṓṑṍȫỗṏǿȭǭỏȍȏơổọớờỡộởợóòôöõṕṗᵽƥʀŕṙřŗɍɽȑȓṛṟṝśṡŝšṥṧṣṩşṫťƭṭʈṱṯⱦţŧúùûüǔŭūũůųűʉǘǜṹṻủȕȗưụṳứừṷṵữửựúùûüṽṿẃẁẇŵẅẉẋẍýỳẏŷÿȳỹɏỷƴỵýźżẑžƶẓẕ")),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)))

Please note that some special characters remain as-is in the translation for clarity.

英文:

I want to turn letters with diacritics into plain letters, so -

áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâ =a

ḃƀɓḅḇ =b

ćċĉčçȼḉƈɔ =c

ḋďḑđɗḍḓḏðɖ =d

I'm using office 365, but need to achieve this without addins or VBA, using a formula. A lambda function is fine.

There's a solution here but

  1. It's slow
  2. It's hard to read
  3. It needs a single string as the input - it won't work with (and produce) an array of words without the diacritics in
  4. If a diacritic happens not to be listed, it won't be 'caught'

Re 2. & 4. the formula just lists some diacritics to be replaced - looking at Excel's character codes, there's no obvious pattern that can be used. So that might be the only way to do it.

I need a solution which doesn't use addins or VBA. Designed for Excel 2019, the solution below doesn't take advantage of LAMDA functions, which I can use.

=CONCAT(IFERROR(IF(EXACT(UPPER(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),UPPER(MID(CONCATENATE("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbcccccccccddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeffggggggggghhhhhhhhhiiiiiiiiiiiiiiiiiiiijjjkkkkkkkklllllllllllll","mmmnnnnnnnnnnnnooooooooooooooooooooooooooooooooooooooooopppprrrrrrrrrrrrsssssssssttttttttttuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuvvwwwwwwxxyyyyyyyyyyyyzzzzzzz"),SEARCH(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),CONCATENATE("áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâḃƀɓḅḇćċĉčçȼḉƈɔḋďḑđɗḍḓḏðɖéèėêëěĕēẽęȩɇếềḗḕễḝẻȅȇểẹḙḛệéèêëḟƒǵġĝǧğḡģǥɠḣĥḧȟḩħḥḫⱨıíìïǐĭīĩįɨḯỉȉȋịḭíìïîȷĵǰḱǩķƙᶄḳḵⱪĺŀľɫⱡļƚłḷḽḻḹꝈ","ḿṁṃńǹṅňñņɲṇṋṉñŋóòȯôöǒŏōõǫőốồɵøṓṑṍȫỗṏǿȭǭỏȍȏơổọớờỡộởợóòôöõṕṗᵽƥʀŕṙřŗɍɽȑȓṛṟṝśṡŝšṥṧṣṩşṫťƭṭʈṱṯⱦţŧúùûüǔŭūũůųűʉǘǜṹṻủȕȗưụṳứừṷṵữửựúùûüṽṿẃẁẇŵẅẉẋẍýỳẏŷÿȳỹɏỷƴỵýźżẑžƶẓẕ")),1)),MID(CONCATENATE("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbcccccccccddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeffggggggggghhhhhhhhhiiiiiiiiiiiiiiiiiiiijjjkkkkkkkklllllllllllll","mmmnnnnnnnnnnnnooooooooooooooooooooooooooooooooooooooooopppprrrrrrrrrrrrsssssssssttttttttttuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuvvwwwwwwxxyyyyyyyyyyyyzzzzzzz"),FIND(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),CONCATENATE("áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâḃƀɓḅḇćċĉčçȼḉƈɔḋďḑđɗḍḓḏðɖéèėêëěĕēẽęȩɇếềḗḕễḝẻȅȇểẹḙḛệéèêëḟƒǵġĝǧğḡģǥɠḣĥḧȟḩħḥḫⱨıíìïǐĭīĩįɨḯỉȉȋịḭíìïîȷĵǰḱǩķƙᶄḳḵⱪĺŀľɫⱡļƚłḷḽḻḹꝈ","ḿṁṃńǹṅňñņɲṇṋṉñŋóòȯôöǒŏōõǫőốồɵøṓṑṍȫỗṏǿȭǭỏȍȏơổọớờỡộởợóòôöõṕṗᵽƥʀŕṙřŗɍɽȑȓṛṟṝśṡŝšṥṧṣṩşṫťƭṭʈṱṯⱦţŧúùûüǔŭūũůųűʉǘǜṹṻủȕȗưụṳứừṷṵữửựúùûüṽṿẃẁẇŵẅẉẋẍýỳẏŷÿȳỹɏỷƴỵýźżẑžƶẓẕ")),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)))

答案1

得分: 2

以下是一个递归的 LAMBDA 函数,适用于您的问题。您可以在“名称管理器”中定义它,或者使用高级公式环境插件公式1):

CLEANCHARS = LAMBDA(txt, old, new, IF(old <>"", 
 CLEANCHARS(SUBSTITUTE(txt, LEFT(old), new), RIGHT(old, LEN(old)-1), new), txt))

现在可以用它来解决您的具体问题,例如:

=CLEANCHARS(A1, "áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâ", "a")

或者,如果您不想定义一个递归函数,可以使用 REDUCE公式2):

=LET(x, A1, CLEANCHARS, LAMBDA(txt, old, new,
  REDUCE(txt, SEQUENCE(LEN(old)), LAMBDA(ac, i, SUBSTITUTE(ac, MID(old, i, 1), new)))),
 CLEANCHARS(x, "áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâ", "a"))

注意:递归函数受到当前操作数堆栈限制的影响(请查看此链接以获取更多信息)。对于 CLEANCHARS,最多可以进行256次递归替换。如果您认为可能会达到这个限制,那么请使用公式2

> 此外,请注意,Excel 中的当前操作数堆栈限制为 1,024。这应该与计算时间一起考虑,因为当前递归限制设置为1,024 除以(lambda 参数数目 + 1)

这是递归方法的输出:
使用 Office 365,我如何用高效的公式将变音符替换为普通字母?

现在您可以将其扩展到一般情况,通过一次调用将整个数组清理干净,无需拖动公式:

=LET(lk, A2:B3, BYROW(D2:D3, LAMBDA(x, REDUCE(x, SEQUENCE(ROWS(lk)), LAMBDA(ac, i, 
  CLEANCHARS(ac, INDEX(lk, i, 1), INDEX(lk, i, 2)))))))

注意:以类似的方式可以调整为使用非递归方法(公式2)。

这是输出结果:
使用 Office 365,我如何用高效的公式将变音符替换为普通字母?

英文:

Here is a recursive LAMBDA function that works for your problem. You can define it in the Name Manager for example, or use Advanced Formula Environment Add-ins (formula 1):

CLEANCHARS = LAMBDA(txt,old,new, IF(old <>"", 
 CLEANCHARS(SUBSTITUTE(txt,LEFT(old),new),RIGHT(old,LEN(old)-1),new),txt))

Now use it for your specific case, for example:

=CLEANCHARS(A1,"áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâ","a")

or you can use REDUCE if you don't want to define a recursive function (formula 2):

=LET(x,A1, CLEANCHARS, LAMBDA(txt,old,new,
  REDUCE(txt,SEQUENCE(LEN(old)),LAMBDA(ac,i,SUBSTITUTE(ac, MID(old,i,1),new)))),
 CLEANCHARS(x, "áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâ", "a"))

Note: Recursive functions are affected by current operand stack limit (check this link for more info). For CLEANCHARS would be 256 recursive replacements. If you consider you can reach it, then use formula 2:

> ...Furthermore, do note that the current operand stack limit in Excel is 1,024. This should be borne in mind together with calculation times, as the current recursion limit is set as 1,024 divided by (number of lambda parameters + 1).

Here is the output (for recurse approach):
使用 Office 365,我如何用高效的公式将变音符替换为普通字母?

Now you can extend it for the general case, by doing all the clean up for an entire array. It spills out the entire array in just one call, no need to drag the formula down:

=LET(lk, A2:B3, BYROW(D2:D3, LAMBDA(x,REDUCE(x, SEQUENCE(ROWS(lk)),LAMBDA(ac,i, 
  CLEANCHARS(ac, INDEX(lk,i,1), INDEX(lk,i,2)))))))

Note: In a similar way it can be addapted for using the non-recursive approach (formula 2).

Here is the output:
使用 Office 365,我如何用高效的公式将变音符替换为普通字母?

huangapple
  • 本文由 发表于 2023年5月10日 21:53:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/76219266.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定