2023年6月1日 15:29:32go评论79阅读模式

英文:

Julia: elegant way to identify 2 strings

问题

我想知道是否有一种优雅的方法可以通过Julia识别2个字符串。
我的意思是，有2个字符串，例如

   1.我认为这很好
   2.这很好，我认为

它们当然有相同的意思，但单词的顺序不同。
我不擅长这种过程。你通常是如何做的呢？你是否将所有单词都放入数组变量中，然后比较每个元素的存在？
我相信在Julia中有一种奇妙的方法。

提前感谢。

英文:

I wonder there is an elegant way to indentify 2 strings by Julia.
I mean,there are 2 strings, for example

   1.I think this is good
   2.This is good, I think

Both are the same meaning of course, but the words order are different.
I am not good at such like procedure. How do you do it usually? Do you set all words into array variables then compare the each elements existents?
I believe there is a marvelous way in Julia.

Thanks any advance.

答案1

得分: 4

以下是逐步示例：

julia> using StatsBase
julia> strs = ["I think this is good", "This is good, I think"] # 初始字符串向量
2-element Vector{String}:
 "I think this is good"
 "This is good, I think"
julia> split.(strs, r"\W", keepempty=false) # 通过非单词字符拆分它们
2-element Vector{Vector{SubString{String}}}:
 ["I", "think", "this", "is", "good"]
 ["This", "is", "good", "I", "think"]
julia> map(x -> lowercase.(x), (split.(strs, r"\W", keepempty=false))) # 小写所有单词
2-element Vector{Vector{String}}:
 ["i", "think", "this", "is", "good"]
 ["this", "is", "good", "i", "think"]
julia> sort.(map(x -> lowercase.(x), (split.(strs, r"\W", keepempty=false)))) # 对每个条目排序
2-element Vector{Vector{String}}:
 ["good", "i", "is", "think", "this"]
 ["good", "i", "is", "think", "this"]
julia> countmap(sort.(map(x -> lowercase.(x), (split.(strs, r"\W", keepempty=false))))) # 最后计算重复次数，需要使用StatsBase
Dict{Vector{String}, Int64} with 1 entry:
  ["good", "i", "is", "think", "this"] => 2

英文:

Here is a step by step example:

julia&gt; using StatsBase
julia&gt; strs = [&quot;I think this is good&quot;, &quot;This is good, I think&quot;] # initial vector of strings
2-element Vector{String}:
 &quot;I think this is good&quot;
 &quot;This is good, I think&quot;
julia&gt; split.(strs, r&quot;\W&quot;, keepempty=false) # split them by non-word characters
2-element Vector{Vector{SubString{String}}}:
 [&quot;I&quot;, &quot;think&quot;, &quot;this&quot;, &quot;is&quot;, &quot;good&quot;]
 [&quot;This&quot;, &quot;is&quot;, &quot;good&quot;, &quot;I&quot;, &quot;think&quot;]
julia&gt; map(x -&gt; lowercase.(x), (split.(strs, r&quot;\W&quot;, keepempty=false))) # lowercase all words
2-element Vector{Vector{String}}:
 [&quot;i&quot;, &quot;think&quot;, &quot;this&quot;, &quot;is&quot;, &quot;good&quot;]
 [&quot;this&quot;, &quot;is&quot;, &quot;good&quot;, &quot;i&quot;, &quot;think&quot;]
julia&gt; sort.(map(x -&gt; lowercase.(x), (split.(strs, r&quot;\W&quot;, keepempty=false)))) # sort each entry
2-element Vector{Vector{String}}:
 [&quot;good&quot;, &quot;i&quot;, &quot;is&quot;, &quot;think&quot;, &quot;this&quot;]
 [&quot;good&quot;, &quot;i&quot;, &quot;is&quot;, &quot;think&quot;, &quot;this&quot;]
julia&gt; countmap(sort.(map(x -&gt; lowercase.(x), (split.(strs, r&quot;\W&quot;, keepempty=false))))) # finally count the number of duplicates, you need StatsBaes for this
Dict{Vector{String}, Int64} with 1 entry:
  [&quot;good&quot;, &quot;i&quot;, &quot;is&quot;, &quot;think&quot;, &quot;this&quot;] =&gt; 2

答案2

得分: -1

string1 = "我认为这是好的"
string2 = "这是好的，我认为"

//第1步（去除空格）
string1 = strip(string1)
string2 = strip(string2)

//第2步（将字符串拆分为单词，为每个字符串创建一个单词向量）
words1 = split(string1)
words2 = split(string2)

//第3步（将单词向量按字母顺序排序）
sorted_words1 = sort(words1)
sorted_words2 = sort(words2)

//第4步
if sorted_words1 == sorted_words2
println("相同")
else
println("不同")
end

英文:

string1 = &quot;I think this is good&quot;
string2 = &quot;This is good, I think&quot;
//Step 1 (remove blank spaces)
string1 = strip(string1)
string2 = strip(string2)
//Step 2 (splitting the strings into individual words, creating a vector of words for each string)
words1 = split(string1)
words2 = split(string2)
//Step 3 (sort the word vectors in alphabetical order)
sorted_words1 = sort(words1)
sorted_words2 = sort(words2)
//Step 4
if sorted_words1 == sorted_words2
    println(&quot;Same&quot;)
else
    println(&quot;Different&quot;)
end

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Julia：识别两个字符串的优雅方式

问题

答案1

答案2

使用Golang计算文本文件中单词的数量

Scrolling text. Rearrange the character. Characters in a string from index zero to last. I want to get the following result

`sort.Slice` 的排序顺序是不确定的。

在Java语言中，”String”与”string”有什么区别？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。