2023年6月29日 12:31:19go评论117阅读模式

英文:

Count number of words with 3 or more letters from a string in R

问题

I can help you with the translation of the code parts you provided. Here they are:

Python code:

# Python中的代码
sentence = 'I have a string sentence but i do not know how to get three lettered words from it'
# 总单词数 = 18
# 3个或更多字母的单词 = 12
# 在Python中的基本方式：
words = sentence.split(' ') # 这创建了一个单词列表
count = 0
for each in words:
    if len(each) >= 3:
        count = count + 1
print(count)
# Python中的另一种方式（有点粗糙但有效）：
print(len(list(filter(lambda word: len(word) >= 3, words))))

R code:

# R中的代码
sentence <- 'I have a string sentence but i do not know how to get three lettered words from it'
# 以下代码在R中引发错误：
words <- strsplit(sentence, split = ' ')
count <- 0
for (word in words) {
    l <- nchar(word)
    if (l >= 3) {
        count <- count + 1
    }
}
print(count)
# 这导致了一个错误：
# Error in if (l >= 3) { : the condition has length > 1
# ERROR!
# Execution halted

请注意，R中的错误似乎是因为strsplit函数返回一个嵌套的列表，而不是一个简单的单词列表，因此条件if (l >= 3)应用于整个嵌套列表而不是单个单词。解决方法是将嵌套列表展平为一个单词列表，然后应用条件。

英文:

I have a string sentence. I need to find the count of words from the sentence which have more than or equal to 3 letters.

For example:

sentence &lt;- &#39;I have a string sentence but i do not know how to get three lettered words from it&#39;
# Total words = 18
# 3 or more lettered words = 12

How I do it in Python in a basic way:

words = sentence.split(&#39; &#39;) #this creates a list of words
count = 0
for each in words:
    if len(each) &gt;= 3:
        count = count + 1
print(count)

Alternative way in python (a little crude but):

print(len(list(filter(lambda word: len(word)&gt;= 3, words))))

I tried doing the same thing in R:

words &lt;- strsplit(sentence, split = &#39; &#39;)
count &lt;- 0
for (word in words) {
    l &lt;- nchar(word)
    if (l &gt;= 3) {
        count &lt;- count + 1
    }
}
print(count)

This results in an error for me:

# Error in if (l &gt;= 3) { : the condition has length &gt; 1
# ERROR!
# Execution halted

When I checked this error on the web, it says that if we provide a vector to the if condition, then this error occurs. But I provided it with a simple numeric variable, so I do not understand what is causing this error.

Can someone please explain and help me out?

P.s.: I do not want to use any external package for this. I am learning R so want to do it with basics.

答案1

得分: 4

You can use lengths

sentence &lt;- '我有一个字符串句子，但我不知道如何从中获取三个字母的单词'
lengths(strsplit(sentence, '\\s+'))
# [1] 18

To count words with min. three chars, we use the first element of the resulting list, test if nchar is >= three and sum.

sum(nchar(el(strsplit(sentence, "\\s+"))) &gt;= 3)
# [1] 12

or using pipes:

strsplit(sentence, '\\s+') |&gt; el() |&gt; nchar() |&gt; base::`&gt;=`(3) |&gt; sum()
# [1] 12

The regex \\s+,一个或多个空格，而不是`` cares for (accidentally) multiple whitespaces.

Note:

To clarify lengths vs. length:

length(list(1:3))
# [1] 1
lengths(list(1:3))
# [1] 3
sapply(list(1:3), length)  ## equiv.
# [1] 3

英文:

You can use lengths

sentence &lt;- &#39;I have a string sentence but i do not know how to get three lettered words from it&#39;
lengths(strsplit(sentence, &#39;\\s+&#39;))
# [1] 18

To count words with min. three chars, we use the first element of the resulting list, test if nchar is >= three and sum.

sum(nchar(el(strsplit(sentence, &quot;\\s+&quot;))) &gt;= 3)
# [1] 12

or using pipes:

strsplit(sentence, &#39;\\s+&#39;) |&gt; el() |&gt; nchar() |&gt; base::`&gt;=`(3) |&gt; sum()
# [1] 12

The regex '\\s+', one or more spaces, instead of ' ' cares for (accidentally) multiple whitespaces.

Note:

To clarify lengths vs. length:

length(list(1:3))
# [1] 1
lengths(list(1:3))
# [1] 3
sapply(list(1:3), length)  ## equiv.
# [1] 3

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Count number of words with 3 or more letters from a string in R

问题

答案1

Python : How to split the given start date and end date in a dataframe into number of days falling in each month creating new row for every date split

从多个列和列标题中更新列的 Pandas 操作

将JSON数据格式化并附加硬编码数据以创建一个扁平的.txt文件。

如何检索LangChain中ChatGPT使用的输入文档列表？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。