英文:
Count number of words with 3 or more letters from a string in R
问题
I can help you with the translation of the code parts you provided. Here they are:
Python code:
# Python中的代码
sentence = 'I have a string sentence but i do not know how to get three lettered words from it'
# 总单词数 = 18
# 3个或更多字母的单词 = 12
# 在Python中的基本方式:
words = sentence.split(' ') # 这创建了一个单词列表
count = 0
for each in words:
if len(each) >= 3:
count = count + 1
print(count)
# Python中的另一种方式(有点粗糙但有效):
print(len(list(filter(lambda word: len(word) >= 3, words))))
R code:
# R中的代码
sentence <- 'I have a string sentence but i do not know how to get three lettered words from it'
# 以下代码在R中引发错误:
words <- strsplit(sentence, split = ' ')
count <- 0
for (word in words) {
l <- nchar(word)
if (l >= 3) {
count <- count + 1
}
}
print(count)
# 这导致了一个错误:
# Error in if (l >= 3) { : the condition has length > 1
# ERROR!
# Execution halted
请注意,R中的错误似乎是因为strsplit
函数返回一个嵌套的列表,而不是一个简单的单词列表,因此条件if (l >= 3)
应用于整个嵌套列表而不是单个单词。解决方法是将嵌套列表展平为一个单词列表,然后应用条件。
英文:
I have a string sentence. I need to find the count of words from the sentence which have more than or equal to 3 letters.
For example:
sentence <- 'I have a string sentence but i do not know how to get three lettered words from it'
# Total words = 18
# 3 or more lettered words = 12
How I do it in Python in a basic way:
words = sentence.split(' ') #this creates a list of words
count = 0
for each in words:
if len(each) >= 3:
count = count + 1
print(count)
Alternative way in python (a little crude but):
print(len(list(filter(lambda word: len(word)>= 3, words))))
I tried doing the same thing in R:
words <- strsplit(sentence, split = ' ')
count <- 0
for (word in words) {
l <- nchar(word)
if (l >= 3) {
count <- count + 1
}
}
print(count)
This results in an error for me:
# Error in if (l >= 3) { : the condition has length > 1
# ERROR!
# Execution halted
When I checked this error on the web, it says that if we provide a vector to the if condition, then this error occurs. But I provided it with a simple numeric variable, so I do not understand what is causing this error.
Can someone please explain and help me out?
P.s.: I do not want to use any external package for this. I am learning R so want to do it with basics.
答案1
得分: 4
You can use lengths
sentence <- '我有一个字符串句子,但我不知道如何从中获取三个字母的单词'
lengths(strsplit(sentence, '\\s+'))
# [1] 18
To count words with min. three chars, we use the first el
ement of the resulting list, test if nchar
is >=
three and sum
.
sum(nchar(el(strsplit(sentence, "\\s+"))) >= 3)
# [1] 12
or using pipes:
strsplit(sentence, '\\s+') |> el() |> nchar() |> base::`>=`(3) |> sum()
# [1] 12
The regex \\s+
,一个或多个空格,而不是`` cares for (accidentally) multiple whitespaces.
Note:
To clarify lengths
vs. length
:
length(list(1:3))
# [1] 1
lengths(list(1:3))
# [1] 3
sapply(list(1:3), length) ## equiv.
# [1] 3
英文:
You can use lengths
sentence <- 'I have a string sentence but i do not know how to get three lettered words from it'
lengths(strsplit(sentence, '\\s+'))
# [1] 18
To count words with min. three chars, we use the first el
ement of the resulting list, test if nchar
is >=
three and sum
.
sum(nchar(el(strsplit(sentence, "\\s+"))) >= 3)
# [1] 12
or using pipes:
strsplit(sentence, '\\s+') |> el() |> nchar() |> base::`>=`(3) |> sum()
# [1] 12
The regex '\\s+'
, one or more spaces, instead of ' '
cares for (accidentally) multiple whitespaces.
Note:
To clarify lengths
vs. length
:
length(list(1:3))
# [1] 1
lengths(list(1:3))
# [1] 3
sapply(list(1:3), length) ## equiv.
# [1] 3
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论