英文:
How to calculate the total count of characters in a python set
问题
A. 我已使用以下代码对上述数据进行了分组(对不包括id列的部分表示歉意):
dff = df_new.groupby('id').agg({'brand':'first','title':'first','generic_keywords':'first',
'revised_keywords': lambda x: set(x)})
所以我的第一个问题是,如何在逗号后面没有空格的情况下获得'revised_keywords'?
预期输出将类似于以下内容(仅显示第一行):
{highland cow christmas ornament,cow tree ornament,highland cow christmas decoration,highland cow ornament,cow ornament tree}
现在来到我的第二个问题和主要问题,一旦我得到了上述更新的行(没有逗号后的空格),如何计算每行的字符总数?
len()
只会给我字符串的数量。
因此,输出看起来像这样:
5
16
19
但预期的输出将类似于以下内容:
123
261
347
(在Excel中计算,不包括逗号后的空格,但计算了单词和逗号之间的空格)。
能否请得到一些帮助?
TIA
英文:
I initially had 1 doubt, but while writing this question I got another doubt.
So I have a dataset as follows:
A. I have used the following code to get the above data groupedby (apologies for not including the id column):
dff = df_new.groupby('id').agg({'brand':'first','title':'first','generic_keywords':'first',
'revised_keywords': lambda x: set(x)})
so my 1st query is, how can I have the 'revised_keywords' without space after the commas ?
expected output would look something like this (here showing for just the 1st row):
{highland cow christmas ornament,cow tree ornament,highland cow christmas decoration,highland cow ornament,cow ornament tree}
Now coming to my 2nd query and main query, once I get the rows updated as above (without spaces after the commas), how can I calculate the total count of characters for each row ?
len()
is just giving me the count of strings.
So the output looks something like this:
5
16
19
But the expected output would look something like this:
123
261
347
(calculated in excel without the spaces after each comma, but the spaces between the words and commas are calculated):
Can I please get some help on this ?
TIA
答案1
得分: 1
对于第一个问题,逗号后的空格可以通过在 lambda 函数中使用 ','.join(set(x))
进行简单修改来解决,因为 set
是一个对象,它会原样打印内容。
对于计数问题,你可以使用
.apply(lambda x: sum(len(word) for words in x for word in words.split(',')))
这将帮助您计算字母的数量。这对你也会有帮助。帮助材料
英文:
For the first issue which is comma-separated space can be resolved by simple modification in your lambda function by using ','.join(set(x))
because set
is an object and it will print the content originally.
For the count issue, you can use
.apply(lambda x: sum(len(word) for words in x for word in words.split(',')))
This will help you in counting the alphabet.
This will also be helpful for you.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论