如何计算Python集合中字符的总数

huangapple go评论70阅读模式
英文:

How to calculate the total count of characters in a python set

问题

A. 我已使用以下代码对上述数据进行了分组(对不包括id列的部分表示歉意):

dff = df_new.groupby('id').agg({'brand':'first','title':'first','generic_keywords':'first',
                                  'revised_keywords': lambda x: set(x)})

所以我的第一个问题是,如何在逗号后面没有空格的情况下获得'revised_keywords'?
预期输出将类似于以下内容(仅显示第一行):

{highland cow christmas ornament,cow tree ornament,highland cow christmas decoration,highland cow ornament,cow ornament tree}

现在来到我的第二个问题和主要问题,一旦我得到了上述更新的行(没有逗号后的空格),如何计算每行的字符总数?
len() 只会给我字符串的数量。

因此,输出看起来像这样:

5
16
19

但预期的输出将类似于以下内容:

123
261
347

(在Excel中计算,不包括逗号后的空格,但计算了单词和逗号之间的空格)。

能否请得到一些帮助?

TIA

英文:

I initially had 1 doubt, but while writing this question I got another doubt.
So I have a dataset as follows:

如何计算Python集合中字符的总数

A. I have used the following code to get the above data groupedby (apologies for not including the id column):

dff = df_new.groupby('id').agg({'brand':'first','title':'first','generic_keywords':'first',
                                  'revised_keywords': lambda x: set(x)})

so my 1st query is, how can I have the 'revised_keywords' without space after the commas ?
expected output would look something like this (here showing for just the 1st row):

{highland cow christmas ornament,cow tree ornament,highland cow christmas decoration,highland cow ornament,cow ornament tree}

Now coming to my 2nd query and main query, once I get the rows updated as above (without spaces after the commas), how can I calculate the total count of characters for each row ?
len() is just giving me the count of strings.

So the output looks something like this:

5
16
19

But the expected output would look something like this:

123
261
347

(calculated in excel without the spaces after each comma, but the spaces between the words and commas are calculated):

Can I please get some help on this ?

TIA

答案1

得分: 1

对于第一个问题,逗号后的空格可以通过在 lambda 函数中使用 ','.join(set(x)) 进行简单修改来解决,因为 set 是一个对象,它会原样打印内容。

对于计数问题,你可以使用

.apply(lambda x: sum(len(word) for words in x for word in words.split(',')))

这将帮助您计算字母的数量。这对你也会有帮助。帮助材料

英文:

For the first issue which is comma-separated space can be resolved by simple modification in your lambda function by using ','.join(set(x)) because set is an object and it will print the content originally.

For the count issue, you can use

.apply(lambda x: sum(len(word) for words in x for word in words.split(',')))

This will help you in counting the alphabet.

This will also be helpful for you.

helping material

huangapple
  • 本文由 发表于 2023年6月30日 04:17:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/76584363.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定