2023年1月9日 06:13:35go评论133阅读模式

英文:

How to extract the 10 most frequent words in a text in form of a list of words in Python?

问题

这是您需要的翻译部分：

这是我写的代码：
```py
tuple(map(str, Counter(text).most_common(10)))

这是我得到的输出：

('science', 55)

这是我需要的输出：

["science"]

英文:

I have a text and am trying to extract the 10 most frequent words in it. I use the text.most_common(10) method, but am getting the ouput in form of a tuple which also contains the number of occurencies (which I don't need...). How can I fix this so that the output is just the words in form of a list?

Note: I can't use the nltk library in the program to be created.

this is the code I wrote:

tuple(map(str, Counter(text).most_common(10)))

this is the output I am getting:

(&#39;science&#39;, 55)

this is the output I need:

[&quot;science&quot;]

答案1

得分: 1

[t[0] for t in counter.most_common(10)]

英文:

You need to get the first item in the pairs returned by Counter.most_common().

[t[0] for t in counter.most_common(10)]

Full demo:

from collections import Counter

text = &quot;&quot;&quot;\
A Counter is a dict subclass for counting hashable objects. It is a collection
where elements are stored as dictionary keys and their counts are stored as
dictionary values. Counts are allowed to be any integer value including zero or
negative counts. The Counter class is similar to bags or multisets in other
languages.

Elements are counted from an iterable or initialized from another mapping (or
counter):

Counter objects have a dictionary interface except that they return a zero
count for missing items instead of raising a KeyError:

Setting a count to zero does not remove an element from a counter. Use del to
remove it entirely:

New in version 3.1.

Changed in version 3.7: As a dict subclass, Counter inherited the capability to
remember insertion order. Math operations on Counter objects also preserve
order. Results are ordered according to when an element is first encountered in
the left operand and then by the order encountered in the right operand.
&quot;&quot;&quot;

counter = Counter(text.split())

[t[0] for t in counter.most_common(10)]

gives

[&#39;a&#39;, &#39;to&#39;, &#39;Counter&#39;, &#39;are&#39;, &#39;in&#39;, &#39;is&#39;, &#39;the&#39;, &#39;dictionary&#39;, &#39;zero&#39;, &#39;or&#39;]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Python中以单词列表的形式提取文本中出现频率最高的前10个单词？

问题

答案1

使用Selenium和Python无法定位元素

对于 x 在 [列表] 中，不会在每次迭代中进行迭代。

python loguru将输出到stderr和一个文件。

Pygame 不识别我的路径，但 VS Code 可以识别？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论