2023年6月16日 02:44:02go评论117阅读模式

英文:

How to untruncate the print output. The pd.set_option() option does not work

问题

我正在使用 pandas 1.4.4 和 Python 3.9.13 在 Jupyter 笔记本中，并尝试使我的打印输出不被截断。

我的代码如下：

for column in categorical_cols[1:]:
    unique_values = df[column].value_counts()
    non_null_values = df[column].count()
    print(f'Column: {column} - Non value values: {non_null_values}')
    print(unique_values)
    print()

基本上，categorical_cols 是包含非数值列的数据框。以下是获取它的代码：

categorical_cols = df.select_dtypes(include=['object']).columns

通过 for 循环，我正在尝试查找每个分类列中的唯一值，这个部分运行正常。唯一的问题是我的 Jupyter 输出被截断。

关于 df 的一些信息：
Int64Index: 34434 entries, 0 to 34433
Data columns (total 51 columns)

我尝试了各种 pandas.set_option 选项，但似乎它们都不起作用：

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.pprint_nest_depth', 10)
pd.set_option('display.large_repr', 'info')

英文:

I am using pandas 1.4.4 and Python 3.9.13 in Jupyter notebook and trying to make my print output appear not truncated.

My code is below:

for column in categorical_cols[1:]:
    unique_values = df[column].value_counts()
    non_null_values = df[column].count()
    print(f&#39;Column: {column} - Non value values: {non_null_values}&#39;)
    print(unique_values)
    print()

Basically categorical_cols is the dataframe containing columns with non numerical values. There is a code for it below.

categorical_cols = df.select_dtypes(include=[&#39;object&#39;]).columns

And with for loop I am trying to find out unique values in each categorical column, which works fine. The only problem is that my output in Jupyter is truncated.
Here is a print screen

Some info about df:
Int64Index: 34434 entries, 0 to 34433
Data columns (total 51 columns)

I have tried various pandas.set_option below, but they do not seem to be working:

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.pprint_nest_depth', 10)
pd.set_option('display.large_repr', 'info')

答案1

得分: 0

已添加的屏幕截图表明，通过“截断”你似乎指的是已激活“滚动”视图。

请参考这个帖子和答案，了解这里发生了什么以及如何更改它。请查看此答案底部，了解在经典笔记本中打印大量输出时激活的这种视图模式的更多信息。

当你在单元格中打印大量输出时会发生这种情况。我推测这可能是 Jupyter 中内置的一种旧的、谨慎的处理大量输出的方式，当时计算机的性能通常不如现在强大。请注意，这在现代的 JupyterLab 中不会发生，因此一种解决方法是开始使用更现代的 JupyterLab。（我不知道 Jupyter Notebook 版本 7 或更高版本会发生什么。我假设你正在使用 nbclassic。如果“nbclassic”和“Jupyter Notebook 版本 7”对你来说没有意义，请查看这里。）

如果你想继续使用 nbclassic，还有其他解决方案：

通过点击视图单元格左侧关闭该模式。
基于在这里演示的使用 IPython/Jupyter 的 %%capture 和 %store 魔术命令，你可以在具有大量输出的单元格顶部添加 %%capture out。然后在下一个单元格中运行 %store out.stdout > my_ton_of_text.txt 以保存一个包含所有输出的文件，然后你可以在 Jupyter 或你自己的文本编辑器中打开它以查看。
编辑你的单元格代码，使用 Python 累积一个字符串而不是打印它，然后使用上面介绍的 %store 魔术命令将其保存为文件。

滚动视图的详细信息

如果我运行以下代码，我将获得你在屏幕截图中显示的模式。如果我在新笔记本中运行此代码，会发生这种情况，因为有大量输出：

for x in range(2000):
    print(x)

因此，如果我保存笔记本文件，然后在文本编辑器中打开它，我会在 .ipynb 代码中看到以下内容：

{
 &quot;cells&quot;: [
  {
   &quot;cell_type&quot;: &quot;code&quot;,
   &quot;execution_count&quot;: 1,
   &quot;id&quot;: &quot;e0c96475&quot;,
   &quot;metadata&quot;: {
    &quot;scrolled&quot;: true
   },
   &quot;outputs&quot;: [
    {
     &quot;name&quot;: &quot;stdout&quot;,
     &quot;output_type&quot;: &quot;stream&quot;,
     &quot;text&quot;: [
      &quot;0\n&quot;,
      &quot;1\n&quot;,
      &quot;2\n&quot;,
      &quot;3\n&quot;,

注意元数据中的 "scrolled": true。 你可以通过点击单元格输出区域的左侧来循环浏览不同的视图选项，其中一个选项是关闭它。

英文:

The screenshot that has been added indicates that by 'truncated' you seem to mean that 'scrolled' view has been activated.

See this post and the answers to get an idea of what is going on here and how you can change it. See bottom of this answer for more about this mode of view that gets activated in classic notebook if you print a lot of output.

This happens when you print a lot of output to your cell. I speculate it may be an old, cautious way built in to Jupyter to better handle a lot of output when computers weren't generally as powerful. Note that this doesn't happen in modern JupyterLab, and so one solution is to start using the more modern offering of JupyterLab. (I don't know what Jupyter Notebook Version 7 or higher does. I'm assuming that nbclassic is what you are using. See here if 'nbclassic' and 'Jupyter Notebook Version 7' don't make sense to you.

Other solutions if you want to keep using nbclassic:

Toggle that mode off by clicking on the left hand side of the view cell.
Based on using IPython/Jupyter's %%capture and %store magic illustrated here you can add %%capture out to the top of your cell with a lot of output. Then in the next cell run %store out.stdout >my_ton_of_text.txt to save a file that has all the output then you can open that text file in Jupyter or your own text editor to peruse it.
Edit your cell's code to use Python to accumulate a string instead of printing it and then use %store magic covered above to save it as a file.

Details on scrolled view

If I ran the following code, I'll get the mode you show in your screenshot. If I run this code in a new notebook it will happen because there is a lot of output:

for x in range(2000):
    print(x)

So if I save the notebook file and then open it in a text editor then I see the following among the code for the .ipynb.

{
 &quot;cells&quot;: [
  {
   &quot;cell_type&quot;: &quot;code&quot;,
   &quot;execution_count&quot;: 1,
   &quot;id&quot;: &quot;e0c96475&quot;,
   &quot;metadata&quot;: {
    &quot;scrolled&quot;: true
   },
   &quot;outputs&quot;: [
    {
     &quot;name&quot;: &quot;stdout&quot;,
     &quot;output_type&quot;: &quot;stream&quot;,
     &quot;text&quot;: [
      &quot;0\n&quot;,
      &quot;1\n&quot;,
      &quot;2\n&quot;,
      &quot;3\n&quot;,

Note the "scrolled": true in the metadata for it.
You can cycle through the different view options by clicking on the left side of the cell output area. One of the views is it off.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何取消截断的打印输出。pd.set_option() 选项不起作用。

问题

答案1

滚动视图的详细信息

Details on scrolled view

请求返回“必须提供查询字符串”是因为被抓取时需要提供查询字符串。

How do I melt and/or pivot my pandas dataframe in way that forces x-axis titles to become index? (see description for visual)

‘patch’ 方法在收到 URL 中的 pk 时生成 500 错误。

这些Python打印命令在Python 2和Python 3中表现不同的原因是什么。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。