Python Code and Output in Bookdown pdf are not in multiple lines

huangapple go评论78阅读模式
英文:

Python Code and Output in Bookdown pdf are not in multiple lines

问题

I understand that you want a translation of the provided text excluding the code part. Here's the translated text:

我正在尝试使用 rmarkdownbookdown 中编写 Python 代码。Python 代码没问题。问题是当生成书籍的 PDF 时,一些较长的 Python 代码和有时一些 Python 代码的输出位于 PDF 页外,因此它们不可见。请查看下面的图片。

在图像中,您可以看到 print ('数据集中的总行数和列数分别为 {} 和 {}。'.format(iris_df.shape[0], iris_df.shape[1])) 函数的代码未完全可见,但输出可见。另一个情况,对于 new_col = iris_df.columns.str.replace(' \(.*\)','').str.strip().str.upper().str.replace(' ','_') 代码,整行代码以及代码的输出都不可见。在 sns.scatterplot () 代码行也存在相同的问题。

我只是想知道是否在 bookdown 的 PDF 中有没有办法使代码和相关的输出都不会超出 PDF 页。

注意: 我尝试在 rmarkdown 中将 Python 代码写成多行,但它不起作用,大多数情况下,在 rmarkdown 中将 Python 代码写成多行时,代码不会被执行。

这是生成图片中的输出所使用的代码。

英文:

I am trying to write python code using rmarkdown in bookdown. The python code is ok. The problem is when the book pdf is generated, some long python codes and sometimes some python codes' output are outside of the pdf page and therefore they are not visible. Please see the images below.

In the image you can see the print ('The total number of rows and columns in the dataset is {} and {} respectively.'.format(iris_df.shape[0],iris_df.shape[1])) function code is not fully visible, but the output is visible. Another case, for new_col = iris_df.columns.str.replace('\(.*\)','').str.strip().str.upper().str.replace(' ','_') code, the whole code line is not visible and also the output of the code. The same issue is in sns.scatterplot () line of code.

I am just wondering whether there is anyway in bookdown pdf, both the code and the associated output will not be outside of the pdf page.

Note: I tried to write python code in rmarkdown in multiple lines, but it did not work and most cases the codes are not executed when python codes are written in multiple lines in rmarkdown.

Python Code and Output in Bookdown pdf are not in multiple lines

Here is the code that I used to generate the output in the image

from sklearn import datasets
iris = datasets.load_iris()
iris.keys()
iris_df = pd.DataFrame (data = iris.data, columns = iris.feature_names)
iris_df['target'] = iris.target

iris_df.sample(frac = 0.05)
iris_df.shape
print ('The total number of rows and columns in the dataset is {} and {} respectively.'.format(iris_df.shape[0],iris_df.shape[1]))
iris_df.info()

new_col = iris_df.columns.str.replace('\(.*\)','').str.strip().str.upper().str.replace(' ','_')
          
new_col
iris_df.columns = new_col
iris_df.info()

sns.scatterplot(data = iris_df, x = 'SEPAL_LENGTH', y = 'SEPAL_WIDTH', hue = 'TARGET', palette = 'Set2')
plt.xlabel('Sepal Length'),
plt.ylabel('Sepal Width')
plt.title('Scatterplot of Sepal Length and Width for the Target Variable')
plt.show()

答案1

得分: 1

我不知道为什么在多行中编写Python代码对你的情况不起作用,你是否尝试以正确的方式进行(因为你没有提供太多相关信息)。

根据PEP 8 – Python代码风格指南,首选的长行换行方式是使用Python括号、方括号和花括号内部的隐式换行。可以通过将表达式括在括号中来将长行分成多行。应优先使用这种方法,而不是使用反斜杠进行行续。

因此,如果你按照上述建议编写代码,代码应该能在rmarkdown(或bookdown)中正常运行。

另外,你可以尝试使用latex软件包和命令减小源代码和输出的字体大小(因为你的预期输出格式是pdf)。latex软件包fvextra 提供了一些很好的选项,可以减小字体大小甚至对长代码行进行自动换行。

因此,牢记这些建议,尝试以下操作,

(注意我是如何将所有长行括在括号内的)

intro.Rmd

# 你好,bookdown 

```{r setup, include=FALSE}
library(reticulate)
# reticulate::py_install(c("scikit-learn","pandas", "matplotlib", "seaborn"))
use_virtualenv("r-reticulate/")
```


```{python}
import pandas as pd
import seaborn as sns
from sklearn import datasets
import matplotlib.pyplot as plt
```


```{python}
iris = datasets.load_iris()
iris_df = pd.DataFrame (data = iris.data, columns = iris.feature_names)
iris_df['target'] = iris.target

(print(
  '数据集中的总行数和列数分别为{}{}。'
  .format(iris_df.shape[0],iris_df.shape[1])))
```


```{python}
new_col = (iris_df.columns
            .str
            .replace('\(.*\)','')
            .str.strip()
            .str.upper()
            .str.replace(' ','_'))
new_col
```


\newpage

```{python}
iris_df.columns = new_col
sns.scatterplot(
  data = iris_df, 
  x = 'SEPAL_LENGTH_(CM)', 
  y = 'SEPAL_WIDTH_(CM)', 
  hue = 'TARGET', 
  palette = 'Set2')
plt.xlabel('花萼长度'),
plt.ylabel('花萼宽度')
plt.title('目标变量的花萼长度和宽度的散点图')
plt.show()

``` 

在你的 preamble.tex 文件中添加以下行,

\usepackage{fvextra}
\DefineVerbatimEnvironment{Highlighting}{Verbatim}{commandchars=\\\{\},fontsize=\footnotesize}

\makeatletter
\def\verbatim{\footnotesize\@verbatim \frenchspacing\@vobeyspaces \@xverbatim}
\makeatother

如果你需要比这更大或更小的字体大小,请尝试 smallscriptsize

然后在 _output.yml 文件的头部包含该 preamble.tex 文件,

bookdown::pdf_book:
  includes:
    in_header: preamble.tex
  latex_engine: xelatex
  citation_package: natbib
  keep_tex: yes

生成的pdf输出

Python Code and Output in Bookdown pdf are not in multiple lines

Python Code and Output in Bookdown pdf are not in multiple lines

英文:

I do not know why writing python code in multiple lines did not work for your case, whether have you tried in the right way (since you didn't provide much info regarding that).

From the PEP 8 – Style Guide for Python Code

> The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation

So if you write code by following the above suggestion, code should run fine in the rmarkdown (or in bookdown) too.

Also along with that, you can try to reduce the font size a bit for source code and output using latex packages and commands (since your intended output format is pdf). And latex package fvextra provides some nice options for reducing font sizes or even auto line wrapping for long code lines.

Therefore, keeping all of these in mind, try the followings,

(Note that how I have wrapped all of the long lines inside the parenthesis)

intro.Rmd

# Hello bookdown 

```{r setup, include=FALSE}
library(reticulate)
# reticulate::py_install(c("scikit-learn","pandas", "matplotlib", "seaborn"))
use_virtualenv("r-reticulate/")
```


```{python}
import pandas as pd
import seaborn as sns
from sklearn import datasets
import matplotlib.pyplot as plt
```

```{python}
iris = datasets.load_iris()
iris_df = pd.DataFrame (data = iris.data, columns = iris.feature_names)
iris_df['target'] = iris.target

(print(
  'The total number of rows and columns in the dataset is {} and {} respectively.'
  .format(iris_df.shape[0],iris_df.shape[1])))
```


```{python}
new_col = (iris_df.columns
            .str
            .replace('\(.*\)','')
            .str.strip()
            .str.upper()
            .str.replace(' ','_'))
new_col
```

\newpage

```{python}
iris_df.columns = new_col
sns.scatterplot(
  data = iris_df, 
  x = 'SEPAL_LENGTH_(CM)', 
  y = 'SEPAL_WIDTH_(CM)', 
  hue = 'TARGET', 
  palette = 'Set2')
plt.xlabel('Sepal Length'),
plt.ylabel('Sepal Width')
plt.title('Scatterplot of Sepal Length and Width for the Target Variable')
plt.show()

```

And add the lines in your preamble.tex file,

\usepackage{fvextra}
\DefineVerbatimEnvironment{Highlighting}{Verbatim}{commandchars=\\\{\},fontsize=\footnotesize}

\makeatletter
\def\verbatim{\footnotesize\@verbatim \frenchspacing\@vobeyspaces \@xverbatim}
\makeatother

If you need bigger or smaller font size than this, try with small or scriptsize.

Then use that preamble.tex file in the includes in header in the _output.yml file,

bookdown::pdf_book:
  includes:
    in_header: preamble.tex
  latex_engine: xelatex
  citation_package: natbib
  keep_tex: yes

rendered pdf output

Python Code and Output in Bookdown pdf are not in multiple lines

Python Code and Output in Bookdown pdf are not in multiple lines

huangapple
  • 本文由 发表于 2023年5月17日 12:01:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/76268472.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定