2023年2月14日 09:10:14go评论92阅读模式

英文:

Python execution log

问题

我想要创建一个用于记录Python脚本执行的日志。例如：

import pandas as pd
data = pd.read_excel('example.xlsx')
data.head()

我如何创建一个日志来知道谁运行了这个脚本，执行时间以及完成时间。此外，例如，假设我从数据框中取一个样本，如何创建一个种子，以便与其他人共享，以执行并获得相同的结果？

英文:

I'd like to create a log for a Python script execution. For example:

import pandas as pd
data = pd.read_excel(&#39;example.xlsx&#39;)
data.head()

How can I create a log for this script un order to know who run the script, when was executed, when did it finish. And ir for example, suppossing I take a sample of the df, how can I make to create a seed so I can share it to another person to execute it and have the same result?

答案1

得分: 2

你可以使用Python默认自带的logging模块。需要添加一些额外的代码来配置它以记录所需的信息（执行时间和执行脚本的用户），并指定日志消息的存储文件名。

至于添加“谁”运行脚本的信息，这将取决于您如何区分用户。如果您的脚本旨在在某个服务器上执行，您可以通过其IP地址区分用户。另一种解决方案是使用getpass模块，就像我在下面的示例中所做的那样。

最后，在从data生成样本时，您可以将整数设置为参数random_state的种子，以使样本始终包含相同的行。

以下是已经修改过的脚本，包含先前提到的更改：

# == 必要的导入 =========================================================
import logging
import pandas as pd
import getpass


# == 脚本配置 ==========================================================
# 设置种子以实现可重现性
SEED = 1

# 获取运行脚本的用户的用户名。
USERNAME = getpass.getuser()

# 设置日志格式。
LOG_FORMAT = '[%(levelname)s | ' + USERNAME + ' | %(asctime)s] - %(message)s';

# 存储日志的文件名。
LOG_FILENAME = 'script_execution.log';

# 要记录消息的级别。默认情况下，日志具有以下级别，按严重程度排名：
# 1. DEBUG：详细信息，仅在诊断问题时有用。
# 2. INFO：确认一切正常工作的消息。
# 3. WARNING：需要用户注意的信息。
# 4. ERROR：发生错误，脚本无法执行某些功能。
# 5. CRITICAL：发生严重错误，脚本可能无法正常运行。
LOG_LEVEL = logging.INFO
# 设置级别时，所有更严重级别的消息也将被记录。例如，当您将日志级别设置为“INFO”时，所有“WARNING”，“ERROR”和“CRITICAL”消息也将被记录，但不会记录“DEBUG”消息。


# == 设置日志 ========================================================
logging.basicConfig(
    level=LOG_LEVEL,
    format=LOG_FORMAT,
    force=True,
    datefmt="%Y-%m-%d %H:%M:%S",
    handlers=[logging.FileHandler(LOG_FILENAME, "a", "utf-8"),
              logging.StreamHandler()]
)


# == 脚本开始 ==========================================================
# 记录脚本执行开始
logging.info('脚本开始执行!')

# 从Excel文件中读取数据
data = pd.read_excel('example.xlsx')

# 从`data`中获取包含50%行的样本。
# 当设置了`random_state`时，`pd.DataFrame.sample`将始终返回相同的数据框，前提是`data`没有更改。
sample_data = data.sample(frac=0.5, random_state=SEED)

# 其他操作
# ...

# 记录脚本执行完成时
logging.info('脚本执行完成!')

运行上述代码会在控制台打印以下消息：

[INFO | erikingwersen | 2023-02-13 23:17:14] - 脚本开始执行!
[INFO | erikingwersen | 2023-02-13 23:17:14] - 脚本执行完成!

它还会创建或更新一个名为'script_execution.log'的文件，位于与脚本相同的目录中，其中包含与打印到控制台相同的信息。

英文:

You could use the logging module that comes by default with Python.
You'll have to add a few extra lines of code to configure it to log the information you require (time of execution and user executing the script) and specify a file name where the log messages should be stored at.

In respect to adding the information of "who" ran the script, it will depend on how you want to differentiate users. If your script is intended to be executed on some server, you might want to differentiate users by their IP addresses. Another solution is to use the getpass module, like I did in the example below.

Finally, when generating a sample from data, you can set an integer as seed to the parameter random_state to make the sample always contain the same rows.

Here's a modified version of your script with the previously mentioned changes:

# == Necessary Imports =========================================================
import logging
import pandas as pd
import getpass


# == Script Configuration ======================================================
# Set a seed to enable reproducibility
SEED = 1

# Get the username of the person who is running the script.
USERNAME = getpass.getuser()

# Set a format to the logs.
LOG_FORMAT = &#39;[%(levelname)s | &#39; + USERNAME + &#39; | %(asctime)s] - %(message)s&#39;

# Name of the file to store the logs.
LOG_FILENAME = &#39;script_execution.log&#39;

# Level in which messages are to be logged. Logging, by default has the
# following levels, ordered by ranking of severity:
# 1. DEBUG: detailed information, useful only when diagnosing a problem.
# 2. INFO: message that confirms that everything is working as it should.
# 3. WARNING: message with information that requires user attention
# 4. ERROR: an error has occurred and script is unable to perform some function.
# 5. CRITICAL: serious error occurred and script may stop running properly.
LOG_LEVEL = logging.INFO
# When you set the level, all messages from a higher level of severity are also
# logged. For example, when you set the log level to `INFO`, all `WARNING`,
# `ERROR` and `CRITICAL` messages are also logged, but `DEBUG` messages are not.


# == Set up logging ============================================================
logging.basicConfig(
    level=LOG_LEVEL,
    format=LOG_FORMAT,
    force=True,
    datefmt=&quot;%Y-%m-%d %H:%M:%S&quot;,
    handlers=[logging.FileHandler(LOG_FILENAME, &quot;a&quot;, &quot;utf-8&quot;),
              logging.StreamHandler()]
)


# == Script Start ==============================================================
# Log the script execution start
logging.info(&#39;Script started execution!&#39;)

# Read data from the Excel file
data = pd.read_excel(&#39;example.xlsx&#39;)

# Retrieve a sample with 50% of the rows from `data`.
# When a `random_state` is set, `pd.DataFrame.sample` will always return
# the same dataframe, given that `data` doesn&#39;t change.
sample_data = data.sample(frac=0.5, random_state=SEED)

# Other stuff
# ...

# Log when the script finishes execution
logging.info(&#39;Script finished execution!&#39;)

Running the above code prints to the console the following messages:

[INFO | erikingwersen | 2023-02-13 23:17:14] - Script started execution!
[INFO | erikingwersen | 2023-02-13 23:17:14] - Script finished execution!

It also creates or updates a file named 'script_execution.log', located at the same directory as the script with the same information that gets printed to the console.

答案2

得分: 1

创建日志

您可以使用Python的标准日志模块。

Logging HOWTO — Python 3.11.2 documentation

import logging
logging.basicConfig(filename='example.log', encoding='utf-8', level=logging.DEBUG)
logging.debug('这条消息应该记录到日志文件中')
logging.info('这也应该记录')
logging.warning('还有这个')
logging.error('还有非ASCII字符，比如 Øresund 和 Malmö')

1.1 知道谁运行了脚本

import getpass
getpass.getuser()

1.2 知道运行的时间

FORMAT = '%(asctime)s %(clientip)-15s %(user)-8s %(message)s'
logging.basicConfig(format=FORMAT)
d = {'clientip': '192.168.0.1', 'user': 'fbloggs'}
logger = logging.getLogger('tcpserver')
logger.warning('协议问题：%s', '连接重置', extra=d)

创建一个种子以便与其他人共享并执行以获得相同的结果

您可以使用参数 random_state

df['one_col'].sample(n=10, random_state=1)

英文:

To create a log

You could use python's standard logging moudle.

Logging HOWTO — Python 3.11.2 documentation

import logging
logging.basicConfig(filename=&#39;example.log&#39;, encoding=&#39;utf-8&#39;, level=logging.DEBUG)
logging.debug(&#39;This message should go to the log file&#39;)
logging.info(&#39;So should this&#39;)
logging.warning(&#39;And this, too&#39;)
logging.error(&#39;And non-ASCII stuff, too, like &#216;resund and Malm&#246;&#39;)

1.1 To know who ran the script

import getpass
getpass.getuser()

1.2 To know when it ran

FORMAT = &#39;%(asctime)s %(clientip)-15s %(user)-8s %(message)s&#39;
logging.basicConfig(format=FORMAT)
d = {&#39;clientip&#39;: &#39;192.168.0.1&#39;, &#39;user&#39;: &#39;fbloggs&#39;}
logger = logging.getLogger(&#39;tcpserver&#39;)
logger.warning(&#39;Protocol problem: %s&#39;, &#39;connection reset&#39;, extra=d)

Create a seed so you can share it with another person to execute it and have the same result

You can use a parameter random_state

df[&#39;one_col&#39;].sample(n=10, random_state=1)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Python执行日志

问题

答案1

答案2

快速从int16解析为float32的Python代码。

找不到模块名为 ‘fastapi’ – Ubuntu 环境

TclError在将字典解包到treeview.insert时出现未知选项（tkinter）

OpenAI ChatGPT (GPT-3.5) API：如何在Python中使用问题列表实现for循环？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论