编码错误在从剪贴板打印数据时发生,但在数据硬编码时有效。

huangapple go评论66阅读模式
英文:

Encoding error when printing data from clipboard, but works when the data is hardcoded

问题

I'll provide the translated code section:

我试图从Amazon搜索结果页面复制所有文本数据假设搜索词是笔记本电脑),使用PyAutoGui通过Ctrl+ACtrl+C然后使用`pyperclip.paste()``pd.read_clipboard()`获取数据并打印出来以下是代码

import pyautogui
import time
import pyperclip
import pandas as pd

关键词 = '笔记本电脑'

time.sleep(3)
pyautogui.click(x=750, y=135)
time.sleep(1)
pyautogui.write(关键词)
time.sleep(1)
pyautogui.press('enter')
time.sleep(5)
pyautogui.hotkey('ctrl', 'a')
pyautogui.hotkey('ctrl', 'c')
time.sleep(0.1)

#raw = pyperclip.paste()
raw = pd.read_clipboard()

print(raw)

请注意,这是您提供的代码的翻译部分。如果您有其他问题或需要进一步的帮助,请随时告诉我。

英文:

I'm trying to copy all text data from an Amazon search result page (say the search term is laptop), using Ctrl+A, Ctrl+C through PyAutoGui. Then get the data using either pyperclip.paste() or pd.read_clipboard() and print it. Here's the code:

import pyautogui
import time
import pyperclip
import pandas as pd

keyword = 'laptop'

time.sleep(3)
pyautogui.click(x=750, y=135)
time.sleep(1)
pyautogui.write(keyword)
time.sleep(1)
pyautogui.press('enter')
time.sleep(5)
pyautogui.hotkey('ctrl', 'a')
pyautogui.hotkey('ctrl', 'c')
time.sleep(0.1)

#raw = pyperclip.paste()
raw = pd.read_clipboard()

print(raw)

Using Pandas gives this error:

Traceback (most recent call last):
  File "c:\Users\smfah\OneDrive\Desktop\tmp\regex.py", line 32, in <module>
    raw = pd.read_clipboard()
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\clipboards.py", line 88, in read_clipboard
    return read_csv(StringIO(text), sep=sep, **kwargs)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 950, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 611, in _read
    return parser.read(nrows)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 1778, in read
    ) = self._engine.read(  # type: ignore[attr-defined]
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\python_parser.py", line 282, in read
    alldata = self._rows_to_cols(content)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\python_parser.py", line 1045, in _rows_to_cols
    self._alert_malformed(msg, row_num + 1)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\python_parser.py", line 765, in _alert_malformed
    raise ParserError(msg)
pandas.errors.ParserError: Expected 4 fields in line 726, saw 7. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

And using Pyperclip gives this error:

Traceback (most recent call last):
  File "c:\Users\smfah\OneDrive\Desktop\tmp\regex.py", line 45, in <module>
    print(raw)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u200c' in position 60: character maps to <undefined>

However, if I hardcode the text on the code editor (using VSCode on Win11), and don't print it, I can work (e.g. applying regex) using the hardcoded data.

text = '''long block of text'''

But I want to work on the text copied into the clipboard. I tried applying various solutions, but none worked for me.

Note: This issue is not happening on Ubuntu 22.4, so looks like Windows related issue.

Any help will be greatly appreciated! Thanks!

答案1

得分: 1

Windows剪贴板可以使用win32clipboard来访问,它是winpy组的一部分。要获取剪贴板的最新文本,

import win32clipboard

# 获取剪贴板数据
win32clipboard.OpenClipboard()
data = win32clipboard.GetClipboardData()
win32clipboard.CloseClipboard()
print(data)

你不需要安装winpywin32clipboard,因为它们随默认的Python安装一起提供。

英文:

Windows clipboards could be accessed with win32clipboard which is a part of winpy group. To get the latest text from clipboard,

import win32clipboard

# get clipboard data
win32clipboard.OpenClipboard()
data = win32clipboard.GetClipboardData()
win32clipboard.CloseClipboard()
print(data)

You don't need to install winpy or win32clipboard as they come with the default python installation.

huangapple
  • 本文由 发表于 2023年3月21日 00:51:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/75793125.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定