问题

这是您提供的代码的翻译部分：

import requests
import json
import os
ln = 0
os.system("clear")

with open("wordlist.txt") as file:
  lines = 
  for line in lines:
    try:
      query = str(line)
      responsetext = requests.get("https://us-central1-sandtable-8d0f7.cloudfunctions.net/api/creations?title="+query).text
      responsedict = json.loads(responsetext)
      length = int(len(responsedict))
      if length != 0:
        item = responsedict[length - 1]
        itemtimestamp = item["data"]["timestamp"]
        if str(itemtimestamp[:4]) == "2018" and int(itemtimestamp[8:10]) <= 14:
          itemtitle = item["data"]["title"]
          # itemid = item["data"]["id"]
          itemurl = "https://sandspiel.club/#"+item["data"]["id"]
          print("  Title: "+str(itemtitle))
          # print("  Post ID: "+itemid)
          print("  Post URL: "+itemurl)
          print("  Post date: "+itemtimestamp[:10])
          print("  Timestamp: "+itemtimestamp)
          print("  Word: " + query)
          # print("  Post time: "+itemtimestamp[12:19])
          open('posts.txt', 'w').writelines(itemtitle + "\n" + itemurl + "\n" + itemtimestamp + "\n")
      pass
    except:
      print(query + str(length) + " Error!")
      continue
  ln += 1
print("\n\n done!")

请注意，我已经将代码中的HTML实体编码（如"）更改为正常的引号字符。如果您需要进一步的帮助或有其他问题，请随时提出。

英文:

im trying to figure out how to optimize some code, the purpose is to go through a word list (10k words), make a search query for each word, and then get the last result, printing it if the result is before a certain date.

the code:

import requests
import json
import os
ln = 0
os.system(&quot;clear&quot;)

with open(&quot;wordlist.txt&quot;) as file:
  lines = 
  for line in lines:
    try:
      query = str(line)
      responsetext = requests.get(&quot;https://us-central1-sandtable-8d0f7.cloudfunctions.net/api/creations?title=&quot;+query).text
      responsedict = json.loads(responsetext)
      length = int(len(responsedict))
      if length != 0:
        item = responsedict[length - 1]
        itemtimestamp = item[&quot;data&quot;][&quot;timestamp&quot;]
        if str(itemtimestamp[:4]) == &quot;2018&quot; and int(itemtimestamp[8:10]) &lt;= 14:
          itemtitle = item[&quot;data&quot;][&quot;title&quot;]
          # itemid = item[&quot;data&quot;][&quot;id&quot;]
          itemurl = &quot;https://sandspiel.club/#&quot;+item[&quot;data&quot;][&quot;id&quot;]
          print(&quot;  Title: &quot;+str(itemtitle))
          # print(&quot;  Post ID: &quot;+itemid)
          print(&quot;  Post URL: &quot;+itemurl)
          print(&quot;  Post date: &quot;+itemtimestamp[:10])
          print(&quot;  Timestamp: &quot;+itemtimestamp)
          print(&quot;  Word: &quot; + query)
          # print(&quot;  Post time: &quot;+itemtimestamp[12:19])
          open(&#39;posts.txt&#39;, &#39;w&#39;).writelines(itemtitle + &quot;\n&quot; + itemurl + &quot;\n&quot; + itemtimestamp + &quot;\n&quot;)
      pass
    except:
      print(query + str(length) + &quot; Error!&quot;)
      continue
  ln += 1
print(&quot;\n\n done!&quot;)```

</details>


# 答案1
**得分**: 1

使用来自请求库的会话对象，以便可以重用底层TCP连接，还可以使用单个文件对象，这样您就不必每次都打开和关闭，f-string也更好，如果可能的话使用较小的单词列表或查看并行处理。

```python
import requests
import json
import os

os.system("clear")

session = requests.Session()

with open("wordlist.txt") as file, open("posts.txt", "w") as output_file:
    lines = 
    for line in lines:
        try:
            query = line
            response = session.get(f"https://us-central1-sandtable-8d0f7.cloudfunctions.net/api/creations?title={query}")
            response_dict = response.json()
            length = len(response_dict)

            if length != 0:
                item = response_dict[length - 1]
                item_data = item["data"]
                item_timestamp = item_data["timestamp"]

                if item_timestamp.startswith("2018") and int(item_timestamp[8:10]) <= 14:
                    item_title = item_data["title"]
                    item_url = f"https://sandspiel.club/#{item_data['id']}"

                    print(f"  Title: {item_title}")
                    print(f"  Post URL: {item_url}")
                    print(f"  Post date: {item_timestamp[:10]}")
                    print(f"  Timestamp: {item_timestamp}")
                    print(f"  Word: {query}")

                    output_file.writelines([item_title + "\n", item_url + "\n", item_timestamp + "\n"])

        except Exception as e:
            print(f"{query} {length} Error: {e}")
            continue

    print("\n\n done!")

英文:

Use the Session object from the request library so you can reuse the underlying TCP connection, also you could use a single file object, so that you dont have to open and close each time, f-string is better too, and if possible use a smaller word list or look into parallel processing.

import requests
import json
import os
import requests
import json
import os
os.system(&quot;clear&quot;)
session = requests.Session()
with open(&quot;wordlist.txt&quot;) as file, open(&quot;posts.txt&quot;, &quot;w&quot;) as output_file:
lines = 
for line in lines:
try:
query = line
response = session.get(f&quot;https://us-central1-sandtable-8d0f7.cloudfunctions.net/api/creations?title={query}&quot;)
response_dict = response.json()
length = len(response_dict)
if length != 0:
item = response_dict[length - 1]
item_data = item[&quot;data&quot;]
item_timestamp = item_data[&quot;timestamp&quot;]
if item_timestamp.startswith(&quot;2018&quot;) and int(item_timestamp[8:10]) &lt;= 14:
item_title = item_data[&quot;title&quot;]
item_url = f&quot;https://sandspiel.club/#{item_data[&#39;id&#39;]}&quot;
print(f&quot;  Title: {item_title}&quot;)
print(f&quot;  Post URL: {item_url}&quot;)
print(f&quot;  Post date: {item_timestamp[:10]}&quot;)
print(f&quot;  Timestamp: {item_timestamp}&quot;)
print(f&quot;  Word: {query}&quot;)
output_file.writelines([item_title + &quot;\n&quot;, item_url + &quot;\n&quot;, item_timestamp + &quot;\n&quot;])
except Exception as e:
print(f&quot;{query} {length} Error: {e}&quot;)
continue
print(&quot;\n\n done!&quot;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

优化爬虫

问题

我有这个函数，它告诉我音频的长度，但给出了一个错误的数字。

为每个plt.step线条分配不同的颜色。

Python不将str转换为int。

Python: ProcessPoolExecutor vs ThreadPoolExecutor

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论