有没有一种方法可以在Python中进行REST API调用的多线程或批处理?

huangapple go评论95阅读模式
英文:

Is there a way to multithread or batch REST API calls in Python?

问题

我有一个非常长的密钥列表,我正在使用每个密钥调用REST API以获取关于它的一些元数据。

API只能一次接受一个密钥,但我想知道是否有办法可以批量或多线程从我的一侧调用这些API?

英文:

I've got a very long list of keys, and I am calling a REST API with each key to GET some metadata about it.

The API can only accept one key at a time, but I wondered if there was a way I could batch or multi-thread the calls from my side?

答案1

得分: 0

是的,有多种方法可以在Python中多线程或批量处理REST API调用,以提高程序性能。一种方法是使用concurrent.futures模块,该模块提供了一个高级接口,用于异步执行函数,可以使用线程或进程。

以下是一个示例代码,演示了如何使用concurrent.futures来批量执行多线程REST API调用:

import requests
from concurrent.futures import ThreadPoolExecutor
from itertools import islice

API_ENDPOINT = 'https://api.example.com/metadata'

def get_metadata(keys):
    results = []
    with ThreadPoolExecutor(max_workers=5) as executor:
        for batch in iter(lambda: list(islice(keys, 5)), []):
            futures = [executor.submit(get_metadata_for_key, key) for key in batch]
            results += [future.result() for future in futures]
    return results
            
def get_metadata_for_key(key):
    url = f"{API_ENDPOINT}/{key}"
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    else:
        return None

在这个示例中,get_metadata函数接受一个键列表,并使用ThreadPoolExecutor来批量执行get_metadata_for_key函数,每次处理5个键。islice函数用于创建一个迭代器,从输入列表中返回5个键的批次。executor.submit函数用于为批次中的每个键提交一个新任务,它返回一个concurrent.futures.Future对象。future.result()函数用于检索每个任务的结果并将其附加到结果列表中。

您可以修改max_workers参数来控制用于执行任务的线程数。在这个示例中,我使用了5个线程。

英文:

Yes, there are ways to multithread or batch REST API calls in Python to improve the performance of your program. One way to do this is by using the concurrent.futures module which provides a high-level interface for asynchronously executing functions using threads or processes.

Here's an example code that shows how you can use concurrent.futures to perform multithreaded REST API calls in batches:

import requests
from concurrent.futures import ThreadPoolExecutor
from itertools import islice

API_ENDPOINT = 'https://api.example.com/metadata'

def get_metadata(keys):
    results = []
    with ThreadPoolExecutor(max_workers=5) as executor:
        for batch in iter(lambda: list(islice(keys, 5)), []):
            futures = [executor.submit(get_metadata_for_key, key) for key in batch]
            results += [future.result() for future in futures]
    return results
            
def get_metadata_for_key(key):
    url = f"{API_ENDPOINT}/{key}"
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    else:
        return None

In this example, get_metadata function takes a list of keys and uses the ThreadPoolExecutor to execute get_metadata_for_key function for each key in batches of 5. The islice function is used to create an iterator that returns batches of 5 keys from the input list. The executor.submit function is used to submit a new task to the thread pool for each key in the batch, which returns a concurrent.futures.Future object. The future.result() function is used to retrieve the result of each task and append it to the results list.

You can modify the max_workers parameter to control the number of threads used for executing tasks. In this example, I'm using 5 threads.

答案2

得分: 0

I will provide the translation of the code part you provided:

import requests
from concurrent.futures import ThreadPoolExecutor

API_ENDPOINT = 'https://api.example.com/metadata'

def get_metadata_for_key(key):
    url = f"{API_ENDPOINT}/{key}"
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    else:
        return None

def get_save_metadata(keys, workers):
    results = {}
    batches = [keys[i : i + workers] for i in range(0, len(keys), workers)]

    with ThreadPoolExecutor(max_workers=workers) as executor:
        for batch in tqdm(batches):     #tqdm shows a progress bar
            futures = {key: executor.submit(get_metadata_for_key, key) for key in batch}
            futures_clean = {k: v.result() for k, v in futures.items() if v is not None}
            results.update({k: xmltodict.parse(v) for k, v in futures_clean.items()})
    
    return results

Please note that the code remains in English as per your request.

英文:

The other reply to this looks like ChatGPT so it should be ignored.

I did, however, use its code as a base to write a function that does what I want.

import requests
from concurrent.futures import ThreadPoolExecutor

API_ENDPOINT = 'https://api.example.com/metadata'

def get_metadata_for_key(key):
    url = f"{API_ENDPOINT}/{key}"
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    else:
        return None

def get_save_metadata(keys, workers):
    results = {}
    batches = [keys[i : i + workers] for i in range(0, len(keys), workers)]

    with ThreadPoolExecutor(max_workers=workers) as executor:
        for batch in tqdm(batches):     #tqdm shows a progress bar
            futures = {key: executor.submit(get_metadata_for_key, key) for key in batch}
            futures_clean = {k: v.result() for k, v in futures.items() if v is not None}
            results.update({k: xmltodict.parse(v) for k, v in futures_clean.items()})
    
    return results

huangapple
  • 本文由 发表于 2023年8月11日 02:46:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76878564.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定