我应该使用工作线程来发送超过5000个HTTP请求吗?

huangapple go评论72阅读模式
英文:

Should I use worker threads for sending more than 5000 HTTP requests?

问题

如果我想在Node.js中向超过5000个用户发送消息,而这需要很长时间,并且我不希望它阻塞渲染,那么"Worker threads"是我需要的吗?

文档中提到:

Workers(线程)适用于执行CPU密集型的JavaScript操作。它们在处理I/O密集型工作时帮助不大。Node.js内置的异步I/O操作比Workers更有效率。

它所指的"Node.js内置的异步I/O操作"是什么?

我需要循环处理超过5000个联系人:

  • 为每个联系人在数据库中存储一行新数据。
  • 对每个联系人使用Axios的HTTP请求发送消息到外部API。

这是否被视为"I/O操作"?

英文:

If I want to send messages to more than 5000 users in Node.js, which takes a long time, and I don't want it to be render blocking, is "Worker threads" what I need?

The documentation says:

> Workers (threads) are useful for performing CPU-intensive JavaScript operations. They do not help much with I/O-intensive work. The Node.js built-in asynchronous I/O operations are more efficient than Workers can be.

What is the "Node.js built-in asynchronous I/O operations" it is referring to?

I need to loop like more than 5000 contacts:

  • store a new row in the database for each contact.
  • send a message with an Axios HTTP request to an external API for each contact.

Is this considered as an "I/O operation"?

答案1

得分: 1

使用带有速率限制的第三方API时,我喜欢以匹配其速率限制的方式来控制我的异步请求。并发请求的数量会根据延迟而变化,但不应该足以过度占用您的资源。

rateLimiter = {
    paces: {'myList': 13},
    starts: {},
    sleep: function(ms) { return new Promise(resolve => setTimeout(resolve, ms)) },
    pace: async function(label) {
        if (this.starts[label]) {
            let remainingTime = this.paces[label] + this.starts[label] - new Date().getTime();
            if (remainingTime > 0) await this.sleep(remainingTime);
        }
        this.starts[label] = new Date().getTime();
    }
};

async function processList(myList) {
    for (const member of myList) {
        await rateLimiter.pace('myList');
        APICall(member).then(/* 处理它 */);
    }
}

注意:这里的代码部分未进行翻译,保持原样。

英文:

When using third-party APIs with rate-limiting, I like to pace my asynchronous requests to match their rate limits. The number of concurrent requests will vary based on latency, but it shouldn't ever be enough to overtax your resources.

rateLimiter = {
    paces: {'myList':13},
    starts: {},
    sleep: function(ms) { return new Promise(resolve => setTimeout(resolve, ms)) },
    pace: async function(label) {
            if (this.starts[label]) {
                    let remainingTime=this.paces[label] + this.starts[label] - new Date().getTime();
                    if (remainingTime > 0) await this.sleep(remainingTime);
            }
            this.starts[label]=new Date().getTime();
    }
};

async function processList(myList) {
   for (const member of myList) {
      await rateLimiter.pace('myList');
      APICall(member).then(/* handle it */);
   }
}

答案2

得分: -3

对于您的用例,您描述的操作(将新行存储在数据库中并为每个联系人发送HTTP请求)被视为I/O操作。I/O代表输入/输出,它指的是涉及读取或写入外部资源(如数据库、文件或网络请求)的操作。

Node.js特别适合I/O密集型操作,因为它是非阻塞的、异步的。当Node.js执行I/O操作(例如从文件读取或发出HTTP请求)时,它不会阻塞整个程序的执行,而是继续并行执行其他任务,有效利用资源。

Node.js内置的异步I/O操作包括:

  1. 文件系统操作:使用诸如fs.readFilefs.writeFile等函数读取和写入文件。

  2. 网络操作:使用httphttpsnet等模块进行HTTP请求或其他与网络相关的任务。

  3. 数据库操作:使用mysqlmongodb等模块在数据库上执行查询和CRUD(创建、读取、更新、删除)操作。

现在,关于在您的特定场景中使用Worker Threads:

Worker Threads:正如文档所述,Worker Threads主要用于CPU密集型操作。如果您的任务涉及大量可以并行处理的计算工作(如复杂计算),那么使用Worker Threads可能会有益处。它允许您将CPU密集型任务卸载到单独的线程中,从而使主线程可用于其他工作。

然而,对于像您提到的I/O密集型任务(数据库操作和发出HTTP请求),使用Worker Threads可能不会提供显著的好处。Node.js的内置异步I/O模型已经设计用于有效处理这些任务。

高效处理I/O密集型任务:要在不阻塞事件循环的情况下处理5000多个联系人并充分利用Node.js的优势,您可以使用异步编程技术,例如Promises、async/await或回调。通过这样做,您可以为每个联系人启动I/O操作,Node.js会以非阻塞的方式高效处理它们。

以下是使用async/await进行高级示例(需要Node.js 7.6+):

async function processContacts(contacts) {
  for (const contact of contacts) {
    // 在数据库中存储新行(使用async/await或Promises)
    await db.storeContact(contact);

    // 发送HTTP请求到外部API(使用async/await或Promises)
    await axios.post('外部API网址', contact);
  }
}

// 假设您有一个联系人数组
const contacts = [...]; // 您的5000多个联系人的数组
processContacts(contacts)
  .then(() => console.log('所有联系人成功处理!'))
  .catch((error) => console.error('处理联系人时出错:', error));

通过使用带有async/await或Promises的异步I/O操作,Node.js会以非阻塞的方式高效处理数据库和HTTP操作,确保您的应用程序保持响应和高性能。

英文:

Yes, for your use case, the operations you described (storing a new row in the database and sending an HTTP request to an external API for each contact) are considered I/O operations. I/O stands for Input/Output, and it refers to operations that involve reading from or writing to external resources, such as databases, files, or network requests.

Node.js is particularly well-suited for I/O-intensive operations due to its non-blocking, asynchronous nature. When Node.js performs an I/O operation (e.g., reading from a file or making an HTTP request), it doesn't block the entire program's execution while waiting for the operation to complete. Instead, it continues executing other tasks in parallel, making efficient use of resources.

The Node.js built-in asynchronous I/O operations include:

  1. File System Operations: Reading and writing files using functions like fs.readFile, fs.writeFile, etc.

  2. Network Operations: Making HTTP requests or other network-related tasks using modules like http, https, net, etc.

  3. Database Operations: Performing queries and CRUD (Create, Read, Update, Delete) operations on databases using modules like mysql, mongodb, etc.

Now, regarding the use of Worker Threads for your specific scenario:

Worker Threads: As the documentation states, worker threads are primarily useful for CPU-intensive operations. If your task involves a significant amount of computational work that can be parallelized (like heavy calculations), then using Worker Threads could be beneficial. It allows you to offload CPU-intensive tasks to separate threads, leaving the main thread available for other work.

However, for I/O-intensive tasks like the ones you mentioned (database operations and making HTTP requests), using Worker Threads might not provide significant benefits. Node.js's built-in asynchronous I/O model is already designed to handle such tasks efficiently.

Handling I/O-Intensive Tasks Efficiently: To process 5000+ contacts without blocking the event loop and leveraging Node.js's strengths, you can use asynchronous programming techniques, such as Promises, async/await, or callbacks. By doing so, you can initiate I/O operations for each contact, and Node.js will handle them efficiently in a non-blocking manner.

Here's a high-level example using async/await (requires Node.js 7.6+):

async function processContacts(contacts) {
  for (const contact of contacts) {
    // Store a new row in the database (using async/await or Promises)
    await db.storeContact(contact);

    // Send HTTP request to an external API (using async/await or Promises)
    await axios.post('external-api-url', contact);
  }
}

// Assuming you have an array of contacts
const contacts = [...]; // Your array of 5000+ contacts
processContacts(contacts)
  .then(() => console.log('All contacts processed successfully!'))
  .catch((error) => console.error('Error processing contacts:', error));

By using asynchronous I/O operations with async/await or Promises, Node.js will efficiently handle the database and HTTP operations in a non-blocking way, making sure your application remains responsive and performant.

huangapple
  • 本文由 发表于 2023年7月20日 12:03:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/76726602.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定