2023年7月6日 20:39:37go评论64阅读模式

英文:

GNU Parallel not improving performance

问题

I have the following script which zip archives a series of files/directories in the current directory in parallel and stores the archives into a temp directory.


# cd into a directory

tmp_directory="/tmp/test-1";

mkdir "${tmp_directory}" && \
find . -mindepth 1 -maxdepth 1 | \
parallel -j $(parallel --number-of-cores) zip -r ${tmp_directory}/{/.}.zip {} && \
zipmerge -k ../test-1.zip $tmp_directory

# a zip archive called test-1.zip should be generated at ../ #relative to the current directory

Note that to run the above code, GNU parallel and zipmerge must be installed.

The problem is that the sequential speed, before using parallel when zipping the directory is faster than after. The process is definitely not IO bound- checking h(top), the CPU is at 100% and the read/write speeds are ver low i.e. 10MB/s. I'm running this on an 8-core machine.

One thing is that when I first start the script, the CPU on 4 cores jumps to 100% and then drops rapidly until only one core is fully utilized.

I have no idea why it's not speeding up.

EDIT: iostat output (M2 MacBook Air)

disk0 
    KB/t  tps  MB/s 
   25.62    4  0.10 
    8.00    8  0.06 
    8.00    8  0.06 
    0.00    0  0.00 
    0.00    0  0.00 
    0.00    0  0.00 
   84.55 1323 109.22 
  727.63  117 83.47 
  228.24  249 55.44 
  955.80   60 55.74 
  927.55   53 47.76 
  870.12   66 55.97 
  925.10   62 55.85 
  398.56  171 66.60 
  241.23  362 85.31 
  428.89 1925 806.33 
  180.50  765 134.81 
    4.46   95  0.41 
    6.13   15  0.09

Any help would be appreciated.

英文:

I have the following script which zip archives a series of files/directories in the current directory in parallel and stores the archives into a temp directory.


# cd into a directory

tmp_directory=&quot;/tmp/test-1&quot;

mkdir &quot;${tmp_directory}&quot; &amp;&amp; \
find . -mindepth 1 -maxdepth 1 | \
parallel -j $(parallel --number-of-cores) zip -r ${tmp_directory}/{/.}.zip {} &amp;&amp; \
zipmerge -k ../test-1.zip $tmp_directory

# a zip archive called test-1.zip should be generated at ../ #relative to the current directory

Note that to run the above code, GNU parallel and zipmerge must be installed.

One thing is that when I first start the script, the CPU on 4 cores jumps to 100% and then drops rapidly until only one core is fully utilized.

I have no idea why it's not speeding up.

EDIT: iostat output (M2 MacBook Air)

disk0 
    KB/t  tps  MB/s 
   25.62    4  0.10 
    8.00    8  0.06 
    8.00    8  0.06 
    0.00    0  0.00 
    0.00    0  0.00 
    0.00    0  0.00 
   84.55 1323 109.22 
  727.63  117 83.47 
  228.24  249 55.44 
  955.80   60 55.74 
  927.55   53 47.76 
  870.12   66 55.97 
  925.10   62 55.85 
  398.56  171 66.60 
  241.23  362 85.31 
  428.89 1925 806.33 
  180.50  765 134.81 
    4.46   95  0.41 
    6.13   15  0.09

Any help would be appreciated.

答案1

得分: 1

这听起来是I/O受限的情况。

iostat -dkx 1

非常适合查看磁盘的繁忙程度。尽管我很喜欢top，但对于这个问题它无用。

您的问题可能不是带宽，而是寻址速度。

在磁盘驱动器上，按顺序读取通常比并行读取快。NVMe驱动器通常有并行化的最佳点（通常是4或8）。RAID和网络驱动器通常也有最佳点，但这取决于驱动器。

附注：parallel -j $(parallel --number-of-cores)与parallel --use-cores-instead-of-threads相同。

英文:

This sounds I/O bound.

iostat -dkx 1

is excellent for seeing how busy a disk is. However much I love top, it is useless for this.

Your problem is likely not bandwidth, but seeking.

On magnetic drives it will often be faster to read sequentially than in parallel. NVMe drives typically have a sweet spot of parallelization (often 4 or 8). RAID and network drives will often also have a sweet spot, but it depends on the drive.

PS: parallel -j $(parallel --number-of-cores) is the same as parallel --use-cores-instead-of-threads

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

GNU Parallel未能提升性能。

问题

答案1

Golang程序在调用Linux shell命令后终止。

如何从Azure SQL托管实例连接字符串中提取dnszone

关于命令行中带双引号的回车解释的困惑

Jar as a Service in RHEL 8.2 – Not Starting

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论