英文:
Multiple brotli processes at the same time
问题
I currently have to compress several thousand files (~40-80MB each) with brotli and get them ready for an s3 bucket.
From what i've researched so far, brotli can't multithread the compression so, brotli.exe uses ~10% of the cpu. How can I iterate through the files in a folder and spawn multiple (brotli).exe files to work at the same time (8-10 processes should fill the cpu)?
windows/powershell/vbs, I can try any suggestions
At the moment, I'm running this batch
for /R %%f in (*.) do (
"brotli" -Z "--output=E:\output\brotli\%%~nf" "%%f"
)
你目前需要压缩数千个文件(每个约40-80MB)并准备好放入S3存储桶中。
根据我迄今为止的研究,brotli不能多线程压缩,因此brotli.exe只使用大约10%的CPU。如何遍历文件夹中的文件并同时启动多个(brotli).exe文件以并行处理(8-10个进程应该充分利用CPU)?
对于Windows/powershell/vbs,我可以尝试任何建议。
目前,我正在运行以下批处理命令:
for /R %%f in (*.*) do (
"brotli" -Z "--output=E:\output\brotli\%%~nf" "%%f"
)
英文:
I currently have to compress several thousand files (~40-80MB each) with brotli and get them ready for an s3 bucket.
From what i've researched so far, brotli can't multithread the compression so, brotli.exe uses ~10% of the cpu. How can I iterate through the files in a folder and spawn multiple (brotli).exe files to work at the same time (8-10 processes should fill the cpu)?
windows/powershell/vbs, I can try any suggestions
At the moment, I'm running this batch
for /R %%f in (*.) do (
"brotli" -Z "--output=E:\output\brotli\%%~nf" "%%f"
)
答案1
得分: 0
以下是您的代码的翻译部分:
@ECHO OFF
SETLOCAL
:: 设置任务限制数
SET /a limit=8
:: 在%temp%中创建一个子目录
SET "control=%temp%\brotlicontrol"
MD "%control%" 2>NUL
:: 用于测试的虚拟数据
for %%f IN (fred anna george bill betty carl celia daphne john kelly ian zoe brian
tracey susan colin jane selina valerie david stephen) DO (
CALL :wait
START /min "brotli %%~nf" q75403766_2 "%%f"
)
GOTO :EOF
:wait
SET /a running=0
FOR /f %%y IN ('DIR /a-d /b "%control%\*.flg" 2^nul ^|FIND /c ".flg" ') DO SET /a running=%%y
IF %running% geq %limit% timeout /t 1 >nul&GOTO wait
GOTO :eof
以下是另一个批处理的翻译部分:
@echo off
setlocal
ECHO.>"%control%\%~n1.flg"
REM "brotli" -Z "--output=E:\output\brotli\%~n1" %1
:: 虚拟数据 - 变量超时时间为5-20秒
SET /a exectime=(%RANDOM% %% 16) + 5
timeout /t %exectime% >nul
del "%control%\%~n1.flg"
EXIT
请注意,代码中的%%f
迭代用于测试目的。您可以将其替换为处理您原始文件列表的代码。此外,第一个子批处理中的START
命令中的参数是作业的标题,如果不想指定标题,可以使用两个空引号""
,但不要省略该参数。代码中的^
字符是用于转义特殊字符的,以便在批处理中正常使用。
英文:
@ECHO OFF
SETLOCAL
:: set limit to #jobs
SET /a limit=8
:: establish a subdirectory in %temp%
SET "control=%temp%\brotlicontrol"
MD "%control%" 2>NUL
:: Dummy for testing
for %%f IN (fred anna george bill betty carl celia daphne john kelly ian zoe brian
tracey susan colin jane selina valerie david stephen) DO (
rem for /R %%f in (*.) do (
CALL :wait
START /min "brotli %%~nf" q75403766_2 "%%f"
)
GOTO :EOF
:wait
SET /a running=0
FOR /f %%y IN ('DIR /a-d /b "%control%\*.flg" 2^>nul ^|FIND /c ".flg" ') DO SET /a running=%%y
IF %running% geq %limit% timeout /t 1 >nul&GOTO wait
GOTO :eof
Here's a main batch which start
s a subsidiary batch
@echo off
setlocal
ECHO.>"%control%\%~n1.flg"
REM "brotli" -Z "--output=E:\output\brotli\%~n1" %1
:: Dummy - variable timeout 5-20 seconds
SET /a exectime=(%RANDOM% %% 16) + 5
timeout /t %exectime% >nul
del "%control%\%~n1.flg"
EXIT
I had %%f
iterate through a list of names for testing. All you need to do is to remove that test code and use your original code which I rem
med out to process your list of files.
The process calls the :wait
routine, which counts the .flg
files in the temporary directory, and sets running
to that value.
If the number running is greater than or equal to (geq
) the limit established in the initialisation, wait 1 second and try again, otherwise the :wait
routine terminates and the subsidiary batch q75403766_2
is start
ed /min
minimised and with the name brotli nameoffile
. It's important that the first quoted parameter to start
exists as it's used as the title of the start
ed process. You could use ""
if you want (for no title) but you should not omit this title string.
The sub-process started (q75403766_2
) first creates a .flg
file with the name of the file being processed in the control
directory, then runs the brotli
job (rem
med out again) - I added a few lines to create a variable timeout to simulate the brotli
process-time - and deletes the control
file and exits.
The carets before the redirectors in the for
loops tell cmd
that the redirection is to be applied to the command being executed, not the for
. 2>nul
(+caret) says "redirect error messages (file not found) to nowhere (ie. discard them)".
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论