英文:
If createJob/createTask works for my function? What is the difference between create multiple jobs and create multiple tasks in one job?
问题
I want to run multiple completely independent scripts, which only differs from each other by 1 or 2 parameters, in parallel, so I write the main part as a function and pass the parameters by createJob and createTask as follow:
% Run_DMRG_HubbardKondo
UList = [1, 2, 4, 8];
J_UList = [-1, 0:0.2:2];
c = parcluster;
c.NumThreads = 3;
j = createJob(c);
for iU = 1:numel(UList)
for iJ_U = 1:numel(J_UList)
t = createTask(j, @DMRG_HubbardKondo, 0, {{UList(iU), J_UList(iJ_U)}});
end
end
submit(j);
wait(j,'finished')
delete(j);
clear j t
exit
function DMRG_HubbardKondo(U_Job, J_U_Job)
...% (skipped)
end
What if I createJob multiple times each with one createTask? I know there are some options like attachedfile in createJob. But with respect to independency, is there any difference between createJob and createTask? The reason I ask about independency is that there are setenv inside the DMRG_HubbardKondo function as follow:
function DMRG_HubbardKondo(U_Job, J_U_Job)
...% (skipped)
DirTmp = '/tmp/swan';
setenv('LMA', DirTmp)
Para.DateStr = datestr(datetime('now'), 30);
% RCDir named by parameter and datetime
Para.RCDir = [DirTmp, '/RCStore', Para.DateStr, sprintf('U%.4gJ%.4g', [U_Job, J_U_Job])];
k = [strfind(Para.Symm, 'SU2'), strfind(Para.Symm, '-v')];
if ~isempty(k)
RC = Para.RCDir
if exist(RC, 'dir') == 0
mkdir(RC); % creat if not exist
fprintf([RC, ' made.\n'])
end
setenv('RC_STORE', RC);
setenv('CG_VERBOSE', '0');
end
... % (skipped)
end
The main part DMRG_HubbardKondo will use some mex-compiled functions which act like wigner-eckart theorem. Specifically, it will generate and retrieve data (cg coefficients) in RCDir in every step. I guess those mex-compiled functions will find the corresponding RCDir by "getenv" and I want to know whether createJob/createTask will work correctly.
In summary, my questions are:
- difference between create multiple tasks in one job and create multiple jobs each with one task.
- will createJob/createTask work for my function?
I know sbatch will work by writing a script passing parameters to submit.sh as follow:
function GenSubmitsh(partition, nodeNo, TLim, NCore, mem, logName, JobName, ParaName, ScriptName)
if isnan(nodeNo)
nodeStr = '##SBATCH --nodelist=auto \n';
else
nodeStr = sprintf('#SBATCH --nodelist=node%g \n', nodeNo);
end
Submitsh = sprintf([
'#!/bin/bash -l \n', ...
'#SBATCH --partition=%s \n', ...
nodeStr, ...
'#SBATCH --exclude=node1051 \n', ...
'#SBATCH --time=%s \n', ...
'#SBATCH --nodes=1 \n', ...
'#SBATCH --ntasks=1 \n', ...
'#SBATCH --cpus-per-task=%g \n', ...
'#SBATCH --mem=%s \n', ...
'#SBATCH --output=%s \n', ...
'#SBATCH --job-name=%s \n', ...
'\n', ...
'##Do not remove or change this line in GU_CLUSTER \n', ...
'##export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK \n', ...
'\n', ...
'echo "Job Started At" \n', ...
'date \n', ...
'\n', ...
'matlab -nodesktop -nojvm -nodisplay -r "ParaName=''%s'',%s" \n', ...
'\n', ...
'echo "Job finished at" \n', ...
'date \n'], ...
partition, TLim, NCore, mem, logName, JobName, ParaName, ScriptName);
fileID = fopen('Submit.sh', 'w');
fprintf(fileID, '%s', Submitsh);
fclose(fileID);
end
I hope createJob/createTask will work equivalently. (i.e. completely independent).
英文:
I want to run multiple completely independent scripts, which only differs from each other by 1 or 2 parameters, in parallel, so I write the main part as a function and pass the parameters by createJob and createTask as follow:
% Run_DMRG_HubbardKondo
UList = [1, 2, 4, 8];
J_UList = [-1, 0:0.2:2];
c = parcluster;
c.NumThreads = 3;
j = createJob(c);
for iU = 1:numel(UList)
for iJ_U = 1:numel(J_UList)
t = createTask(j, @DMRG_HubbardKondo, 0, {{UList(iU), J_UList(iJ_U)}});
end
end
submit(j);
wait(j,'finished')
delete(j);
clear j t
exit
function DMRG_HubbardKondo(U_Job, J_U_Job)
...% (skipped)
end
What if I createJob multiple times each with one createTask? I know there are some options like attachedfile in createJob. But with respect to independency, is there any difference between createJob and createTask? The reason I ask about independency is that there are setenv inside the DMRG_HubbardKondo function as follow:
function DMRG_HubbardKondo(U_Job, J_U_Job)
...% (skipped)
DirTmp = '/tmp/swan';
setenv('LMA', DirTmp)
Para.DateStr = datestr(datetime('now'),30);
% RCDir named by parameter and datetime
Para.RCDir = [DirTmp,'/RCStore',Para.DateStr,sprintf('U%.4gJ%.4g', [U_Job,J_U_Job])];
k = [strfind(Para.Symm,'SU2'), strfind(Para.Symm,'-v')];
if ~isempty(k)
RC = Para.RCDir
if exist(RC, 'dir')==0
mkdir(RC); % creat if not exist
fprintf([RC,' made.\n'])
end
setenv('RC_STORE', RC);
setenv('CG_VERBOSE', '0');
end
... % (skipped)
end
The main part DMRG_HubbardKondo will use some mex-compiled functions which act like wigner-eckart theorem. Specifically, it will generate and retrieve data(cg coefficients) in RCDir in every steps. I guess those mex-compiled functions will find the corresponding RCDir by "getenv" and I want to know whether createJob/createTask will work correctly.
In summary, my questions are:
- difference between create multiple tasks in one job and create multiple jobs each with one task.
- will createJob/createTask work for my function?
I know sbatch will work by writing a script passing parameters to submit.sh as follow:
function GenSubmitsh(partition,nodeNo,TLim,NCore,mem,logName,JobName,ParaName,ScriptName)
if isnan(nodeNo)
nodeStr = '##SBATCH --nodelist=auto \n';
else
nodeStr = sprintf('#SBATCH --nodelist=node%g \n',nodeNo);
end
Submitsh = sprintf([
'#!/bin/bash -l \n',...
'#SBATCH --partition=%s \n',...
nodeStr,...
'#SBATCH --exclude=node1051 \n',...
'#SBATCH --time=%s \n',...
'#SBATCH --nodes=1 \n',...
'#SBATCH --ntasks=1 \n',...
'#SBATCH --cpus-per-task=%g \n',...
'#SBATCH --mem=%s \n',...
'#SBATCH --output=%s \n',...
'#SBATCH --job-name=%s \n',...
'\n',...
'##Do not remove or change this line in GU_CLUSTER \n',...
'##export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK \n',...
'\n',...
'echo "Job Started At" \n',...
'date \n',...
'\n',...
'matlab -nodesktop -nojvm -nodisplay -r "ParaName=''%s'',%s" \n',...
'\n',...
'echo "Job finished at" \n',...
'date \n'],...
partition,TLim,NCore,mem,logName,JobName,ParaName,ScriptName);
fileID = fopen('Submit.sh','w');
fprintf(fileID,'%s',Submitsh);
fclose(fileID);
end
I hope createJob/createTask will work equivalently.(i.e. completely independent)
答案1
得分: 1
多次调用createJob
,每次只创建一个createTask
与一次调用createJob
,创建多个createTask
之间只有轻微的差异。我会说通常最好使用一个带有多个任务的单个作业,除非您有特定原因不这样做。以下是一些考虑因素:
-
有一个单个作业对象可以使提交过程的某些阶段只需执行一次,而不是多次(例如,附加文件的某些部分等)。
-
可以对
createTask
的调用进行矢量化(尽管可能会有点尴尬)。(这不影响执行) -
在MATLAB作业调度程序(MJS)系统上,您可以为每个作业对象设置更多属性,比如执行期间使用的工作程序范围。
-
在使用类似于SLURM的调度程序时,可以将单个作业的多个任务提交给调度程序作为“作业数组”,我认为这对调度程序本身可能更有效。
-
在使用不是MJS的调度程序时,不管作业中是否只有一个任务,每个任务都在一个全新的MATLAB工作程序中运行。
英文:
There are only minor differences between multiple createJob
calls each with a single createTask
vs. single createJob
with multiple createTask
calls. I would say it is generally better to use a single Job with multiple Tasks, unless you have a specific reason not to. Here are some considerations:
-
Having a single Job object allows some of the stages of the submission process to be done once instead of multiple times (e.g. some pieces of attaching files etc.)
-
It is possible (although admittedly awkward) to vectorise the calls to
createTask
. (This doesn't affect execution) -
On the MATLAB Job Scheduler (MJS) system, you can set more properties per Job object, such as a range of workers to be used during execution
-
When using schedulers such as SLURM, multiple Tasks of a single Job can be submitted to the scheduler as a "job array", which I believe can be more efficient for the scheduler itself.
-
When using schedulers other than MJS, each Task runs in a fresh MATLAB worker process, regardless of whether it is the only Task in a Job or not.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论