Follow these instructions to configure advanced options during integration of MATLAB® Job Scheduler with your cluster.
Note
If this is the first time you integrate MATLAB Job Scheduler, see the following for the most common configuration options: Install and Configure MATLAB Parallel Server for MATLAB Job Scheduler and Network License Manager.
In the following instructions, matlabroot
refers to the
location of your installed MATLAB
Parallel Server™ software. Where you see this term used in the instructions that follow,
substitute the path to your location.
You can upgrade your MATLAB Job Scheduler clusters and continue to use the R2016a release onwards of Parallel Computing Toolbox on your MATLAB desktop client to connect to it. To take advantage of this backward compatibility feature:
Install the latest version of MATLAB Parallel Server on your cluster. You must use this version to run MATLAB Job Scheduler on your cluster.
Install MATLAB Parallel Server for each release that you want to support in the cluster. For example, to use R2016a and R2016b with your cluster, install both the R2016a and R2016b releases of MATLAB Parallel Server.
Configure MATLAB Job Scheduler with the location of these installations. In
the mjs_def
configuration file, specify the location
of each installation of MATLAB
Parallel Server in the MJS_ADDITIONAL_MATLABROOTS
variable. You can find this file in
for Linux (matlabroot
/toolbox/parallel/binmjs_def.sh
) and Windows
(mjs_def.bat
). For more information, see mjs.
With this configuration, the MATLAB Job Scheduler allows MATLAB clients from the installed releases to submit jobs to the cluster. The MATLAB Job Scheduler dynamically starts the right version of the MATLAB worker to run the job.
If this is the first installation of MATLAB Parallel Server on a cluster of Windows machines, you need to configure these hosts for job communications.
Note
If you do not have a Windows cluster, or if you have already installed a previous version of MATLAB Parallel Server on your Windows cluster, you can skip this step.
If you are using Windows® firewalls on your cluster nodes,
Log in as a user with administrator privileges.
Execute the following in a DOS command window.
matlabroot\toolbox\parallel\bin\addMatlabToWindowsFirewall.bat
This command adds MATLAB as an allowed program. If you are using other firewalls, you must configure them for similar accommodation.
The user that mjs runs as requires access to the cluster MATLAB installation location. By default, mjs runs as the user
LocalSystem
. If your network allows
LocalSystem
to access the install location, you can skip
this step. (If you are not sure of your network configuration and the access
provided for LocalSystem
, contact the MathWorks
install support team.)
Note
If LocalSystem
cannot access the install location, you
must run mjs as a different user.
You can set a different user with these steps:
With any standard text editor (such as WordPad) open the
mjs_def
file found at:
matlabroot\toolbox\parallel\bin\mjs_def.bat
Find the line for setting the MJSUSER
parameter,
and provide a value in the form
domain\username
:
set MJSUSER=mydomain\myusername
Provide the user password by setting the MJSPASS
parameter:
set MJSPASS=password
Save the file.
The mjs
service uses as many ports as required, starting with
BASE_PORT
. By default, BASE_PORT
is 27350.
If you use a machine that runs a total of nJ
job managers and
nW
workers, the mjs
service reserves a
total of 6+2*nJ+4*nW
consecutive ports for its own use. All job
managers and workers, even those on different hosts, that are going to work together
must use the same base port. Otherwise the job managers and workers will not be able
to contact each other. In addition, MPI communication occurs on ports starting at
BASE_PORT+1000
and use 2*nW
consecutive
ports.
For example, if you use a machine with 1 job manager and 16 workers, then you need the following ranges of ports to be open:
27350 – 27422
for the mjs service.
28350 – 28382
for MPI communication.
To connect from MATLAB to a cluster with a non-default BASE_PORT
, you must
append the value of BASE_PORT
to the 'Host'
property in the MATLAB Job Scheduler cluster profile. You must do this in the form
Hostname:BASE_PORT
, for example
myMJSHost:44001
.
If you have an older version of MATLAB Parallel Server running on your cluster nodes, you should stop the mjs services before starting the services of the new installation.
Open a DOS command window with the necessary privileges:
If you are using Windows 7 or Windows Vista™, you must run the command window with administrator privileges. Click the Windows menu Start > (All) Programs > Accessories; then right-click Command Window, and select Run as Administrator. This option is available only if you are running User Account Control (UAC).
If you are using Windows XP, open a DOS command window by selecting the Windows menu Start > Run, then in the Open field, type
cmd
In the command window, navigate to the folder of the old installation that contains the control scripts.
cd oldmatlabroot\toolbox\parallel\bin
Stop and uninstall the old service and remove its associated files by typing the following command.
mjs uninstall -clean
In releases before R2019a, the service is called mdce. Type the following commands instead.
cd oldmatlabroot\toolbox\distcomp\bin mdce uninstall -clean
Note
Using the -clean
flag permanently removes all
existing job data. Be sure this data is no longer needed before
removing it.
Repeat the instructions of this step on all worker nodes.
Log in as root. If you cannot log in as root, you must alter the
following parameters in the
file to point to a folder for which you have write privileges:
oldmatlabroot
/toolbox/parallel/bin/mjs_def.shCHECKPOINTBASE
, LOGBASE
,
PIDBASE
, and LOCKBASE
if
applicable. In releases before R2019a, this file is
instead.oldmatlabroot
/toolbox/distcomp/bin/mdce_def.sh
On each cluster node, stop the mjs service and remove its associated files by typing the commands:
cd oldmatlabroot/toolbox/parallel/bin ./mjs stop -clean
In releases before R2019a, the service is called mdce. Type the following command instead.
cd oldmatlabroot/toolbox/distcomp/bin ./mdce stop -clean
Note
Using the -clean
flag permanently removes all
existing job data. Be sure this data is no longer needed before
removing it.
Before starting the mjs service on your cluster nodes, set a security level. For instructions, see Set the Security Level. For additional security considerations, see Set MATLAB Job Scheduler Cluster Security.
You can start MATLAB Job Scheduler using a graphical interface or the command line. For instructions on how to use the graphical interface, see Configure the MATLAB Job Scheduler. To use the graphical interface, Admin Center, you must run it on a computer that has direct network connectivity to all the nodes of your cluster. If you cannot run Admin Center on such a computer, you must use the command-line interface. For instructions on how to use the command-line interface, follow the next steps.
Start the mjs Service
You must install the mjs service on all nodes (head node and worker nodes). Begin on the head node.
Open a DOS command window with the necessary privileges:
If you are using Windows or Windows Vista, you must run the command window with administrator privileges. Click the Windows menu Start > (All) Programs > Accessories; then right-click Command Window, and select Run as Administrator. This option is available only if you are running User Account Control (UAC).
If you are using Windows XP, open a DOS command window by selecting the Windows menu Start > Run, then in the Open field, type:
cmd
In the DOS command window, navigate to the folder with the control scripts:
cd matlabroot\toolbox\parallel\bin
Install the mjs service by typing the command:
mjs install
Start the mjs service by typing the command:
mjs start
Repeat the instructions of this step on all worker nodes.
As an alternative to items 3–5, you can install and start the mjs service on several nodes remotely from one machine by typing:
cd matlabroot\toolbox\parallel\bin remotemjs install -remotehost hostA,hostB,hostC . . . remotemjs start -remotehost hostA,hostB,hostC . . .
where hostA,hostB,hostC
refers to a list of your
host names. Note that there are no spaces between host names, only a
comma. If you need to indicate protocol, platform (such as in a mixed
environment), or other information, see the help for
remotemjs
by typing:
remotemjs -help
Once installed, the mjs service starts running each time the machine reboots. The mjs service continues to run until explicitly stopped or uninstalled, regardless of whether a MATLAB Job Scheduler or worker session is running.
Start the MATLAB Job Scheduler
To start the MATLAB Job Scheduler, enter the following commands in a DOS command window. You do not have to be at the machine on which the MATLAB Job Scheduler runs, as long as you have access to the MATLAB Parallel Server installation.
In your DOS command window, navigate to the folder with the startup scripts:
cd matlabroot\toolbox\parallel\bin
Start the MATLAB Job Scheduler, using any unique text you want for
the name <MyMJS>
:
startjobmanager -name <MyMJS> -remotehost <MATLAB Job Scheduler host name> -v
Verify that the MATLAB Job Scheduler is running on the intended host.
nodestatus -remotehost <MATLAB Job Scheduler host name>
Note
If you are executing startjobmanager
on
the host where the MATLAB Job Scheduler runs, you do not need to specify
the -remotehost
flag.
If you have more than one MATLAB Job Scheduler on your cluster, each must have a unique name.
Start the Workers
Note
Before you can start a worker on a machine, the mjs service must already be running on that machine. If you are using the network license manager, it must be running on the network.
For each node used as a worker, enter the following commands in a DOS command window. You do not have to be at the machines where the MATLAB workers will run, as long as you have access to the MATLAB Parallel Server installation.
Navigate to the folder with the startup scripts:
cd matlabroot\toolbox\parallel\bin
Start the workers on each node, using the
text for <MyMJS>
that identifies the
name of the MATLAB Job Scheduler you want this worker registered
with. Enter this text on a single line:
startworker -jobmanagerhost <MATLAB Job Scheduler host name> -jobmanager <MyMJS> -remotehost <worker host name> -v
To run more than one worker session on the same node, give
each worker a unique name by including the
-name
option on the
startworker
command, and run it for each
worker on that node:
startworker ... -name <worker1 name> startworker ... -name <worker2 name>
Verify that the workers are running.
nodestatus -remotehost <worker host name>
Repeat items 2–3 for all worker nodes.
For more information about mjs, MATLAB Job Scheduler, and worker processes, such as how to shut them down or customize them, see MATLAB Job Scheduler Cluster Customization.
Start the mjs Service
On each cluster node, start the mjs service by typing the commands:
cd matlabroot/toolbox/parallel/bin ./mjs start
Alternatively (on Linux, but not Macintosh), you can start the mjs service on several nodes remotely from one machine by typing
cd matlabroot/toolbox/parallel/bin ./remotemjs start -remotehost hostA,hostB,hostC . . .
where hostA,hostB,hostC
refers to a list of your
host names. Note that there are no spaces between host names, only a
comma. If you need to indicate protocol, platform (such as in a mixed
environment), or other information, see the help for
remotemjs
by typing
./remotemjs -help
Start the MATLAB Job Scheduler
To start the MATLAB Job Scheduler, enter the following commands. You do not have to be at the machine on which the MATLAB Job Scheduler runs, as long as you have access to the MATLAB Parallel Server installation.
Navigate to the folder with the startup scripts:
cd matlabroot/toolbox/parallel/bin
Start the MATLAB Job Scheduler, using any unique text you want for
the name <MyMJS>
. Enter this text on a
single line.
./startjobmanager -name <MyMJS> -remotehost <MATLAB Job Scheduler host name> -v
Verify that the MATLAB Job Scheduler is running on the intended host:
./nodestatus -remotehost <MATLAB Job Scheduler host name>
Note
If you have more than one MATLAB Job Scheduler on your cluster, each must have a unique name.
Start the Workers
Note
Before you can start a worker on a machine, the mjs service must already be running on that machine. If you are using the network license manager, it must be running on the network.
For each computer hosting a MATLAB worker, enter the following commands. You do not have to be at the machines where the MATLAB workers run, as long as you have access to the MATLAB Parallel Server installation.
Navigate to the folder with the startup scripts:
cd matlabroot/toolbox/parallel/bin
Start the workers on each node, using the
text for <MyMJS>
that identifies the
name of the MATLAB Job Scheduler you want this worker registered
with. Enter this text on a single line:
./startworker -jobmanagerhost <MATLAB Job Scheduler host name> -jobmanager <MyMJS> -remotehost <worker host name> -v
To run more than one worker session on the same machine, give
each worker a unique name with the -name
option:
./startworker ... -name <worker1> ./startworker ... -name <worker2>
Verify that the workers are running. Repeat this command for each worker node:
./nodestatus -remotehost <worker host name>
For more information about mjs, MATLAB Job Scheduler, and worker processes, such as how to shut them down or customize them, see MATLAB Job Scheduler Cluster Customization.
Although this step is not required, it is helpful in case of a system crash. Once configured for this, the mjs service starts running each time the machine reboots. The mjs service continues to run until explicitly stopped, regardless of whether a MATLAB Job Scheduler or worker session is running.
You must have root privileges to do this step.
Choose your platform:
On each cluster node, register the mjs service as a known service and configure it to start automatically at system boot time by following these steps:
Create the following link, if it does not already exist:
ln -s matlabroot/toolbox/parallel/bin/mjs /etc/mjs
Create the following link to the boot script file:
ln -s matlabroot/toolbox/parallel/bin/mjs /etc/init.d/mjs
Set the boot script file permissions:
chmod 555 /etc/init.d/mjs
Find your default run level. If you have a SysV
linux machine, you can determine the default run level by booting your
machine and immediately executing the $runlevel
command. The second number output is the default run level of your
system. If your Linux machine does not support SysV, look in
/etc/inittab
for the default run level.
When you have determined the run level, create a link in the
rc
folder associated with that run level. For
example, if the run level is 5
, execute one of the
following sets of platform-specific commands.
Debian and Fedora platforms:
cd /etc/rc5.d; ln -s ../init.d/mjs S99MJS
SUSE platform:
cd /etc/init.d/rc5.d; ln -s ../mjs S99MJS
Red Hat platform (non-Fedora):
cd /etc/rc.d/rc5.d; ln -s ../../init.d/mjs S99MJS
On each cluster node, register the mjs service as a known service with launchd, and configure it to start automatically at system boot time by following these steps:
Navigate to the toolbox folder and stop the running mjs service:
cd matlabroot/toolbox/parallel/bin sudo ./mjs stop
Create the following link if it does not already exist:
sudo mkdir -p /usr/local/sbin/ sudo ln -s matlabroot/toolbox/parallel/bin/mjs /usr/local/sbin/mjs
Copy the launchd .plist
file
for mjs to /Library/LaunchDaemons
:
sudo cp ./util/com.mathworks.mjs.plist /Library/LaunchDaemons
Open the copied .plist
file in a text editor.
Ensure that the leading part of the StandardOutPath
and StandardErrorPath
fields match the
LOGBASE
value as defined in the
mjs_def.sh
file. For example, if
LOGBASE
is /var/log/mjs
, then
you must define StandardOutPath
and
StandardErrorPath
as follows:
<key>StandardOutPath</key> <string>/var/log/mjs/launchctl.stdout</string> <key>StandardErrorPath</key> <string>/var/log/mjs/launchctl.stderr</string>
Restart your machine and observe that mjs is
running using nodestatus
:
cd matlabroot/toolbox/parallel/bin ./nodestatus
To verify that your MATLAB Parallel Server products are installed and configured correctly, create a cluster profile and validate it. For instructions, see Connect the MATLAB Client to the MATLAB Parallel Server Cluster. You can specify the number of workers to use when validating your profile, to avoid occupying the whole cluster. If your validation does not pass, contact the MathWorks Install Support Team, or see Troubleshoot Common Problems.
After you create a cluster profile, you can make any modifications appropriate for
your applications, such as NumWorkersRange
,
AttachedFiles
, or AdditionalPaths
. To
save your profile for other users, in the Cluster Profile Manager, select the
profile and click Export, then save your profile to a file in a
convenient location. Later, when running the Cluster Profile Manager, other users
can import your profile by clicking Import. For more
information about cluster profiles, see Discover Clusters and Use Cluster Profiles (Parallel Computing Toolbox).