/bin/jm_stop /bin/jm_start
You should get a message that the manager has been started on sp2-16.
/p/sp2/bin/switch_stat
"switch_responds" should be 1 for all nodes. If it is not, and all nodes are up, then the switch needs to be restarted with:
/usr/lpp/ssp/bin/Estart
Note: You'll need to log in as root to restart the switch. "su" won't work since it won't give you the required Kerberos 4 tickets for the other nodes..
If the Estart command complains that the primary node does not have a fault service worm running, you will need to do this first:
/usr/lpp/ssp/bin/pexec 1 /usr/lpp/ssp/css/rc.switch
Then retry the Estart command, as above.
Rebooting nodes
If a node does not respond, or is otherwise behaving irrationally, it might need to be rebooted.
To reboot (a) node(s), from the control workstation (sp2-cw), as root:
/usr/lpp/ssp/bin/cstartup sp2-04 ....
Reboots the given nodes; only works for nodes that are down. The node should be back up within about 5 minutes. If it does not come up, try the cshutdown command (see below).
/usr/lpp/ssp/bin/cstartup -Z sp2-04 ....
Shutdown, then reboot currently running nodes.
/usr/lpp/ssp/bin/cshutdown -r sp2-04 ...
Shutdown and reboot a node. Average time for a node to come back up is about 5-10 minutes. This command powers off the node after shutdown, so it should work even in cases where the node won't respond to cstartup.
Notes:
- Several node names can be specified to reboot several nodes concurrently
- The cstartup and cshutdown commands wait until all the nodes are completely up before
returning control to the user.
- After the nodes are back up, the switch will need to be restarted
for those nodes to rejoin the switch.
Checking if a node is up
Standard methods can be used to check the status of a node (ping, etc..).
You might also want to try logging in to the suspicious node; a node could be hung
but still be responding to ping.
/p/sp2/bin/node_stat
This command will show you the SP2's view of which nodes are up or down.