Вы находитесь на странице: 1из 33

Omega 2017.

1 Quick Start Guide - OCM

Document Version Control


Classification Schlumberger Private - Customer Use
Release 2017.1
Date 07 Aug 2017
Source QsgOCMShell
Copyright © 2017 Schlumberger. All rights reserved.
This document is copyright protected. No part of this document may be reproduced, stored in a retrieval system,
or transcribed in any form or by any means, electronic or mechanical, including photocopying and recording,
without the permission of Schlumberger. Further, the document contains information proprietary to Schlumberger
and should not be disclosed or distributed to third parties without the permission of Schlumberger. To the extent
that documents are provided in electronic format, Schlumberger grants permission for the document to be stored
electronically. All other restrictions set forth above regarding the document's use or distribution shall apply.

Schlumberger Trademarks & Service Marks


Schlumberger, the Schlumberger logotype, and other words or symbols used to identify the products and services
described herein are either trademarks, trade names, or service marks of Schlumberger and its licensors, or are the
property of their respective owners. These marks may not be copied, imitated or used, in whole or in part, without
the express prior written permission of Schlumberger. In addition, covers, page headers, custom graphics, icons,
and other design elements may be service marks, trademarks, and/or trade dress of Schlumberger, and may not be
copied, imitated, or used, in whole or in part, without the express prior written permission of Schlumberger.

Other company, product, and service names are the properties of their respective owners.

An asterisk (*) is used throughout this document to designate a mark of Schlumberger.


Table of Contents
Note............................................................................................................................................................................1

1 Set up OCM server................................................................................................................................................2

2 Install Oracle and create OCM instance.............................................................................................................3

3 Install OCM...........................................................................................................................................................4
3.1 Check OCM server and other OCM nodes.............................................................................................4
3.2 Check OCM License-keys and setup OCM account environment settings............................................6
3.3 Check Omega configuration database entries.........................................................................................7
3.4 Prepare hardware database......................................................................................................................8
3.5 Install ocmcontroller RPMs on OCI masters and compute nodes..........................................................9
3.6 Install OCM on OCM server.................................................................................................................10
3.7 Extra commandline OCM admin tool (optional)..................................................................................12

4 Start and configure OCM...................................................................................................................................14


4.1 OCM -> Config -> System Configuration............................................................................................16
4.2 OCM -> Config -> Resource-Class Load-Unit Map............................................................................16
4.3 OCM -> Config -> Node-Group Definition and Accounting Factor....................................................19
4.4 OCM -> Node-Group -> Change Status...............................................................................................19
4.5 OCM -> Node-Group -> Health Check................................................................................................20
4.6 Max Resource Load and Max Allowed Jobs........................................................................................21
4.7 OCM -> Scheduling..............................................................................................................................22
4.8 OCM -> Scheduling -> Project, Job, Tape rules...................................................................................23
4.9 OCM -> Admin -> SFM Configuration................................................................................................26
4.10 OCM -> Admin -> Services................................................................................................................27

5 Share nodes with 3rd party job scheduler........................................................................................................28

6 Acceptance tests...................................................................................................................................................29

i
Note
• This quick start does not cover all details of the OCM admin manual and user manual, which are
excellently written and have a lot of good information. Please do read them and refer to them for anything
we skip here. If there is any discrepancy between the quick start and the OCM admin manual, please
follow the admin manual. These guides are on DVD OmegaAndOCM under /02-OCM/01-Documentation
directory.

• This quick-start covers a quick OCM full installation. For OCM upgrade, please refer to
OCM_2.5_Release_Notes.pdf.

• If you are going to migrate from JSS to OCM, please refer to JSS_to_OCM_guide.pdf at the same place
on DVD.

• The directory structure we post here are Linux directories, hence we use format like
/02-OCM/02-Installation instead of \02-OCM\01-Documentation.

Schlumberger Private - Customer Use 1


1 Set up OCM server
Please follow Omega 2017.1ext System Preparation quick start guide to set up new OCM server, masters and
compute nodes.

A script, ocm_server_checker.sh, may be used to check server's setup and port availability.

In this guide, we use the sample xxoc001 discussed in above system preparation quick start as OCM host.

Schlumberger Private - Customer Use 2


2 Install Oracle and create OCM instance
Please refer to Omega 2017.1ext Linux Installation quick start guide for Oracle installation and OCM instance
creation.

In this example, OCM instance is xxocm001 on xxos001.

The load of OCM instance on Oracle server is very light. Most sites can share the same oracle server with Omega
OPM/RDM host.

Large site can have separate Oracle server for OCM. Please consult Schlumberger on this.

Schlumberger Private - Customer Use 3


3 Install OCM
The DVD OmegaAndOCM provides OCM 2.5.

We will use xxoc001 as OCM server, xxmm00[1-2] as OCI masters, xxa00[01-50] as compute nodes, as
described in section 1.4 of System Preparation Quick Start Guide.

Copy /02-OCM/02-Installation and /02-OCM/03-Miscellaneous from DVD to /wg/omega/installations directory.

As mentioned above, Oracle instance xxocm001 is created on xxos001 in the Oracle Installation section of
2017.1ext Linux Installation Quick Start Guide.

3.1 Check OCM server and other OCM nodes


We should have /wg/omega, /wgjss, /wgjss/wgas, /tmp, /wglogs, /local1/scr on OCM server xxoc001.

Please refer to step 3.1 of Quick Start Guide for System Preparation to check directories permission here.

To check OCM server:

# as root on OCM sever xxoc001


su - root
ls -ald /wg/omega /wgjss /wgjss/wgas /tmp /wglogs /local1/scr

# create the following if we don t have these directories

mkdir -p /wgjss/wgas /wgjss/ocm/workorder /wgjss/ocm/omega2stat /wgjss/ocm/archive


chmod 777 /wgjss/wgas
chmod -R 2777 /wgjss/ocm
chown -R jssmgr:jssadmin /wgjss
ln -s /wgjss/wgas /wgas

# check /oracle is mounted or copied from xxos001:oracle


# OCM needs some Oracle libraries and sqlplus during initial OCM installation
ls -ald /oracle

Schlumberger Private - Customer Use 4


3.1 Check OCM server and other OCM nodes

# check if java 1.7.0.55 or above or 1.8 is available


java -version

# if java version is lower than 1.7.0.55, find a newer java openjdk rpm
# install java, as root
yum install java-1.xxxx-openjdk

In addition to the above preparation, there is a script to check OCM server. It's on DVD OmegaAndOCM under
/02-OCM/03-Miscellaneous/scripts. Below is the sample outputs:

# this script has to run as root


[root@xxx ]# ./ocm_server_checker.sh

------ Checking OCM Server ------

1. Dir /tmp size = 1890664 KB

2. Oracle installation
Oracle Home = /oracle/12.1

3. GlassFish needs port 4848


It is available

4. OCM https needs port 8181


It is available

5. OCM http needs port 9090


It is available

6. Java Version 1.7.0_91


Java 1.8.0_131 or newer is recommended

To check OCI masters and compute nodes:

# check if /wg/omega and /wgjss are NFS mounted on the masters and compute nodes
ls /wg/omega /wgjss/ocm

# check if local /wglogs, /tmp, /local1/scr exist and have permission of 777
ls -ald /wglogs/ /tmp /local1/scr

Schlumberger Private - Customer Use 5


3.2 Check OCM License-keys and setup OCM account environment settings

# check if /etc/omega/installations exist


Omegainst

# check if service omegalauncher runs


# we should see omegalauncher up
service omegalauncher status

# check if java 1.7.0.55 or above or 1.8 is available


java -version

# if java version is lower than 1.7.0.55, find a newer java openjdk rpm
# install java, as root
yum install java-1.xxxx-openjdk

3.2 Check OCM License-keys and setup OCM account


environment settings
To check if we have OCM license-keys:

omega2 lmstat -a | grep OCM

we should see at least OCMBase license. Without this license we can not run OCM. We will get a license
verification error on OCM web.

If you purchased OCM Dynamic license,then you should see both OCMBase and OCMDynamic license.

Set up license environment variable for ocm account:

su - ocm
echo setenv SLBSLS_LICENSE_FILE @xxls001.dnsdomain.com >> .cshrc
echo setenv PATH /wgas/Server/bin:/wgas/ocm/bin:$PATH >> .cshrc
source .cshrc

Note:

Schlumberger Private - Customer Use 6


3.3 Check Omega configuration database entries

• OCM will install without checking license. We can even restart ocmadmin without licenses. Only when
we bring up the OCM server web page or submit OCI jobs, OCM licenses are checked.
• We can submit parallel OCI jobs using 'dynamic node allocation' when we have OCMDynamic license;
and we can only submit parallel OCI jobs using 'static node allocation' if we don't have OCMDynamic
license.

3.3 Check Omega configuration database entries


Table: OCM related Configuration Entries in Configuration.omcdb

Name Explanation Suggested Value


Omega.OCM.ServerLocation OCM server name xxoc001.dnsdomain.com
Where Omega job
Omega.WAN.Sites.Default.JobDirectory submission files are picked /wgjss/ocm/workorder
up
Host where all Omega jobs
are submitted; use xxoc001
as it has local access to
Omega.WAN.Sites.Default.JobSubmitHost xxoc001.dnsdomain.com
JobDirectory instead of NFS;
this would improve job
submission performance
Location of Omega hardware
database file which has OCI
Omega.HardwareDatabase masters, compute nodes and /wg/omega/setup/HardwareDb.omcdb
their network structure
information

Note 1: Omega.WAN.Sites.Default.JobDirectory is where OCM pick up the job files. The directory should
generally be /wgjss/ocm/workorder.

Note 2: For a small center with flat network, Omega hardware database is not required anymore in OCM 2.5 and
Omega 2017.1.

Schlumberger Private - Customer Use 7


3.4 Prepare hardware database

Check and edit the above entries:

# as omadmin on OCM installation server xxoc001, make sure we have display here

omega2 config&

# browse and open /wg/omega/2017.1ext/share/config/Configuration.omcdb file


# refer to the table above to check and update the OCM entries

3.4 Prepare hardware database


The hardware database file has OCI masters, compute nodes and network topology information. Detail
information can be found from Omega_2017.1_HWDB_user_manual.pdf under /02-OCM/01-Documentation on
DVD.

Switch names can be an arbitrary name like core, level1switch1, level1switch2, level2switch1, level2switch2 etc.
OCM only cares for the structure of the network and which nodes connect to the same switch. Use core as the
root switch name.

GPU hosts that connect to different Infiniband (IB) islands need to be separated into different groups, even they
are connected to the same ethernet switch. This is a hard requirement for IB aware applications such as RTM,
FWI, and FD_MOD. It does not matter if ethernet connectivity information is represented for the nodes. It is
sufficient for each of the IB switches in the HWDB to be connected to 'core'.

Here are some examples of network topology for our case.

Scenario 1:

All nodes connect to the same network switch.


Hardware database in not required for OCM and OCI anymore.

Scenario 2:

xxa00[01-25] connect to ethernet switch 1


xxa00[26-50] connect to ethernet switch 2
No IB connection

Schlumberger Private - Customer Use 8


3.5 Install ocmcontroller RPMs on OCI masters and compute nodes

deviceID,neighbor-deviceID,bandwidth
core,level1swtich1,10
core,level1swtich2,10
level1switch1,xxa[0001-0025],10
level1switch2,xxa[0026-0050],10

Scenario 3:

All xxa nodes are GPU nodes and connect to same ethernet switch
But xxa00[01-25] connect to IB island 1 and xxa00[26-50] connect to IB island 2

deviceID,neighbor-deviceID,bandwidth
core,vsre001,10
core,vsre002,10
vsre001,xxa[0001-0025],10
vsre002,xxa[0026-0050],10

To create or edit the hardware database:

# as omadmin on xxoc001, make sure we have display here


# start HWDB explorer

omega2 hwconfig&

# click file-> new to enter nodes and network topology manually


# large center can prepare nodes and network topology in spreadsheet and import
# save the hardware database to default location /wg/omega/setup/HardwareDb.omcdb

3.5 Install ocmcontroller RPMs on OCI masters and compute


nodes
# on each host xxmm00[1-2], xxa00[01-50]
su - root
# first make sure /wg/omega is mounted and omegalauncher up

df -h /wg/omega
omegainst

Schlumberger Private - Customer Use 9


3.6 Install OCM on OCM server

service omegalauncher status

# assume we copied OCM installation files to /wg/omega/installations/02-OCM/02-Installation

cd /wg/omega/installations/02-OCM/02-Installation
rpm -Uvh ocmcontroller-201-1.noarch.rpm

# start ocmcontroller service


service ocmcontroller start
service ocmcontroller status

# check if it s in chkconfig
chkconfig --list | grep ocmcontroller
# if not, turn it on
chkconfig ocmcontroller on

Optional: install o2dk RPM and use o2dk to monitor omegalauncher and ocmcontroller service on each host

The o2dk RPM is provided on same DVD under /01-OmegaLinux/03-Miscellaneous/RPMs/NoDesktop.

rpm -qa | grep o2dk


rpm -Uvh wg-o2dk-ext-11.3.7-2.el6.x86_64.rpm
vi /etc/o2dk.conf

# make sure we have this line in the file


service = ocmcontroller, omegalauncher

# check o2dk service is on


service o2dk status

# check o2dk will be started automatically


chkconfig -list | grep o2dk
chkconfig o2dk on

3.6 Install OCM on OCM server


Install OCM root controller RPM on OCM server:

# on xxoc001, as root

Schlumberger Private - Customer Use 10


3.6 Install OCM on OCM server

# assume we copied OCM installation files to /wg/omega/installations/02-OCM/02-Installation

cd /wg/omega/installations/02-OCM/02-Installation
rpm -Uvh ocmrootcontroller-205096-1.noarch.rpm

# start ocmrootcontroller service


service ocmrootcontroller status
service ocmrootcontroller start
service ocmrootcontroller status

# check if ocmrootcontroller is turned on chkconfig


chkconfig -list | grep ocmrootcontroller
# if not, turn it on
chkconfig ocmrootcontroller on

Optional: install o2dk RPM and use o2dk to monitor ocmrootcontroller service on OCM server

rpm -qa | grep o2dk


rpm -Uvh wg-o2dk-ext-11.3.7-2.el6.x86_64.rpm

vi /etc/o2dk.conf
# make sure we have this line in the file
service = ocmrootcontroller

# check o2dk service is on


service o2dk status

# check o2dk will be started automatically


chkconfig -list | grep o2dk
chkconfig o2dk on

Install the application server and engine on OCM server:

# on xxoc001, as ocm
# assume we copied OCM installation files to /wg/omega/installations/02-OCM/02-Installation

su - ocm
cd /wg/omega/installations/02-OCM/02-Installation
./OcmSetupExt.2.5.96.exe
# ./OcmSetupExt.2.5.96.exe --definepasswords
# note: this option will overwrite default Oracle OCM table passwords

Schlumberger Private - Customer Use 11


3.7 Extra commandline OCM admin tool (optional)

# when prompted for Oracle server machine name, type in: xxos001
# when prompted for OCM instance ID: xxocm001
# when prompted for server installer path, enter:
# /wg/omega/installations/02-OCM/02-Installation/ee6u4j7installer.bin
# type password for admin (for http://xxoc001.dnsdomain.com:4848/)
# type password for ocmadministrator (for http://xxoc001.dnsdomain.com:9090/ocm)
# take default action and finish the installation

# next with ocmadmin> prompt, add power users


ocmadmin> help
ocmadmin> serverstatus
# add power users that can change OCM configuration and restart OCM services
ocmadmin> addpoweruser

# specify user name and password, this password is independent of linux NIS password
# in this example, we can add ocm, password ocm!, as one power user

# add regular users


ocmadmin> adduser
# specify user name and password
# start OCM services

ocmadmin> startocm
ocmadmin> exit

3.7 Extra commandline OCM admin tool (optional)


This step is optional.

Large site can use the command line OCM tools that are provided in two rpm.

These two ocm_admin_xxx.rpm are under /02-OCM/03-Miscellaneous/RPMs directory on DVD


OmegaAndOCM.

These command lines can restart OCM and change node status. They are very powerful.

# as root on the OCM server


rpm -Uvh ocm_admin_resources_tool-205096-1.noarch.rpm

Schlumberger Private - Customer Use 12


3.7 Extra commandline OCM admin tool (optional)

rpm -Uvh ocm_admin_services_tool-205096-1.noarch.rpm

Please refer to section 8 Command Line Tools to Manage OCM of the OCM_Admin_Manual for details.

Schlumberger Private - Customer Use 13


4 Start and configure OCM
# as root on xxoc001, make sure we have display
# bring up firefox
firefox&

# Type address http://xxoc001.dnsdomain.com:9090/ocm.


# Click login button at right upper corner of the web page, login as ocm.
# Then click on Config tab on the top.

Below is a sample configurations screen. Please update 'System Configuration', 'Resource-class Load-Unit Map',
'Node-Group Definition' and 'Accounting Factor'. Log in as ocm and click the 'Edit xxxx' link in each section.

Schlumberger Private - Customer Use 14


4 Start and configure OCM

Schlumberger Private - Customer Use 15


4.1 OCM -> Config -> System Configuration

4.1 OCM -> Config -> System Configuration


Job Submission Directory: this is where OCM retrieves new jobs. It should match Omega configuration
Omega.WAN.Sites.Default->JobDirectory.

Job Archive Directory: this is where archiver saves old jobs. Usually we set it to be /wgjss/ocm/archive. This
directory needs to be writable by ocm account.

Cluster Sharing Communication Directory: directory for communication between OCM and 3rd party
scheduler. We will talk about this in detail in the end of this quick guide.

Cluster Sharing Release Iactive Nodes after: idle time for external nodes to be released back to the 3rd party
scheduler.

Fast RDM Check: check RDM pool quota for jobs at job submission time in stead of checking pool quota during
the job run. This will avoid wasting compute nodes time when there is not enough space in RDM pool.

Precheck License: recommend to set it to 'Yes', which means OCM will check license before OCM assign nodes
to a job. If set to 'No', OCM will assign nodes to jobs regardless if there are licenses available, and the jobs or
compute tasks may fail.

4.2 OCM -> Config -> Resource-Class Load-Unit Map


Usually the 'Resource Class' names here should match Omega configuration entries in Omega.Batch.Queues.

It's recommend to use the sample load numbers in the screen shot above. If serial job load automation is turned
on, only 'heaviest' is used.

[root@triumph01 healthcheck]# omega2017.1ext dump-config | grep Omega.Batch.Queues -A 11


Omega.Batch.Queues
Priorities(in increasing order) = Default,Host,User,Project
Default/heaviest = Name=heaviest
Default/heavy = Name=heavy
Default/medium = Name=medium

Schlumberger Private - Customer Use 16


4.2 OCM -> Config -> Resource-Class Load-Unit Map

Default/parallel = Name=parallel
Default/light = Name=light
Default/tape = Name=tape
Omega.Batch.RunClass
Priorities(in increasing order) = Default,Host,User,Project
Default/omega2 = Name=omega2

And when we batch submit jobs, we need to specify 'Targe Queue' of compute nodes. This 'Target Queue' should
be one of the 'Resource Class' we defined here.

This way, OCM knows how much resources this job will take and assign nodes accordingly.

Schlumberger Private - Customer Use 17


4.2 OCM -> Config -> Resource-Class Load-Unit Map

Schlumberger Private - Customer Use 18


4.3 OCM -> Config -> Node-Group Definition and Accounting Factor

4.3 OCM -> Config -> Node-Group Definition and Accounting


Factor
If Hardware Database is used, please note that all nodes must be in the Hardware Database before they are
imported and added to OCM.

Omega role nodes can only run serial Omega jobs.

Compute role nodes can only run OCI jobs as child nodes.

OCI-Master nodes are job server for OCI jobs and work with the 'compute' role on OCI jobs.

Each node group can have combination of 'Omega', 'Compute', and 'OCI-Master' roles.

OffHours nodes are for small sites that want to add work stations to Omega cluster to be used only during off
hour time. Each Off-Hours node (Linux) should have Omega, ocmcontroller installed.

External nodes are nodes that are shared by OCM and 3rd party scheduler. Uncheck if you don't have 3rd party
scheduler. We will talk about node sharing in the end of this guide.

Accounting Factor is for the charge rate for different nodes. It's recommend to use the CPU numbers of each type
of node as accounting factor.

If your site does not need this, you can just set it to be 1.0.

4.4 OCM -> Node-Group -> Change Status


All new nodes added to OCM need to be enabled. Click on 'Node Group' tab, select the node group and right
click, then select 'Change Status'. Once the dialog box pops up, check the button for 'Up'.

Schlumberger Private - Customer Use 19


4.4 OCM -> Node-Group -> Change Status

Please note that the OffHours nodes will be enabled automatically by OCM in off hours (working hours are
defined earlier in System Configuration).

4.5 OCM -> Node-Group -> Health Check


When nodes are down, we can right click on the node or node group and do a health check

In the dialog box, type in '2017.1ext' and then submit.

The health check scripts are under /wgas/healthcheck directory on the OCM server host. If you don't have IB
nodes or you don't use Mellanox IB card, disable the IB check script by renaming it.

Schlumberger Private - Customer Use 20


4.5 OCM -> Node-Group -> Health Check

[root@triumph01 healthcheck]# ll
total 156
-rwxr-xr-x 1 ocm ocm 12574 Jul 18 16:18 custom_ib_health_check.sh
-r-xr-xr-x 1 ocm ocm 15230 Jul 18 16:18 ocm_healthcheck_generic.py
-r-xr-xr-x 1 ocm ocm 28810 Jul 18 16:18 ocm_healthcheck_mpi_compute1.py
-r-xr-xr-x 1 ocm ocm 42807 Jul 18 16:18 ocm_healthcheck_mpi_compute2.py
-r-xr-xr-x 1 ocm ocm 21953 Jul 18 16:18 ocm_healthcheck_oci_compute.py
-r-xr-xr-x 1 ocm ocm 22558 Jul 18 16:18 ocm_healthcheck_omega.py
[root@triumph01 healthcheck]# less custom_ib_health_check.sh

# disable IB check
mv custom_ib_health_check.sh custom_ib_health_check.sh.notused

You can also customize these healthcheck scripts using python scripts. The details can be found in OCM Admin
Manual section 3.6 Node Health Check Setup.

4.6 Max Resource Load and Max Allowed Jobs


To fully use the compute nodes resources but not overloading them, we can specify max resource load and max
allowed jobs for each node group.

From OCM -> Node Groups -> right click on each node group, we can 'Change Max Resource Load' and 'Change
Max Allowed Jobs' for each node group.

Below is the recommended resource load table for different type of HPC nodes:

Max
Node Node Cores Node Memory (GB) Max Rload/Node
Jobs/Node
IvyBridge 20 128 39 13
SandyBridge 16 64 33 9
Westmere 12 48 28 7
Nehalem 8 24 17 5

Note: from OCM -> Scheduling -> Config -> '_Automation: Default Serial Job Load', we can specify default
serial job load. If we have this default value bigger than the max allowed resource loads on the serial nodes, then

Schlumberger Private - Customer Use 21


4.6 Max Resource Load and Max Allowed Jobs

serial jobs won't run. We will talk about this entry in below Scheduling part.

4.7 OCM -> Scheduling


Click on 'Scheduling' -> Config tab. Below please see the sample screen:

Schlumberger Private - Customer Use 22


4.7 OCM -> Scheduling

Detail instructions and explanations are in the OCM_User_Manual_V2.5.pdf section 4.7 Scheduling
Configuration and table 4-4.

Max Fraction of Nodes in Shareable Node Groups for Serial Jobs: value between 0 and 1. '0' means serial jobs can
not run on Shareable Node groups. '0.5' means serial jobs can use half of the nodes. '1' means serial jobs can use
all nodes. Detail explanation in OCM_User_Manual_V2.5.pdf section 3.4.8.

Fraction of Test Nodes in a Selected Node Group: value between 1 and 1. '0.5' means the Test mode jobs can use
half nodes of the selected group, defined in Test job rules. Test mode jobs have highest priority to run. Detail
information is in section 6.5 of OCM user guide.

Please note we leave 'Scheduling' -> Config -> Require Pre-defined Project' field to 'No', to make things easier.
This means OCM allow jobs without predefined rule to run. A default TEMPLATEPROJECT rule and a default
project rule will be created the first time when a job of any project is submitted.

Allow Users to Manage Their Jobs -> checked. This will ease initial burden of the Omega admin to monitor the
OCM job queue and allow users to manage their own jobs.

Apply Serial Job Load Automation -> checked. Let OCM determine rload of a serial job.

License for Dynamic OCI Jobs is updated automatically. OCM will check if we have OCMDynamic license on
the license server. If yet, this field will display checked.

4.8 OCM -> Scheduling -> Project, Job, Tape rules


If we have 'Scheduling' -> Config -> Require Pre-defined Project' field to 'No', then we don't need to pre-define
project rules.

However, there are many cases we need to create customized project rules to better use the compute resources.

The project rule creation window is attached below. The detail explanation of project rule can be found in OCM
user guide section 4.3 Project rule.

Schlumberger Private - Customer Use 23


4.8 OCM -> Scheduling -> Project, Job, Tape rules

Please note Target Node Quota and Reserve Nodes are optional. These two fields can reserve nodes for the
project in selected node groups.

Job rules allow more specialized control of job scheduling and node allocation. Below is a screen shot. Detail
information can be found in OCM user guide section 4.4 Job Rule.

Schlumberger Private - Customer Use 24


4.8 OCM -> Scheduling -> Project, Job, Tape rules

Schlumberger Private - Customer Use 25


4.9 OCM -> Admin -> SFM Configuration

Please note processing mode 'Test' has higher priority. The 'Test' mode jobs allow test users to run their tests
faster on specified nodes in a busy center.

Tape rules are for centers that have multiple tape drives which connect to different nodes. In such case, we can
use tape rule to specify which tape drive can be used by which node group. Detail explanation can be found in
OCM user guide section 4.6 Tape Job Scheduling.

4.9 OCM -> Admin -> SFM Configuration


This feature allows user to specify maximum number of nodes that can be distributed to an OCI job in a
scheduling cycle. We can also specify if we allow a small OCI job to use its master node to run child processes
(not recommended for big OCI jobs). The detail information can be found in OCM Admin guide section 5.9 SFM
Configuration. This step is optional for a small center.

Schlumberger Private - Customer Use 26


4.10 OCM -> Admin -> Services

4.10 OCM -> Admin -> Services


Click on Admin tab on the top, then click Services. Check all services here are in green status. If any service is
not green, click on the button at the end of each line to start it (only start ClusterSharing service if you share nodes
between OCM and 3rd party scheduler; mailService is optional).

Schlumberger Private - Customer Use 27


5 Share nodes with 3rd party job scheduler
This step is optional for most sites.

OCM can share nodes with 3rd party scheduler like PBS Torque.

OCM and 3rd party scheduler communicate through the 'ClusterSharing service' we started and read/generate
drop files in the 'Cluster Sharing: Communication Directory' we mentioned before.

Shared group and nodes need to have same name and configured on both systems.

Client needs to write a simple python script to communicate with OCM on node requests.

Please contact Schlumberger WesternGeco for details and assistance.

Schlumberger Private - Customer Use 28


6 Acceptance tests
After OCM is set up, we can do the acceptance tests. Please refer to Acceptance Tests section in 2017.1ext Linux
Installation Quick Start Guide and the acceptance test instruction in the the acceptance test kit. We can use batch
submission mode, instead of immediate mode to submit jobs.

To troubleshoot OCM related issues, we can check /wglogs/OCMROOT/server0.log on the OCM server, or the
/wglogs/OCM/server0.log on each master or compute nodes.

To further investigate OCM issues, we can also check /wgas/Server/glassfish/domains/domain1/logs/server.log on


the OCM server.

OCM events logs can be viewed in OCM 'Events' web page. It's also very helpful in troubleshooting OCM issues.

Schlumberger Private - Customer Use 29

Вам также может понравиться