Skip to content

3.4 Sensys CFGI User Guide

lrrajesh edited this page Sep 14, 2016 · 1 revision

Introduction

This Sensys configuration file "orcm_sites.xml" is used to feed in the cluster info node hierarchy, default mca parameters and authentication information to the Sensys runtime daemons.

Sensys CFGI file syntax

The CFGI file in a plain text ASCII file written in XML format. The grammar in the file is defined by the Backus-Naur desciption shown hereafter.

*<configuration> = <version> <role> <junction> [<scheduler>] [<workflows>] [<ipmi>] [<snmp>]
*<junction> = <type> <name> [<controller>] [<junction\*>]
*<scheduler> = <shost> <port> [<mca-params\*>]
*<workflows> = <workflow\+>
*<ipmi> = <bmc\+>
*<snmp> = <config\+>

Because the CFGI is written in XML, the above tokens will have an appropriate representation. For example, the command , in XML, is represented by the pair of tags:

<configuration> … </configuration>

cluster is always the root of the hierarchy. There can only be one junction of type cluster. It must always be mentioned. row & rack are used in a 4-tiers hierarchy: cluster, row, rack, node. cluster is always the root of the hierarchy; and there can be only one cluster. node are always leaf points in the hierarchy. With cluster as root and node as leaf, the hierarchy always has a unique start point and well defined ending points. Each junction must have a name. Furthermore the name of siblings must be different. For example, if a cluster contains two rows, these rows must not have the same name. Currently, at most 4-tiers of hierarchy are supported and exactly the following order: cluster, row, rack, node. If a row or a rack is omitted, a fictitious equivalent will be automatically inserted.

* <name> ::= **regex**  

A regex expanding to one or more names. A special character, ‘@’, is reserved to indicate that the name of the parent junction in the hierarchy will be used to replace the ‘@’ character. For example, if the parent junction is named “rack1” and its child junction as for =”@_node”, then the child junction’s name string is “rack1_node”, where “rack1” of the parent replaced the ‘@’ character.

<controller> = <host> <port> [<aggregator>] [<mca-params\*>]
*<host> = **regex**  

A regex of an actual IP address or an IP resolvable name, of the machine hosting the controller. It can include the hierarchical operator ‘@’ as explained in . If it does, this ‘@’ operator will refer to the immediate junction hosting this controller. There is a strong relationship between the hosting junction name and its controller host value. If the hosting junction’s name is a regex, one must use ‘@’ as the controller host value. If the hosting junction’s name is a cstring and not a regex, then the controller host value can also be a cstring. EXCEPTION: For node junction, if the hosting junction’s name is a cstring and not a regex, then its controller host value must have be exactly that node junction’s name. NOTE: When in doubt about a controller host name, use ‘@’. * = {yes|no}
If yes, than that particular junction or scheduler will designated as an accumulator.

*<mca-params> = **cslist**  

A cslist of string, where each string contains a single tag-value pairs, with the “=” as separator. Use multiple statements for multiple specification. All multiple statements must be grouped together in the XML file.

<scheduler> = <shost> <port> [<mca-params\*>]
<shost> = **regex**  

A regex of an actual IP address or an IP resolvable name, of the machine hosting the controller. It can include the hierarchical operator ‘@’ as explained in . If it does, this ‘@’ operator will refer to the immediate junction hosting this controller. There is a strong relationship between the hosting junction name and its controller host value. If the hosting junction’s name is a regex, one must use ‘@’ as the controller host value. If the hosting junction’s name is a cstring and not a regex, then the controller host value can also be a cstring. EXCEPTION: For node junction, if the hosting junction’s name is a cstring and not a regex, then its controller host value must have be exactly that node junction’s name. NOTE: When in doubt about a controller host name, use ‘@’.

*<mca-params> = **cslist**  

A cslist of string, where each string contains a single tag-value pairs, with the “=” as separator. Use multiple statements for multiple specification. All multiple statements must be grouped together in the XML file.

<workflows> = <workflow\+> [<aggregator>]  

You can find a full workflow documentation and examples in section 1.7-Data-Smoothing-Algorithms-Analytics

*<aggregator> = **string**  

A string that contains a valid hostname indicating to which aggregator(s) this workflow will be submitted to.

<workflow> = name <step\+>

*name = string
A string that contains this workflow name. This is user - defined.

<step> = name {(<hostname> <data_group>)|(<compute> [<db>])|(<win_size> <compute> [<type>])|([<severity>] [<fault_type>] [<store_event>] [<notifier_action>] <label_mask> [<time_window>] <count_threshold>)| (<nodelist> <compute> [<interval>] [<timeout>])|(<msg_regex> <severity> <notifier>)|(<policy>)|(<policy> <suppress_repeat> [<category>] [<severity>] [<time>])|(<policy> <exec_name> [<exec_argv>])}
*name = {**filter**|**aggregate**|**threshold**|**window**|**cott**|**spatial**|**genex**}  

The name of one of the multiple tasks that can be performed by a workflow.

*<hostname> = **regex**  

Used by filter. This indicates the nodes from which a data will be filtered.

*<data_group> = {**componentpower**|**coretemp**|**dmidata**|**errcounts**|**freq**|**ipmi**|**mcedata**|**nodepower**|**resusage**|**sigar**|**snmp**|**syslog**|**udsensors**}  

Used by filter. This is a sensor name from which the data will be filtered.

*<compute> = {**average**|**min**|**max**|**sd**}  

Used by aggregate, window and spatial. Computes the average, minimum, maximum or standard deviation value over the data gathered from the chosen sensor and rules in filter.

*<db> = {**yes**|**no**}

Used by aggregate. Optional. Defaults to no. Defines if the data will be stored into the database.

*<win_size> = **uint**  

Used by window. It's an integer that indicates the amount of time or samples from which a computation (compute) will be performed. If the integer its followed by "h"(hours), "m"(minutes) or "s"(seconds - default if no unit is provided) then must be time and it will be taken as time, if not, must be counter and it will be taken as number of samples.

*<type> = {**counter**|**time**}  

Used by window. Optional. Defaults to time. Indicates if <win_size> it's given by an amount of time(time) or an amount of samples(counter).

*<severity> = {**emerg**|**alert**|**crit**|**error**|**warn**}  

Used by cott, genex and threshold.
cot - Optional. Defaults to error. Acts like a severity filter for filtering errors that come from errcounts sensor.
genex Filters the severity of the system message to be caught.
threshold Has effect under the <suppress_repeat> mode. Filters the severity of the message to be suppresed.

*<fault_type> = {**hard**|**soft**}  

Used by cott. Optional. Defaults to hard. Acts like a fault type filter for filtering errors that come from errcounts sensor.

*<store_event> = {**yes**|**no**}  

Used by cott. Optional. Defaults to yes. Defines if the event will be stored into the database.

*<notifier_action> = {**none**|**email**|**syslog**}  

Used by cott. Optional. Defaults to none. Defines the way in which the event will be communicated.

* <label_mask> ::= **string**  

Used by cott. A string to search for the errcounts sensor. "*" is taken as a wildcard.

*<time_window> = **uint**  

Used by cott. Optional. Defaults to 1 second. Defines the time window in which the data will be searched for. Requires a character that denotes the unit of the integer:
s - for seconds, m - for minutes, h - for hours, d - for days.

*<count_threshold> = **uint**  

Used by cott. This denotes the number of new item counts within the <time_window> which after the the event is fired (inclusive).

*<nodelist> = **regex**  

Used by spatial. The list of nodes that the aggregation will be conducted. Can be also a logical group.

*<interval> = **uint**  

Used by spatial. Optional. Defaults to 60 seconds. It's an integer which it's default unit is seconds. It indicates the "sleeping length" between 2 consecutive cycling. This is to give user the ability to control the granularity of cycling if he/she does not want the cycling to go as fast as it can go.

*<timeout> = **uint**  

Used by spatial. Optional. Defaults to 60 seconds. It's an integer which it's default unit is seconds. It means the maximum waiting time upon the first sample of a cycle comes before doing the computation. In the case of node failure happens, this attribute avoids waiting endlessly.

*<msg_regex> = **string**  

Used by genex. It's a string containing a formal regular expression with POSIX Classes expressions (ASCII/Unicode/Shorthand Expressions are not allowed). This represents a regex filter from which a system meessage must pass to be caught.

*<notifier> = **smtp**  

Used by genex. Optional. There is only one value for this parameter: smtp. Must be specified if an email notification is required.

*<policy> = **cslist**  

Used by threshold. It's a comma separated list of the policies of the data generated by syslog. Defines which messages will write a message to the syslog. A policy has the format:

<treshold_type>**"|"**<treshold_value>**"|"**<severity>**"|"**<notification_mechanism>
*<suppress_repeat> = **yes**  

Used by threshold. Enables suppress repeats for specific plugins. If no additional attributes are specified, all events generated by the plugin will be suppressed for time period determined by the analytics_base_suppress_repeat MCA parameter.

*<category> = {**HARD_FAULT**|**SOFT_FAULT**}  

Used by threshold under the <suppress_repeat> mode. Defines if hard or soft faults are being suppresed.

*<time> = **uint**  

Used by threshold under the <suppress_repeat> mode. All events that match the severity and category parameters will be suppressed for time period specified using this parameter. All other events will be suppressed time period determined by analytics_base_suppress_repeat MCA parameter. Requires a character that denotes the unit of the integer:
s - for seconds, m - for minutes, h - for hours.

*<exec_name> = **string**  

Used by threshold under the launch exec mode. Defines the name of the executable to be launched by the scheduler if some filter conditions and a message that matchs with are found. Logical group name is recommended.

*<exec_argv> = **cslist**  

Used by threshold under the launch exec mode. Optional. Defaults to an empty list. Comma separated argument list of running the exec.

<policy> = <treshold_type>**"|"**<treshold_value>**"|"**<severity>**"|"**<notification_mechanism>
*<treshold_type> = {**hi**|**low**}
*<threshold_value> = **float**  

Can be also an integer number. Is the threshold boundary of sensor measurement.

*<severity> = {**emerg**|**alert**|**crit**|**err**|**warning**|**notice**|**info**|**debug**}  

severity/priority of the RAS event that is triggered by the policy; severity level follows RFC 5424 syslog protocol.
There are some deprecated serevity names that may could work on your system:
error(same as err), warn(same as warning), panic(same as emerg).

*<notification_mechanism> = {**syslog**|**smtp**}  

Notification mechanism for communicate the event.

<ipmi> = <bmc_node\+>  

You can find a full ipmi documentation and examples in section 2.3.6-ipmi

<bmc_node> = name <bmc_address> <user> <pass> <auth_method> <priv_level> <aggregator>

*name = string
A string that contains this bmc name. This is user defined.

*<bmc_address> = **string**  

Is the IP address of the BMC. This may not be the same as the IP address of the node.

*<user> = **string**  

Is the username of the remote BMC nodes for retrieving the metrics via the IPMI interface.

*<pass> = **string**  

Is the password of the remote BMC nodes for retrieving the metrics via the IPMI interface, for the above configured username.

*<auth_method> = {**NONE**|**MD2**|**MD5**|**UNUSED**|**PASSWORD**|**AUTH_OEM**}  

Optional. Defaults to PASSWORD. Is the authentication method.

*<priv_level> = {**CALLBACK**|**USER**|**OPERATOR**|**ADMIN**|**OEM**}  

Optional. Defaults to USER. Are the privilege levels.

*<aggregator> = **string**  

Is the hostname of the aggregator that receives the metrics.

<snmp> = <config\+>  

You can find a full snmp documentation and examples in section 2.3.15-snmp

<config> ::= name version user pass auth sec location <aggregator> <hostname> <oids>

*name = string
A string that contains this snmp name. This is user defined. *version = {1|3}
Specifies the SNMP version to use. *user = string
On SNMPv1. Is the community user of the device.
On SNMPv3. Is the username of the credential to access the device. *pass = string
On SNMPv3. Is the password of the credential to access the device. *auth = {MD5|SHA1} On SNMPv3. Optional. Defaults to MD5. Is the encryption mechanism. *sec = {NOAUTH|AUTHNOPRIV|AUTHPRIV}
On SNMPv3. Optional. Defaults to AUTHPRIV. Is the security access method. In case of using AUTHPRIV, the default protocol on Net-SNMP library will be used (DES in most cases, unless being disabled at compile time in the library). *location = string
Optional. Provides additional info for locating the device.

*<aggregator> = **string**  

Is a single string specifying which aggregator would be in charge of collecting the SNMP data of the device.

*<hostname> = **regex**  

Is the device hostname or ip address. This can be a comma separated list, logical group and/or regular expression.

*<oids> = **cslist**  

Is the comma separated list of oids to query. Both numerical OIDs and textual MIB names are supported.

XML example

Typically the file is written in the directory $PATH2SENSYS/orcm/etc/orcm-site.xml. Sensys will look for this file by name.

The Sensys XML parser only parse a simplified XML format. The simplification are as follows:

  • XML attributes are not supported
  • Quoted strings can only use double quotes “.

A prototype CFGI file written in XML is provided hereafter. It presents a cluster with the following configuration:

  • A 4-tier hierarchy: cluster, row, rack, junction
  • 1 Scheduler
  • 1 row named row1 without a controller
  • 4 racks in the single row, locally called "agg01", "agg02", "agg03", "agg04"
  • 1024 nodes equally distributed among the racks, locally called “node0000”, …, “node1023”
  • Each rack and node has a controller

The example has in-lined comments which provides further details.

<?xml version="1.0" encoding="UTF-8" ?>
<configuration>
    <!-- Version is fixed to 3.1 -->
    <version>3.1</version>
    <!-- We need a single RECORD -->
    <role>RECORD</role>
    <junction>
        <!-- We need a single root for the hierarchy -->
        <type>cluster</type>
        <name>master3</name>
        <junction>
            <type>row</type>
            <name>row1</name>
            <junction>
                <type>rack</type>
                <name>agg01</name>
                <controller>
                    <host>agg01</host>
                    <port>55805</port>
                    <aggregator>yes</aggregator>
                </controller>
                <junction>
                    <type>node</type>
                    <name>node[4:0-255]</name>
                    <controller>
                        <!-- This controller takes its host name from its row’s name -->
                        <!-- The @ operator does the unique selection -->
                        <host>@</host>
                        <port>55805</port>
                        <aggregator>no</aggregator>
                    </controller>
                </junction>
            </junction>
            <junction>
                <type>rack</type>
                <name>agg02</name>
                <controller>
                    <host>agg02</host>
                    <port>55805</port>
                    <aggregator>yes</aggregator>
                </controller>
                <junction>
                    <type>node</type>
                    <name>node[4:256-511]</name>
                    <controller>
                        <host>@</host>
                        <port>55805</port>
                        <aggregator>no</aggregator>
                    </controller>
                </junction>
            </junction>
            <junction>
                <type>rack</type>
                <name>agg03</name>
                <controller>
                    <host>agg03</host>
                    <port>55805</port>
                    <aggregator>yes</aggregator>
                </controller>
                <junction>
                    <type>node</type>
                    <name>node[4:512-767]</name>
                    <controller>
                        <host>@</host>
                        <port>55805</port>
                        <aggregator>no</aggregator>
                    </controller>
                </junction>
            </junction>
            <junction>
                <type>rack</type>
                <name>agg04</name>
                <controller>
                    <host>agg04</host>
                    <port>55805</port>
                    <aggregator>yes</aggregator>
                </controller>
                <junction>
                    <type>node</type>
                    <name>node[4:768-1023]</name>
                    <controller>
                        <host>@</host>
                        <port>55805</port>
                        <aggregator>no</aggregator>
                    </controller>
                </junction>
            </junction>
        </junction>
    </junction>
    <scheduler>
        <!—shost identifies the node that houses the Sensys scheduler. Only one allowed -->
        <shost>master01</shost>
        <port>55820</port>
    </scheduler>
    <workflows>
        <aggregator>agg01</aggregator>
        <workflow name = "wf1">
            <step name = "filter">
                <data_group>syslog</data_group>
            </step>
            <step name = "genex">
                <msg_regex>access granted</msg_regex>
                <severity>info</severity>
                <notifier>smtp</notifier>
            </step>
        </workflow>
    </workflows>
    <ipmi>
        <bmc_node name="node0004">
            <bmc_address>192.168.0.104</bmc_address>
            <user>bmc_username_01</user>
            <pass>12345678</pass>
            <auth_method>PASSWORD</auth_method>
            <priv_level>USER</priv_level>
            <aggregator>agg01</aggregator>
        </bmc_node>
    </ipmi>
    <snmp>
        <config name="snmp1" version="3" user="user" pass="12345678" auth="MD5" sec="AUTHNOPRIV">
            <aggregator>agg01</aggregator>
            <hostname>server[2:0-20],server21</hostname>
            <oids>1.3.6.1.4.1.343.1.1.3.1,1.3.6.1.4.1.343.1.1.3.4</oids>
        </config>
        <config name="snmp2" version="1" user="user" location="X Lab">
            <aggregator>agg02</aggregator>
            <hostname>switches[2:0-20],switch21</hostname>
            <oids>1.3.6.1.4.1.343.1.1.3.1,1.3.6.1.4.1.343.1.1.3.4</oids>
        </config>
    </snmp>
</configuration>

Running multiple aggregators on one node

Currently, Sensys supports running multiple aggregators on one node. The reason to support this is that at extreme scales, a single aggregator will be saturated given the large number of compute nodes and extreme volume of data. In this case, multiple aggregators will be needed. However, we do not want each aggregator to run on separate node, because we want as many compute nodes as possible to run applications. Given that a compute node will likely have much higher parallelism (e.g 1000 cores) at extreme scales, running multiple aggregators on one node will be a good choice.

To run multiple aggregators on the same node, Sensys needs to distinguish between them with the combination of logical hostname and port number. Assuming the actual hostname of the node is master01, and the external ip address of the node is: X.X.X.X. In the /etc/hosts file, there should be one line specifying the mapping of the actual hostname and the external ip address with the format:

X.X.X.X   master01

For example, if the user/admin wants to run 4 aggregators on the same node, he/she can define the logical hostnames (aliases) for the node by appending the aliases to master01 in the /etc/hosts file as follows assuming that the aliases are agg01, agg02, agg03 and agg04:

X.X.X.X   master01 agg01 agg02 agg03 agg04

The mapping of the logical hostnames to the node ip for the aggregators needs to be copied to the corresponding compute nodes as well in their /etc/hosts files, in order for the compute nodes to recognize the logical hostnames of the aggregators.

In the configuration file, each aggregator must be given an unique logical hostname, as well as an unique port number. When running multiple aggregators on the same node, each aggregator needs to specify the unique port number (the exact same ones in the configuration file) with the mca parameter "--omca cfgi_base_port_number", or with the "-p" option.

An example to configure 4 aggregators on the same node with hostname master01 is shown as follows:

<?xml version="1.0" encoding="UTF-8" ?>
<configuration>
    <version>3.1</version>
    <role>RECORD</role>
    <junction>
        <type>cluster</type>
        <name>default_cluster</name>
        <junction>
            <type>row</type>
            <name>default_row</name>
            <junction>
                <type>rack</type>
                <name>rack1</name>
                <controller>
                    <host>agg01</host>
                    <port>55805</port>
                    <aggregator>yes</aggregator>
                </controller>
                <junction>
                    <type>node</type>
                    <name>node[4:0-255]</name>
                    <controller>
                        <host>@</host>
                        <port>55805</port>
                        <aggregator>no</aggregator>
                    </controller>
                </junction>
            </junction>
            <junction>
                <type>rack</type>
                <name>rack2</name>
                <controller>
                    <host>agg02</host>
                    <port>55806</port>
                    <aggregator>yes</aggregator>
                </controller>
                <junction>
                    <type>node</type>
                    <name>node[4:256-511]</name>
                    <controller>
                        <host>@</host>
                        <port>55805</port>
                        <aggregator>no</aggregator>
                    </controller>
                </junction>
            </junction>
            <junction>
                <type>rack</type>
                <name>rack3</name>
                <controller>
                    <host>agg03</host>
                    <port>55807</port>
                    <aggregator>yes</aggregator>
                </controller>
                <junction>
                    <type>node</type>
                    <name>node[4:512-767]</name>
                    <controller>
                        <host>@</host>
                        <port>55805</port>
                        <aggregator>no</aggregator>
                    </controller>
                </junction>
            </junction>
            <junction>
                <type>rack</type>
                <name>rack4</name>
                <controller>
                    <host>agg04</host>
                    <port>55808</port>
                    <aggregator>yes</aggregator>
                </controller>
                <junction>
                    <type>node</type>
                    <name>node[4:768-1023]</name>
                    <controller>
                        <host>@</host>
                        <port>55805</port>
                        <aggregator>no</aggregator>
                    </controller>
                </junction>
            </junction>
        </junction>
    </junction>
    <scheduler>
        <shost>master01</shost>
        <port>55820</port>
    </scheduler>
    <workflows>
        <aggregator>agg01</aggregator>
        <workflow name = "wf1">
            <step name = "filter">
                <data_group>syslog</data_group>
            </step>
            <step name = "genex">
                <msg_regex>access granted</msg_regex>
                <severity>info</severity>
                <notifier>smtp</notifier>
            </step>
        </workflow>
    </workflows>
    <ipmi>
        <bmc_node name="node0004">
            <bmc_address>192.168.0.104</bmc_address>
            <user>bmc_username_01</user>
            <pass>12345678</pass>
            <auth_method>PASSWORD</auth_method>
            <priv_level>USER</priv_level>
            <aggregator>agg01</aggregator>
        </bmc_node>
    </ipmi>
    <snmp>
        <config name="snmp1" version="3" user="user" pass="12345678" auth="MD5" sec="AUTHNOPRIV">
            <aggregator>agg01</aggregator>
            <hostname>server[2:0-20],server21</hostname>
            <oids>1.3.6.1.4.1.343.1.1.3.1,1.3.6.1.4.1.343.1.1.3.4</oids>
        </config>
        <config name="snmp2" version="1" user="user" location="X Lab">
            <aggregator>agg02</aggregator>
            <hostname>switches[2:0-20],switch21</hostname>
            <oids>1.3.6.1.4.1.343.1.1.3.1,1.3.6.1.4.1.343.1.1.3.4</oids>
        </config>
    </snmp>
</configuration>

To run the 4 aggregators on the same node, each aggregator needs to specify its port number as follows:

% ./orcmd --omca cfgi_base_port_number 55805 % ./orcmd --omca cfgi_base_port_number 55806 % ./orcmd --omca cfgi_base_port_number 55807 % ./orcmd --omca cfgi_base_port_number 55808

or use the short -p option as follows:

% ./orcmd -p 55805 % ./orcmd -p 55806 % ./orcmd -p 55807 % ./orcmd -p 55808

Clone this wiki locally