Your suggested change has been received. Thank you.

close

Suggest A Change

https://thales.na.market.dpondemand.io/docs/dpod/services/kmo….

back

CipherTrust Manager System Monitoring

Alarms

search

Alarms

Alarms are CipherTrust Manager as a Service's mechanism for notifying administrators to problems with the state of the CipherTrust Manager as a Service, one of its nodes, or select clients such as CTE. Alarms are raised based on Server or Client Records indicating CipherTrust Manager as a Service or a client is not healthy or is not configured securely. Alarms can be fetched via the REST API, CLI, or the GUI.

Alarm states are listed in the following table:

State Description
off The alarm is inactive and does not need to be investigated.
on The alarm is active and should be investigated. It remains active until the condition that triggered it is not longer valid.
unknown The alarm's state could not be determined and should be investigated. Typically this occurs when the service responsible for triggering the alarm failed to communicate its state (e.g. the service is down).

The CipherTrust Manager as a Service has the following built-in alarms for server problems:

Name Severity Trigger Remediation
Disk Full Critical The root file system's 'used space' percentage exceeds the configured threshold of 80%. Increase the node's disk/partition/file-system, or replace the node with a new node that has sufficient storage.
NAE TLS Disabled Critical The system is started with NAE's interface mode configured to not use TLS <link to interfaces>. Modify NAE's interface mode <link to interfaces> to one of the options that specifies TLS and restart <link to services>.
HSM Offline Critical The system cannot access the HSM after more than 15 seconds.
NOTE: If connectivity to the HSM is not restored after 5 minutes, all services are shut down until the HSM becomes available.
Restore connectivity to the HSM.
Cluster Node Down Critical A node within a cluster is down. Restore connection of the down node to the cluster.
Cluster Node Certificate Expiration Critical An internal CipherTrust Manager as a Service certificate that is used for database access and clustering has expired, or will expire within 30 days. Reboot the box when the alarm triggers. A new certificate is automatically generated after the reboot.
Syslog Connection Offline Critical A Syslog connection goes offline. Restore connectivity to the specified Syslog connection.
Deprecated TLS version Enabled Critical When TLSv1.0 or/and TLSv1.1 is set as minimum TLS version on the CipherTrust Manager for NAE-KMIP interface. Set minimum TLS version to TLSv1.2 or higher.
License Violation Critical A Virtual CipherTrust Manager as a Service k170v instance has more than four CPUs assigned to it. Upgrade to a valid k470 license, or assign four or fewer CPUs to the k170v.
License Expiration Warning One or more of the licenses is set to expire in fewer than 90 days. The description indicates the licenses and number of days until expiry.

The CipherTrust Manager as a Service server also has dynamic alarms that are triggered based on server or client record conditions. Refer to: Configuring alarm triggers based on a record.

Consult documentation for a specific connector for information on interpreting client alarms.

To list all alarms
$ ksctl alarms list

Example response:

{
    "skip": 0,
    "limit": 10,
    "total": 8,
    "resources": [
        {
            "name": "License Violation",
            "state": "off",
            "triggeredAt": "2021-09-30T10:07:52.5529Z",
            "description": "Alarm triggers if CPU count exceeds the limit allowed in the license",
            "severity": "critical"
        },
        {
            "name": "Cluster Node Certificate Expiration",
            "state": "off",
            "triggeredAt": "2021-09-30T10:07:52.547729Z",
            "description": "Alarm triggers 30 days to certificate expiration (the certificate is currently valid for 2.0 years)",
            "severity": "critical"
        },
        {
            "name": "Cluster Node Down",
            "state": "off",
            "triggeredAt": "2021-09-30T10:07:52.541277Z",
            "description": "Cluster nodes down alarm triggers when any node is down (currently all nodes are up)",
            "severity": "critical"
        },
        {
            "name": "Disk Full",
            "state": "off",
            "triggeredAt": "2021-09-30T10:07:47.535595Z",
            "description": "Disk full alarm triggers above 80% of capacity (currently at 47%)",
            "severity": "critical"
        },
        {
            "name": "HSM Offline",
            "state": "off",
            "triggeredAt": "2021-09-30T10:07:44.730768Z",
            "description": "HSM is offline",
            "severity": "critical"
        },
        {
            "name": "Deprecated TLS version Enabled",
            "state": "unknown",
            "triggeredAt": "2021-09-30T10:07:37.527468Z",
            "description": "Deprecated TLS version(TLSv1.0/1.1) enabled on NAE/KMIP interface",
            "severity": "critical"
        },
        {
            "name": "KMIP Debug Logs Unmask Enabled",
            "state": "off",
            "triggeredAt": "2021-09-30T10:07:27.516841Z",
            "description": "Unmasking of sensitive data for KMIP debug logs is enabled",
            "severity": "critical"
        },
        {
            "name": "NAE TLS Disabled",
            "state": "off",
            "triggeredAt": "2021-09-30T10:07:17.510478Z",
            "description": "TLS is disabled on the NAE interface",
            "severity": "critical"
        }
    ]
}
To acknowledge that an alarm is under investigation:
$ ksctl alarms acknowledge -i 1650789b-e8e6-4915-91ab-b067e073f39f
To clear (i.e. turn off) an alarm:
$ ksctl alarms clear -i 1650789b-e8e6-4915-91ab-b067e073f39f