Your suggested change has been received. Thank you.

close

Suggest A Change

https://thales.na.market.dpondemand.io/docs/dpod/services/kmo….

back

CipherTrust Manager System Monitoring

Prometheus Metrics Endpoint

search

Prometheus Metrics Endpoint

You can use the Prometheus metrics endpoint to connect the Prometheus monitoring system to CipherTrust Manager as a Service. You can set Prometheus to scrape the CipherTrust Manager as a Service continuously, providing metrics over time to help monitor overall system health, performance, and cryptographic activity.

A sample configuration with Prometheus and Grafana docker images is available on Github. The Grafana data visualization application provides graph visualizations of the Prometheus-collected metrics.

Prerequisites for Sample Configuration

  • CipherTrust Manager as a Service 2.7.0 or later

  • Docker

  • Docker Compose (docker-compose)

Sample Configuration Setup

  1. On your CipherTrust Manager as a Service, enable Prometheus metrics, either through a POST to the /v1/system/metrics/prometheus/enable endpoint, or with the ksctl metrics prometheus enable CLI command.

    A token is returned, which Prometheus needs to scrape CipherTrust Manager as a Service.

    This token does not expire, but can be manually renewed with ksctl metrics prometheus renew-token or a POST to /v1/system/metrics/prometheus/renew-token.

  2. Get the token which Prometheus needs to scrape CipherTrust Manager as a Service, if needed. You can use GET with the /v1/system/metrics/prometheus/status endpoint or ksctl metrics prometheus status.

  3. In the Prometheus Metrics directory, edit the prometheus.yml file.

    At minimum, you must provide the CipherTrust Manager as a Service hostname/IP in targets and the prometheus API token in bearer token. Prometheus can scrape multiple CipherTrust Manager as a Services, which might or might not share the API Token. This is an example configuration file with three CipherTrust Manager as a Service nodes, of which two share the same Prometheus API token:

    scrape_configs:
      - job_name: "CipherTrust Manager"
        scheme: "https"
        tls_config:
          #ca_file: "/trusted_cas/web-keysecure-local.pem"
          #server_name: "web.keysecure.local"
          insecure_skip_verify: true
        bearer_token: "1zplR4njZsRN5dNeWAFXhkL1x7MU9q4H"
        metrics_path: "/api/v1/system/metrics/prometheus"
        static_configs:
          - targets:
            - "1.1.1.1"
            - "1.1.1.2"
      - job_name: "CipherTrust Manager Staging"
        scheme: "https"
        tls_config:
          #ca_file: "/trusted_cas/web-keysecure-local.pem"
          #server_name: "web.keysecure.local"
          insecure_skip_verify: true
        bearer_token: "TnRHpdL9v8MnWv8DhN9xuAaKgPevMEZs"
        metrics_path: "/api/v1/system/metrics/prometheus"
        static_configs:
          - targets:
            - "1.1.1.3"
    
  4. Set up TLS authentication. By default, the Prometheus configuration sets insecure_skip_verify: true which is not recommended for production deployments as it skips SSL/TLS certificate validation for the CipherTrust Manager as a Service server.

    1. On CipherTrust Manager as a Service, download the certificate associated with the web interface. Export to a pem format.

      ksctl interfaces certificate get --name web --icertfile <desired-filename>.pem
      
    2. Use openssl to retrieve the Common Name (CN) of the certificate, which will become the server_name value in Prometheus.

      openssl x509 -noout -subject -in <your-file>.pem
      

      Example response:

      subject=C = US, ST = MD, L = Belcamp, O = Gemalto, CN = web.keysecure.local
      

      The CN value, web.keysecure.local, is the value needed for Prometheus.

    3. Copy the certificate file to the trusted_cas folder in the Prometheus Metrics directory.

    4. Edit the prometheus.yaml file to include the ca_file path and server_name of the certificate, and disable the insecure_skip_verify parameter. For example:

      scrape_configs:
        - job_name: "CipherTrust Manager"
          scheme: "https"
          tls_config:
            ca_file: "/trusted_cas/web-keysecure-local.pem"
            server_name: "web.keysecure.local"
            #insecure_skip_verify: true
          bearer_token: "TnRHpdL9v8MnWv8DhN9xuAaKgPevMEZs"
          metrics_path: "/api/v1/system/metrics/prometheus"
          static_configs:
            - targets:
              - "1.1.1.1"
      
  5. In the Prometheus directory run make up to start the stack.

    You can run make down to stop the stack and make clear to stop the stack and all persisted data.

  6. Visit the Prometheus Dashboard in a browser at http://localhost:9090.

    1. Navigate to Status > Target to ensure that Prometheus is scraping CipherTrust Manager as a Service. The state should display as UP for each node, with no errors.

    2. If you detect a problem, verify the metrics endpoint on CipherTrust Manager as a Service with ksctl metrics prometheus get --api-token <api-token>, or curl -k 'https://<hostname>/api/v1/system/metrics/prometheus' -H 'Authorization: Bearer <api-token>' --compressed ). You can also use The Docker Compose logs to debug problem, with docker-compose logs -f.

  7. Visit the Grafana Dashboard in a browser at http://localhost:3000.

    1. Login with the user admin and the password admin. Set a new password when prompted.

    2. Go to Dashboards -> Home to view the included dashboards.

Metrics Prefixes

The following high-level categories of metrics are returned from the endpoint:

Prefixes Metrics Type Prometheus Exporter, Package, or Integration
ciphertrust_ Metrics for specific CipherTrust resource applications. For example, size of the server audit log or key cache hits. N/A
dummy_ Custom internal metrics that can be disregarded. N/A
docker_ Metrics for the Docker containers that underlie CipherTrust Manager as a Service microservices, such as state, lifecycle, and resource usage. docker_exporter
go_ Metrics gathered by the Go runtime. Most useful for debugging purposes with Thales engineers. go-metrics
http_, httpclient_ Metrics about HTTP traffic to and from the CipherTrust Manager as a Service REST API endpoints. For example, response time and number of requests to an endpoint. N/A
node_ Metrics for the CipherTrust Manager as a Service host, such as CPU and disk details. Node exporter
process_ Metrics for microservices written in Go, including CPU, memory and file descriptor usage as well as the process start time. Process collector of the go Prometheus package
promhttp_ Measures number of times individual microservices are called divided by HTTP code. promhttp Go package
sql_ CipherTrust-specific metrics collected to analyze performance issues. Most useful for debugging purposes with Thales engineers. N/A

Available Metrics Dashboards

The following dashboards are displayed in Grafana for CipherTrust Manager as a Service:

  • CipherTrust Manager Developer - Metrics relevant to internal CipherTrust Manager as a Service developers to debug problems. This includes:

    • Average JWT processing time

    • Applications and Accounts Totals

    • Key Encryption Key (KEK)s Count

    • Authorization Policies Cache Hits pr Minute

    • Average Prometheus Metrics Scraping Response Time

  • CipherTrust Manager Host - Metrics about the health of the CipherTrust Manager as a Service host, including CPU details, memory details, network details, network connections, and disk details.

  • CipherTrust Manager HTTP Traffic - Metrics about HTTP traffic to the CipherTrust Manager as a Service. This includes:

    • Average HTTP Response Time Per Minute

    • HTTP Requests in the Last Minute

    • Average Network Latency Per Minute

    • Average CM HTTP Client Response Time Per Minute

    • HTTP 500 Errors in the Last Minute

  • CipherTrust Manager NAE - Basic metrics about the performance of the NAE-XML cryptographic interface. This includes XML response time and XML processing time.

  • CipherTrust Manager NAE Developer - More detailed metrics about operations and performance on the NAE-XML interface, intended for debugging. This includes:

    • Key Info Cache Misses Time Per Minute

    • Key Info Cache Hits Time Per Minute

    • XML Total Processing time

    • XML Parsing Time

    • XML Transmit Time

    • XML Receive Time

    • XML Execution Time

  • CipherTrust Manager Resources - Metrics about creation and use of objects on CipherTrust Manager as a Service, such as audit records, keys, licenses, backup, and users. This includes:

    • Audit Records Created Per Second Over The Last Minute

    • Audit Records Created In Last Five Minutes

    • Total Number of Audit Records

    • Total Number of Keys By Algorithm

    • Crypto Operations Per Second Over The Last Minute

    • Total Number Of Connector Licenses Deployed

    • Number of License Units Consumed

    • License Unit Consumption by Percentage

    • Total Number Of Group Users in the System

    • Total Number Of Key Rotations

    • Key Rotations In Last Five Minutes

    • Time Taken To Create Backup

    • Number of Backups taken

  • CipherTrust Manager Services - Metrics about the performance of individual microservices within CipherTrust Manager as a Service, intended for debug purposes. This includes:

    • CPU percentage

    • Memory usage

    • Network I/O (transmitting and receiving)

    • Disk I/O (reading and writing)

  • CipherTrust Manager Node Metrics - Metrics of the nodes in a clustered system showing the node connection information. This includes:

    • Write, flush, and replay lags

    • Sent, write, flush, and replay lag sizes

    • Whether replication is blocked

    • Whether node is connected

    • Connect time for a node

    • Apply rate

    • Catchup interval

    • Uptime for a connection