Prometheus Metrics Endpoint
You can use the Prometheus metrics endpoint to connect the Prometheus monitoring system to CipherTrust Manager as a Service. You can set Prometheus to scrape the CipherTrust Manager as a Service continuously, providing metrics over time to help monitor overall system health, performance, and cryptographic activity.
A sample configuration with Prometheus and Grafana docker images is available on Github. The Grafana data visualization application provides graph visualizations of the Prometheus-collected metrics.
Prerequisites for Sample Configuration
-
CipherTrust Manager as a Service 2.7.0 or later
-
Docker
-
Docker Compose (
docker-compose)
Sample Configuration Setup
-
On your CipherTrust Manager as a Service, enable Prometheus metrics, either through a
POSTto the/v1/system/metrics/prometheus/enableendpoint, or with theksctl metrics prometheus enableCLI command.A token is returned, which Prometheus needs to scrape CipherTrust Manager as a Service.
This token does not expire, but can be manually renewed with
ksctl metrics prometheus renew-tokenor aPOSTto/v1/system/metrics/prometheus/renew-token. -
Get the token which Prometheus needs to scrape CipherTrust Manager as a Service, if needed. You can use
GETwith the/v1/system/metrics/prometheus/statusendpoint orksctl metrics prometheus status. -
In the Prometheus Metrics directory, edit the
prometheus.ymlfile.At minimum, you must provide the CipherTrust Manager as a Service hostname/IP in
targetsand the prometheus API token inbearer token. Prometheus can scrape multiple CipherTrust Manager as a Services, which might or might not share the API Token. This is an example configuration file with three CipherTrust Manager as a Service nodes, of which two share the same Prometheus API token:scrape_configs: - job_name: "CipherTrust Manager" scheme: "https" tls_config: #ca_file: "/trusted_cas/web-keysecure-local.pem" #server_name: "web.keysecure.local" insecure_skip_verify: true bearer_token: "1zplR4njZsRN5dNeWAFXhkL1x7MU9q4H" metrics_path: "/api/v1/system/metrics/prometheus" static_configs: - targets: - "1.1.1.1" - "1.1.1.2" - job_name: "CipherTrust Manager Staging" scheme: "https" tls_config: #ca_file: "/trusted_cas/web-keysecure-local.pem" #server_name: "web.keysecure.local" insecure_skip_verify: true bearer_token: "TnRHpdL9v8MnWv8DhN9xuAaKgPevMEZs" metrics_path: "/api/v1/system/metrics/prometheus" static_configs: - targets: - "1.1.1.3" -
Set up TLS authentication. By default, the Prometheus configuration sets
insecure_skip_verify: truewhich is not recommended for production deployments as it skips SSL/TLS certificate validation for the CipherTrust Manager as a Service server.-
On CipherTrust Manager as a Service, download the certificate associated with the web interface. Export to a
pemformat.ksctl interfaces certificate get --name web --icertfile <desired-filename>.pem -
Use openssl to retrieve the Common Name (CN) of the certificate, which will become the
server_namevalue in Prometheus.openssl x509 -noout -subject -in <your-file>.pemExample response:
subject=C = US, ST = MD, L = Belcamp, O = Gemalto, CN = web.keysecure.localThe CN value,
web.keysecure.local, is the value needed for Prometheus. -
Copy the certificate file to the
trusted_casfolder in the Prometheus Metrics directory. -
Edit the
prometheus.yamlfile to include theca_filepath andserver_nameof the certificate, and disable theinsecure_skip_verifyparameter. For example:scrape_configs: - job_name: "CipherTrust Manager" scheme: "https" tls_config: ca_file: "/trusted_cas/web-keysecure-local.pem" server_name: "web.keysecure.local" #insecure_skip_verify: true bearer_token: "TnRHpdL9v8MnWv8DhN9xuAaKgPevMEZs" metrics_path: "/api/v1/system/metrics/prometheus" static_configs: - targets: - "1.1.1.1"
-
-
In the Prometheus directory run
make upto start the stack.You can run
make downto stop the stack andmake clearto stop the stack and all persisted data. -
Visit the Prometheus Dashboard in a browser at http://localhost:9090.
-
Navigate to Status > Target to ensure that Prometheus is scraping CipherTrust Manager as a Service. The state should display as
UPfor each node, with no errors. -
If you detect a problem, verify the metrics endpoint on CipherTrust Manager as a Service with
ksctl metrics prometheus get --api-token <api-token>, orcurl -k 'https://<hostname>/api/v1/system/metrics/prometheus' -H 'Authorization: Bearer <api-token>' --compressed). You can also use The Docker Compose logs to debug problem, withdocker-compose logs -f.
-
-
Visit the Grafana Dashboard in a browser at http://localhost:3000.
-
Login with the user
adminand the passwordadmin. Set a new password when prompted. -
Go to Dashboards -> Home to view the included dashboards.
-
Metrics Prefixes
The following high-level categories of metrics are returned from the endpoint:
| Prefixes | Metrics Type | Prometheus Exporter, Package, or Integration |
|---|---|---|
ciphertrust_ |
Metrics for specific CipherTrust resource applications. For example, size of the server audit log or key cache hits. | N/A |
dummy_ |
Custom internal metrics that can be disregarded. | N/A |
docker_ |
Metrics for the Docker containers that underlie CipherTrust Manager as a Service microservices, such as state, lifecycle, and resource usage. | docker_exporter |
go_ |
Metrics gathered by the Go runtime. Most useful for debugging purposes with Thales engineers. | go-metrics |
http_, httpclient_ |
Metrics about HTTP traffic to and from the CipherTrust Manager as a Service REST API endpoints. For example, response time and number of requests to an endpoint. | N/A |
node_ |
Metrics for the CipherTrust Manager as a Service host, such as CPU and disk details. | Node exporter |
process_ |
Metrics for microservices written in Go, including CPU, memory and file descriptor usage as well as the process start time. | Process collector of the go Prometheus package |
promhttp_ |
Measures number of times individual microservices are called divided by HTTP code. | promhttp Go package |
sql_ |
CipherTrust-specific metrics collected to analyze performance issues. Most useful for debugging purposes with Thales engineers. | N/A |
Available Metrics Dashboards
The following dashboards are displayed in Grafana for CipherTrust Manager as a Service:
-
CipherTrust Manager Developer - Metrics relevant to internal CipherTrust Manager as a Service developers to debug problems. This includes:
-
Average JWT processing time
-
Applications and Accounts Totals
-
Key Encryption Key (KEK)s Count
-
Authorization Policies Cache Hits pr Minute
-
Average Prometheus Metrics Scraping Response Time
-
-
CipherTrust Manager Host - Metrics about the health of the CipherTrust Manager as a Service host, including CPU details, memory details, network details, network connections, and disk details.
-
CipherTrust Manager HTTP Traffic - Metrics about HTTP traffic to the CipherTrust Manager as a Service. This includes:
-
Average HTTP Response Time Per Minute
-
HTTP Requests in the Last Minute
-
Average Network Latency Per Minute
-
Average CM HTTP Client Response Time Per Minute
-
HTTP 500 Errors in the Last Minute
-
-
CipherTrust Manager NAE - Basic metrics about the performance of the NAE-XML cryptographic interface. This includes XML response time and XML processing time.
-
CipherTrust Manager NAE Developer - More detailed metrics about operations and performance on the NAE-XML interface, intended for debugging. This includes:
-
Key Info Cache Misses Time Per Minute
-
Key Info Cache Hits Time Per Minute
-
XML Total Processing time
-
XML Parsing Time
-
XML Transmit Time
-
XML Receive Time
-
XML Execution Time
-
-
CipherTrust Manager Resources - Metrics about creation and use of objects on CipherTrust Manager as a Service, such as audit records, keys, licenses, backup, and users. This includes:
-
Audit Records Created Per Second Over The Last Minute
-
Audit Records Created In Last Five Minutes
-
Total Number of Audit Records
-
Total Number of Keys By Algorithm
-
Crypto Operations Per Second Over The Last Minute
-
Total Number Of Connector Licenses Deployed
-
Number of License Units Consumed
-
License Unit Consumption by Percentage
-
Total Number Of Group Users in the System
-
Total Number Of Key Rotations
-
Key Rotations In Last Five Minutes
-
Time Taken To Create Backup
-
Number of Backups taken
-
-
CipherTrust Manager Services - Metrics about the performance of individual microservices within CipherTrust Manager as a Service, intended for debug purposes. This includes:
-
CPU percentage
-
Memory usage
-
Network I/O (transmitting and receiving)
-
Disk I/O (reading and writing)
-
-
CipherTrust Manager Node Metrics - Metrics of the nodes in a clustered system showing the node connection information. This includes:
-
Write, flush, and replay lags
-
Sent, write, flush, and replay lag sizes
-
Whether replication is blocked
-
Whether node is connected
-
Connect time for a node
-
Apply rate
-
Catchup interval
-
Uptime for a connection
-