Prometheus Metrics Endpoint
You can use the Prometheus metrics endpoint to connect the Prometheus monitoring system to CipherTrust Manager. You can set Prometheus to scrape the CipherTrust Manager continuously, providing metrics over time to help monitor overall system health, performance, and cryptographic activity.
A sample configuration with Prometheus and Grafana docker images is available on Github. The Grafana data visualization application provides graph visualizations of the Prometheus-collected metrics.
Prerequisites for Sample Configuration
-
CipherTrust Manager 2.7.0 or later
-
Docker
-
Docker Compose (
docker-compose)
Sample Configuration Setup
-
On your CipherTrust Manager, enable Prometheus metrics, either through a
POSTto the/v1/system/metrics/prometheus/enableendpoint, or with theksctl metrics prometheus enableCLI command.A token is returned, which Prometheus needs to scrape CipherTrust Manager.
Note
This token does not expire, but can be manually renewed with
ksctl metrics prometheus renew-tokenor aPOSTto/v1/system/metrics/prometheus/renew-token. -
Get the token which Prometheus needs to scrape CipherTrust Manager, if needed. You can use
GETwith the/v1/system/metrics/prometheus/statusendpoint orksctl metrics prometheus status. -
In the Prometheus Metrics directory, edit the
prometheus.ymlfile.At minimum, you must provide the CipherTrust Manager hostname/IP in
targetsand the prometheus API token inbearer token. Prometheus can scrape multiple CipherTrust Managers, which might or might not share the API Token. This is an example configuration file with three CipherTrust Manager nodes, of which two share the same Prometheus API token:scrape_configs: - job_name: "CipherTrust Manager" scheme: "https" tls_config: #ca_file: "/trusted_cas/web-keysecure-local.pem" #server_name: "web.keysecure.local" insecure_skip_verify: true bearer_token: "1zplR4njZsRN5dNeWAFXhkL1x7MU9q4H" metrics_path: "/api/v1/system/metrics/prometheus" static_configs: - targets: - "1.1.1.1" - "1.1.1.2" - job_name: "CipherTrust Manager Staging" scheme: "https" tls_config: #ca_file: "/trusted_cas/web-keysecure-local.pem" #server_name: "web.keysecure.local" insecure_skip_verify: true bearer_token: "TnRHpdL9v8MnWv8DhN9xuAaKgPevMEZs" metrics_path: "/api/v1/system/metrics/prometheus" static_configs: - targets: - "1.1.1.3" -
Set up TLS authentication. By default, the Prometheus configuration sets
insecure_skip_verify: truewhich is not recommended for production deployments as it skips SSL/TLS certificate validation for the CipherTrust Manager server.-
On CipherTrust Manager, download the certificate associated with the web interface. Export to a
pemformat.ksctl interfaces certificate get --name web --icertfile <desired-filename>.pem -
Use openssl to retrieve the Common Name (CN) of the certificate, which will become the
server_namevalue in Prometheus.openssl x509 -noout -subject -in <your-file>.pemExample response:
subject=C = US, ST = MD, L = Belcamp, O = Gemalto, CN = web.keysecure.localThe CN value,
web.keysecure.local, is the value needed for Prometheus. -
Copy the certificate file to the
trusted_casfolder in the Prometheus Metrics directory. -
Edit the
prometheus.yamlfile to include theca_filepath andserver_nameof the certificate, and disable theinsecure_skip_verifyparameter. For example:scrape_configs: - job_name: "CipherTrust Manager" scheme: "https" tls_config: ca_file: "/trusted_cas/web-keysecure-local.pem" server_name: "web.keysecure.local" #insecure_skip_verify: true bearer_token: "TnRHpdL9v8MnWv8DhN9xuAaKgPevMEZs" metrics_path: "/api/v1/system/metrics/prometheus" static_configs: - targets: - "1.1.1.1"
-
-
In the Prometheus directory run
make upto start the stack.Note
You can run
make downto stop the stack andmake clearto stop the stack and all persisted data. -
Visit the Prometheus Dashboard in a browser at
http://localhost:9090.-
Navigate to Status > Target to ensure that Prometheus is scraping CipherTrust Manager. The state should display as
UPfor each node, with no errors. -
If you detect a problem, verify the metrics endpoint on CipherTrust Manager with
ksctl metrics prometheus get --api-token <api-token>, orcurl -k 'https://<hostname>/api/v1/system/metrics/prometheus' -H 'Authorization: Bearer <api-token>' --compressed). You can also use The Docker Compose logs to debug problem, withdocker-compose logs -f.
-
-
Visit the Grafana Dashboard in a browser at
http://localhost:3000.-
Login with the user
adminand the passwordadmin. Set a new password when prompted. -
Go to Dashboards -> Home to view the included dashboards.
-
Metrics Prefixes
The following high-level categories of metrics are returned from the endpoint:
| Prefixes | Metrics Type | Prometheus Exporter, Package, or Integration |
|---|---|---|
ciphertrust_ |
Metrics for specific CipherTrust resource applications. For example, size of the server audit log or key cache hits. | N/A |
dummy_ |
Custom internal metrics that can be disregarded. | N/A |
docker_ |
Metrics for the Docker containers that underlie CipherTrust Manager microservices, such as state, lifecycle, and resource usage. | docker_exporter |
go_ |
Metrics gathered by the Go runtime. Most useful for debugging purposes with Thales engineers. | go-metrics |
http_ |
Metrics about HTTP traffic to and from the CipherTrust Manager REST API endpoints and entities external to CipherTrust Manager. For example, response time and number of requests to an endpoint. | N/A |
httpclient_ |
Metrics about HTTP traffic between internal CipherTrust Manager microservices. For example, response time and number of requests. | N/A |
node_ |
Metrics for the CipherTrust Manager host, such as CPU and disk details. | Node exporter |
process_ |
Metrics for microservices written in Go, including CPU, memory and file descriptor usage as well as the process start time. | Process collector of the go Prometheus package |
promhttp_ |
Measures number of times individual microservices are called divided by HTTP code. | promhttp Go package |
sql_ |
CipherTrust-specific metrics collected to analyze performance issues. Most useful for debugging purposes with Thales engineers. | N/A |
Available Metrics Dashboards
The following dashboards are displayed in Grafana for CipherTrust Manager:
-
CipherTrust Manager Developer - Metrics relevant to internal CipherTrust Manager developers to debug problems. This includes:
-
Average JWT processing time
-
Applications and Accounts Totals
-
Key Encryption Key (KEK)s Count
-
Authorization Policies Cache Hits pr Minute
-
Average Prometheus Metrics Scraping Response Time
-
-
CipherTrust Manager Host - Metrics about the health of the CipherTrust Manager host, including CPU details, memory details, network details, network connections, and disk details.
-
CipherTrust Manager HTTP Traffic - Metrics about HTTP traffic to the CipherTrust Manager. This includes:
-
Average HTTP Response Time Per Minute
-
HTTP Requests in the Last Minute
-
Average Network Latency Per Minute
-
Average CM HTTP Client Response Time Per Minute
-
HTTP 500 Errors in the Last Minute
-
-
CipherTrust Manager NAE - Basic metrics about the performance of the NAE-XML cryptographic interface. This includes XML response time and XML processing time.
-
CipherTrust Manager NAE Developer - More detailed metrics about operations and performance on the NAE-XML interface, intended for debugging. This includes:
-
Key Info Cache Misses Time Per Minute
-
Key Info Cache Hits Time Per Minute
-
XML Total Processing time
-
XML Parsing Time
-
XML Transmit Time
-
XML Receive Time
-
XML Execution Time
-
-
CipherTrust Manager Resources - Metrics about creation and use of objects on CipherTrust Manager, such as audit records, keys, licenses, backup, and users. This includes:
-
Audit Records Created Per Second Over The Last Minute
-
Audit Records Created In Last Five Minutes
-
Total Number of Audit Records
-
Total Number of Keys By Algorithm
-
Crypto Operations Per Second Over The Last Minute
-
Total Number Of Connector Licenses Deployed
-
Number of License Units Consumed
-
License Unit Consumption by Percentage
-
Total Number Of Group Users in the System
-
Total Number Of Key Rotations
-
Key Rotations In Last Five Minutes
-
Time Taken To Create Backup
-
Number of Backups taken
-
Total number of Keys by States and Algorithm
-
-
CipherTrust Manager Services - Metrics about the performance of individual microservices within CipherTrust Manager, intended for debug purposes. This includes:
-
CPU percentage
-
Memory usage
-
Network I/O (transmitting and receiving)
-
Disk I/O (reading and writing)
-
-
CipherTrust Manager Node Metrics - Metrics of the nodes in a clustered system showing the node connection information. This includes:
-
Write, flush, and replay lags
-
Sent, write, flush, and replay lag sizes
-
Whether replication is blocked
-
Whether node is connected
-
Connect time for a node
-
Apply rate
-
Catchup interval
-
Uptime for a connection
-
-
CipherTrust Manager CTE Resources - More detailed metrics about operations and performance on the CTE resources, intended for debugging. This includes:
-
CTE Clients
-
CTE Clients Health Status
-
CTE Groups
-
CTE Guard Points State
-
-
CipherTrust Manager Secrets - Metrics for the various secrets (Akeyless) gateway operations. This includes:
-
CPU utilization metrics
-
Disk I/O metrics
-
CPU load metrics
-
Memory utilization metrics
-
Network interface I/O metrics & TCP connection metrics
-
Current transaction number
-
Total transaction by an admin client
-
Total transaction limit per hour
-
Status of HTTP response
-
Total number of requests
For more details, click here.
-