Use Aidbox Metrics Server

Setup and environment variable

DefineBOX_METRICS_PORTenvironment variable with monitoring server port number.

Start metrics server

Aidbox exposes metrics on the endpoint /.

To check if the monitoring server works make the GET : request. The output should be a string "aidbox metrics".

Metrics server endpoints

There are three types of metrics Aidbox collects and exposes. All endpoints are available on a separate port, e.g. GET :/metrics.

Endpoint Update frequency
GET /metrics continuous
GET /metrics/minutes every minute
GET /metrics/hours every hour

The /metrics/hours response can take some time since it collects much information from the database. Make sure your metrics scraper timeout is sufficient.

Prometheus example scrapers configuration

yaml
global:
  # omitted global configuration values
  external_labels:
    monitor: 'aidbox'
scrape_configs:
  # omitted default scrappers configuration
  
  - job_name: aidbox
    honor_labels: true
    scrape_interval: 10s
    metrics_path: /metrics
    static_configs:
      - targets: [ 'aidbox.example.com:9999' ]  # should be <AIDBOX_BASE_URL>:<BOX_METRICS_PORT

  - job_name: aidbox-minutes
   honor_labels: true
    scrape_interval: 1m
    metrics_path: /metrics/minutes
    static_configs:
      - targets: [ 'aidbox.example.com:9999' ]  # should be <AIDBOX_BASE_URL>:<BOX_METRICS_PORT

  - job_name: aidbox-hours
    honor_labels: true
    scrape_interval: 10m
    scrape_timeout: 30s                         # increased timeout
    metrics_path: /metrics/hours
    static_configs:
      - targets: [ 'aidbox.example.com:9999' ]   # should be <AIDBOX_BASE_URL>:<BOX_METRICS_PORT

Collected metrics

HTTP

MetricUpdate frequencyDescription
aidbox_http_request_duration_seconds_bucketcontinuousrequest duration as cumulative counters for buckets
aidbox_http_request_duration_seconds_countcontinuousrequest duration events count
aidbox_http_request_duration_seconds_sumcontinuoussum of request duration events value
aidbox_http_request_wait_seconds_bucketcontinuousqueue waiting time as cumulative counters for buckets
aidbox_http_request_wait_seconds_countcontinuousqueue waiting time events count
aidbox_http_request_wait_seconds_sumcontinuoussum of queue waiting time events value

Postgres

MetricUpdate frequencyDescription
pg_requests_totalcontinuousnumber of executed selects requests
pg_inserts_totalcontinuousnumber of executed insert statements
pg_updates_totalcontinuousnumber of executed update statements
pg_deletes_totalcontinuousnumber of executed delete statements
pg_blks_hitcontinuousnumber of shared block cache hits
pg_blks_readcontinuousnumber of shared blocks read
pg_tup_fetchedcontinuousfetched tuples number
pg_tup_returnedcontinuousreturned tuples number
pg_errors_totalcontinuousnumber of errors
pg_activity_countcontinuousnumber of PG workers
pg_idx_scanevery minutenumber of index scans
pg_seq_scanevery minutenumber of sequential scans
pg_stat_statements_total_callsevery minutenumber of times executed
pg_stat_statements_stddev_execution_timeevery minutestatement execution time
pg_stat_statements_mean_execution_timeevery minutemean statement execution time
pg_table_sizeevery hourtable size
pg_database_sizeevery hourdatabase size
pg_activity_maxevery hourmaximum number of connections

Hikari (Postgres connection pool)

Metric Update frequency Description
hikari_active_count continuous number of active connections
hikari_idle_count continuous number of idle connections
hikari_acquire_created_seconds_bucket continuous time taken to create an actual physical connection
hikari_acquire_created_seconds_count continuous number of created physical connections
hikari_acquire_created_seconds_sum continuous total amount of time to create all physical connections
hikari_acquired_total continuous number of obtained connections
hikari_acquire_wait_seconds_bucket continuous time taken to obtain a connection
hikari_acquire_wait_seconds_sum continuous total amount of time to obtain all connections
hikari_acquire_used_seconds_bucket continuous time consumed by a connection
hikari_max_size every hour maximum number of connections

JVM

Metric Update frequency Description
jvm_gc_time continuous garbage collector execution time
jvm_gc_count continuous garbage collector count of launch
jvm_heap_memory continuous heap memory usage
jvm_non_heap_memory continuous non-heap memory usage
jvm_thread_count continuous number of live threads including both daemon and non-daemon thread
jvm_thread_peak_count continuous peak live thread count
jvm_thread_daemon_count continuous number of daemon thread
jvm_available_processors_size every hour number of processors available to the JVM
jvm_max_memory_size every hour maximum amount of memory that JVM will attempt to use
jvm_total_memory_size every hour total amount of memory in JVM

Disable PostgreSQL metrics

If you have a different pg exporter you can disable Aidbox PostgreSQL metrics for avoiding metrics duplication.

In this case, you should set BOX_METRICS_POSTGRES_ON to false value