Prometheus
Prometheus's main features are:
- a multi-dimensional data model with time series data
identified by metric name and key/value pairs
- a flexible query language to leverage this
dimensionality
- no reliance on distributed
storage; single server nodes are autonomous
- time series collection happens
via a pull model over HTTP
- pushing time series is supported via an
intermediary gateway
- targets are discovered via
service discovery or static configuration
- multiple modes of graphing and
dashboarding support
Metric Types:
Counter: monotonically increasing counter
Gauge: value going up and down
Histogram: samples observations and counts them in
configurable buckets
Summary: samples observations
Instrument either services or libraries
Service
Instrumentation
Three types of services:
Online-serving systems: RED (requests,
errors, duration)
Offline-serving systems: USE
(utilization, saturation, errors)
batch jobs: see Pushgateway
Library
Instrumentation
Services are what
you care about at a high level. Within each of your services there are
libraries that you can think of as mini services.
Exposition:
The process of making metrics available to Prometheus is known
as exposition.
Pushgateway:
A metric cache for batch jobs.
Remembers only the last push for each batch job. Prometheus scrapes
these metrics from it.
Download it from Prometheus download page. It is an exporter that runs by default on
port 9091.
Graphite bridge:
Sample python code
to show Prometheus metrics:
import http.server, time
from prometheus_client import start_http_server,
Counter, Gauge, Summary, Histogram
REQUESTS = Counter('request_total', 'total HTTP
requests')
g = Gauge('my_inprogress_requests', 'description
of my gauge')
g.set(1.1)
HISTOGRAM =
Histogram('request_latency_histogram', 'histogram for the request time',
buckets=[0.0001, 0.0005,
0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50])
LATENCY = Summary('request_latency', 'Time for a
request')
class
MyHandler(http.server.BaseHTTPRequestHandler):
@LATENCY.time()
@HISTOGRAM.time()
def
do_GET(self):
REQUESTS.inc()
self.send_response(200)
self.end_headers()
self.wfile.write(b"Hello World")
g.inc()
if __name__ == "__main__":
start_http_server(18000)
server
= http.server.HTTPServer(('localhost', 18001), MyHandler)
server.serve_forever()
http://localhost:18001
to get “Hello World”
http://localhost:18000/metrics
to see metrics:
…
# HELP request_total total HTTP requests
# TYPE request_total counter
request_total 14.0
# TYPE request_created gauge
request_created 1.544661662956961e+09
# HELP my_inprogress_requests description of my gauge
# TYPE my_inprogress_requests gauge
my_inprogress_requests 15.1
# HELP request_latency_histogram histogram for the request time
# TYPE request_latency_histogram histogram
request_latency_histogram_bucket{le="0.0001"} 0.0
request_latency_histogram_bucket{le="0.0005"} 14.0
request_latency_histogram_bucket{le="0.001"} 14.0
request_latency_histogram_bucket{le="0.005"} 14.0
request_latency_histogram_bucket{le="0.01"} 14.0
request_latency_histogram_bucket{le="0.05"} 14.0
request_latency_histogram_bucket{le="0.1"} 14.0
request_latency_histogram_bucket{le="0.5"} 14.0
request_latency_histogram_bucket{le="1.0"} 14.0
request_latency_histogram_bucket{le="5.0"} 14.0
request_latency_histogram_bucket{le="10.0"} 14.0
request_latency_histogram_bucket{le="50.0"} 14.0
request_latency_histogram_bucket{le="+Inf"} 14.0
request_latency_histogram_count 14.0
request_latency_histogram_sum 0.0022318799999991867
# TYPE request_latency_histogram_created gauge
request_latency_histogram_created 1.544661662957037e+09
# HELP request_latency Time for a request
# TYPE request_latency summary
request_latency_count 14.0
request_latency_sum 0.0024406689999976194
# TYPE request_latency_created gauge
request_latency_created 1.544661662957119e+09
Prometheus
metric library for Nginx written in Lua:
A Lua library that can be used with Nginx to keep track of
metrics and expose them on a separate web page to be pulled by Prometheus.
Installation:
Install nginx package with lua
support (libnginx-mod-http-lua on newer Debian versions,
or nginx-extrason older ones). ß I did not do this for openresty
The library
file, prometheus.lua, needs to be available in LUA_PATH. If this is
the only Lua library you use, you can just point lua_package_path to
the directory with this git repo checked out (see example below).
OpenResty users will
find this library in opm. It is also available via luarocks. ß I did not do this for
openresty
nginx-lua-prometheus
souce code:
Prometheus nginx
monitoring sample config:
Enable Prometheus counter,
gauge and histogram in nginx.conf file
lua_package_path
"site/lualib/?.lua;/etc/nginx/ssl/?.lua;;";
# prometheus exporter settings
lua_shared_dict prometheus_metrics 10M;
init_by_lua_block {
prometheus =
require("prometheus").init("prometheus_metrics");
metric_requests =
prometheus:counter("nginx_http_requests_total", "Number of HTTP
requests", {"nginx_port", "method",
"endpoint", "status"});
metric_latency =
prometheus:histogram("nginx_http_request_duration_seconds",
"HTTP request latency", {"nginx_port", "method",
"endpoint", "status"});
metric_connections =
prometheus:gauge("nginx_http_connections", "Number of HTTP
connections", {"nginx_port", "state"});
}
log_by_lua_block {
metric_requests:inc(1,
{ngx.var.server_port, ngx.var.request_method, ngx.var.uri, ngx.var.status});
metric_latency:observe(tonumber(ngx.var.request_time),
{ngx.var.server_port, ngx.var.request_method, ngx.var.uri, ngx.var.status});
}
server {
server_name http_metrics;
listen 9000;
access_log
/var/log/pan/directory-sync-service/nginx.access.log main;
location /metrics {
content_by_lua '
metric_connections:set(ngx.var.connections_reading,
{"reading"})
metric_connections:set(ngx.var.connections_waiting,
{"waiting"})
metric_connections:set(ngx.var.connections_writing,
{"writing"})
prometheus:collect()
';
}
}
Above config will generate error until following changes:
metric_connections:set(ngx.var.connections_reading,
{ngx.var.server_port, "reading"});
metric_connections:set(ngx.var.connections_waiting,
{ngx.var.server_port, "waiting"});
metric_connections:set(ngx.var.connections_writing, {ngx.var.server_port,
"writing"});
Prometheus docker
container:
SJCMACJ15JHTD8:docker jzeng$ docker
pull prom/Prometheus
SJCMACJ15JHTD8:docker jzeng$
docker run --rm --add-host sv3-dsappweb1-devr1.ds.pan.local:10.105.50.23 -p 9090:9090 -d --name prometheus bc2b9d813555
ß ‘add-host’ may not be needed if ip address is used in
Prometheus.yml
SJCMACJ15JHTD8:prometheus jzeng$
docker exec -it 3cd8e7a2ddc6 /bin/sh
Add more job to Prometheus.yml:
/etc/prometheus $ vi prometheus.yml
-
job_name: 'ds_metrics'
static_configs:
-
targets: ['10.105.50.23:9000']
metrics_path: "/metrics"
10.105.50.23
is the ip of ‘sv3-dsappweb1-devr1.ds.pan.local’
Reload the changes:
/bin $ kill -HUP {pid_of_prometheus}
Check Prometheus logs:
SJCMACJ15JHTD8:~ jzeng$ docker
logs dbee6bb15ed2
Access to Promdash UI:
Check ‘Status/Targets’ to make sure
‘ds_metrics’ is UP.
Access to Prometheus metrics:
Installation
of Grafana:
admin/admin
HeatMap:
Heatmap
format is suitable for displaying metrics having histogram type on Heatmap
panel. Under the hood, it converts cumulative histogram to regular and sorts
series by the bucket bound
The query for displaying histogram
to HeatMap:
sum(rate(nginx_http_request_duration_seconds_bucket{instance=~"$INSTANCE"}[10m]))
by (le)
Format: Time series
Legend format: {(le)}
Data format: Time series
Sample query:
Get fatal count for
certain rest API and group them by endpoint:
sum(nginx_http_requests_total{instance=~"$INSTANCE",endpoint!="service/directory/v1/health",endpoint!="c",endpoint=~"/suscription|/agent/status|/service/directory/v1/.*|/directory-sync-service/v1/.*",status=~"5.."})
by (endpoint)
Google Cloud Stackdriver:
Stackdriver Kubernetes Monitoring integrates metrics, logs,
events, and metadata from your Kubernetes environment and from your Prometheus
instrumentation, to help you understand, in real time, your application’s
behavior in production, no matter your role and where your Kubernetes
deployments run.
No comments:
Post a Comment