[Grafana] Opentelemetry: spanmetric ๊ตฌ์„ฑํ•˜๊ธฐ

์ด์ „ ํฌ์ŠคํŠธ
https://blog.naver.com/sssang97/223764113482
https://blog.naver.com/sssang97/223761188760

๋‹จ์ˆœํžˆ API์— ๋Œ€ํ•œ ๊ธฐ๋ก์„ ๋ณด๊ณ  ์‹ถ๋‹ค๋ฉด Tempo ๋“ฑ์„ ์‚ฌ์šฉํ•ด์„œ trace๋ฅผ ๊ธฐ๋กํ•˜๋ฉด ๋œ๋‹ค.
๊ทธ๋Ÿฌ๋ฉด API๋“ค์ด ์–ผ๋งˆ๋‚˜ ํ˜ธ์ถœ๋˜์—ˆ๋Š”์ง€, ์˜ค๋ฅ˜์œจ์ด ์–ผ๋งˆ์ธ์ง€ ๊ฐ™์€ ํ†ต๊ณ„์„ฑ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๊ณ  ์‹ถ๋‹ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผํ• ๊นŒ?

๊ทธ๋Ÿฐ๊ฑธ ์œ„ํ•œ ๊ธฐ๋Šฅ์š”์†Œ ์ค‘ ํ•˜๋‚˜๊ฐ€ ๋ฐ”๋กœ spanmetric์ด๋‹ค. ๊ฐ๊ฐ์˜ span์„ ๊ธฐ๋ฐ˜์œผ๋กœ metric์„ ๊ตฌ์„ฑํ•œ๋‹ค๋Š” ์˜๋ฏธ๋‹ค.

๋ณธ ํฌ์ŠคํŠธ์—์„œ๋Š” otel-collector๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ prometheus์— ํ†ต๊ณ„๋ฅผ ์‘ค์…”๋„ฃ๊ณ  ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๊ฐ„๋‹จํžˆ ๋‹ค๋ค„๋ณด๊ฒ ๋‹ค.




otel-collector ์„ค์ •

spanmetric์€ ์‹ค์ œ๋กœ span-trace ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์ƒ์„ฑ๋˜๋Š” ๋ฉ”ํŠธ๋ฆญ ๋ฐ์ดํ„ฐ๋‹ค.

connectos ์˜ต์…˜์„ ํ†ตํ•ด์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋„˜๊ฒจ๋ฐ›์„ ์ง€๋ฅผ ์ •์˜ํ•  ์ˆ˜ ์žˆ๋‹ค.

connectors:
  spanmetrics:
    namespace: http_server_request
    dimensions:
      - name: http.method
      - name: http.status_code
      - name: http.route
      - name: service.namespace	
    histogram:
      unit: s
    exemplars:
      enabled: true
    events:
      enabled: true
      dimensions:
        - name: exception.type
        - name: exception.message

http_server_request๋ผ๋Š” ์ ‘๋‘์–ด๋กœ metric์ด ์Œ“์ด๊ฒŒ ํ–ˆ๊ณ , http.method๋ฅผ ๋น„๋กฏํ•œ span์˜ ๋ฐ์ดํ„ฐ๋“ค์„ ๊ณ„์Šน๋ฐ›์•„์„œ ์ €์žฅํ•˜๋„๋ก ํ–ˆ๋‹ค.
๊ทธ๋ฆฌ๊ณ  ๊ธฐ๋ก ๋‹จ์œ„๋Š” ์ดˆ ๋‹จ์œ„๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ ๋ฐ€๋ฆฌ์ดˆ๋‹จ์œ„์ธ๋ฐ, ๋‚œ ๊ทธ์ •๋„๋Š” ํ•„์š”์—†์–ด์„œ ์ด๋ ‡๊ฒŒ ํ–ˆ๋‹ค.

์ด ๋ถ€๋ถ„์ด ํŠนํžˆ grafana ๋Œ€์‹œ๋ณด๋“œ ํ™˜๊ฒฝ์—์„œ ํŠนํžˆ ๊ฐœ๋–ก๊ฐ™์ด ๋˜์–ด์žˆ๋Š” ๋ถ€๋ถ„ ์ค‘ ํ•˜๋‚˜๋‹ค.
๋ณดํ†ต์€ ๋Œ€์‹œ๋ณด๋“œ ํ…œํ”Œ๋ฆฟ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ฐ€์ ธ๋‹ค๊ฐ€ ํŽธํ•˜๊ฒŒ ์“ฐ๋Š” ๊ฒƒ์„ ์›ํ•  ๊ฒƒ์ด๋‹ค.

๋กœ๊ทธ๋‚˜ trace๋Š” ์–ด๋А ์ •๋„ ์ •ํ˜•ํ™”๊ฐ€ ๋˜์–ด์žˆ๊ณ  ์„ ํ˜•์ ์ธ ๋ฐ์ดํ„ฐ๋ผ์„œ ๋Œ€์ถฉ ์‘ค์…”๋„ฃ๊ณ  ์•„๋ฌด๊ฑฐ๋‚˜ ๋„์šฐ๋ฉด ์‹œ๊ฐํ™”๊ฐ€ ์ž˜ ๋˜๋Š”๋ฐ, ์ด๊ฑด ์ž˜ ์•ˆ๋œ๋‹ค.
otel-collector ์ž์ฒด๊ฐ€ ํ•˜์œ„ํ˜ธํ™˜์„ ๋ณด์žฅํ•˜์ง€ ์•Š๊ณ  0,* ์ƒํƒœ๋กœ ํญ์ฃผ์ค‘์ธ ๊ฒƒ๋„ ์žˆ๊ณ , span์—๊ฒŒ์„œ ๊ณ„์Šน๋ฐ›๋Š” ๋ฐ์ดํ„ฐ๋„ ๋‹ค ์ž๊ธฐ๋“ค ๋ง˜๋Œ€๋กœ ์ฑ„์›Œ๋„ฃ๊ณ  ๊ทธ๋Ÿฐ๋‹ค. ๊ทธ๋ž˜์„œ grafana ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ์žˆ๋Š” opentelemetry ํ…œํ”Œ๋ฆฟ๋“ค ๋ณด๋ฉด ๋ฐ์ดํ„ฐ ํ˜•ํƒœ๋‚˜ metric ์ด๋ฆ„์„ ๊ฐ€์ •ํ•œ๊ฒŒ ๋‹ค ๋‹ค๋ฅด๋‹ค.

์ œ๋Œ€๋กœ ์“ฐ๋ ค๋ฉด ์–ด๋А ์ •๋„ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๊ตฌ์กฐ๋ฅผ ์•Œ๊ณ  ์„ค์ •ํ•  ์ค„๋„ ์•Œ์•„์•ผ ํ•œ๋‹ค.

service์—๋„ spanmetric ๊ด€๋ จ ์˜ต์…˜์„ ์ ๋‹นํžˆ ์ฑ„์›Œ์ค€๋‹ค.

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus]
    metrics/spanmetrics:
      receivers: [spanmetrics]
      exporters: [prometheus]
    traces:
      receivers: [otlp]
      exporters: [spanmetrics, otlp]
  telemetry:
    metrics:
      address: 0.0.0.0:8888
      level: detailed

์ด ์ •๋„๋ฉด ๋œ๋‹ค.

span->metric ์ „์ด๋Š” otel-collector ์ˆ˜์ค€์—์„œ ๋‹ค ํ•ด์ฃผ๋Š”๊ฑฐ๋ผ์„œ, ์ด๋ฏธ metric&trace ์„ค์ •์ด ์ž˜ ๋˜์–ด์žˆ๋‹ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ˆ˜์ค€์—์„œ ์—์ด์ „ํŠธ ์กฐ์ •์„ ๋” ํ•ด์ค˜์•ผ ํ•  ๊ฒƒ์€ ์—†๋‹ค.




Grafana ๋Œ€์‹œ๋ณด๋“œ ํ™œ์šฉ

์ด์ œ ๋Œ€์‹œ๋ณด๋“œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๊ฐ€์ ธ๋‹ค๊ฐ€ ์ง์ ‘ ๋„์›Œ๋ณด๊ณ  ์จ๋ณด์ž.

๋‚œ ์ด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค.
https://grafana.com/grafana/dashboards/22784-lightweight-apm-for-opentelemetry/

ํ˜น์‹œ ์‚ญ์ œ๋˜์—ˆ์„ ๋•Œ๋ฅผ ๊ฐ€์ •ํ•œ ๋‹ค์šด๋กœ๋“œ json ๋ฐฑ์—…
์ฒจ๋ถ€ํŒŒ์ผ22784_rev3.zipํŒŒ์ผ ๋‹ค์šด๋กœ๋“œ ์•„๋ฌดํŠผ ์ €๊ฑธ importํ•ด์„œ ๋Œ€์‹œ๋ณด๋“œ๋ฅผ ๋„์›Œ๋ณด๋ฉด

๋งค์šฐ ๋†’์€ ํ™•๋ฅ ๋กœ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ป—์–ด์žˆ์„ ๊ฒƒ์ด๋‹ค.
์‹ค์ œ๋กœ ๋‹จ์ˆœ timeseries๋กœ ์กฐํšŒํ•ด๋ณด๋ฉด ๋ฐ์ดํ„ฐ๊ฐ€ ๋‹ค ์žˆ์Œ์—๋„.

ํ•˜๋‚˜ ์—ด์–ด์„œ ๋ณด๊ณ  ๊ณ ์น˜๋ฉด์„œ ์จ๋ณด์ž.

ํ•ด์„ํ•ด๋ณด๋ฉด, http_server_request_duration_seconds_count๋ผ๋Š” ๋ฉ”ํŠธ๋ฆญ์„ ์กฐํšŒํ•˜๋˜, job์œผ๋กœ ํ•„ํ„ฐ๋ฅผ ๊ฑธ๊ณ , ์ง‘๊ณ„ํ•˜๊ณ  ํ•ด์„œ ๊ฐ’์„ ๋‚˜ํƒ€๋‚ธ๋‹ค๋Š” ๋œป์ด๋‹ค.

๋‚ด ๊ฒฝ์šฐ์—” ์œ„์—์„œ http_server_request๋ฅผ namespace๋กœ ์„ค์ •ํ•˜๊ณ , histogram์„ second๋กœ ์กฐ์ •ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์™„์ „ํžˆ ๋™์ผํ•œ ์ด๋ฆ„์œผ๋กœ ๋ฉ”ํŠธ๋ฆญ์ด ์ƒ์„ฑ๋˜์—ˆ๋‹ค.
๊ทธ๋Ÿฐ๋ฐ job์ด ์ข€ ์ด์ƒํ•˜๋‹ค. ์ด ๊ฒฝ์šฐ์—๋Š” k8s์—์„œ์˜ ์‚ฌ์šฉ๋งŒ์„ ์ „์ œํ•˜๊ณ  ์ € ํ•„๋“œ๊ฐ€ ์žˆ์„ ๊ฒƒ์ด๋ผ ๊ธฐ๋Œ€ํ•˜๋Š”๋ฐ, ์ง€๊ธˆ ๋‚ด ํ™˜๊ฒฝ๊ณผ ๊ตฌ์„ฑ์—์„œ๋Š” service_namespace์™€ service_name์ด ๋”ฐ๋กœ ์žˆ๋‹ค.

ํ•„ํ„ฐ๋ฅผ ์กฐ์ •ํ•ด๋ณด์ž.

์ด๋Ÿฌ๋ฉด ๋”ฐ๋กœ๋”ฐ๋กœ ์กด์žฌํ•˜๋Š” ํ•„๋“œ์— ๋Œ€ํ•ด์„œ ํ•„ํ„ฐ๋ฅผ ๊ฑธ ๊ฒƒ์ด๊ณ 


์กฐํšŒ๋„ ์ž˜ ๋˜๊ธฐ ์‹œ์ž‘ํ•  ๊ฒƒ์ด๋‹ค.

๋‹ค๋ฅธ๊ฑฐ ๋˜ ์•ˆ๋˜๋Š”๊ฑธ ๋ณด๋ฉด

http_response_status_code๋กœ ์‘๋‹ต์ฝ”๋“œ๋ฅผ ํ•„ํ„ฐ๋งํ•ด์„œ ์˜ค๋ฅ˜๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค.
๊ทผ๋ฐ ๋‚˜๋Š” ๋ฐฉ๊ธˆ ์œ„์—์„œ http.status_code๋กœ span ๋ฐ์ดํ„ฐ๋ฅผ ๊ทธ๋Œ€๋กœ ์ „์ด์‹œ์ผฐ๋Š”๋ฐ, ์ด๋Ÿฌ๋ฉด ๊ธฐ๋ณธ์ ์œผ๋กœ http_status_code๋ผ๋Š” ์ด๋ฆ„์œผ๋กœ ๋งคํ•‘๋˜์–ด์„œ ๋ถˆ์ผ์น˜๊ฐ€ ๋ฐœ์ƒํ•œ ๊ฒƒ์ด๋‹ค.


๋‹ค์‹œ ๋งž์ถฐ์„œ ๋Œ๋ ค๋ณด๋ฉด ์ด์ œ ๋˜ ์ž˜ ๋œฐ ๊ฒƒ์ด๋‹ค.

๋น„์Šทํ•œ ๋А๋‚Œ์œผ๋กœ ์ด๋ž˜์ €๋ž˜ ์กฐ์ •ํ•˜๋‹ค๋ณด๋ฉด

์ ๋‹นํžˆ ๊ทธ๋Ÿด๋“ฏํ•œ ๋ชจ์Šต์ด ๋  ๊ฒƒ์ด๋‹ค.



์ฐธ์กฐ
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/connector/spanmetricsconnector/README.md