Member-only story
How do we Monitor N number of K8s cluster by implementing a centralized monitoring system through Thanos and Prometheus…
Before going to architecture details, I would like to explain Thanos. what it is and why do we adopt it.
At, the first time when I heard about Thanos, I thought 💭 it would be something related to the marvel cosmetic universe 🐶 but after exploring I found that it’s a CNCF adopted monitoring tool that gives wings to our Prometheus server.
In nutshell, Prometheus uses SSD to store its relative metrics but when we integrate Thanos sidecar 🚗 with Prometheus cluster it also stores data as in the object form. we used an S3 bucket for storing metrics.
After using Thanos and centralized Prometheus server we achieved lot’s of things
- A centralized place to monitor all of our servers.
- Data loss prevention for Prometheus server(previously we used to have a common worker group for Prometheus server but sometimes we lose metrics when pods schedules to new az)
- data availability for a longer time.
- Resource sharing for all of our K8s…