Rich Freeman via plug on 13 Jun 2023 06:42:22 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
[PLUG] Collecting k8s events |
I'm running k8s at home, and occasionally I'll find pods getting restarted with little idea of why. The reason for a pod restart is captured as a k8s event, but k8s only retains these for one hour and has no alerting, so odds are that when I notice that something has restarted, these events are gone. It seems like the typical solution is some kind of log aggregator solution (though not all support events - fluentd doesn't appear to, for example). It seems like the typical approach is to use an agent (many exist) to collect this data, and then dump it into some kind of search/visualization tool. Elasticsearch and Grafana appear to be the leading options. What would make the most sense? Keep in mind this is for the home, so we're talking 10-20 containers, not thousands. I'd prefer that the monitoring solution not be the most complex part of the entire cluster. That said, if it is easy to deploy in k8s then that mitigates the complexity (though I'd still prefer that it not eat up gigabytes of RAM 24x7). I am not concerned with pretty charts of CPU use and so on. This is about logs, not metrics. Of course being able to be alerted when a host is running out of disk is still useful, but I don't have so many hosts that it is hard to be aware of things like that. Oh, for those who aren't familiar with k8s, I'm talking about k8s events, not application logs. Logs are created by applications and are what you're already familiar with. Events are basically logs at the cluster level. So if an application has an error and terminates, it might helpfully write to the log before it dies. If the cluster sends the application a SIGTERM for whatever reason, the application log is just going to say that it died because it got a SIGTERM, but the events would capture why the cluster sent that SIGTERM. So it is important to capture both. Events would also tell you about hosts crashing and so on, though as with most OS-level logs they might not say much as to why. -- Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug