- 01 引言
- 02 kube-prometheus安装
- 2.1 step1:分类yaml
- 2.2 step2:修改数据持久化存储
- 2.2.1 修改Prometheus 持久化
- 2.2.2 修改grafana持久化配置
- 2.3 step3:修改 Service 端口设置
- 2.3.1 修改 Prometheus Service
- 2.3.2 修改 Grafana Service
- 2.4 step4:安装promethues-operator
- 2.5 step5:验证
- 03 文末
本文主要讲解在k8s
(kubernetes
)下安装kube-prometheus
。
kube-prometheus
的github
地址:https://github.com/prometheus-operator/kube-prometheus
kube-promethues
本质就是以下内容的集合:
- Prometheus Operator
- Prometheus
- Alertmanager
- node-exporter
- Prometheus Adapter for Kubernetes Metrics APIs
- kube-state-metrics
- Grafana
注意kube-promethues
与kubernetes
的版本对应关系如下:
release-0.7
✔✔✗✗✗release-0.8
✗✔✔✗✗release-0.9
✗✗✔✔✗release-0.10
✗✗✗✔✔main
✗✗✗✔✔
由于目前系统使用的k8s
版本是v1.16.9
,根据上面的列表,是找不到的,因此查阅了其它的资料,发现是支持的:
因此,从版本对应关系表,可以看出,采用release-0.4
版本,下载地址:https://github.com/prometheus-operator/kube-prometheus/releases/tag/v0.4.0
首选使用ssh
工具把kube-prometheus
压缩包上传至服务器。
解压:
# tar -zxvf kube-prometheus-0.4.0.tar.gz
按yaml分类:
cd kube-prometheus-0.4.0/manifests
# 创建文件夹
mkdir -p node-exporter alertmanager grafana kube-state-metrics prometheus serviceMonitor adapter
# 移动 yaml 文件,进行分类到各个文件夹下
mv *-serviceMonitor* serviceMonitor/
mv grafana-* grafana/
mv kube-state-metrics-* kube-state-metrics/
mv alertmanager-* alertmanager/
mv node-exporter-* node-exporter/
mv prometheus-adapter* adapter/
mv prometheus-* prometheus/
最终结构如下:
➜ manifests tree .
.
├── adapter
│ ├── prometheus-adapter-apiService.yaml
│ ├── prometheus-adapter-clusterRole.yaml
│ ├── prometheus-adapter-clusterRoleAggregatedMetricsReader.yaml
│ ├── prometheus-adapter-clusterRoleBinding.yaml
│ ├── prometheus-adapter-clusterRoleBindingDelegator.yaml
│ ├── prometheus-adapter-clusterRoleServerResources.yaml
│ ├── prometheus-adapter-configMap.yaml
│ ├── prometheus-adapter-deployment.yaml
│ ├── prometheus-adapter-roleBindingAuthReader.yaml
│ ├── prometheus-adapter-service.yaml
│ └── prometheus-adapter-serviceAccount.yaml
├── alertmanager
│ ├── alertmanager-alertmanager.yaml
│ ├── alertmanager-secret.yaml
│ ├── alertmanager-service.yaml
│ └── alertmanager-serviceAccount.yaml
├── grafana
│ ├── grafana-dashboardDatasources.yaml
│ ├── grafana-dashboardDefinitions.yaml
│ ├── grafana-dashboardSources.yaml
│ ├── grafana-deployment.yaml
│ ├── grafana-service.yaml
│ └── grafana-serviceAccount.yaml
├── kube-state-metrics
│ ├── kube-state-metrics-clusterRole.yaml
│ ├── kube-state-metrics-clusterRoleBinding.yaml
│ ├── kube-state-metrics-deployment.yaml
│ ├── kube-state-metrics-service.yaml
│ └── kube-state-metrics-serviceAccount.yaml
├── node-exporter
│ ├── node-exporter-clusterRole.yaml
│ ├── node-exporter-clusterRoleBinding.yaml
│ ├── node-exporter-daemonset.yaml
│ ├── node-exporter-service.yaml
│ └── node-exporter-serviceAccount.yaml
├── operator
│ ├── 0namespace-namespace.yaml
│ ├── prometheus-operator-0alertmanagerConfigCustomResourceDefinition.yaml
│ ├── prometheus-operator-0alertmanagerCustomResourceDefinition.yaml
│ ├── prometheus-operator-0podmonitorCustomResourceDefinition.yaml
│ ├── prometheus-operator-0probeCustomResourceDefinition.yaml
│ ├── prometheus-operator-0prometheusCustomResourceDefinition.yaml
│ ├── prometheus-operator-0prometheusruleCustomResourceDefinition.yaml
│ ├── prometheus-operator-0servicemonitorCustomResourceDefinition.yaml
│ ├── prometheus-operator-0thanosrulerCustomResourceDefinition.yaml
│ ├── prometheus-operator-clusterRole.yaml
│ ├── prometheus-operator-clusterRoleBinding.yaml
│ ├── prometheus-operator-deployment.yaml
│ ├── prometheus-operator-service.yaml
│ └── prometheus-operator-serviceAccount.yaml
├── other
├── prometheus
│ ├── prometheus-clusterRole.yaml
│ ├── prometheus-clusterRoleBinding.yaml
│ ├── prometheus-prometheus.yaml
│ ├── prometheus-roleBindingConfig.yaml
│ ├── prometheus-roleBindingSpecificNamespaces.yaml
│ ├── prometheus-roleConfig.yaml
│ ├── prometheus-roleSpecificNamespaces.yaml
│ ├── prometheus-rules.yaml
│ ├── prometheus-service.yaml
│ └── prometheus-serviceAccount.yaml
└── serviceMonitor
├── alertmanager-serviceMonitor.yaml
├── grafana-serviceMonitor.yaml
├── kube-state-metrics-serviceMonitor.yaml
├── node-exporter-serviceMonitor.yaml
├── prometheus-adapter-serviceMonitor.yaml
├── prometheus-operator-serviceMonitor.yaml
├── prometheus-serviceMonitor.yaml
├── prometheus-serviceMonitorApiserver.yaml
├── prometheus-serviceMonitorCoreDNS.yaml
├── prometheus-serviceMonitorKubeControllerManager.yaml
├── prometheus-serviceMonitorKubeScheduler.yaml
└── prometheus-serviceMonitorKubelet.yaml
9 directories, 67 files
2.2 step2:修改数据持久化存储
prometheus
实际上是通过 emptyDir
进行挂载的,我们知道 emptyDir
挂载的数据的生命周期和 Pod
生命周期一致的,如果 Pod
挂掉了,那么数据也就丢失了,这也就是为什么我们重建 Pod
后之前的数据就没有了的原因,所以这里修改它的持久化配置。
本文默认已经安装好了openebs,不再进行讲述,请自行百度。
使用命令查询当前StorageClass
的名称(需要安装openebs
):
## 查询当前的storeclass名称
kubectl get sc
可以看到StorageClass的名称为
openebs-hostpath
,接下来可以修改配置文件了
prometheus
是一种 StatefulSet
有状态集的部署模式,所以直接将 StorageClass
配置到里面,在下面的 yaml
中最下面添加持久化配置:
目录:manifests/prometheus/prometheus-prometheus.yaml
在文件末尾新增:
...
serviceMonitorSelector: {}
version: v2.11.0
retention: 3d
storage:
volumeClaimTemplate:
spec:
storageClassName: openebs-hostpath
resources:
requests:
storage: 5Gi
2.2.2 修改grafana持久化配置
由于 Grafana
是部署模式为 Deployment
,所以我们提前为其创建一个 grafana-pvc.yaml
文件,加入下面 PVC
配置。
目录:manifests/grafana/grafana-pvc.yaml
完整内容如下:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: grafana
namespace: monitoring #---指定namespace为monitoring
spec:
storageClassName: openebs-hostpath #---指定StorageClass
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
接着修改 grafana-deployment.yaml
文件设置持久化配置,应用上面的 PVC(目录:manifests/grafana/grafana-deployment.yaml)
修改内容如下:
serviceAccountName: grafana
volumes:
- name: grafana-storage # 新增持久化配置
persistentVolumeClaim:
claimName: grafana # 设置为创建的PVC名称
# - emptyDir: {} # 注释旧的注释
# name: grafana-storage
- name: grafana-datasources
secret:
secretName: grafana-datasources
2.3 step3:修改 Service 端口设置
2.3.1 修改 Prometheus Service
修改prometheus Service
端口类型为 NodePort
,设置 NodePort
端口为 32101
:
目录:manifests/prometheus/prometheus-service.yaml
修改 prometheus-service.yaml 文件:
apiVersion: v1
kind: Service
metadata:
labels:
prometheus: k8s
name: prometheus-k8s
namespace: monitoring
spec:
type: NodePort
ports:
- name: web
port: 9090
targetPort: web
nodePort: 32101
selector:
app: prometheus
prometheus: k8s
sessionAffinity: ClientIP
2.3.2 修改 Grafana Service
修改 garafana service
端口类型为 NodePort
,设置 NodePort
端口为 32102
目录:manifests/grafana/grafana-service.yaml
修改 grafana-service.yaml 文件:
apiVersion: v1
kind: Service
metadata:
labels:
app: grafana
name: grafana
namespace: monitoring
spec:
type: NodePort
ports:
- name: http
port: 3000
targetPort: http
nodePort: 32102
selector:
app: grafana
2.4 step4:安装promethues-operator
注意:以下操作均在manifest目录下操作!
cd kube-prometheus-0.4.0/manifests
开始安装 Operator:
kubectl apply -f setup/
查看 Pod,等 pod 创建起来在进行下一步:
kubectl get pods -n monitoring
接下来安装其它组件:
kubectl apply -f adapter/
kubectl apply -f alertmanager/
kubectl apply -f node-exporter/
kubectl apply -f kube-state-metrics/
kubectl apply -f grafana/
kubectl apply -f prometheus/
kubectl apply -f serviceMonitor/
查看 Pod 状态,等待所有状态均为Running
:
kubectl get pods -n monitoring
打开地址:http://10.194.188.101:32101/targets,看看各个服务状态有没有问题:
可以看到已经监控上了很多指标数据了,上面我们可以看到 Prometheus
是两个副本,我们这里通过 Service
去访问,按正常来说请求是会去轮询访问后端的两个 Prometheus
实例的,但实际上我们这里访问的时候始终是路由到后端的一个实例上去,因为这里的 Service
在创建的时候添加了 SessionAffinity:ClientIP
这样的属性,会根据 ClientIP
来做 Session
亲和性,所以我们不用担心请求会到不同的副本上去。
有可能grafna
会没有启动成功,提示如下:
执行如下语句去删除pod,然后会自动创建pod:
kubectl delete pod/grafana-689d64c6db-wcvq2 -nmonitoring
打开地址:http://10.194.188.101:32102,默认账号密码为admin: 可以看到所有组件都安装成功了!
至此,我们在k8s上安装kube-promethues已经成功了!
- 参考资料:https://bbs.huaweicloud.com/blogs/303137