大家好,我是博哥爱运维。这节课我们继续prometheus相关的内容。
访问prometheus后台,点击上方菜单栏Status
— Targets
,我们发现kube-controller-manager和kube-scheduler未发现
接下来我们解决下这一个碰到的问题吧
# 这里我们发现这两服务监听的IP是0.0.0.0 正常
# ss -tlnp|egrep 'controller|schedule'
LISTEN 0 32768 *:10257 *:* users:(("kube-controller",pid=3528,fd=3))
LISTEN 0 32768 *:10259 *:* users:(("kube-scheduler",pid=837,fd=3))
然后因为K8s的这两上核心组件我们是以二进制形式部署的,为了能让K8s上的prometheus能发现,我们需要来创建相应的service和endpoints来将其关联起来
注意:我们需要将endpoints里面的NODE IP换成我们实际情况的
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-controller-manager
labels:
app.kubernetes.io/name: kube-controller-manager
spec:
type: ClusterIP
clusterIP: None
ports:
- name: https-metrics
port: 10257
targetPort: 10257
protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
labels:
app.kubernetes.io/name: kube-controller-manager
name: kube-controller-manager
namespace: kube-system
subsets:
- addresses:
- ip: 10.0.1.201
- ip: 10.0.1.202
ports:
- name: https-metrics
port: 10257
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler
labels:
app.kubernetes.io/name: kube-scheduler
spec:
type: ClusterIP
clusterIP: None
ports:
- name: https-metrics
port: 10259
targetPort: 10259
protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
labels:
app.kubernetes.io/name: kube-scheduler
name: kube-scheduler
namespace: kube-system
subsets:
- addresses:
- ip: 10.0.1.201
- ip: 10.0.1.202
ports:
- name: https-metrics
port: 10259
protocol: TCP
将上面的yaml配置保存为repair-prometheus.yaml,然后创建它
kubectl apply -f repair-prometheus.yaml
创建完成后确认下
# kubectl -n kube-system get svc |egrep 'controller|scheduler'
kube-controller-manager ClusterIP None <none> 10252/TCP 58s
kube-scheduler ClusterIP None <none> 10251/TCP 58s
然后再返回prometheus UI处,耐心等待一会,就能看到已经被发现了
serviceMonitor/monitoring/kube-controller-manager/0 (2/2 up)
serviceMonitor/monitoring/kube-scheduler/0 (2/2 up)