Redhat Openshift 의 오픈소스 버전
인 OKD Cluster 를 생성하고 사용해본다. ( OKD 4.12 버전 기준 : k8s 1.25 )
OKD 설명 참고 : https://velog.io/@_gyullbb/OKD-%EA%B0%9C%EC%9A%94
-
환경 정보
- OKD Console : https://console-openshift-console.apps.okd4.ktdemo.duckdns.org/
- OKD API : https://api.okd4.ktdemo.duckdns.org:6443
- Minio Object Storage : https://minino.apps.okd4.ktdemo.duckdns.org/
- ArgoCD : https://argocd.apps.okd4.ktdemo.duckdns.org/
- Harbor Private Docker Registry : https://myharbor.apps.okd4.ktdemo.duckdns.org/
-
도메인 생성
-
설치 환경 인프라 구성
-
Bastion 서버 설치 및 설정
-
Cluster 생성 준비
-
Cluster 생성
-
Cluster 생성 후 할 일
-
Cloud shell 설치 및 core os 설정
-
ArgoCD 설치
-
Dynamic Provisioning 설치
-
Minio Object Stroage 설치 ( w/ NFS )
-
Harbor ( Private Registry ) 설치 및 설정
-
Compute ( Worker Node ) Join 하기
-
etcd 백업하기
OKD 설치를 위해서는 Static IP 가 필요하지만 Static IP 가 없는 경우 도메인이 필요하고 아래 방법으로 무료 도메인 ( duckdns )을 생성한다.
https://www.duckdns.org/ 로 접속하여 가입을 하고 본인의 공유기의 WAN IP 와 도메인을 매칭 시킨다. WAN IP가 바뀌어도 주기적으로 update 된다.
ktdemo.duckdns.org 로 생성 을 한다. ip 를 변경하고 싶으면 ip를 수동으로 입력하고 update 한다.
위의 도메인이 우리가 설치하는 base 도메인이 된다.
설치에 필요한 Node는 총 3 개이고 boostrap 서버는 master 노드 설치 이후 제거 가능하다.
서버구분 | Hypervisor | IP | hostname | 용도 | OS | Spec | 기타 |
---|---|---|---|---|---|---|---|
VM | proxmox | 192.168.1.1.247 | bastion.okd4.ktdemo.duckdns.org | Bastion(LB,DNS) | Centos 8 Stream | 2 core / 4 G / 30G | |
VM | proxmox | 192.168.1.1.128 | bootstrap.okd4.ktdemo.duckdns.org | Bootstrap | Fedora Core OS 37 | 2 core / 4 G / 40G | |
VM | vmware | 192.168.1.1.146 | okd-1.okd4.ktdemo.duckdns.org | Master/Worker | Fedora Core OS 37 | 8 core / 20 G / 200G | Base OS 윈도우 11 |
VM | proxmox | 192.168.1.1.148 | okd-2.okd4.ktdemo.duckdns.org | Worker | Fedora Core OS 37 | 2 core / 16 G / 300G | 워커 노드 |
VM | proxmox | 192.168.1.1.149 | okd-3.okd4.ktdemo.duckdns.org | Worker | Fedora Core OS 37 | 4 core / 8 G / 100G | 워커 노드 추가 |
coreos 버전은 OKD 버전과 일치해야 합니다.
- 버전이 일치 하지않으면 설치시 pivot 에러 발생
https://github.com/okd-project/okd/releases?page=1 에서 확인 합니다.
fedora download
Release Metadata:
Version: 4.12.0-0.okd-2023-03-18-084815
Upgrades: <none>
Metadata:
Component Versions:
kubernetes 1.25.7
machine-os 37.20230218.3 Fedora CoreOS
bastion 서버는 centos 8 stream 으로 proxmox 서버에 설치 한다.
설치 과정은 생략한다.
설치 이후에 centos에서 네트웍 활성화를 해야 IP 받아온다.
vi 에디터 와 tar, wget 라이브러리 설치
[root@localhost shclub]# dnf install -y vim bash-completion tcpdump tar wget
[root@localhost shclub]# hostnamectl set-hostname bastion.okd4.ktdemo.duckdns.org
[root@localhost shclub]# vi /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
방화벽을 disable 한다.
[root@localhost shclub]# systemctl disable firewalld --now
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
bastion server는 resolv.conf 의 nameserver 설정 을 확인해야 하는 데 외부 라이브러리 설치를 하기 위해서는 공유기의 IP로 설정을 한다. ( search는 상관 없음 )
[root@bastion shclub]# vi /etc/resolv.conf
# Generated by NetworkManager
search okd4.ktdemo.duckdns.org
nameserver 192.168.1.1
HA Proxy ( L7 ) 를 설치한다.
[root@localhost shclub]# yum install -y haproxy
configuration 을 설정한다.
[root@bastion shclub]# vi /etc/haproxy/haproxy.cfg
# Global settings
#---------------------------------------------------------------------
global
maxconn 20000
log /dev/log local0 info
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
log global
mode http
option httplog
option dontlognull
option http-server-close
option redispatch
option forwardfor except 127.0.0.0/8
retries 3
maxconn 20000
timeout http-request 10000ms
timeout http-keep-alive 10000ms
timeout check 10000ms
timeout connect 40000ms
timeout client 300000ms
timeout server 300000ms
timeout queue 50000ms
# Enable HAProxy stats
listen stats
bind :9000
mode http
stats enable
stats uri /
stats refresh 5s
#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend static
balance roundrobin
server static 127.0.0.1:4331 check
# OKD API Server
frontend openshift_api_frontend
bind *:6443
default_backend openshift_api_backend
mode tcp
option tcplog
backend openshift_api_backend
mode tcp
balance source
server bootstrap 192.168.1.128:6443 check # bootstrap 서버
server okd-1 192.168.1.146:6443 check # okd master/worker 설정
server okd-2 192.168.1.148:6443 check # 추가 서버 있다면 설정
# OKD Machine Config Server
frontend okd_machine_config_server_frontend
mode tcp
bind *:22623
default_backend okd_machine_config_server_backend
backend okd_machine_config_server_backend
mode tcp
balance source
server bootstrap 192.168.1.128:22623 check # bootstrap 서버
server okd-1 192.168.1.146:22623 check # okd master/worker 설정
server okd-2 192.168.1.148:22623 check # 추가 서버 있다 면 설정
# OKD Ingress - layer 4 tcp mode for each. Ingress Controller will handle layer 7.
frontend okd_http_ingress_frontend
bind *:80
default_backend okd_http_ingress_backend
mode tcp
backend okd_http_ingress_backend
balance source
mode tcp
server okd-1 192.168.1.146:80 check # okd master/worker 설정
server okd-2 192.168.1.148:80 check # 추가 서버 있다 면 설정
frontend okd_https_ingress_frontend
bind *:443
default_backend okd_https_ingress_backend
mode tcp
backend okd_https_ingress_backend
mode tcp
balance source
server okd-1 192.168.1.146:443 check
server okd-2 192.168.1.148:443 check # 추가 서버 있다 면 설정
haproxy를 enable 해서 활성화 한다.
[root@localhost shclub]# systemctl enable haproxy --now
서비스 활성화시에 애러가 발생하면 status를 확인한다.
[root@bastion shclub]# systemctl status haproxy
bind 에러가 나는 경우 아래와 같이 설정한다.
[root@bastion shclub]# setsebool -P haproxy_connect_any=1
[root@bastion shclub]# systemctl restart haproxy
HAproxy 로드발란서 상태 및 전송량 등 통계를 확인하려면 위에서 설정한대로 아래 URL 및 계정을 이용하여 로그인하면 됩니다.
아래는 haproxy.cfg 설정이다.
# Enable HAProxy stats
listen stats
bind :9000
stats auth admin:az1@
stats uri /stats
브라우저에서 http://192.168.1.247:9000/stats 로 로그인 하고 권한 을 설정했다면 위에 설정한 값으로 로그인 한다.
- 계정 : admin / az1@
bootstrap , master , worker 노드를 생성하기 위해서는 bastion에 web 서버를 구성하여
ignition 화일을 다운 받아 설치를 한다. ( 여기서는 Apache를 설치한다. )
HTTP 설치 후 기본 80 포트 구성을 8080으로 변경한 후 서비스를 시작합니다.
[root@localhost ~]# dnf install -y httpd
[root@localhost ~]# vi /etc/httpd/conf/httpd.conf
[root@localhost ~]# cat /etc/httpd/conf/httpd.conf | grep Listen
# Listen: Allows you to bind Apache to specific IP addresses and/or
# Change this to Listen on specific IP addresses as shown below to
#Listen 12.34.56.78:80
Listen 8080
Apache web server를 활성화 하고 재기동 한다.
[root@localhost ~]# systemctl enable httpd --now
Created symlink /etc/systemd/system/multi-user.target.wants/httpd.service → /usr/lib/systemd/system/httpd.service.
[root@localhost ~]# systemctl restart httpd
bind 를 설치하고 DNS 서버를 구성하자.
[root@localhost ~]# dnf install -y bind bind-utils
[root@localhost ~]# systemctl enable named --now
Created symlink /etc/systemd/system/multi-user.target.wants/named.service → /usr/lib/systemd/system/named.service.
/etc/named.conf 화일을 수정한다.
[root@bastion shclub]# vi /etc/named.conf
options {
listen-on port 53 { any; };
listen-on-v6 port 53 { none; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file "/var/named/data/named_stats.txt";
memstatistics-file "/var/named/data/named_mem_stats.txt";
secroots-file "/var/named/data/named.secroots";
recursing-file "/var/named/data/named.recursing";
allow-query { any; };
/*
- If you are building an AUTHORITATIVE DNS server, do NOT enable recursion.
- If you are building a RECURSIVE (caching) DNS server, you need to enable
recursion.
- If your recursive DNS server has a public IP address, you MUST enable access
control to limit queries to your legitimate users. Failing to do so will
cause your server to become part of large scale DNS amplification
attacks. Implementing BCP38 within your network would greatly
reduce such attack surface
*/
recursion yes;
dnssec-enable yes;
dnssec-validation yes;
managed-keys-directory "/var/named/dynamic";
pid-file "/run/named/named.pid";
session-keyfile "/run/named/session.key";
/* https://fedoraproject.org/wiki/Changes/CryptoPolicy */
include "/etc/crypto-policies/back-ends/bind.config";
};
logging {
channel default_debug {
file "data/named.run";
severity dynamic;
};
};
zone "." IN {
type hint;
file "named.ca";
};
include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";
/etc/named.rfc1912.zones 화일을 수정한다.
아래 2개의 존을 설정해야 한다.
- ktdemo.duckdns.org : DNS 정방향
- 1.168.192.arpa : DNS 역방향 ( 192.168.1 의 반대로 설정 )
zone "ktdemo.duckdns.org" IN {
type master;
file "/var/named/okd4.ktdemo.duckdns.org.zone";
allow-update { none; };
};
zone "1.168.192.arpa" IN {
type master;
file "/var/named/1.168.192.in-addr.rev";
allow-update { none; };
};
[root@bastion named]# vi /etc/named.rfc1912.zones
zone "localhost.localdomain" IN {
type master;
file "named.localhost";
allow-update { none; };
};
zone "localhost" IN {
type master;
file "named.localhost";
allow-update { none; };
};
zone "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa" IN {
type master;
file "named.loopback";
allow-update { none; };
};
zone "1.0.0.127.in-addr.arpa" IN {
type master;
file "named.loopback";
allow-update { none; };
};
zone "0.in-addr.arpa" IN {
type master;
file "named.empty";
allow-update { none; };
};
zone "ktdemo.duckdns.org" IN {
type master;
file "/var/named/okd4.ktdemo.duckdns.org.zone";
allow-update { none; };
};
zone "1.168.192.arpa" IN {
type master;
file "/var/named/1.168.192.in-addr.rev";
allow-update { none; };
};
/var/named 폴더로 이동한다.
[root@bastion shclub]# cd /var/named
okd4.ktdemo.duckdns.org.zone 파일 설정 ( DNS 정방향 )
- ip와 hostname을 잘 수정한다.
[root@bastion named]# ls
data dynamic named.ca named.empty named.localhost named.loopback slaves
[root@bastion named]# vi okd4.ktdemo.duckdns.org.zone
$TTL 1D
@ IN SOA @ ns.ktdemo.duckdns.org. (
0 ; serial
1D ; refresh
1H ; retry
1W ; expire
3H ) ; minimum
@ IN NS ns.ktdemo.duckdns.org.
@ IN A 192.168.1.247 ;
; Ancillary services
lb.okd4 IN A 192.168.1.247
; Bastion or Jumphost
ns IN A 192.168.1.247 ;
; OKD Cluster
bastion.okd4 IN A 192.168.1.247
bootstrap.okd4 IN A 192.168.1.128
okd-1.okd4 IN A 192.168.1.146
okd-1.okd4 IN A 192.168.1.148
api.okd4 IN A 192.168.1.247
api-int.okd4 IN A 192.168.1.247
*.apps.okd4 IN A 192.168.1.247
1.168.192.in-addr.rev 파일 설정 ( DNS 역방향 )
[root@bastion named]# vi 1.168.192.in-addr.rev
$TTL 1D
@ IN SOA ktdemo.duckdns.org. ns.ktdemo.duckdns.org. (
0 ; serial
1D ; refresh
1H ; retry
1W ; expire
3H ) ; minimum
@ IN NS ns.
247 IN PTR ns.
247 IN PTR bastion.okd4.ktdemo.duckdns.org.
128 IN PTR bootstrap.okd4.ktdemo.duckdns.org.
146 IN PTR okd-1.okd4.ktdemo.duckdns.org.
148 IN PTR okd-1.okd4.ktdemo.duckdns.org.
247 IN PTR api.okd4.ktdemo.duckdns.org.
247 IN PTR api-int.okd4.ktdemo.duckdns.org.
zone 파일 권한 설정을 하고 named 서비스를 재기동한다.
[root@bastion named]# chown root:named okd4.ktdemo.duckdns.org.zone
[root@bastion named]# chown root:named 1.168.192.in-addr.rev
[root@bastion named]# systemctl restart named
이제 bastion 서버의 기본 설정을 완료를 하였다.
bastion 서버의 root 폴더로 이동한다.
[root@bastion named]# cd ~/
rsa key를 생성하고 엔터를 계속 치면 2개의 화일이 .ssh 폴더에 생성이 된다.
- id_rsa : private key
- id_rsa.pub : public key ( bootstrap , master/worker 에 설치 될 key )
[root@localhost ~]# ssh-keygen -t rsa -b 4096 -N ''
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:QDIN4njCh9DzWBhgeUu3gye3VldDPb2*****jsXr**l4o [email protected]
The key's randomart image is:
+---[RSA 4096]----+
|o++o+o. ... . |
|++=+.=. o o . |
|oo=*+ o . . . .|
| oo+.= o . . o |
| + + S + .|
| o o .|
| . . .=|
| .....*|
| E.o++.|
+----[SHA256]-----+
oc 실행 바이너리 와 openshift-install 바이너리 다운로드 하고 압축을 푼다.
[root@localhost ~]# wget https://github.com/okd-project/okd/releases/download/4.12.0-0.okd-2023-03-18-084815/openshift-client-linux-4.12.0-0.okd-2023-03-18-084815.tar.gz
[root@localhost ~]# wget https://github.com/okd-project/okd/releases/download/4.12.0-0.okd-2023-03-18-084815/openshift-install-linux-4.12.0-0.okd-2023-03-18-084815.tar.gz
[root@bastion ~]# tar xvfz openshift-client-linux-4.12.0-0.okd-2023-03-18-084815.tar.gz
README.md
oc
kubectl
[root@bastion ~]# tar xvfz openshift-install-linux-4.12.0-0.okd-2023-03-18-084815.tar.gz
README.md
openshift-install
/usr/local/bin/ 폴더에 실행화일을 이동하고 실행 권한을 준다.
[root@bastion ~]# mv oc kubectl openshift-install /usr/local/bin/
[root@bastion ~]# chmod 755 /usr/local/bin/{oc,kubectl,openshift-install}
[root@bastion ~]# /usr/local/bin/oc version
Client Version: 4.12.0-0.okd-2023-03-18-084815
Kustomize Version: v4.5.7
OKD를 설치 하는 과정에서 redhat 의 private registry 에서 이미지를 다운을 받는다.
private registry 에 접속하기 위해서는 pull secret이 필요하고 아래 redhat 사이트에 접속을 하여 가입을 하고 pull secret를 다운 받는다.
pull secret 을 받은 후 24시간 안에 master node 설치를 완료 해야 한다.
접속하기 : https://cloud.redhat.com/openshift/create/local
pull secret의 포맷은 아래와 같다.
{"auths":{"cloud.openshift.com":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K29jbV9hY2Nlc3NfODA3Yjc0MDgzODBmNDg4NmE-------zSDdNSTFGRzFPN1hBODRSQjZONTFYSw==","email":"[email protected]"},"quay.io":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K29jbV9hY2Nlc3NfODA3Yjc0MDgzODBmNDg4NmExYTE4YWVjMzZjZDc3ZTE6WEhHTVhYWlAzMjEyR0tJUFRaN0Y3MUNSWVRHUEVMM1BBRThQUExWSlEzSDdNSTFGRzFPN1hBODRSQjZONTFYSw==","email":"[email protected]"},"registry.connect.redhat.com":{"auth":"fHVoYy1wb29sLTYxMTBkMjQyLTQ3MjgtNDBhYS05Zjc5LTdjZTMyNDUyNzJlYzpleUpoYkdjaU9pSlNVelV4TWlKOS5leUp6ZFdJaU9pSXdaamMyTkRRMU4yVXdNREUwT0dJek9EZGpNVGMyTW1GaE9ERTBORGcwTVNKOS5aNXZrTnNTb3NlQ1NfTDZOQ1NiN0I5NTVkUkR4NmVsUWpkZGMwRl9-------wSlRiX0hrUVVoUHE5dEthOVZDOWtsS2tCNVViSEF3OXByNTdnR25QSzFRNzJycDI4NA==","email":"[email protected]"},"registry.redhat.io":{"auth":"-------Da3JnS0xUMEJqYks5Y0FoR0JfRjBZMjZEa3lCOHF2SkdRSGE2VklOQ1Y3dnpRTU1GU3lHeWdZQ2VkWjFSWk9PRUQwSlRiX0hrUVVoUHE5dEthOVZDOWtsS2tCNVViSEF3OXByNTdnR25QSzFRNzJycDI4NA==","email":"[email protected]"}}}
manifest 와 inition 화일을 생성하기 위하여 install-config.yaml를 만든다.
pull secret 항목은 위에서 다운받은 redhat pull secret을 복사하고 ssh 키는 bastion에서 생성한 public key를 가져와서 붙여 넣는다.
[root@bastion ~]# mkdir -p okd4
[root@bastion ~]# vi ./okd4/install-config.yaml
apiVersion: v1
baseDomain: ktdemo.duckdns.org # 베이스 도메인. 본인의 공유기 도메인
compute:
- hyperthreading: Enabled
name: worker
replicas: 1 # 워커 노드 수
controlPlane:
hyperthreading: Enabled
name: master
replicas: 1 # 마스터 노드 수 ( master 노드는 기본은 worker node 혼용으로 설정 )
metadata:
name: okd4 # OKD Cluster 이름이며 base domain 앞에 추가가 된다.
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
pullSecret: '{"auths":{"cloud.openshift.com":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K29jbV9hY2Nlc3NfODA3Yjc0MDgzODBmNDg4NmExYTE4YWVjMzZjZDc3ZTE6WEhHTVhYWlAzMjEyR0tJUFRaN0Y3MUNSWVRHUEVMM1BBRThQUExWSlE----CNVViSEF3OXByNTdnR25QSzFRNzJycDI4NA==","email":"[email protected]"}}}'
sshKey: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCWJkGLkamR8mtMhNPUC7fY5lzXZFzGpEFftZwkFoXCBWmF------R8chyf60CkHOTFHVqsUHNs3JdkvmJBPWrE3FN3w== [email protected]'
install-config.yaml 파일은 manifest 와 ignition 생성후 삭제가 되기때문에 백업 폴더를 생성하여 저장한다.
[root@bastion ~]# mkdir backup
[root@bastion ~]# cp ./okd4/install-config.yaml ./backup/install-config.yaml
manifest 화일을 생성한다.
openshift 폴더가 생성이되고 master/worker 노드 설정 화일들이 있어 여기 값을 수정하여 role 을 할당 할 수 있다.
[root@bastion ~]# /usr/local/bin/openshift-install create manifests --dir=okd4
INFO Consuming Install Config from target directory
INFO Manifests created in: okd4/manifests and okd4/openshift
[root@bastion ~]# ls ./okd4
manifests openshift
[root@bastion ~]# ls ./okd4/openshift
99_kubeadmin-password-secret.yaml 99_openshift-machineconfig_99-master-ssh.yaml
99_openshift-cluster-api_master-user-data-secret.yaml 99_openshift-machineconfig_99-worker-ssh.yaml
99_openshift-cluster-api_worker-user-data-secret.yaml openshift-install-manifests.yaml
master 노드 기능의 구성을 하려면 manifest/cluster-scheduler-02-config.yml의 mastersSchedulable 파라미터를 false
로 변경한다.
[root@bastion manifests]# vi cluster-scheduler-02-config.yml
[root@bastion manifests]# cat cluster-scheduler-02-config.yml
apiVersion: config.openshift.io/v1
kind: Scheduler
metadata:
creationTimestamp: null
name: cluster
spec:
mastersSchedulable: false
policy:
name: ""
status: {}
향후 수정할수 있고 master 노드에서 worker role을 삭제하려면 아래 명령어를 통해 삭제 할 수 있습니다.
[root@bastion ]# oc patch schedulers.config.openshift.io/cluster --type merge -p '{"spec":{"mastersSchedulable":false}}'
scheduler.config.openshift.io/cluster patched
만일, master 노드에서 다시 worker role을 추가하려면 아래 명령어를 통해 롤을 부여 할 수 있습니다.
[root@bastion ]# oc patch schedulers.config.openshift.io/cluster --type merge -p '{"spec":{"mastersSchedulable":true}}'
scheduler.config.openshift.io/cluster patched
coreos 설정을 위한 ignition 화일을 생성한다.
역할 별로 ign 파일이 생성이 되고 auth 폴더 안에는 연결 정보 (kubeconfig) 가 저장이 되어 있다.
[root@bastion ~]# /usr/local/bin/openshift-install create ignition-configs --dir=okd4
INFO Consuming OpenShift Install (Manifests) from target directory
INFO Consuming Openshift Manifests from target directory
INFO Consuming Worker Machines from target directory
INFO Consuming Common Manifests from target directory
INFO Consuming Master Machines from target directory
INFO Ignition-Configs created in: okd4 and okd4/auth
[root@bastion ~]# ls ./okd4/
auth bootstrap.ign master.ign metadata.json worker.ign
bastion web 서버에 ign 화일을 복사하고 apache web server를 재 기동한다.
[root@bastion ~]# mkdir /var/www/html/ign
[root@bastion ~]# cp ./okd4/*.ign /var/www/html/ign/
[root@bastion ~]# chmod 777 /var/www/html/ign/*.ign
[root@bastion ~]# systemctl restart httpd
proxmox 서버에 coreos 기반의 bootstrap 용 서버를 생성한다. ( 생성 과정은 생략 )
<br.>
- 다운로드 위치 : https://builds.coreos.fedoraproject.org/browser?stream=stable&arch=x86_64
- Version : fedora-coreos-37.20230218.3.0-live.x86_64.iso
처음 기동인 되면 자동 로그인 이 되고 proxmox 에서 OS 콘솔로 접속이 가능하다.
- proxmox 콘솔은 웹이기 때문애 붙여 넣기가 안된다.
먼저 네트웍을 설정을 하기 위해서 network device 이름을 확인한다.
nmcli 대신 nmtui를 사용하면 gui 모드로 설정 가능하다.
[root@localhost core]# nmcli device
ens160 ethernet connected Wired connection 1
lo loopback unmanaged --
connection 이름을 ens18로 생성한다.
[root@localhost core]# nmcli connection add type ethernet autoconnect yes con-name ens160 ifname ens160
네트웍 설정을 한다.
- ip : bootstrap 서버는 192.168.1.128/24 로 설정한다.
- dns : bastion 서버는 192.168.1.247 로 설정한다.
- gateway : 공유기 ip 인 192.168.1.1 로 설정한다. ( bastion 서버 ip로 해도 상관 없음 )
- dns-search : okd4.ktdemo.duckdns.org 로 설정 ( cluster 이름 + . + base Domain)
[root@localhost core]# nmcli connection modify ens160 ipv4.addresses 192.168.1.128/24 ipv4.method manual
[root@localhost core]# nmcli connection modify ens160 ipv4.dns 192.168.1.247
[root@localhost core]# nmcli connection modify ens160 ipv4.gateway 192.168.1.1
[root@localhost core]# nmcli connection modify ens160 ipv4.dns-search okd4.ktdemo.duckdns.org
설치를 시작하기 전에 bastion 서버의 /etc/resolv.conf 화일에서 nameserver 설정을 bastion 서버의 ip인 192.168.1.247 로 변경한다.
- 변경하지 않으면 EOF 에러등 다양한 에러가 발생한다.
- 이제 부터는 내부 네트웍만 필요하다.
아래 명령어로 bootstrap 서버 설치를 시작한다.
- --copy-network 의미는 설치 될때 위에서 설정한 네트웍 정보로 설치가 된다.
disk 종류를 조회 한다.
[root@localhost core]# fdisk -l
Disk /dev/loop0: 386.14 MiB, 404901888 bytes, 790824 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/nvme0n1: 50 GiB, 53687091200 bytes, 104857600 sectors
Disk model: VMware Virtual NVMe Disk
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop1: 680.51 MiB, 713562624 bytes, 1393677 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
[root@localhost core]# coreos-installer install --copy-network /dev/sda -I http://192.168.1.247:8080/ign/bootstrap.ign --insecure-ignition
Installing Fedora CoreOS 37.20230218.3.0 x86_64 (512-byte sectors)
> Read disk 2.5 GiB/2.5 GiB (100%)
Writing Ignition config
Copying networking configuration from /etc/NetworkManager/system-connections/
Copying /etc/NetworkManager/system-connections/ens18.nmconnection to installed system
Copying /etc/NetworkManager/system-connections/ens18-37d95251-8740-4053-a3ee-99ef2a2063c2.nmconnection to installed system
Install complete.
설치가 완료 되면 재기동 한다.
[root@localhost core]# reboot now
bastion 서버에서 bootsrap 서버로 로그인을 해본다.
[root@bastion config]# ssh [email protected]
The authenticity of host '192.168.1.128 (192.168.1.128)' can't be established.
ECDSA key fingerprint is SHA256:7+MOJdsnC548GUrGZxYKTnvhG94F+2kyGa2bpSH6eA8.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.1.128' (ECDSA) to the list of known hosts.
Red Hat Enterprise Linux CoreOS 48.84.202109241901-0
Part of OpenShift 4.8, RHCOS is a Kubernetes native operating system
managed by the Machine Config Operator (`clusteroperator/machine-config`).
WARNING: Direct SSH access to machines is not recommended; instead,
make configuration changes via `machineconfig` objects:
https://docs.openshift.com/container-platform/4.8/architecture/architecture-rhcos.html
---
This is the bootstrap node; it will be destroyed when the master is fully up.
The primary services are release-image.service followed by bootkube.service. To watch their status, run e.g.
journalctl -b -f -u release-image.service -u bootkube.service
아래 명령어를 사용하여 설치 로그를 확인한다.
시간이 지나면 한번 재기동을 진행한다.
[core@localhost ~]$ journalctl -b -f -u release-image.service -u bootkube.service
Aug 31 05:03:07 localhost.localdomain systemd[1]: Starting release-image.service - Download the OpenShift Release Image...
Aug 31 05:03:07 localhost.localdomain release-image-download.sh[1552]: Pulling quay.io/openshift/okd@sha256:7153ed89133eeaca94b5fda702c5709b9ad199ce4ff9ad1a0f01678d6ecc720f...
Aug 31 05:03:07 localhost.localdomain podman[1627]: 2023-08-31 05:03:07.966165638 +0000 UTC m=+0.261238989 system refresh
Aug 31 05:03:24 localhost.localdomain release-image-download.sh[1627]: 976927c0e02baf3b3ef5c78bfed8b856bac8a7123f8e0f28cd15108db602202a
Aug 31 05:03:24 localhost.localdomain podman[1627]: 2023-08-31 05:03:07.967191803 +0000 UTC m=+0.262265164 image pull quay.io/openshift/okd@sha256:7153ed89133eeaca94b5fda702c5709b9ad199ce4ff9ad1a0f01678d6ecc720f
Aug 31 05:03:24 localhost.localdomain systemd[1]: Finished release-image.service - Download the OpenShift Release Image
[root@localhost core]# sudo podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/openshift/okd <none> 976927c0e02b 5 months ago 422 MB
quay.io/openshift/okd-content <none> 809d1bb470e0 5 months ago 2.33 GB
quay.io/openshift/okd-content <none> 7409643ff5f3 5 months ago 448 MB
quay.io/openshift/okd-content <none> 20487d50fb08 8 months ago 419 MB
[core@localhost ~]$
Broadcast message from root@localhost (Fri 2023-09-01 03:05:03 UTC):
The system is going down for reboot NOW!
Connection to 192.168.1.128 closed by remote host.
Connection to 192.168.1.128 closed.
부팅 완료 후 다시 로그인 해보면 설치가 진행이 된다.
[core@localhost ~]$ journalctl -b -f -u release-image.service -u bootkube.service
Aug 31 05:06:57 localhost.localdomain bootkube.sh[2122]: Writing asset: /assets/config-bootstrap/manifests/0000_10_config-operator_01_infrastructure.crd.yaml
Aug 31 05:06:57 localhost.localdomain podman[2122]: 2023-08-31 05:06:57.175032705 +0000 UTC m=+10.142264168 container died 9e9f18af07f6c6ae0a1d344ff09b8ab388bcacb946575aa2ee5bd637067aa79b (image=quay.io/openshift/okd-content@sha256:fbc0cf0b8d2d8c1986c51b7dce88f36b33d6ff4a742d42e91edbae0b48b6991f, name=config-render, name=openshift/ose-base, version=v4.12.0, com.redhat.build-host=cpt-1001.osbs.prod.upshift.rdu2.redhat.com, io.openshift.tags=openshift,base, io.k8s.display-name=4.12-base, vcs-type=git, vendor=Red Hat, Inc., io.k8s.description=ART equivalent image openshift-4.12-openshift-enterprise-base - rhel-8/base-repos, release=202212060046.p0.g7e8a010.assembly.stream, distribution-scope=public, io.openshift.maintainer.product=OpenShift Container Platform, io.openshift.build.name=, maintainer=Red Hat, Inc., io.openshift.build.commit.date=, summary=Provides the latest release of the Red Hat Extended Life Base Image., url=https://access.redhat.com/containers/#/registry.access.redhat.com/openshift/ose-base/images/v4.12.0-202212060046.p0.g7e8a010.assembly.stream, io.openshift.build.commit.ref=release-4.12, com.redhat.license_terms=https://www.redhat.com/agreements, io.openshift.build.commit.message=, io.openshift.build.source-context-dir=, vcs-ref=4c6e171d26cc3c302c6d6193060344456bc381a1, build-date=2022-12-06T08:50:37, io.openshift.build.commit.id=4c6e171d26cc3c302c6d6193060344456bc381a1, io.buildah.version=1.26.4, vcs-url=https://github.com/openshift/cluster-config-operator, io.openshift.build.source-location=https://github.com/openshift/cluster-config-operator, io.openshift.release.operator=true, io.openshift.build.namespace=, io.openshift.build.commit.url=https://github.com/openshift/images/commit/7e8a0105eb7369f3f92ad7b2581a2efffab5b28e, description=This is the base image from which all OpenShift Container Platform images inherit., io.openshift.maintainer.component=Release, io.openshift.ci.from.base=sha256:0a36c595ecd06f505df6f5f140bf4d851f112db1120f447ee116b80e2ab47f0f, io.openshift.build.commit.author=, com.redhat.component=openshift-enterprise-base-container, architecture=x86_64, io.openshift.expose-services=, License=GPLv2+)
nameserver 가 변경 된것을 확인 할 수 있다.
[core@localhost ~]$ sudo cat /etc/resolv.conf
nameserver 192.168.1.247
search okd4.ktdemo.duckdns.org
bootstrap 서버에서 exit 하여 bastion 서버로 돌아오고 아래 명령어를 사용하여 모니터링 한다.
아래 API 버전까지 나와야 정상이고 대부분 the Kubernetes API 서버 접속시 에러가 많이 발생하는데 DNS 서버인 Bastion 서버의 ip로 설정이 안되어 있는 경우가 많다.
bootstrap 노드가 재기동 된 후 부터 아래 명령어를 사용해야 API 서버 EOF 에러가 발생 하지 않는다.
[root@bastion ~]# openshift-install --dir=/root/okd4 wait-for bootstrap-complete --log-level=debug
DEBUG OpenShift Installer 4.12.0-0.okd-2023-03-18-084815
DEBUG Built from commit 4688870d3a709eea34fe2bb5d1c62dea2cfd7e91
INFO Waiting up to 20m0s (until 10:26PM) for the Kubernetes API at https://api.okd4.ktdemo.duckdns.org:6443...
DEBUG Loading Agent Config...
INFO API v1.25.0-2786+eab9cc98fe4c00-dirty up
DEBUG Loading Install Config...
DEBUG Loading SSH Key...
DEBUG Loading Base Domain...
DEBUG Loading Platform...
DEBUG Loading Cluster Name...
DEBUG Loading Base Domain...
DEBUG Loading Platform...
DEBUG Loading Networking...
DEBUG Loading Platform...
DEBUG Loading Pull Secret...
DEBUG Loading Platform...
DEBUG Using Install Config loaded from state file
INFO Waiting up to 30m0s (until 10:36PM) for bootstrapping to complete...
INFO Waiting up to 30m0s (until 10:36PM) for bootstrapping to complete...
위의 메시지가 나오면 바로 master node를 생성한다.
bootstrap 서버에 아래와 같이 pivot 에러가 발생하면 okd 버전에 맞는 coreos 가 달라서 발생한다. okd github에서 해당 coreos를 찾아서 설치 해아한다.
Aug 30 21:59:22 localhost.localdomain bootstrap-pivot.sh[19646]: + error_line='\''1 source ./pre-pivot.sh'\''
Aug 30 21:59:22 localhost.localdomain bootstrap-pivot.sh[19646]: ++ journalctl --unit=release-image-pivot.service --lines=3 --output=cat' '. += [
Aug 30 21:59:22 localhost.localdomain bootstrap-pivot.sh[19646]: {$timestamp,$preCommand,$postCommand,$stage,$phase,$result,$errorLine,$errorMessage} |
Aug 30 21:59:22 localhost.localdomain bootstrap-pivot.sh[19646]: reduce keys[] as $k (.; if .[$k] == "" then del(.[$k]) else . end)
Aug 30 21:59:22 localhost.localdomain bootstrap-pivot.sh[19646]: ]'
Aug 30 21:59:22 localhost.localdomain bootstrap-pivot.sh[18854]: + mv /var/log/openshift//release-image-pivot.json.tmp /var/log/openshift//release-image-pivot.json
Aug 30 21:59:22 localhost.localdomain systemd[1]: release-image-pivot.service: Main process exited, code=exited, status=1/FAILURE
아래 처럼 os 이미지 버전이 안 맞아서 rebase 해야 한다. coreos 재설치가 답이다.
rpm-ostree rebase --experimental "ostree-unverified-registry:${MACHINE_OS_IMAGE}"
설치 시 bootstrap 서버에 들어가서 oc 명령어를 사용하여 상태를 볼 수도 있다.
[root@localhost core]# cd /etc/kubernetes
[root@localhost kubernetes]# oc --kubeconfig kubeconfig get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
openshift-operator-lifecycle-manager collect-profiles-28224240-dvr6d 0/1 Pending 0 14m
[root@localhost kubernetes]# oc --kubeconfig kubeconfig get events -n openshift-operator-lifecycle-manager
LAST SEEN TYPE REASON OBJECT MESSAGE
37m Warning FailedCreate replicaset/catalog-operator-c5d66646d Error creating: pods "catalog-operator-c5d66646d-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes
18m Warning FailedCreate replicaset/catalog-operator-c5d66646d Error creating: pods "catalog-operator-c5d66646d-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes
3m25s Warning FailedCreate replicaset/catalog-operator-c5d66646d Error creating: pods "catalog-operator-c5d66646d-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes
48m Normal ScalingReplicaSet deployment/catalog-operator Scaled up replica set catalog-operator-c5d66646d to 1
33m Warning FailedScheduling pod/collect-profiles-28224315-hh5n8 no nodes available to schedule pods
29m Warning FailedScheduling pod/collect-profiles-28224315-hh5n8 no nodes available to schedule pods
28m Warning FailedScheduling pod/collect-profiles-28224315-hh5n8 skip schedule deleting pod: openshift-operator-lifecycle-manager/collect-profiles-28224315-hh5n8
43m Normal SuccessfulCreate job/collect-profiles-28224315 Created pod: collect-profiles-28224315-hh5n8
28m Warning FailedScheduling pod/collect-profiles-28224330-vb69j no nodes available to schedule pods
23m Warning FailedScheduling pod/collect-profiles-28224330-vb69j no nodes available to schedule pods
28m Normal SuccessfulCreate job/collect-profiles-28224330 Created pod: collect-profiles-28224330-vb69j
13m Warning FailedScheduling pod/collect-profiles-28224345-6hbjn no nodes available to schedule pods
9m8s Warning FailedScheduling pod/collect-profiles-28224345-6hbjn no nodes available to schedule pods
13m Normal SuccessfulCreate job/collect-profiles-28224345 Created pod: collect-profiles-28224345-6hbjn
43m Normal SuccessfulCreate cronjob/collect-profiles Created job collect-profiles-28224315
28m Normal SuccessfulDelete cronjob/collect-profiles Deleted job collect-profiles-28224315
28m Normal SuccessfulCreate cronjob/collect-profiles Created job collect-profiles-28224330
13m Normal SuccessfulDelete cronjob/collect-profiles Deleted job collect-profiles-28224330
13m Normal SuccessfulCreate cronjob/collect-profiles Created job collect-profiles-28224345
37m Warning FailedCreate replicaset/olm-operator-5fb85799cb Error creating: pods "olm-operator-5fb85799cb-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes
18m Warning FailedCreate replicaset/olm-operator-5fb85799cb Error creating: pods "olm-operator-5fb85799cb-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes
3m24s Warning FailedCreate replicaset/olm-operator-5fb85799cb Error creating: pods "olm-operator-5fb85799cb-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes
48m Normal ScalingReplicaSet deployment/olm-operator Scaled up replica set olm-operator-5fb85799cb to 1
37m Warning FailedCreate replicaset/package-server-manager-9dfd89b5c Error creating: pods "package-server-manager-9dfd89b5c-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes
18m Warning FailedCreate replicaset/package-server-manager-9dfd89b5c Error creating: pods "package-server-manager-9dfd89b5c-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes
3m24s Warning FailedCreate replicaset/package-server-manager-9dfd89b5c Error creating: pods "package-server-manager-9dfd89b5c-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes
48m Normal ScalingReplicaSet deployment/package-server-manager Scaled up replica set package-server-manager-9dfd89b5c to 1
39m Normal NoPods poddisruptionbudget/packageserver-pdb No matching pods found
19m Normal NoPods poddisruptionbudget/packageserver-pdb No matching pods found
9m7s Normal NoPods poddisruptionbudget/packageserver-pdb No matching pods found
[root@localhost kubernetes]# oc --kubeconfig kubeconfig get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version False True 53m Unable to apply 4.12.0-0.okd-2023-03-18-084815: an unknown error has occurred: MultipleErrors
위의 에러는 bootstrap 서버가 종료 되기 전에 master node를 설치 하지 않아서 발생하는 에러이다.
bootstrap 서버 기동 후 바로 master node 설치를 해야 한다.
정상적으로 master node 설치가 진행이 되면 아래와 같이 보인다.
[root@localhost core]# cd /etc/kubernetes
[root@localhost kubernetes]# oc --kubeconfig kubeconfig get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
openshift-apiserver-operator openshift-apiserver-operator-8f8cb9ff4-nmkd9 0/1 Pending 0 70s
openshift-authentication-operator authentication-operator-5f86f86574-fgljl 0/1 Pending 0 70s
openshift-cloud-controller-manager-operator cluster-cloud-controller-manager-operator-754cb759cb-cpb2w 2/2 Running 0 70s
openshift-cloud-credential-operator cloud-credential-operator-8995bf858-ch255 0/2 Pending 0 70s
openshift-cluster-machine-approver machine-approver-565564ff6c-r8fjv 0/2 Pending 0 70s
openshift-cluster-node-tuning-operator cluster-node-tuning-operator-8485fcdd45-2mgvd 0/1 ContainerCreating 0 70s
openshift-cluster-storage-operator cluster-storage-operator-749b9d89fc-bc2r6 0/1 Pending 0 70s
openshift-cluster-storage-operator csi-snapshot-controller-operator-7c66dbf5fb-dmtv7 0/1 ContainerCreating 0 70s
openshift-cluster-version cluster-version-operator-5d4474d7b8-xwjww 0/1 ContainerCreating 0 70s
openshift-config-operator openshift-config-operator-6d66bf8658-6zr4t 0/1 Pending 0 70s
openshift-controller-manager-operator openshift-controller-manager-operator-77879f6b5c-v6j9n 0/1 Pending 0 70s
openshift-dns-operator dns-operator-5f6df86d78-ztw7w 0/2 Pending 0 70s
openshift-etcd-operator etcd-operator-f8fc7cfb4-wxd65 0/1 Pending 0 70s
openshift-image-registry cluster-image-registry-operator-656b8989db-v6sb2 0/1 Pending 0 70s
openshift-ingress-operator ingress-operator-6d99dfbd-r4fdx 0/2 Pending 0 70s
openshift-insights insights-operator-75d976fcf7-2lhbw 0/1 Pending 0 70s
openshift-kube-apiserver-operator kube-apiserver-operator-54fcf47c74-grwkh 0/1 Pending 0 70s
openshift-kube-controller-manager-operator kube-controller-manager-operator-6d5b4bbc9b-7hsc7 0/1 ContainerCreating 0 70s
openshift-kube-scheduler-operator openshift-kube-scheduler-operator-7cf4676c48-xv4k7 0/1 Pending 0 70s
openshift-kube-storage-version-migrator-operator kube-storage-version-migrator-operator-6c4dbb7f98-s2t4q 0/1 Pending 0 70s
openshift-machine-api cluster-autoscaler-operator-768f7bc45-kkx4p 0/2 ContainerCreating 0 70s
openshift-machine-api cluster-baremetal-operator-69f64d77dc-fw4xz 0/2 ContainerCreating 0 70s
openshift-machine-api control-plane-machine-set-operator-6bb58cc97c-w792d 0/1 Pending 0 70s
openshift-machine-api machine-api-operator-d658c8c8f-8tqnq 0/2 Pending 0 70s
openshift-machine-config-operator machine-config-operator-6c567dcc74-cpdtd 0/1 Pending 0 70s
openshift-marketplace marketplace-operator-5dfccbb4f5-vnqzb 0/1 Pending 0 70s
openshift-monitoring cluster-monitoring-operator-585c6bf574-pzqth 0/2 Pending 0 70s
openshift-multus multus-additional-cni-plugins-rt5tr 0/1 Init:2/6 0 40s
openshift-multus multus-admission-controller-754f6f745f-6n786 0/2 Pending 0 37s
openshift-multus multus-sgv6q 1/1 Running 0 40s
openshift-multus network-metrics-daemon-9fq6h 0/2 ContainerCreating 0 39s
openshift-network-diagnostics network-check-source-f6dc45665-bcv5d 0/1 Pending 0 29s
openshift-network-diagnostics network-check-target-mhmvs 0/1 ContainerCreating 0 28s
openshift-network-operator network-operator-557ff659f8-b9s85 1/1 Running 0 70s
openshift-operator-lifecycle-manager catalog-operator-c5d66646d-k55px 0/1 Pending 0 70s
openshift-operator-lifecycle-manager collect-profiles-28225635-nm4zx 0/1 Pending 0 3m46s
openshift-operator-lifecycle-manager olm-operator-5fb85799cb-cd96x 0/1 Pending 0 70s
openshift-operator-lifecycle-manager package-server-manager-9dfd89b5c-pkd76 0/1 Pending 0 70s
openshift-sdn sdn-controller-xhfnk 0/2 ContainerCreating 0 33s
openshift-sdn sdn-fjh45 0/2 ContainerCreating 0 31s
openshift-service-ca-operator service-ca-operator-b65957b7f-dbprl 0/1 Pending 0 70s
에러 가 발생하지 않고 Bootstrap status: complete
메시지가 나오면 Bootstrap 서버가 정상 설치가 되고 master 노드 생성을 시작합니다.
[root@bastion ~]# /usr/local/bin/openshift-install --dir=/root/okd4 wait-for bootstrap-complete --log-level=debug
DEBUG OpenShift Installer 4.10.0-0.okd-2022-03-07-131213
DEBUG Built from commit 3b701903d96b6375f6c3852a02b4b70fea01d694
INFO Waiting up to 20m0s (until 9:58AM) for the Kubernetes API at https://api.okd4.ktdemo.duckdns.org:6443...
INFO API v1.23.3-2003+e419edff267ffa-dirty up
INFO Waiting up to 30m0s (until 10:08AM) for bootstrapping to complete...
DEBUG Bootstrap status: complete
DEBUG Bootstrap status: complete INFO It is now safe to remove the bootstrap resources DEBUG Time elapsed per stage: DEBUG Bootstrap Complete: 20m51s DEBUG API: 2s INFO Time elapsed: 20m51s
vmware 에 coreos 기반의 master/worker 겸용 서버를 생성한다. ( 생성 과정은 생략 )
<br.>
- 다운로드 위치 : https://builds.coreos.fedoraproject.org/browser?stream=stable&arch=x86_64
- Version : fedora-coreos-37.20230218.3.0-live.x86_64.iso
처음 기동인 되면 자동 로그인 이 되고 vmware 에서 OS 콘솔로 접속이 가능하다.
- vmware 콘솔은 붙여 넣기가 가능하다.
먼저 네트웍을 설정을 하기 위해서 network device 이름을 확인한다.
[root@localhost core]# nmcli device
DEVICE TYPE STATE CONNECTION
ens160 ethernet connected Wired connection 1
lo loopback unmanaged --
connection 이름을 ens160으로 생성한다.
[root@localhost core]# nmcli connection add type ethernet autoconnect yes con-name ens160 ifname ens160
네트웍 설정을 한다.
- ip : okd-1 서버는 192.168.1.146/24 로 설정한다.
- dns : bastion 서버는 192.168.1.247 로 설정한다.
- gateway : 공유기 ip 인 192.168.1.1 로 설정한다. ( bastion 서버 ip로 해도 상관 없음 )
- dns-search : okd4.ktdemo.duckdns.org 로 설정 ( cluster 이름 + . + base Domain)
[root@localhost core]# nmcli connection modify ens160 ipv4.addresses 192.168.1.146/24 ipv4.method manual
[root@localhost core]# nmcli connection modify ens160 ipv4.dns 192.168.1.247
[root@localhost core]# nmcli connection modify ens160 ipv4.gateway 192.168.1.1
[root@localhost core]# nmcli connection modify ens160 ipv4.dns-search okd4.ktdemo.duckdns.org
master 노드 ( okd-1 ) 설치를 한다.
[root@localhost core]# coreos-installer install /dev/sda -I http://192.168.1.247:8080/ign/master.ign --insecure-ignition --copy-network
Installing Fedora CoreOS 37.20230218.3.0 x86_64 (512-byte sectors)
> Read disk 2.5 GiB/2.5 GiB (100%)
Writing Ignition config
Copying networking configuration from /etc/NetworkManager/system-connections/
Copying /etc/NetworkManager/system-connections/ens160.nmconnection to installed system
Install complete.
hostname을 설정 하고 재기동 한다.
[root@localhost core]# hostnamectl set-hostname okd-1.okd4.ktdemo.duckdns.org
[root@localhost core]# reboot now
bastion 서버에서 아래 명령어로 모니터링을 하고 It is now safe to remove the bootstrap resources
가 나오면 정상적으로 master 노드가 설치가 완료 됩니다.
[root@bastion ~]# /usr/local/bin/openshift-install --dir=/root/okd4 wait-for bootstrap-complete --log-level=debug
It is now safe to remove the bootstrap resources
proxmox 에 coreos 기반의 worker 노드를 생성한다.
<br.>
- 다운로드 위치 : https://builds.coreos.fedoraproject.org/browser?stream=stable&arch=x86_64
- Version : fedora-coreos-37.20230218.3.0-live.x86_64.iso
먼저 네트웍을 설정을 하기 위해서 network device 이름을 확인한다.
[root@localhost core]# nmcli device
DEVICE TYPE STATE CONNECTION
ens18 ethernet connected Wired connection 1
lo loopback unmanaged --
connection 이름을 ens160으로 생성한다.
[root@localhost core]# nmcli connection add type ethernet autoconnect yes con-name ens18 ifname ens18
네트웍 설정을 한다.
- ip : okd-2 서버는 192.168.1.148/24 로 설정한다.
- dns : bastion 서버는 192.168.1.40 로 설정한다.
- gateway : 공유기 ip 인 192.168.1.1 로 설정한다. ( bastion 서버 ip로 해도 상관 없음 )
- dns-search : okd4.ktdemo.duckdns.org 로 설정 ( cluster 이름 + . + base Domain)
[root@localhost core]# nmcli connection modify ens18 ipv4.addresses 192.168.1.148/24 ipv4.method manual
[root@localhost core]# nmcli connection modify ens18 ipv4.dns 192.168.1.40
[root@localhost core]# nmcli connection modify ens18 ipv4.gateway 192.168.1.1
[root@localhost core]# nmcli connection modify ens18 ipv4.dns-search okd4.ktdemo.duckdns.org
worker 노드 ( okd-2 ) 설치를 한다.
[root@localhost core]# coreos-installer install /dev/sda -I http://192.168.1.40:8080/ign/worker.ign --insecure-ignition --copy-network
Installing Fedora CoreOS 37.20230218.3.0 x86_64 (512-byte sectors)
> Read disk 2.5 GiB/2.5 GiB (100%)
Writing Ignition config
Copying networking configuration from /etc/NetworkManager/system-connections/
Copying /etc/NetworkManager/system-connections/ens18.nmconnection to installed system
Install complete.
hostname을 설정 하고 재기동 한다.
[root@localhost core]# hostnamectl set-hostname okd-2.okd4.ktdemo.duckdns.org
[root@localhost core]# reboot now
worker node 설치 모니터링을 한다.
[root@bastion ~]# openshift-install --dir=/root/okd4 wait-for install-complete
INFO Waiting up to 40m0s (until 2:15PM) for the cluster at https://api.okd4.ktdemo.duckdns.org:6443 to initialize...
INFO Checking to see if there is a route at openshift-console/console...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/root/okd4/auth/kubeconfig'
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.okd4.ktdemo.duckdns.org
INFO Login to the console with user: "kubeadmin", and password: "**********"
INFO Time elapsed: 42s
설치가 완료가 되면 먼저 node 를 확인해 본다. 아직 worker node 가 조인 되지 않았다.
[root@bastion ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
okd-1.okd4.ktdemo.duckdns.org Ready master,worker 17d v1.23.3+759c22b
csr를 조회하고 승인을 한다.
[root@bastion ~]# oc get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-5lgxl 17s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Pending
csr-vhvjn 3m28s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none>
[root@bastion ~]# oc adm certificate approve csr-5lgxl
certificatesigningrequest.certificates.k8s.io/csr-5lgxl approved
[root@bastion ~]# oc adm certificate approve csr-vhvjn
certificatesigningrequest.certificates.k8s.io/csr-vhvjn approved
승인 대상이 너무 많은 경우는 아래처럼 일괄 승인한다.
[root@bastion ]# oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
다시 node를 조회하면 join 된 것을 확인 할수 있고 status 는 NotReady
이다.
[root@bastion ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
okd-1.okd4.ktdemo.duckdns.org Ready master,worker 17d v1.23.3+759c22b
okd-2.okd4.ktdemo.duckdns.org NotReady worker 21s v1.23.3+759c22b
다시 csr를 조회해 보면 승인 대기 중인 csr 이 있는데 okd 는 machine config 기능이 있어 master node 설정값이 worker node에 적용되는 시간이 필요하다.
[root@bastion ~]# oc get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-5lgxl 2m29s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issued
csr-j2m7b 17s kubernetes.io/kubelet-serving system:node:okd-2.okd4.ktdemo.duckdns.org <none> Pending
csr-vhvjn 5m40s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none>
시간이 좀 더 지나면 아래와 같이 Ready로 바뀐 것을 확인 할 수 있다.
[root@bastion ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
okd-1.okd4.ktdemo.duckdns.org Ready control-plane,master,worker 107m v1.25.7+eab9cc9
okd-2.okd4.ktdemo.duckdns.org Ready worker 109s v1.25.7+eab9cc9
향후 수정할수 있고 master 노드에서 worker role을 삭제하려면 아래 명령어를 통해 삭제 할 수 있습니다.
[root@bastion ~]# oc patch schedulers.config.openshift.io/cluster --type merge -p '{"spec":{"mastersSchedulable":false}}'
scheduler.config.openshift.io/cluster patched
[root@bastion ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
okd-1.okd4.ktdemo.duckdns.org Ready control-plane,master 109m v1.25.7+eab9cc9
okd-2.okd4.ktdemo.duckdns.org Ready worker 3m47s v1.25.7+eab9cc9
OKD 클러스터가 정상적으로 구성되었기 때문에 bastion 서버에서 HAProxy가 bootstrap으로 LB (Load Balancing) 되지 않도록 수정 후 서비스를 재시작합니다.
[root@bastion config]# vi /etc/haproxy/haproxy.cfg
backend openshift_api_backend
mode tcp
balance source
#server bootstrap 192.168.1.128:6443 check
server okd-1 192.168.1.146:6443 check
server okd-2 192.168.1.148:6443 check
backend ocp_machine_config_server_backend
mode tcp
balance source
#server bootstrap 192.168.1.128:22623 check
server okd-1 192.168.1.146:22623 check
server okd-2 192.168.1.148:22623 check
haproxy 를 재기동 한다.
[root@bastion config]# systemctl restart haproxy
.bash_profile 에 연결 정보를 설정합니다.
[root@bastion ~]# vi ~/.bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
# /usr/local/bin 추가
PATH=$PATH:$HOME/bin:/usr/local/bin
export PATH
# Added for okd4 : 아래 구문 추가
export KUBECONFIG=/root/okd4/auth/kubeconfig
source 명령어로 profile 를 적용하고 node를 조회해 봅니다.
정상적으로 node가 조회가 됩니다.
[root@bastion ~]# source ~/.bash_profile
[root@bastion ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
okd-1.okd4.ktdemo.duckdns.org Ready control-plane,master 17m v1.25.7+eab9cc9
cluster componet 조회를 해봅니다.
모든 Cluster Operator가 True / False / False 여야 정상입니다.
[root@bastion ~]# oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.10.0-0.okd-2022-03-07-131213 True True False 64s OAuthServerDeploymentProgressing: deployment/oauth-openshift.openshift-authentication: observed generation is 2, desired generation is 3.
baremetal 4.10.0-0.okd-2022-03-07-131213 True False False 14m
cloud-controller-manager 4.10.0-0.okd-2022-03-07-131213 True False False 18m
cloud-credential 4.10.0-0.okd-2022-03-07-131213 True False False 17m
cluster-autoscaler 4.10.0-0.okd-2022-03-07-131213 True False False 14m
config-operator 4.10.0-0.okd-2022-03-07-131213 True False False 16m
console 4.10.0-0.okd-2022-03-07-131213 True False False 64s
csi-snapshot-controller 4.10.0-0.okd-2022-03-07-131213 True False False 16m
dns 4.10.0-0.okd-2022-03-07-131213 True False False 14m
etcd 4.10.0-0.okd-2022-03-07-131213 True False False 14m
image-registry 4.10.0-0.okd-2022-03-07-131213 True False False 7m6s
ingress 4.10.0-0.okd-2022-03-07-131213 True False False 13m
insights 4.10.0-0.okd-2022-03-07-131213 True False False 9m58s
kube-apiserver 4.10.0-0.okd-2022-03-07-131213 True False False 9m29s
kube-controller-manager 4.10.0-0.okd-2022-03-07-131213 True False False 13m
kube-scheduler 4.10.0-0.okd-2022-03-07-131213 True False False 9m38s
kube-storage-version-migrator 4.10.0-0.okd-2022-03-07-131213 True False False 16m
machine-api 4.10.0-0.okd-2022-03-07-131213 True False False 15m
machine-approver 4.10.0-0.okd-2022-03-07-131213 True False False 16m
machine-config 4.10.0-0.okd-2022-03-07-131213 True False False 14m
marketplace 4.10.0-0.okd-2022-03-07-131213 True False False 14m
monitoring 4.10.0-0.okd-2022-03-07-131213 True False False 52s
network 4.10.0-0.okd-2022-03-07-131213 True False False 16m
node-tuning 4.10.0-0.okd-2022-03-07-131213 True False False 15m
openshift-apiserver 4.10.0-0.okd-2022-03-07-131213 True False False 5m27s
openshift-controller-manager 4.10.0-0.okd-2022-03-07-131213 True False False 15m
openshift-samples 4.10.0-0.okd-2022-03-07-131213 True False False 6m53s
operator-lifecycle-manager 4.10.0-0.okd-2022-03-07-131213 True False False 15m
operator-lifecycle-manager-catalog 4.10.0-0.okd-2022-03-07-131213 True False False 15m
operator-lifecycle-manager-packageserver 4.10.0-0.okd-2022-03-07-131213 True False False 8m48s
service-ca 4.10.0-0.okd-2022-03-07-131213 True False False 16m
storage 4.10.0-0.okd-2022-03-07-131213 True False False 16m
[root@bastion ~]# oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.12.0-0.okd-2023-03-18-084815 False False True 14m OAuthServerServiceEndpointAccessibleControllerAvailable: Get "https://172.30.64.244:443/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)...
baremetal 4.12.0-0.okd-2023-03-18-084815 True False False 12m
cloud-controller-manager 4.12.0-0.okd-2023-03-18-084815 True False False 16m
cloud-credential True False False 24m
cluster-autoscaler 4.12.0-0.okd-2023-03-18-084815 True False False 11m
config-operator 4.12.0-0.okd-2023-03-18-084815 True False False 14m
console 4.12.0-0.okd-2023-03-18-084815 Unknown False False 81s
control-plane-machine-set 4.12.0-0.okd-2023-03-18-084815 True False False 12m
csi-snapshot-controller 4.12.0-0.okd-2023-03-18-084815 True False False 13m
dns 4.12.0-0.okd-2023-03-18-084815 True False False 8m11s
etcd 4.12.0-0.okd-2023-03-18-084815 True False False 9m1s
image-registry 4.12.0-0.okd-2023-03-18-084815 True False False 2m45s
ingress 4.12.0-0.okd-2023-03-18-084815 True True True 11m The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: PodsScheduled=False (PodsNotScheduled: Some pods are not scheduled: Pod "router-default-fb8b9b7f9-5zkz6" cannot be scheduled: 0/1 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling. Make sure you have sufficient worker nodes.), CanaryChecksSucceeding=Unknown (CanaryRouteNotAdmitted: Canary route is not admitted by the default ingress controller)
insights 4.12.0-0.okd-2023-03-18-084815 True False False 7m44s
kube-apiserver 4.12.0-0.okd-2023-03-18-084815 True False False 7m26s
kube-controller-manager 4.12.0-0.okd-2023-03-18-084815 True True True 7m55s GarbageCollectorDegraded: error fetching rules: Get "https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules": dial tcp: lookup thanos-querier.openshift-monitoring.svc on 172.30.0.10:53: no such host
kube-scheduler 4.12.0-0.okd-2023-03-18-084815 True False False 7m
kube-storage-version-migrator 4.12.0-0.okd-2023-03-18-084815 True False False 13m
machine-api 4.12.0-0.okd-2023-03-18-084815 True False False 12m
machine-approver 4.12.0-0.okd-2023-03-18-084815 True False False 12m
machine-config 4.12.0-0.okd-2023-03-18-084815 True False False 7m28s
marketplace 4.12.0-0.okd-2023-03-18-084815 True False False 11m
monitoring Unknown True Unknown 12m Rolling out the stack.
network 4.12.0-0.okd-2023-03-18-084815 True True False 13m Deployment "/openshift-network-diagnostics/network-check-source" is waiting for other operators to become ready
node-tuning 4.12.0-0.okd-2023-03-18-084815 True False False 11m
openshift-apiserver 4.12.0-0.okd-2023-03-18-084815 True False False 101s
openshift-controller-manager 4.12.0-0.okd-2023-03-18-084815 True False False 7m14s
openshift-samples 4.12.0-0.okd-2023-03-18-084815 True False False 5m56s
operator-lifecycle-manager 4.12.0-0.okd-2023-03-18-084815 True False False 12m
operator-lifecycle-manager-catalog 4.12.0-0.okd-2023-03-18-084815 True False False 12m
operator-lifecycle-manager-packageserver 4.12.0-0.okd-2023-03-18-084815 True False False 6m11s
service-ca 4.12.0-0.okd-2023-03-18-084815 True False False 14m
storage 4.12.0-0.okd-2023-03-18-084815 True False False 14m
bastion 서버에서 아래 명령어로 설치 확인을 하며 kubeadmin 비밀번호를 알수 있습니다.
향후 계정을 신규로 생성하여 cluster admin 권한을 준후 kubeadmin은 삭제합니다.
- cluster admin 계정을 먼저 생성하지 않고 kubeadmin 삭제하면 cluster 재생성 해야합니다.
[root@bastion ~]# openshift-install --dir=/root/okd4 wait-for install-complete
INFO Waiting up to 40m0s (until 10:35AM) for the cluster at https://api.okd4.ktdemo.duckdns.org:6443 to initialize...
INFO Waiting up to 10m0s (until 10:05AM) for the openshift-console route to be created...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/root/okd4/auth/kubeconfig'
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.okd4.ktdemo.duckdns.org
INFO Login to the console with user: "kubeadmin", and password: "HeJDB-***-****b****-4hb4q"
INFO Time elapsed: 0s
admin으로 먼저 접속을 해봅니다.
[root@bastion ~]# oc login -u system:admin
Logged into "https://api.okd4.ktdemo.duckdns.org:6443" as "system:admin" using existing credentials.
You have access to 65 projects, the list has been suppressed. You can list all projects with 'oc projects'
Using project "default".
okd 계정을 생성을 하고 비밀번호를 htpasswd 방식으로 설정하기 위해 httpd-tools 라이브러리를 설치합니다.
[root@bastion ~]# dnf install -y httpd-tools
Last metadata expiration check: 1:51:05 ago on Mon 07 Aug 2023 02:28:18 AM EDT.
Package httpd-tools-2.4.37-54.module_el8.8.0+1256+e1598b50.x86_64 is already installed.
Dependencies resolved.
Nothing to do.
Complete!
shclub 라는 이름으로 계정을 만들고 비밀번호도 같이 입력합니다.
[root@bastion ~]# touch htpasswd
[root@bastion ~]# htpasswd -Bb htpasswd shclub 'S#123************'
Adding password for user shclub
[root@bastion ~]# cat htpasswd
shclub:$2y$05$kjWLoagesIMy0.**************
htpasswd 라는 이름으로 kubernetes secret 을 생성합니다.
[root@bastion ~]# oc --user=admin create secret generic htpasswd --from-file=htpasswd -n openshift-config
[root@bastion ~]# oc get secret -n openshift-config
NAME TYPE DATA AGE
builder-dockercfg-lkfwf kubernetes.io/dockercfg 1 20m
builder-token-n4j9g kubernetes.io/service-account-token 4 20m
builder-token-sng5g kubernetes.io/service-account-token 4 20m
default-dockercfg-cdtkb kubernetes.io/dockercfg 1 20m
default-token-mxlks kubernetes.io/service-account-token 4 20m
default-token-s2w2r kubernetes.io/service-account-token 4 27m
deployer-dockercfg-s2wbh kubernetes.io/dockercfg 1 20m
deployer-token-skjsq kubernetes.io/service-account-token 4 20m
deployer-token-wf8wn kubernetes.io/service-account-token 4 20m
etcd-client kubernetes.io/tls 2 27m
etcd-metric-client kubernetes.io/tls 2 27m
etcd-metric-signer kubernetes.io/tls 2 27m
etcd-signer kubernetes.io/tls 2 27m
htpasswd Opaque 0 8s
initial-service-account-private-key Opaque 1 27m
pull-secret kubernetes.io/dockerconfigjson 1 27m
webhook-authentication-integrated-oauth Opaque 1 24m
OKD Cluster 에 생성하기 위해 Local Password
라는 이름으로 identityProviders 를 생성하고 replace 명령어를 사용하여 적용합니다.
[root@bastion ~]# vi oauth-config.yaml
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: Local Password
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpasswd
[root@bastion ~]# oc replace -f oauth-config.yaml
oauth.config.openshift.io/cluster replaced
브라우저를 통해 okd web console 에 접속하기 위해 url을 확인합니다.
[root@bastion ~]# oc whoami --show-console
https://console-openshift-console.apps.okd4.ktdemo.duckdns.org
웹브라우저와 외부에서 접속하기 위해서는 공유기에서 포트포워딩을 해주어 합니다. ( 6443,443 포트)
또한, bastion 서버의 nameserver 를 공유기의 ip로 아래과 같이 변경한다.
[root@bastion ~]# vi /etc/resolv.conf
# Generated by NetworkManager
search okd4.ktdemo.duckdns.org
nameserver 192.168.1.1
웹 브라우저에서 접속해 보면 정상적인 okd 로그인 화면이 나옵니다.
bastion 서버에서 해당 유저로 로그인 합니다.
[root@bastion ~]# oc login https://api.okd4.ktdemo.duckdns.org:6443 -u shclub -p N********9876! --insecure-skip-tls-verify
Login successful.
You don't have any projects. You can try to create a new project, by running
oc new-project <projectname>
계정을 추가하는 경우에는 먼저 htpasswd secret에서 기존 정보를 가져온다.
[root@bastion ]# oc get secret htpasswd -ojsonpath={.data.htpasswd} -n openshift-config | base64 --decode > htpasswd
htpasswd 화일에 기존 계정의 값이 있고 아래와 같이 신규 계정을 추가한다.
[root@bastion ~]# htpasswd -Bb htpasswd edu1 'S#123************'
Adding password for user edu1
이제 적용하고 web console에서 다시 접속해 본다.
[root@bastion argocd]# oc --user=admin create secret generic htpasswd --from-file=htpasswd -n openshift-config --dry-run=client -o yaml | oc replace -f -
secret/htpasswd replaced
shcub 라는 namespace를 생성합니다.
[root@bastion ~]# oc new-project shclub
해당 namespace 에 pod를 생성하면 시작이된 후 권한이 없어서 곧 에러가 발생합니다.
admin 으로 로그인 한 후 anyuid 권한을 할당합니다.
아래와 같은 로그 발생시에는 context를 변경해야 합니다.
[root@bastion ~]# oc login -u system:admin
error: username system:admin is invalid for basic auth
context를 조회를 하면 * 표시가 현재 사용하는 context 입니다.
[root@bastion ~]# oc config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* /api-okd4-ktdemoduckdns-org:6443/root api-okd4-ktdemoduckdns-org:6443 root/api-okd4-ktdemoduckdns-org:6443
/api-okd4-ktdemoduckdns-org:6443/shclub api-okd4-ktdemoduckdns-org:6443 shclub/api-okd4-ktdemoduckdns-org:6443
admin okd4 admin
default/api-okd4-ktdemo-duckdns-org:6443/system:admin api-okd4-ktdemo-duckdns-org:6443 system:admin/api-okd4-ktdemo-duckdns-org:6443 default
use-context 구문을 사용하여 현재 사용하는 context 로 변경합니다.
[root@bastion ~]# kubectl config use-context default/api-okd4-ktdemo-duckdns-org:6443/system:admin
Switched to context "default/api-okd4-ktdemo-duckdns-org:6443/system:admin".
[root@bastion ~]# oc login -u system:admin
Logged into "https://api.okd4.ktdemo.duckdns.org:6443" as "system:admin" using existing credentials.
You have access to 65 projects, the list has been suppressed. You can list all projects with 'oc projects'
Using project "default".
shclub namespace로 접속하기 위해 default service account 에 anyuid 권한을 할당합니다.
oc adm policy add-scc-to-user anyuid system:serviceaccount:<NAMESPACE>:default
[root@bastion ~]# oc adm policy add-scc-to-user anyuid system:serviceaccount:shclub:default
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:anyuid added: "default"
admin 권한을 추가로 부여합니다.
oc adm policy add-role-to-user admin <계정> -n <NAMESPACE>
[root@bastion ~]# oc adm policy add-role-to-user admin shclub -n shclub
clusterrole.rbac.authorization.k8s.io/admin added: "shclub"
권한 제거는 아래와 같다.
oc adm policy remove-role-from-user <role> <username>
Pod를 하나 생성해 봅니다.
[root@bastion ~]# kubectl run nginx --image=nginx
pod/nginx created
[root@bastion ~]# kubectl get po
NAME READY STATUS RESTARTS AGE
nginx 0/1 ContainerCreating 0 4s
[root@bastion ~]# kubectl get po
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 8s
정상적으로 Running 하는 것을 확인 할 수 있습니다.
kubeadmin 대신 cluster-admin 생성은 root 라는 계정을 만들고 cluster-admin권한을 할당합니다.
[root@bastion ~]# oc adm policy add-cluster-role-to-user cluster-admin root
clusterrole.rbac.authorization.k8s.io/cluster-admin added: "root"
coreos 를 ssh 대신 패스워드 방식으로 접속하기 위해서는 먼저 ssh로 로그인을 하고 super user 권한을 확득합니다.
[core@localhost ~]$ sudo su
core 계정에 비밀번호를 생성합니다.
[root@localhost core]# passwd core
Changing password for user core.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
/etc/ssh/sshd_config.d 폴더로 이동합니다.
[root@localhost core]# cd /etc/ssh/sshd_config.d
[root@localhost ssh]# ls
moduli ssh_config.d ssh_host_ecdsa_key.pub ssh_host_ed25519_key.pub ssh_host_rsa_key.pub
ssh_config ssh_host_ecdsa_key ssh_host_ed25519_key ssh_host_rsa_key sshd_config
20-enable-passwords.conf 를 생성한다.
[root@okd-1 sshd_config.d]# ls
10-insecure-rsa-keysig.conf 40-disable-passwords.conf 40-ssh-key-dir.conf 50-redhat.conf
[root@localhost ssh]# vi 20-enable-passwords.conf
아래와 같이 값을 생성하고 저장한다.
PasswordAuthentication yes
sshd 데몬을 재기동한다.
[root@localhost ssh]# systemctl restart sshd
이제 어느 곳에서든 id/password로 접속 가능하다.
설치를 위해서 cluster admin 권한이 있어야 한다.
참고 : https://github.com/shclub/cloudtty
jakelee@jake-MacBookAir cloudshell % oc login https://api.okd4.ktdemo.duckdns.org:6443 -u root -p Sh********0 --insecure-skip-tls-verify
WARNING: Using insecure TLS client config. Setting this option is not supported!
Login successful.
You have access to 68 projects, the list has been suppressed. You can list all projects with 'oc projects'
Using project "shclub".
jakelee@jake-MacBookAir ~ % helm install cloudtty-operator --version 0.5.0 cloudtty/cloudtty
NAME: cloudtty-operator
LAST DEPLOYED: Fri Aug 11 13:19:49 2023
NAMESPACE: shclub
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Thank you for installing cloudtty.
Your release is named cloudtty-operator.
To learn more about the release, try:
$ helm status cloudtty-operator
$ helm get all cloudtty-operator
Documention: https://github.com/cloudtty/cloudtty/-/blob/main/README.md
jakelee@jake-MacBookAir ~ % kubectl wait deployment cloudtty-operator-controller-manager --for=condition=Available=True
deployment.apps/cloudtty-operator-controller-manager condition met
root/.kube 폴더에 kubeconfig 를 생성한다.
[root@bastion .kube]# vi kubeconfig
apiVersion: v1
clusters:
- cluster:
insecure-skip-tls-verify: true
server: https://api.okd4.ktdemoduckdns.org:6443
name: api-okd4-ktdemoduckdns-org:6443
contexts:
- context:
cluster: api-okd4-ktdemoduckdns-org:6443
user: shclub/api-okd4-ktdemoduckdns-org:6443
name: /api-okd4-ktdemoduckdns-org:6443/shclub
current-context: /api-okd4-ktdemoduckdns-org:6443/root
kind: Config
preferences: {}
users:
- name: shclub/api-okd4-ktdemoduckdns-org:6443
user:
token: sha256~********cnLskhIKbwmYwoKAuZ9sowsvSZTsiU
[root@bastion .kube]# ls -al
total 32
drwxr-x--- 3 root root 37 Aug 9 10:44 .
dr-xr-x---. 11 root root 4096 Aug 20 21:09 ..
drwxr-x--- 4 root root 35 Aug 9 09:53 cache
-rw-r----- 1 root root 25107 Aug 9 10:44 kubeconfig
my-kubeconfig 라는 이름으로 secret을 생성한다.
- 접속 정보를 제한.
[root@bastion ~]# kubectl create secret generic my-kubeconfig --from-file=/root/.kube/config
secret/my-kubeconfig created
Dockerfile 을 생성하고 custom 이미지를 생성한다.
https://github.com/shclub/cloudshell 의 Dockerfile를 사용하여 GitHub Action으로 생성한다.
- openshift client 는 Alpine Docker image 에서는 동작하지 않기 때문에 apk add gcompat 을 추가한다. ( 이미 추가됨 )
cloudshell 을 생성하기 위한 yaml 화일을 생성하고 적용한다.
[root@bastion cloudshell]# cat cloud_shell.yaml
apiVersion: cloudshell.cloudtty.io/v1alpha1
kind: CloudShell
metadata:
name: okd-shell
spec:
secretRef:
name: "my-kubeconfig"
image: shclub/cloudshell:master
# commandAction: "kubectl -n shclub get po && bash"
commandAction: "bash"
exposureMode: "ClusterIP"
# exposureMode: "NodePort"
ttl: 55555555 # ttl 설정된 시간 만큼 pod 유지
once: false
[root@bastion cloudshell]# kubectl apply -f cloud_shell.yaml -n shclub
cloudshell.cloudshell.cloudtty.io/okd-shell created
아래 명령어로 모니터링을 하고 Ready
상태가 되면 정상적으로 생성이 된것 입니다.
jakelee@jake-MacBookAir ~ % kubectl get cloudshell -w
NAME USER COMMAND TYPE URL PHASE AGE
okd-shell bash ClusterIP 0s
okd-shell bash ClusterIP 0s
okd-shell bash ClusterIP CreatedJob 0s
okd-shell bash ClusterIP 172.30.180.191:7681 CreatedRouteRule 0s
okd-shell bash ClusterIP 172.30.180.191:7681 Ready 3s
서비스 이름을 확인한다.
[root@bastion cloudshell]# kubectl get po -n shclub
NAME READY STATUS RESTARTS AGE
cloudshell-okd-shell-llr7q 1/1 Running 0 5s
cloudtty-operator-controller-manager-574c45b9df-zg7cf 1/1 Running 0 4m30s
route 생성시 주의 사항은 tls option 설정을 아래와 같이 해야함. (Allow
, edge
)
[root@bastion cloudshell]# cat cloudshell_route.yaml
apiVersion: route.openshift.io/v1
kind: Route
metadata:
labels:
app : okd-shell
name: okd-shell
spec:
host: okd-shell-shclub.apps.okd4.ktdemo.duckdns.org
port:
targetPort: ttyd
tls:
insecureEdgeTerminationPolicy: Allow
termination: edge
# tls:
# insecureEdgeTerminationPolicy: Redirect
# termination: reencrypt
to:
kind: Service
name: cloudshell-okd-shell
weight: 100
wildcardPolicy: None
route를 생성한다.
[root@bastion cloudshell]# kubectl apply -f cloudshell_route.yaml -n shclub
[root@bastion cloudshell]# kubectl get route -n shclub
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
console okd-shell-shclub.apps.okd4.ktdemo.duckdns.org cloudshell-okd-shell https edge/Allow None
브라우저에서 route 인 https://okd-shell-shclub.apps.okd4.ktdemo.duckdns.org 로 접속해 보면 아래와 같이 shell이 생성 된것을 알수 있다.
[root@okd-1 core]# timedatectl set-timezone Asia/Seoul
[root@okd-1 core]# date
Mon Aug 21 09:35:38 KST 2023
linux는 jounal 로그파일 크기가 많이 차지하면 시간이 엄청 걸립니다.
systemd-journal로그 서비스는 커널의 로그, 초기 시스템 시작 단계, 시작 및 실행 중 시스템 데몬의 표준 출력 및 오류 메시지, syslog에서 로그를 수집하는 향상된 로그 관리 서비스입니다.
여러가지 로그를 수집하므로 부피가 엄청 늘어날수 있어 조정이 필요.
journal 로그 파일 사이즈 확인
[root@okd-1 core]# journalctl --disk-usage
Archived and active journals take up 3.9G in the file system.
500M 의 로그만 유지
[root@okd-1 core]# journalctl --vacuum-size=500M
Vacuuming done, freed 0B of archived journals from /var/log/journal.
Vacuuming done, freed 0B of archived journals from /run/log/journal.
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000012c8827-0006035fa10b7629.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000012e44c0-0006035fc78d6328.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-0000000001300140-0006035fed1a8a7e.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000131bdbe-0006036012a6a507.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-0000000001337a32-0006036038a6c081.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000013536d8-000603605e0b49c9.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000136f355-0006036083a417f6.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000138afcf-00060360a8f02351.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000013a6c6c-00060360ce699352.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000013c28e6-00060360f3f491ba.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000013de56f-00060361198158c4.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000013fa213-000603613ea5b7d6.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-0000000001415e89-0006036163819ecc.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-0000000001431afd-0006036188605326.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000144d776-00060361ad322dd9.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000146941a-00060361d1b0f781.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000148509e-00060361f6d10816.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000014a0d35-000603621b013193.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000014bc9cb-000603623f5a18fa.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000014d8636-00060362642d0d3c.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000014f42b0-00060362881420b1.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000150ff51-00060362abff7e08.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000152bbc5-00060362cfac3926.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-0000000001547830-00060362f3ef5bf9.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000015634c7-0006036317d2f23e.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000157f14c-000603633bc0af8e.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-000000000159ade2-000603635f64323d.journal (128.0M).
Deleted archived journal /var/log/journal/126376a91cbf47ffab943ee1bddd8398/system@26be5bf4a08e49c2beb6cbdceed653fc-00000000015b6a85-00060363827b4591.journal (128.0M).
Vacuuming done, freed 3.5G of archived journals from /var/log/journal/126376a91cbf47ffab943ee1bddd8398.
Vacuuming done, freed 0B of archived journals from /run/log/journal/cf886e957b874fa6b0133b1043b8a2c4.
에러 발생시 아래 처럼 로그를 다 삭제하고 재기동한다.
[root@okd-1 core]# rm -rf /var/log/journal/*
[root@okd-1 core]# systemctl restart systemd-journald.service
먼저 namespace를 2개를 생성한다.
oc new-project argocd
oc new-project argo-rollouts
권한을 설정한다.
oc adm policy add-scc-to-user anyuid -z default -n argocd
oc adm policy add-scc-to-user privileged -z default -n argocd
oc adm policy add-scc-to-user privileged -z argocd-redis -n argocd
oc adm policy add-scc-to-user privileged -z argocd-repo-server -n argocd
oc adm policy add-scc-to-user privileged -z argocd-dex-server -n argocd
oc adm policy add-scc-to-user privileged -z argocd-server -n argocd
oc adm policy add-scc-to-user privileged -z argocd-applicationset-controller -n argocd
oc adm policy add-scc-to-user privileged -z argocd-notifications-controller -n argocd
ArgoCD Manifest 화일을 다운 받는다. argo-cd.yaml 화일이 다운로드 된 것을 확인 할 수 있다.
curl https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml -o argo-cd.yaml
이번엔 Argo CD CLI 툴을 다운로드하고, PATH 경로에 추가한다.
VERSION=$(curl --silent "https://api.github.com/repos/argoproj/argo-cd/releases/latest" | grep '"tag_name"' | sed -E 's/.*"([^"]+)".*/\1/')
curl -sSL -o /usr/local/bin/argocd https://github.com/argoproj/argo-cd/releases/download/$VERSION/argocd-linux-amd64
chmod +x /usr/local/bin/argocd
k8s에 ArgoCD를 설치 합니다.
kubectl apply -n argocd -f argo-cd.yaml
k8s에 argo-rollouts을 설치 합니다.
argo-rollouts 은 blue/green 과 canary 배포 방식을 지원합니다.
kubectl apply -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml -n argo-rollouts
redis pod 가 안올라가는 경우는 event를 확인 하는데 아래와 같이 에러가 발생한다.
deployment에서 runAsUser 값을 ranges 값으로 변경한다.
spec.containers[0].securityContext.runAsUser: Invalid value: 999: must be in the ranges: [1000660000, 1000669999],
route 를 생성하기 위해 서비스 이름을 확인합니다.
[root@bastion argocd]# kubectl get svc -n argocd
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argocd-applicationset-controller ClusterIP 172.30.169.175 <none> 7000/TCP,8080/TCP 28m
argocd-dex-server ClusterIP 172.30.253.118 <none> 5556/TCP,5557/TCP,5558/TCP 28m
argocd-metrics ClusterIP 172.30.215.134 <none> 8082/TCP 28m
argocd-notifications-controller-metrics ClusterIP 172.30.70.168 <none> 9001/TCP 28m
argocd-redis ClusterIP 172.30.211.26 <none> 6379/TCP 28m
argocd-repo-server ClusterIP 172.30.217.222 <none> 8081/TCP,8084/TCP 28m
argocd-server ClusterIP 172.30.172.143 <none> 80/TCP,443/TCP 28m
웹 브라우저에서 접속하기 위해 route 를 생성합니다.
[root@bastion argocd]# vi argocd_route.yaml
apiVersion: route.openshift.io/v1
kind: Route
metadata:
labels:
app : argocd
name: argocd
spec:
host: argocd.apps.okd4.ktdemo.duckdns.org
port:
targetPort: http
tls:
insecureEdgeTerminationPolicy: Allow
termination: edge
to:
kind: Service
name: argocd-server
weight: 100
wildcardPolicy: None
적용하고 생성된 route를 확인합니다.
[root@bastion argocd]# kubectl apply -f argocd_route.yaml -n argocd
[root@bastion argocd]# kubectl get route -n argocd
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
argocd argocd.apps.okd4.ktdemo.duckdns.org argocd-server http edge/Allow None
route 에서 https 인증을 처리하는 경우, argocd 의 https 강제실행을 중지시켜야 할 필요가 있다.
아래 명령으로 강제 실행을 중지한다.
[root@bastion argocd]# kubectl edit cm argocd-cmd-params-cm -n argocd
apiVersion: v1
data:
server.insecure: "true"
kind: ConfigMap
아래 내용 추가
data:
server.insecure: "true"
최신 버전부터는 OKD에 설치시 속도가 느려지고 argocd-redis 서비스 연결 오류가 발생하는 데 이런 경우에는 network policy를 삭제 하면 된다.
[root@bastion argocd]# kubectl get networkpolicy -n argocd
NAME POD-SELECTOR AGE
argocd-application-controller-network-policy app.kubernetes.io/name=argocd-application-controller 114m
argocd-applicationset-controller-network-policy app.kubernetes.io/name=argocd-applicationset-controller 114m
argocd-dex-server-network-policy app.kubernetes.io/name=argocd-dex-server 114m
argocd-notifications-controller-network-policy app.kubernetes.io/name=argocd-notifications-controller 114m
argocd-redis-network-policy app.kubernetes.io/name=argocd-redis 114m
argocd-repo-server-network-policy app.kubernetes.io/name=argocd-repo-server 114m
argocd-server-network-policy app.kubernetes.io/name=argocd-server
아래와 같이 삭제한다.
[root@bastion argocd]# kubectl delete networkpolicy argocd-server-79bc95c4-kkhwv argocd-repo-server-network-policy argocd-redis-network-policy argocd-dex-server-network-policy -n argocd
networkpolicy.networking.k8s.io "argocd-repo-server-network-policy" deleted
networkpolicy.networking.k8s.io "argocd-redis-network-policy" deleted
networkpolicy.networking.k8s.io "argocd-dex-server-network-policy" deleted
deployment 전체 재기동은 아래와 같이 실행한다.
[root@bastion argocd]# kubectl rollout restart deployment -n argocd
deployment.apps/argocd-applicationset-controller restarted
deployment.apps/argocd-dex-server restarted
deployment.apps/argocd-notifications-controller restarted
deployment.apps/argocd-redis restarted
deployment.apps/argocd-repo-server restarted
deployment.apps/argocd-server restarted
초기 비밀번호 확인
[root@bastion argocd]# kubectl get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" -n argocd | base64 -d && echo
1jBqpaCukWy58RzT
웹브라우저에서 http://argocd-argocd.apps.okd4.ktdemo.duckdns.org 로 접속하고 admin 계정/초기비밀번호로 로그인 하고 비밀번호를 변경합니다.
계정 생성을 위해서는 argocd-server를 NodePort 로 expose하고 argocd cli 로 접속한다.
[root@bastion argocd]# kubectl get svc -n argocd
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argocd-applicationset-controller ClusterIP 172.30.157.78 <none> 7000/TCP,8080/TCP 162m
argocd-dex-server ClusterIP 172.30.96.19 <none> 5556/TCP,5557/TCP,5558/TCP 162m
argocd-metrics ClusterIP 172.30.216.229 <none> 8082/TCP 162m
argocd-notifications-controller-metrics ClusterIP 172.30.155.71 <none> 9001/TCP 162m
argocd-redis ClusterIP 172.30.45.66 <none> 6379/TCP 162m
argocd-repo-server ClusterIP 172.30.178.138 <none> 8081/TCP,8084/TCP 162m
argocd-server NodePort 172.30.154.250 <none> 80:30866/TCP,443:31928/TCP 162m
argocd-server-metrics ClusterIP 172.30.55.105 <none> 8083/TCP 162m
[root@bastion argocd]# argocd login 192.168.1.40:30866
FATA[0000] dial tcp 192.168.1.247:30866: connect: connection refused
Master 와 Workor node 의 ip로 접속한다.
[root@bastion argocd]# argocd login 192.168.1.146:30866
WARNING: server is not configured with TLS. Proceed (y/n)? y
Username: admin
Password:
'admin:login' logged in successfully
Context '192.168.1.146:30866' updated
argocd-cm
configmap에 data 를 생성하고 계정을 아래와 같이 추가합니다.
[root@bastion argocd]# kubectl -n argocd edit configmap argocd-cm -o yaml
apiVersion: v1
data:
accounts.edu1: apiKey,login
accounts.edu2: apiKey,login
accounts.edu3: apiKey,login
accounts.edu4: apiKey,login
accounts.edu5: apiKey,login
accounts.haerin: apiKey,login
accounts.shclub: apiKey,login
accounts.rorty: apiKey,login
accounts.hans: apiKey,login
exec.enabled: "true"
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"ConfigMap","metadata":{"annotations":{},"labels":{"app.kubernetes.io/name":"argocd-cm","app.kubernetes.io/part-of":"argocd"},"name":"argocd-cm","namespace":"argocd"}}
creationTimestamp: "2023-08-22T06:32:36Z"
labels:
app.kubernetes.io/name: argocd-cm
app.kubernetes.io/part-of: argocd
name: argocd-cm
namespace: argocd
resourceVersion: "5057239"
uid: 2b5d0dbe-eb9c-41f7-bc26-e93355aa6e48
계정 별로 비밀번호를 생성합니다.
[root@bastion argocd]# argocd account update-password --account shclub
*** Enter password of currently logged in user (admin):
*** Enter new password for user shclub:
*** Confirm new password for user shclub:
ERRO[0011] Passwords do not match
*** Enter new password for user shclub:
*** Confirm new password for user shclub:
Password updated
아래처럼 해도 됨
argocd account update-password --account developer --new-password Developer123
argocd-rbac-cm
configmap에 data 를 생성하고 계정별로 권한을 추가합니다.
- exec는 terminal 을 사용할수 있는 권한 입니다.
[root@bastion argocd]# kubectl -n argocd edit configmap argocd-rbac-cm -o yaml
data:
policy.csv: |
p, role:manager, applications, *, */*, allow
p, role:manager, clusters, get, *, allow
p, role:manager, repositories, *, *, allow
p, role:manager, projects, *, *, allow
p, role:manager, exec, * , */*, allow
p, role:edu1, clusters, get, *, allow
p, role:edu1, repositories, get, *, allow
p, role:edu1, projects, get, *, allow
p, role:edu1, applications, *, edu1/*, allow
p, role:edu1, exec, create, edu1/*, allow
p, role:edu2, clusters, get, *, allow
p, role:edu2, repositories, get, *, allow
p, role:edu2, projects, get, *, allow
p, role:edu2, applications, *, edu2/*, allow
p, role:edu2, exec, create, edu2/*, allow
p, role:edu3, clusters, get, *, allow
p, role:edu3, repositories, get, *, allow
p, role:edu3, projects, get, *, allow
p, role:edu3, applications, *, edu3/*, allow
p, role:edu3, exec, create, edu3/*, allow
p, role:edu4, clusters, get, *, allow
p, role:edu4, repositories, get, *, allow
p, role:edu4, projects, get, *, allow
p, role:edu4, applications, *, edu4/*, allow
p, role:edu4, exec, create, edu4/*, allow
p, role:edu5, clusters, get, *, allow
p, role:edu5, repositories, get, *, allow
p, role:edu5, projects, get, *, allow
p, role:edu5, applications, *, edu5/*, allow
p, role:edu5, exec, create, edu5/*, allow
g, edu1, role:edu1
g, edu2, role:edu2
g, edu3, role:edu3
g, edu4, role:edu4
g, edu5, role:edu5
g, shclub, role:manager
g, haerin, role:manager
g, hans, role:manager
g, rorty, role:manager
policy.default: role:''
argocd 2.4 버전 부터 terminal 기능을 사용 하려면 터미널 실행 기능을 설정한다.
[root@bastion argocd]# kubectl edit cm argocd-cm -n argocd
apiVersion: v1
data:
accounts.edu1: apiKey,login
accounts.edu2: apiKey,login
accounts.edu3: apiKey,login
accounts.edu4: apiKey,login
accounts.edu5: apiKey,login
accounts.haerin: apiKey,login
accounts.hans: apiKey,login
accounts.rorty: apiKey,login
accounts.shclub: apiKey,login
exec.enabled: "true"
kind: ConfigMap
아래 내용 추가
data:
exec.enabled: "true"
이제 argocd-server
role 을 변경합니다.
[root@bastion argocd]# kubectl edit role argocd-server -n argocd
맨 아래에 아래 내용을 추가 한다.
- apiGroups:
- ""
resources:
- pods/exec
verbs:
- create
argocd-rbac-cm 를 수정하여 exec, create
를 추가한다.
위에서 추가 해서 여기서는 여기서는 확인만 한다.
[root@bastion argocd]# kubectl -n argocd edit configmap argocd-rbac-cm -o yaml
p, role:<유저명>, exec, create, <프로젝트명>/*, allow
argocd-server pod를 삭제하고 재기동 한다.
[root@bastion argocd]# kubectl get pod -n argocd
NAME READY STATUS RESTARTS AGE
argocd-application-controller-0 1/1 Running 0 118m
argocd-applicationset-controller-6765c74d68-t5jdv 1/1 Running 0 118m
argocd-dex-server-7d974fc66b-ntz2d 1/1 Running 0 113m
argocd-notifications-controller-5df7fddfb7-gqs2w 1/1 Running 0 118m
argocd-redis-74b8bcc46b-hg22b 1/1 Running 0 110m
argocd-repo-server-6db877f7d9-d78qf 1/1 Running 0 112m
argocd-server-6c7df9df6b-85vn9 1/1 Running 0 115m
[root@bastion argocd]#
[root@bastion argocd]# kubectl delete po argocd-server-6c7df9df6b-85vn9 -n argocd
pod "argocd-server-6c7df9df6b-85vn9" deleted
아래와 같이 터미널이 활성화가 된다.
참고
설치 전에 git과 helm 을 먼저 설치한다.
[root@bastion ~]# yum install git
[root@bastion ~]# curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
[root@bastion ~]# chmod 700 get_helm.sh
[root@bastion ~]# ./get_helm.sh
Downloading https://get.helm.sh/helm-v3.12.3-linux-amd64.tar.gz
Verifying checksum... Done.
Preparing to install helm into /usr/local/bin
helm installed into /usr/local/bin/helm
PV/PVC 를 자동으로 생성하기 위해서 Dynamic Provisioning 을 사용한다.
nfs-subdir-external-provisioner
를 사용하기로 한다.
helm repository를 추가한다.
[root@bastion dynamic_provisioning]# helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
"nfs-subdir-external-provisioner" has been added to your repositories
[root@bastion dynamic_provisioning]# helm repo list
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
NAME URL
bitnami https://charts.bitnami.com/bitnami
harbor https://helm.goharbor.io
nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
chart를 검색한다.
[root@bastion dynamic_provisioning]# helm search repo nfs-subdir-external-provisioner
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
NAME CHART VERSION APP VERSION DESCRIPTION
nfs-subdir-external-provisioner/nfs-subdir-exte... 4.0.18 4.0.2 nfs-subdir-external-provisioner is an automatic...
k8s 1.25 부터는 podSecurityPolicy 가 deprecated 가 되어 podSecurityStandard 로 변경이 됨.
현재 namespace 구성을 보면 pod는 privileged 권한을 사용할 수 없음
[root@bastion ]# kubectl describe namespace shclub
Name: shclub
Labels: kubernetes.io/metadata.name=shclub
pod-security.kubernetes.io/audit=privileged
pod-security.kubernetes.io/audit-version=v1.24
pod-security.kubernetes.io/warn=privileged
pod-security.kubernetes.io/warn-version=v1.24
Annotations: openshift.io/description:
openshift.io/display-name:
openshift.io/node-selector: devops=true
openshift.io/requester: system:admin
openshift.io/sa.scc.mcs: s0:c26,c15
openshift.io/sa.scc.supplemental-groups: 1000680000/10000
openshift.io/sa.scc.uid-range: 1000680000/10000
Status: Active
No resource quota.
No LimitRange resource.
아래와 같이 현재 namespace 의 pod security standard를 변경해야 다이나믹 프로비져닝 생성이 가능함.
[root@bastion ]# kubectl label --overwrite ns shclub \
pod-security.kubernetes.io/enforce=baseline \
pod-security.kubernetes.io/enforce-version=v1.24
[root@bastion ]# kubectl describe namespace shclub
Name: shclub
Labels: kubernetes.io/metadata.name=shclub
pod-security.kubernetes.io/audit=privileged
pod-security.kubernetes.io/audit-version=v1.24
pod-security.kubernetes.io/enforce=baseline
pod-security.kubernetes.io/enforce-version=v1.24
pod-security.kubernetes.io/warn=privileged
pod-security.kubernetes.io/warn-version=v1.24
Annotations: openshift.io/description:
openshift.io/display-name:
openshift.io/node-selector: devops=true
openshift.io/requester: system:admin
openshift.io/sa.scc.mcs: s0:c26,c15
openshift.io/sa.scc.supplemental-groups: 1000680000/10000
openshift.io/sa.scc.uid-range: 1000680000/10000
Status: Active
No resource quota.
No LimitRange resource.
helm 으로 values.yaml 화일을 생성한다.
[root@bastion dynamic_provisioning]# helm show values nfs-subdir-external-provisioner/nfs-subdir-external-provisioner > values.yaml
values 에서는 아래와 같이 변경한다.
1 replicaCount: 1
2 strategyType: Recreate
3
4 image:
5 repository: registry.k8s.io/sig-storage/nfs-subdir-external-provisioner
6 tag: v4.0.2
7 pullPolicy: IfNotPresent
8 imagePullSecrets: []
9
10 nfs:
11 server: 192.168.1.79 # NAS IP
12 path: /volume3/okd/dynamic # path
13 mountOptions:
14 volumeName: nfs-subdir-external-provisioner-root
15 # Reclaim policy for the main nfs volume
16 reclaimPolicy: Retain
17
18 # For creating the StorageClass automatically:
19 storageClass:
20 create: true
21
22 # Set a provisioner name. If unset, a name will be generated.
23 # provisionerName:
24
25 # Set StorageClass as the default StorageClass
26 # Ignored if storageClass.create is false
27 defaultClass: false
28
29 # Set a StorageClass name
30 # Ignored if storageClass.create is false
31 name: nfs-client # storage class 이름
32
33 # Allow volume to be expanded dynamically
34 allowVolumeExpansion: true
35
36 # Method used to reclaim an obsoleted volume
37 reclaimPolicy: Delete
38
39 # When set to false your PVs will not be archived by the provisioner upon deletion of the PVC.
40 archiveOnDelete: false # archive 안함
41
42 # If it exists and has 'delete' value, delete the directory. If it exists and has 'retain' value, save the directory.
43 # Overrides archiveOnDelete.
44 # Ignored if value not set.
...
49 pathPattern:
50
51 # Set access mode - ReadWriteOnce, ReadOnlyMany or ReadWriteMany
52 accessModes: ReadWriteOnce
53
54 # Set volume bindinng mode - Immediate or WaitForFirstConsumer
55 volumeBindingMode: Immediate
56
57 # Storage class annotations
58 annotations: {}
59
60 leaderElection:
61 # When set to false leader election will be disabled
62 enabled: true
63
64 ## For RBAC support:
65 rbac:
66 # Specifies whether RBAC resources should be created
67 create: true
68
...
71 podSecurityPolicy:
72 enabled: false # okd (k8s 1.24 이하는 설정. 1.25 이상에서는 false
73
74 # Deployment pod annotations
75 podAnnotations: {}
76
80 #podSecurityContext: {}
81
82 podSecurityContext: {}
이제 helm 으로 설치를 한다.
[root@bastion dynamic_provisioning]# helm install nfs-subdir-external-provisioner -f values.yaml nfs-subdir-external-provisioner/nfs-subdir-external-provisioner -n shclub
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
NAME: nfs-subdir-external-provisioner
LAST DEPLOYED: Fri Sep 1 22:59:26 2023
NAMESPACE: shclub
STATUS: deployed
REVISION: 1
TEST SUITE: None
storageclass 가 nfs-client 이름으로 생성이 된 것을 확인 할 수 있다.
[root@bastion dynamic_provisioning]# kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
nfs-client cluster.local/nfs-subdir-external-provisioner Delete Immediate true 15s
PVC 를 생성하여 테스트 해본다.
[root@bastion dynamic_provisioning]# vi test_claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pvc-test
spec:
storageClassName: nfs-client # SAME NAME AS THE STORAGECLASS
accessModes:
- ReadWriteMany # must be the same as PersistentVolume
resources:
requests:
storage: 1Gi
정상적으로 PVC 가 자동 생성 된 것을 확인 할 수 있다.
[root@bastion dynamic_provisioning]# kubectl get pvc -n shclub
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-pvc-test Bound pvc-3b872a76-4fe5-4a6d-9c19-9ebdb4c4e58f 1Gi RWX nfs-client 6m36s
minio 이름으로 namespace를 생성하고 uid-range
를 확인한다.
[root@bastion minio]# oc new-project minio
[root@bastion minio]# kubectl describe namespace minio
Name: minio
Labels: kubernetes.io/metadata.name=minio
Annotations: openshift.io/description:
openshift.io/display-name:
openshift.io/requester: system:admin
openshift.io/sa.scc.mcs: s0:c27,c4
openshift.io/sa.scc.supplemental-groups: 1000710000/10000
openshift.io/sa.scc.uid-range: 1000710000/10000
Status: Active
No resource quota.
No LimitRange resource.
minio namespace 에 권한을 할당한다.
[root@bastion minio]# oc adm policy add-scc-to-user anyuid system:serviceaccount:minio:default
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:anyuid added: "default"
[root@bastion minio]# oc adm policy add-scc-to-user privileged system:serviceaccount:minio:default
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "default"
helm 설정을 한다. 우리는 bitnami 에서 제공하는 chart를 사용한다.
[root@bastion minio]# helm repo add bitnami https://charts.bitnami.com/bitnami
[root@bastion minio]# helm search repo bitnami | grep minio
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
bitnami/minio 12.7.0 2023.7.18 MinIO(R) is an object storage server, compatibl...
우리는 다이나믹 프로지저닝으로 설치 할 예정임 으로 아래 pv/pvc 생성은 skip 해도 된다.
minio 에서 사용한 pv 와 pvc를 생성한다. nfs는 synology nas 에서 생성하였고 과정은 skip.
[root@bastion minio]# vi minio_pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: minio-pv
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 100Gi
nfs:
path: /volume3/okd/minio
server: 192.168.1.79
persistentVolumeReclaimPolicy: Retain
pvc 생성
[root@bastion minio]# vi minio_pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: minio-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
volumeName: minio-pv
Openshift 에 맞게 values를 수정하기 위해 values.yaml 로 다운 받는다.
[root@bastion minio]# helm show values bitnami/minio > values.yaml
values.yaml 화일에서 아래 부분을 수정한다.
100 rootUser: admin
101 ## @param auth.rootPassword Password for MinIO® root user
102 ##
103 rootPassword: "hahahaha" # 본인의 비밀번호 설정
...
377 ## config:
378 ## - name: region
379 ## options:
380 ## name: us-east-1
381 config:
382 - name: region
383 options:
384 name: ap-northeast-2 # S3 호환으로 사용하기 위해 region을 설정한다.
...
389 ##
390 podSecurityContext:
391 enabled: true
392 fsGroup: 1000680000 #1001 # namespace 의 uid range 값으로 변경한다.
...
399 containerSecurityContext:
400 enabled: true
401 runAsUser: 1000680000 #1001
...
426 podSecurityContext:
427 enabled: true
428 fsGroup: 1000680000 # 1001
...
434 ##
435 containerSecurityContext:
436 enabled: true
437 runAsUser: 1000680000 #1001
438 runAsNonRoot: true
...
899 persistence:
900 ## @param persistence.enabled Enable MinIO® data persistence using PVC. If false, use emptyDir
901 ##
902 enabled: true
909 ##
910 storageClass: "nfs-client" # storage class 이름
911 ## @param persistence.mountPath Data volume mount path
912 ##
913 mountPath: /data
914 ## @param persistence.accessModes PVC Access Modes for MinIO® data volume
915 ##
916 accessModes:
917 - ReadWriteOnce
918 ## @param persistence.size PVC Storage Request for MinIO® data volume
919 ##
920 size: 100Gi # 위에서 설정한 pvc 사이즈
925 ##
926 existingClaim: "" # 다이나믹 으로 설치함으로 아무 것도 넣지 않는다.
이제 위에서 수정한 values.yaml 화일로 설치를 한다.
[root@bastion minio]# helm install my-minio -f values.yaml bitnami/minio -n minio
NAME: my-minio
LAST DEPLOYED: Wed Aug 23 10:04:29 2023
NAMESPACE: minio
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
CHART NAME: minio
CHART VERSION: 12.7.0
APP VERSION: 2023.7.18
** Please be patient while the chart is being deployed **
MinIO® can be accessed via port on the following DNS name from within your cluster:
my-minio.minio.svc.cluster.local
To get your credentials run:
export ROOT_USER=$(kubectl get secret --namespace minio my-minio -o jsonpath="{.data.root-user}" | base64 -d)
export ROOT_PASSWORD=$(kubectl get secret --namespace minio my-minio -o jsonpath="{.data.root-password}" | base64 -d)
To connect to your MinIO® server using a client:
- Run a MinIO® Client pod and append the desired command (e.g. 'admin info'):
kubectl run --namespace minio my-minio-client \
--rm --tty -i --restart='Never' \
--env MINIO_SERVER_ROOT_USER=$ROOT_USER \
--env MINIO_SERVER_ROOT_PASSWORD=$ROOT_PASSWORD \
--env MINIO_SERVER_HOST=my-minio \
--image docker.io/bitnami/minio-client:2023.7.18-debian-11-r0 -- admin info minio
To access the MinIO® web UI:
- Get the MinIO® URL:
echo "MinIO® web URL: http://127.0.0.1:9001/minio"
kubectl port-forward --namespace minio svc/my-minio 9001:9001
외부에서 접속하기 위해서 route를 생성한다.
[root@bastion minio]# kubectl get svc -n minio
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-minio ClusterIP 172.30.88.228 <none> 9000/TCP,9001/TCP 84m
[root@bastion minio]# kubectl describe svc my-minio -n minio
Name: my-minio
Namespace: minio
Labels: app.kubernetes.io/instance=my-minio
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=minio
helm.sh/chart=minio-12.7.0
Annotations: meta.helm.sh/release-name: my-minio
meta.helm.sh/release-namespace: minio
Selector: app.kubernetes.io/instance=my-minio,app.kubernetes.io/name=minio
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 172.30.88.228
IPs: 172.30.88.228
Port: minio-api 9000/TCP
TargetPort: minio-api/TCP
Endpoints: 10.128.0.48:9000
Port: minio-console 9001/TCP
TargetPort: minio-console/TCP
Endpoints: 10.128.0.48:9001
Session Affinity: None
Events: <none>
minio-console
포트로 route와 서비스 포트 매핑이 필요하다.
[root@bastion minio]# vi minio_route.yaml
apiVersion: route.openshift.io/v1
kind: Route
metadata:
labels:
app : minio
name: minio
spec:
host: minio.apps.okd4.ktdemo.duckdns.org
port:
targetPort: minio-console
tls:
insecureEdgeTerminationPolicy: Allow
termination: edge
to:
kind: Service
name: my-minio
weight: 100
wildcardPolicy: None
웹브라우저에서 http://minio.apps.okd4.ktdemo.duckdns.org 로 접속하고 admin 계정/values.yaml에 설정한 비밀번호로 로그인 한다.
admin으로 로그인을 한 후 Administrator -> Identity -> Users 메뉴에서 계정을 하나 생성하고 권한을 부여한다.
admin 계정은 로그 아웃 하고 신규 생성한 계정으로 로그인 하고 Object Browser 메뉴로 이동하면 bucket을 생성하라는 화면이 나온다.
harbor를 docker private registry 로 설치할 예정이고 스토리지는 minio를 사용할 예정이기 때문에 이름을 harbor-registry
라는 이름으로 할당하고 create bucket 버튼을 클릭하여 생성한다.
- versioning 만 체크 한다.
생성된 된 bucket은 다음 과 같다. read/write 권한이 할당 되어 있다.
bucket 이름을 클릭하고 access 를 private 으로 설정한다.
위에서 생성한 bucket 에 접근하기 위한 access key를 생성한다.
create acces key 버튼 클릭.
create 버튼 을 누르면 Access key 와 secret 이 생성이 되고 Download 버튼을 눌러 키를 다운 받을 수 있다.
생성된 key 를 확인 할 수 있다.
bastion 서버에 harbor 폴더를 생성하고 okd에 harbor namespace를 생성한다.
[root@bastion harbor]# oc new-project harbor
Now using project "harbor" on server "https://api.okd4.ktdemo.duckdns.org:6443".
You can add applications to this project with the 'new-app' command. For example, try:
oc new-app rails-postgresql-example
to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:
kubectl create deployment hello-node --image=k8s.gcr.io/e2e-test-images/agnhost:2.33 -- /agnhost serve-hostname
해당 namespace 에 권한을 부여한다.
[root@bastion harbor]# oc adm policy add-scc-to-user anyuid system:serviceaccount:harbor:default
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:anyuid added: "default"
[root@bastion harbor]# oc adm policy add-scc-to-user privileged system:serviceaccount:harbor:default
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "default"
harbor repository 를 추가한다. bitnami 를 사용 시 불 필요.
아래 과정은 harbor를 사용 하는 예제이다.
[root@bastion harbor]# helm repo add harbor https://helm.goharbor.io
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
"harbor" has been added to your repositories
[root@bastion harbor]# helm repo list
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
NAME URL
bitnami https://charts.bitnami.com/bitnami
harbor https://helm.goharbor.io
[root@bastion harbor]# helm search repo harbor
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
NAME CHART VERSION APP VERSION DESCRIPTION
bitnami/harbor 16.7.4 2.8.3 Harbor is an open source trusted cloud-native r...
harbor/harbor 1.12.4 2.8.4 An open source trusted cloud native registry th...
Openshift 에 맞게 values를 수정하기 위해 harbor_values.yaml 로 다운 받는다.
[root@bastion harbor]# helm show values harbor/harbor > harbor_values.yaml
values.yaml 화일에서 아래 부분을 수정한다.
1 expose:
2
4 type: ingress # ingress 로 설정
5 tls:
6 # Enable TLS or not.
7 # Delete the "ssl-redirect" annotations in "expose.ingress.annotations" when TLS is disabled and "expose.type" is "ingress"
8 # Note: if the "expose.type" is "ingress" and TLS is disabled,
9 # the port must be included in the command when pulling/pushing images.
10 # Refer to https://github.com/goharbor/harbor/issues/5291 for details.
11 enabled: true # true 로 설정해야 route/ingress에서 tls 설정이 생성됨. 아주 중요
...
34 ingress:
35 hosts:
36 core: myharbor.apps.okd4.ktdemo.duckdns.org # 나의 URL 로 설정
37 notary: notray-harbor.apps.okd4.ktdemo.duckdns.org #notary.harbor.domain
...
126 # If Harbor is deployed behind the proxy, set it as the URL of proxy
127 externalURL: https://myharbor.apps.okd4.ktdemo.duckdns.org # url 이 정상적으로 설정되어야 docker push가 되고 push command가 정상적으로 나옴
...
203 resourcePolicy: "keep" # keep이면 helm 지워도 pvc 삭제 안됨
204 persistentVolumeClaim:
205 registry:
208 existingClaim: ""
...
212 storageClass: "nfs-client" # storage class 설정
213 subPath: ""
214 accessMode: ReadWriteOnce
215 size: 5Gi
216 annotations: {}
217 jobservice:
218 jobLog:
219 existingClaim: ""
220 storageClass: "nfs-client" # storage class 설정
221 subPath: ""
222 accessMode: ReadWriteOnce
223 size: 1Gi
224 annotations: {}
225 # If external database is used, the following settings for database will
226 # be ignored
227 database:
228 existingClaim: ""
229 storageClass: "nfs-client" # storage class 설정
230 subPath: ""
231 accessMode: ReadWriteOnce
232 size: 1Gi
233 annotations: {}
234 # If external Redis is used, the following settings for Redis will
...
236 redis:
237 existingClaim: ""
238 storageClass: "nfs-client" # storage class 설정
239 subPath: ""
240 accessMode: ReadWriteOnce
241 size: 1Gi
242 annotations: {}
243 trivy:
244 existingClaim: ""
245 storageClass: "nfs-client" # storage class 설정
246 subPath: ""
247 accessMode: ReadWriteOnce
248 size: 5Gi
249 annotations: {}
...
267 # Specify the type of storage: "filesystem", "azure", "gcs", "s3", "swift",
268 # "oss" and fill the information needed in the corresponding section. The type
269 # must be "filesystem" if you want to use persistent volumes for registry
270 type: s3 # 스토로지 타입을 선택한다. 우리는 minio object storage를 사용하기 때문에 s3로 설정
...
290 s3:
291 # Set an existing secret for S3 accesskey and secretkey
292 # keys in the secret should be REGISTRY_STORAGE_S3_ACCESSKEY and REGISTRY_STORAGE_S3_SECRETKEY for registry
293 region: ap-northeast-2 # minio 에 설정한 region
294 bucket: harbor-registry
295 accesskey: "xs2Bg88****"
296 secretkey: "7A2HdVf*******K"
297 regionendpoint: "http://my-minio.minio.svc.cluster.local:9000" # k8s 서비스 url
298 #existingSecret: ""
299 #region: us-west-1
300 #bucket: bucketname
301 #accesskey: awsaccesskey
302 #secretkey: awssecretkey
303 #regionendpoint: http://myobjects.local
304 #encrypt: false
305 #keyid: mykeyid
306 #secure: true
307 #skipverify: false
308 #v4auth: true
309 #chunksize: "5242880"
310 #rootdirectory: /s3/object/name/prefix
311 #storageclass: STANDARD
312 #multipartcopychunksize: "33554432"
313 #multipartcopymaxconcurrency: 100
314 #multipartcopythresholdsize: "33554432"
이제 위에서 수정한 harbor_values.yaml 화일로 설치를 한다.
service 와 pod 그리고 pvc 생성을 확인한다.
[root@bastion harbor]# helm install my-harbor -f harbor_values.yaml harbor/harbor -n harbor
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
NAME: my-harbor
LAST DEPLOYED: Thu Aug 24 09:07:21 2023
NAMESPACE: harbor
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Please wait for several minutes for Harbor deployment to complete.
Then you should be able to visit the Harbor portal at https://core.harbor.domain
For more details, please visit https://github.com/goharbor/harbor
[root@bastion harbor]# kubectl get svc -n harbor
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
harbor ClusterIP 172.30.232.108 <none> 80/TCP,443/TCP,4443/TCP 9s
my-harbor-core ClusterIP 172.30.73.246 <none> 80/TCP 10s
my-harbor-database ClusterIP 172.30.134.241 <none> 5432/TCP 9s
my-harbor-jobservice ClusterIP 172.30.149.250 <none> 80/TCP 9s
my-harbor-notary-server ClusterIP 172.30.112.41 <none> 4443/TCP 9s
my-harbor-notary-signer ClusterIP 172.30.46.231 <none> 7899/TCP 9s
my-harbor-portal ClusterIP 172.30.77.199 <none> 80/TCP 9s
my-harbor-redis ClusterIP 172.30.253.146 <none> 6379/TCP 9s
my-harbor-registry ClusterIP 172.30.83.190 <none> 5000/TCP,8080/TCP 9s
my-harbor-trivy ClusterIP 172.30.76.51 <none> 8080/TCP 9s
[root@bastion harbor]# kubectl get po -n harbor
NAME READY STATUS RESTARTS AGE
my-harbor-core-67587bcbc4-x6whs 1/1 Running 0 6m35s
my-harbor-database-0 1/1 Running 0 6m34s
my-harbor-jobservice-599859fd57-4qvkk 1/1 Running 2 (6m19s ago) 6m35s
my-harbor-notary-server-bf5f9b94f-pmxd4 1/1 Running 0 6m35s
my-harbor-notary-signer-55db47f788-srzq8 1/1 Running 0 6m34s
my-harbor-portal-694bf8c545-p56f6 1/1 Running 0 6m34s
my-harbor-redis-0 1/1 Running 0 6m34s
my-harbor-registry-6c57986d5d-84fzb 2/2 Running 0 6m34s
my-harbor-trivy-0 1/1 Running 0 6m34s
[root@bastion harbor]# kubectl get pvc -n harbor
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-my-harbor-redis-0 Bound pvc-8c45a518-871e-4c4c-bd15-e91243ba5ade 1Gi RWO nfs-client 12m
data-my-harbor-trivy-0 Bound pvc-88724475-62b8-44d7-93c6-70ec4f2b48f8 5Gi RWO nfs-client 12m
database-data-my-harbor-database-0 Bound pvc-d7c7d316-7d28-4258-b326-925e25c8bb68 1Gi RWO nfs-client 12m
my-harbor-jobservice Bound pvc-e0a9fcf7-d360-4c56-9a4d-0500572cbfc1 1Gi RWO nfs-client 12m
my-harbor-registry Bound pvc-7a0cb3da-87ba-4758-ba4d-aaa78baec07c 5Gi RWO nfs-client 12m
ingress 와 route 도 확인해 본다.
[root@bastion harbor]# kubectl get route -n harbor
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
harbor harbor-harbor.apps.okd4.ktdemo.duckdns.org my-harbor-portal http-web edge/Allow None
my-harbor-ingress-2cggf myharbor.apps.okd4.ktdemo.duckdns.org /service/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-5xdrs myharbor.apps.okd4.ktdemo.duckdns.org /api/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-bxkmn myharbor.apps.okd4.ktdemo.duckdns.org /chartrepo/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-hqspm myharbor.apps.okd4.ktdemo.duckdns.org /v2/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-jx9x7 myharbor.apps.okd4.ktdemo.duckdns.org /c/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-notary-6j87b notray-harbor.apps.okd4.ktdemo.duckdns.org / my-harbor-notary-server <all> edge/Redirect None
my-harbor-ingress-tsr84 myharbor.apps.okd4.ktdemo.duckdns.org / my-harbor-portal <all> edge/Redirect None
[root@bastion harbor]# kubectl get ing -n harbor
NAME CLASS HOSTS ADDRESS PORTS AGE
my-harbor-ingress <none> myharbor.apps.okd4.ktdemo.duckdns.org router-default.apps.okd4.ktdemo.duckdns.org 80, 443 10m
my-harbor-ingress-notary <none> notray-harbor.apps.okd4.ktdemo.duckdns.org router-default.apps.okd4.ktdemo.duckdns.org 80, 443 10m
포탈에 접속하기 위해서 portal 서비스의 host를 찾는다.
웹브라우저에서 https://myharbor.apps.okd4.ktdemo.duckdns.org 로 접속하고 admin 계정과 초기 비빌번호인 Harbor12345
로 로그인 하고 변경한다.
설정 참고 : https://github.com/shclub/edu_homework/blob/master/homework.md
계정을 생성을 하고 나서 project 로 이동하여 repository를 생성하고 public 에 체크한다.
생성한 repository 에서 tag / push command 를 확인한다.
생성한 repository 에서 tag / push command 를 확인한다.
command 로 image를 push 하기 위해서는 insecure registry를 설정 해야 하는데 Mac에서는 도커 데스크의 preferences 로 이동한 후 도커 엔진에서 harbor url을 입력하고 docker 를 재기동한다.
docker 를 재기동한다.
linux 기준
docker 의 경우 /etc/docker/daemon.json 화일에 아래와 같이 작성하고 도커를 재시작한다.
{
"insecure-registries" : ["myharbor.apps.okd4.ktdemo.duckdns.org"]
}
반드시 재기동한다.
# flush changes
sudo systemctl daemon-reload
# restart docker
sudo systemctl restart docker
podman 의 경우는 /etc/containers/registries.conf.d
로 이동하여 myregistry.conf
라는 이름으로 화일을 하나 생성한다.
[root@bastion containers]# cd /etc/containers/registries.conf.d
[root@bastion registries.conf.d]# vi myregistry.conf
location
에 harbor 주소를 적어 주고 insecure
옵션은 true
로 설정한다.
[[registry]]
location = "myharbor.apps.okd4.ktdemo.duckdns.org"
insecure = true
podman의 경우 docker 처럼 재기동 필요가 없다.
현재 다운 되어 있는 docker image를 조회해 본다.
[root@bastion registries.conf.d]# podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/shclub/edu1 master 26efc9f33ac0 24 hours ago 166 MB
로그인을 하고 tagging 을 하고 push 를 해봅니다.
- 중간에 edu 는 harbor 의 project ( repository ) 이름
[root@bastion registries.conf.d]# podman login myharbor.apps.okd4.ktdemo.duckdns.org
Username: shclub
Password:
Login Succeeded!
[root@bastion registries.conf.d]# podman tag docker.io/shclub/edu1 myharbor.apps.okd4.ktdemo.duckdns.org/edu/shclub/edu1
[root@bastion registries.conf.d]# podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/shclub/edu1 master 26efc9f33ac0 24 hours ago 166 MB
myharbor.apps.okd4.ktdemo.duckdns.org/edu/shclub/edu1 latest 26efc9f33ac0 24 hours ago 166 MB
[root@bastion registries.conf.d]# podman push myharbor.apps.okd4.ktdemo.duckdns.org/edu/shclub/edu1
Getting image source signatures
Copying blob bc9e2ff0c9d4 done
Copying blob 870fb40b1d5c done
Copying blob 690a4af2c9f7 done
Copying blob 511780f88f80 done
Copying blob 0048c3fcf4cb done
Copying blob 5b60283f3630 done
Copying blob 93b4d35bad0a done
Copying blob 6c2a89cbf5af done
Copying blob d404f6c08f8b done
Copying config 26efc9f33a done
Writing manifest to image destination
Storing signatures
정상적으로 push 가 되었고 harbor에서 확인해 봅니다.
우리는 repository 를 minio에 설정 했기 때문에 minio 에서도 확인합니다.
Object browser 에 보면 용량을 확인 할 수 있다
harbor-registry
를 클릭해서 들어가면 harbor-registry/docker/registry/v2
이경로로 이동하여 repository 항목을 클릭하면 repository 들을 볼 수 있다.
meta 데이터는 도커이미지 이름의 경로인 harbor-registry/docker/registry/v2/repositories/edu/shclub/edu1
로 이동하여 보면 폴더 3개가 보인다.
데이터는 harbor-registry/docker/registry/v2/blobs/sha256
이경로로 이동하여 보면 숫자 폴더가 나오고 데이터가 저장된 것을 볼 수 있다.
bastion 서버에서 haproxy.cfg 에 신규 노드를 추가한다.
/etc/haproxy/haproxy.cfg 를 아래와 같이 수정한다.
backend openshift_api_backend
mode tcp
balance source
#server bootstrap 192.168.1.128:6443 check # bootstrap 서버
server okd-1 192.168.1.146:6443 check # okd master 설정
server okd-2 192.168.1.148:6443 check # okd worker 설정
server okd-2 192.168.1.149:6443 check # worker node 추가
# OKD Machine Config Server
frontend okd_machine_config_server_frontend
mode tcp
bind *:22623
default_backend okd_machine_config_server_backend
backend okd_machine_config_server_backend
mode tcp
balance source
#server bootstrap 192.168.1.128:22623 check # bootstrap 서버
server okd-1 192.168.1.146:22623 check # okd master 설정
server okd-2 192.168.1.148:22623 check # okd worker 설정
server okd-3 192.168.1.149:22623 check # worker node 추가
# OKD Ingress - layer 4 tcp mode for each. Ingress Controller will handle layer 7.
frontend okd_http_ingress_frontend
bind *:80
default_backend okd_http_ingress_backend
mode tcp
backend okd_http_ingress_backend
balance source
mode tcp
server okd-1 192.168.1.146:80 check # okd master 설정
server okd-2 192.168.1.148:80 check # okd worker 설정
server okd-3 192.168.1.149:80 check # worker node 추가
frontend okd_https_ingress_frontend
bind *:443
default_backend okd_https_ingress_backend
mode tcp
backend okd_https_ingress_backend
mode tcp
balance source
server okd-1 192.168.1.146:443 check
server okd-2 192.168.1.148:443 check
server okd-3 192.168.1.149:443 check # worker node 추가
haproxy를 재기동 한다.
[root@bastion shclub]# systemctl restart haproxy
/var/named 폴더로 이동한다.
[root@bastion shclub]# cd /var/named
okd4.ktdemo.duckdns.org.zone 파일 설정 ( DNS 정방향 )
- ip와 hostname을 잘 수정한다.
[root@bastion named]# ls
data dynamic named.ca named.empty named.localhost named.loopback slaves
[root@bastion named]# vi okd4.ktdemo.duckdns.org.zone
$TTL 1D
@ IN SOA @ ns.ktdemo.duckdns.org. (
0 ; serial
1D ; refresh
1H ; retry
1W ; expire
3H ) ; minimum
@ IN NS ns.ktdemo.duckdns.org.
@ IN A 192.168.1.247 ;
; Ancillary services
lb.okd4 IN A 192.168.1.247
; Bastion or Jumphost
ns IN A 192.168.1.247 ;
; OKD Cluster
bastion.okd4 IN A 192.168.1.247
bootstrap.okd4 IN A 192.168.1.128
okd-1.okd4 IN A 192.168.1.146
okd-2.okd4 IN A 192.168.1.148
okd-3.okd4 IN A 192.168.1.149
api.okd4 IN A 192.168.1.247
api-int.okd4 IN A 192.168.1.247
*.apps.okd4 IN A 192.168.1.247
1.168.192.in-addr.rev 파일 설정 ( DNS 역방향 )
[root@bastion named]# vi 1.168.192.in-addr.rev
$TTL 1D
@ IN SOA ktdemo.duckdns.org. ns.ktdemo.duckdns.org. (
0 ; serial
1D ; refresh
1H ; retry
1W ; expire
3H ) ; minimum
@ IN NS ns.
247 IN PTR ns.
247 IN PTR bastion.okd4.ktdemo.duckdns.org.
128 IN PTR bootstrap.okd4.ktdemo.duckdns.org.
146 IN PTR okd-1.okd4.ktdemo.duckdns.org.
148 IN PTR okd-2.okd4.ktdemo.duckdns.org.
150 IN PTR okd-3.okd4.ktdemo.duckdns.org.
247 IN PTR api.okd4.ktdemo.duckdns.org.
247 IN PTR api-int.okd4.ktdemo.duckdns.org.
zone 파일 권한 설정을 하고 named 서비스를 재기동한다.
[root@bastion named]# systemctl restart named
proxmox 에 coreos 기반의 worker 노드를 생성한다. ( 생성 과정은 생략 )
- 다운로드 위치 : https://builds.coreos.fedoraproject.org/browser?stream=stable&arch=x86_64
- Version : fedora-coreos-37.20230218.3.0-live.x86_64.iso
24 시간이 지난 후에 worker node를 추가하는 경우에는 에러가 발생을 하는데 worker.ign의 정보가 bootstrap 생성시 만들어진 것이기 때문에 인증이 유효 하지 않다.
신규로 worker.ign을 생성해야 한다.
[root@bastion okd4]# export MCS=api-int.okd4.ktdemo.duckdns.org:22623
[root@bastion okd4]# echo "q" | openssl s_client -connect $MCS -showcerts | awk '/-----BEGIN CERTIFICATE-----/,/-----END CERTIFICATE-----/' | base64 --wrap=0 | tee ./api-int.base64 && \
> sed --regexp-extended --in-place=.backup "s%base64,[a-zA-Z0-9+\/=]+%base64,$(cat ./api-int.base64)%" ./worker.ign
depth=0 CN = system:machine-config-server
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 CN = system:machine-config-server
verify error:num=21:unable to verify the first certificate
verify return:1
depth=0 CN = system:machine-config-server
verify return:1
DONE
LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURZakN******LS0K
다시 worker.ign 화일을 web server에 update 한다.
[root@bastion ~]# cp okd4/worker.ign /var/www/html/ign/
cp: overwrite '/var/www/html/ign/worker.ign'? y
[root@bastion ~]# systemctl restart httpd
먼저 네트웍을 설정을 하기 위해서 nmtui 라는 gui 환경으로 구성한다.
edit 에 들어가서 수정한다.
[root@localhost core]# nmtui
worker 노드 ( okd-3 ) 설치를 한다.
[root@localhost core]# coreos-installer install /dev/sda -I http://192.168.1.40:8080/ign/worker.ign --insecure-ignition --copy-network
Installing Fedora CoreOS 37.20230218.3.0 x86_64 (512-byte sectors)
> Read disk 2.5 GiB/2.5 GiB (100%)
Writing Ignition config
Copying networking configuration from /etc/NetworkManager/system-connections/
Copying /etc/NetworkManager/system-connections/ens18.nmconnection to installed system
Install complete.
hostname을 설정 하고 재기동 한다.
[root@localhost core]# hostnamectl set-hostname okd-3.okd4.ktdemo.duckdns.org
[root@localhost core]# reboot now
worker node 가 재기동 하기에는 시간이 많이 소요가 되고 완료가 되면 bastion 서버에서 csr를 확인하고 approve를 해야 join이 완료가 된다.
먼저 node를 확인해 본다 아직 worker node 가 조인 되지 않았다.
[root@bastion ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
okd-1.okd4.ktdemo.duckdns.org Ready control-plane,master 7d v1.25.7+eab9cc9
okd-2.okd4.ktdemo.duckdns.org Ready worker 6d22h v1.25.7+eab9cc9
csr를 조회하고 승인을 한다.
[root@bastion ~]# kubectl get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-8qg9p 85s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Pending
csr-dnbcd 74s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Pending
[root@bastion ~]# oc adm certificate approve csr-8qg9p
certificatesigningrequest.certificates.k8s.io/csr-8qg9p approved
[root@bastion ~]# oc adm certificate approve csr-dnbcd
certificatesigningrequest.certificates.k8s.io/csr-dnbcd approved
다시 node를 조회하면 join 된 것을 확인 할수 있고 status 는 NotReady
이다.
[root@bastion ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
okd-1.okd4.ktdemo.duckdns.org Ready control-plane,master 7d v1.25.7+eab9cc9
okd-2.okd4.ktdemo.duckdns.org Ready worker 6d22h v1.25.7+eab9cc9
okd-3.okd4.ktdemo.duckdns.org NotReady worker 81s v1.25.7+eab9cc9
다시 csr를 조회해 보면 승인 대기 중인 csr 이 있는데 okd 는 machine config 기능이 있어 master node 설정값이 worker node에 적용되는 시간이 필요하다.
[root@bastion ~]# oc get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-5lgxl 2m29s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issued
csr-j2m7b 17s kubernetes.io/kubelet-serving system:node:okd-2.okd4.ktdemo.duckdns.org <none> Pending
csr-vhvjn 5m40s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none>
시간이 좀 더 지나면 아래와 같이 Ready로 바뀐 것을 확인 할 수 있다.
[root@bastion ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
okd-1.okd4.ktdemo.duckdns.org Ready control-plane,master 7d v1.25.7+eab9cc9
okd-2.okd4.ktdemo.duckdns.org Ready worker 6d22h v1.25.7+eab9cc9
okd-3.okd4.ktdemo.duckdns.org Ready worker 2m54s v1.25.7+eab9cc9
machine config pool 값을 조회해 본다.
UPDATING 가 false
이면 update 완료
[root@bastion ~]# oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-5fab3f83edfada45fc6c80c4653e299d True False False 1 1 1 0 7d
worker rendered-worker-74e8fd7160efccd8d97f4fe105f4991e True False False 2 2 2 0 7d
기존에 master/worker 겸용 노드에서 생성된 namespace 중 일부는 신규 worker node 로 재기동 시키기 위해서 node selector를 설정한다.
okd-2 , okd-3 node 를 edit 하여 label 을 설정한다.
- okd-2 :
edu: "true"
- okd-3 :
devops: "true"
[root@bastion ~]# kubectl edit node okd-2.okd4.ktdemo.duckdns.org
namespace 에 node selector 를 annotations 에 적용한다.
openshift.io/node-selector: edu=true
[root@bastion ~]# kubectl edit namespace edu3
Pod를 하나 생성해 본다. okd-3 에 pod가 생성된 것을 확인 할 수 있다.
[root@bastion ~]# kubectl run nginx --image=nginx -n edu3
pod/nginx created
[root@bastion ~]# kubectl get po -n edu3 -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx 1/1 Running 0 20s 10.129.0.10 okd-2.okd4.ktdemo.duckdns.org <none> <none>
참고 : https://github.com/shclub/edu/blob/master/chapter9.md
master node 에 로그인하여 cluster-backup.sh 를 수행한다.
[core@okd-1 ~]$ sudo /usr/local/bin/cluster-backup.sh /home/core/assets/backup
Certificate /etc/kubernetes/static-pod-certs/configmaps/etcd-serving-ca/ca-bundle.crt is missing. Checking in different directory
Certificate /etc/kubernetes/static-pod-resources/etcd-certs/configmaps/etcd-serving-ca/ca-bundle.crt found!
found latest kube-apiserver: /etc/kubernetes/static-pod-resources/kube-apiserver-pod-10
found latest kube-controller-manager: /etc/kubernetes/static-pod-resources/kube-controller-manager-pod-7
found latest kube-scheduler: /etc/kubernetes/static-pod-resources/kube-scheduler-pod-7
found latest etcd: /etc/kubernetes/static-pod-resources/etcd-pod-5
8b350855539690bf729d7d708955e5c0732ac956c8be04089bd0cf107f1da531
etcdctl version: 3.5.6
API version: 3.5
{"level":"info","ts":"2023-09-08T14:42:19.542+0900","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/home/core/assets/backup/snapshot_2023-09-08_144210.db.part"}
{"level":"info","ts":"2023-09-08T14:42:19.548+0900","logger":"client","caller":"[email protected]/maintenance.go:212","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2023-09-08T14:42:19.548+0900","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://192.168.1.146:2379"}
{"level":"info","ts":"2023-09-08T14:42:20.404+0900","logger":"client","caller":"[email protected]/maintenance.go:220","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2023-09-08T14:42:20.677+0900","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://192.168.1.146:2379","size":"105 MB","took":"1 second ago"}
{"level":"info","ts":"2023-09-08T14:42:20.677+0900","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/home/core/assets/backup/snapshot_2023-09-08_144210.db"}
Snapshot saved at /home/core/assets/backup/snapshot_2023-09-08_144210.db
Deprecated: Use `etcdutl snapshot status` instead.
{"hash":607384973,"revision":2738905,"totalKey":7435,"totalSize":104706048}
snapshot db and kube resources are successfully saved to /home/core/assets/backup
local 에 저장 하는 것보다는 NAS 저장하는 것이 좋다.
master node에 로그인 하여 mnt 폴더에 nfs 마운트를 한다.
[root@okd-1 ~]# mount -t nfs 192.168.1.79:/volume3/okd/cluster_backup /mnt
[root@okd-1 ~]# ls -al /mnt
lrwxrwxrwx. 2 root root 7 Mar 7 2023 /mnt -> var/mnt
[root@okd-1 ~]# cd /mnt
[root@okd-1 mnt]# ls
먼저 namespace와 권한을 할당한다.
[root@bastion etcd_backup]# oc new-project etcd-backup
Now using project "etcd-backup" on server "https://api.okd4.ktdemo.duckdns.org:6443".
You can add applications to this project with the 'new-app' command. For example, try:
oc new-app rails-postgresql-example
to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:
kubectl create deployment hello-node --image=k8s.gcr.io/e2e-test-images/agnhost:2.33 -- /agnhost serve-hostname
[root@bastion etcd_backup]# oc adm policy add-scc-to-user anyuid -z default -n etcd-backup
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:anyuid added: "default"
[root@bastion etcd_backup]# oc adm policy add-scc-to-user privileged -z default -n etcd-backup
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "default"
configmap 과 cronjob 화일을 생성한다.
backup-config.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: backup-config
data:
OCP_BACKUP_SUBDIR: "/"
OCP_BACKUP_DIRNAME: "+etcd-backup-%FT%T%:z"
OCP_BACKUP_EXPIRE_TYPE: "days"
OCP_BACKUP_KEEP_DAYS: "10"
OCP_BACKUP_KEEP_COUNT: "10"
OCP_BACKUP_UMASK: "0027"
[root@bastion etcd_backup]# cat backup-nfs-cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: etcd-backup
spec:
schedule: "1 7 * * *"
startingDeadlineSeconds: 600
jobTemplate:
spec:
backoffLimit: 0
template:
spec:
containers:
- command:
- /bin/sh
- /usr/local/bin/backup.sh
image: ghcr.io/adfinis/openshift-etcd-backup
imagePullPolicy: Always
name: backup-etcd
resources:
requests:
cpu: 500m
memory: 128Mi
limits:
cpu: 1000m
memory: 512Mi
envFrom:
- configMapRef:
name: backup-config
securityContext:
capabilities:
add:
- SYS_CHROOT
runAsUser: 0
#securityContext:
# privileged: false
# runAsUser: 0
volumeMounts:
- mountPath: /host
name: host
- name: etcd-backup
mountPath: /backup
nodeSelector: # node selector
node-role.kubernetes.io/master: ""
tolerations: # 모든 master node에 pod 스케줄링
- effect: NoSchedule # taint 설정과 같은 값 설정
key: node-role.kubernetes.io/master
nodeName : okd-1.okd4.ktdemo.duckdns.org # master node 중 하나 지정
hostNetwork: true
hostPID: true
restartPolicy: Never
dnsPolicy: ClusterFirst
volumes:
- hostPath:
path: /
type: Directory
name: host
- name: etcd-backup
nfs:
server: 192.168.1.79
path: /volume3/okd/cluster_backup
imsi .... 호스트 네트워크 모드로 실행되기 때문입니다. 포드 스펙에 아래의 그림과 같이 hostNetwork를 설정해주면 마치 docker run 명령어의 --net host와 같은 효과를 낼 수 있습니다. ..
configmap 과 cronjob을 생성합니다.
[root@bastion etcd_backup]# kubectl apply -f backup-config.yaml -n etcd-backup
configmap/backup-config created
[root@bastion etcd_backup]# kubectl apply -f backup-nfs-cronjob.yaml -n etcd-backup
cronjob.batch/etcd-backup created
[root@bastion etcd_backup]# kubectl get cronjob -n etcd-backup
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
etcd-backup 1 7 * * * False 0 <none> 42s
수동으로 cronjob이 잘 되었는지 테스트 하기 위해서는 job을 하나 생성하여 검증합니다.
[root@bastion etcd_backup]# kubectl create job --from=cronjob/etcd-backup test-backup -n etcd-backup
job.batch/test-backup created
[root@bastion etcd_backup]# kubectl get po -n etcd-backup
NAME READY STATUS RESTARTS AGE
test-backup-srw84 0/1 Completed 0 2m28s
nfs 폴더 확인을 해본다.
bastion 서버에 kubecost 폴더를 생성하고 okd에 kubecost namespace를 생성한다.
[root@bastion kubecost]# oc new-project kubecost
Now using project "kubecost" on server "https://api.okd4.ktdemo.duckdns.org:6443".
You can add applications to this project with the 'new-app' command. For example, try:
oc new-app rails-postgresql-example
to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:
kubectl create deployment hello-node --image=k8s.gcr.io/e2e-test-images/agnhost:2.33 -- /agnhost serve-hostname
해당 namespace 에 권한을 부여한다.
[root@bastion kubecost]# oc adm policy add-scc-to-user anyuid system:serviceaccount:kubecost:default
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:anyuid added: "default"
[root@bastion kubecost]# oc adm policy add-scc-to-user privileged system:serviceaccount:kubecost:default
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "default"
kubecost repository 를 추가한다.
[root@bastion kubecost]# helm repo add kubecost https://kubecost.github.io/cost-analyzer/
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
"kubecost" has been added to your repositories
[root@bastion kubecost]# helm search repo kubecost
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
NAME CHART VERSION APP VERSION DESCRIPTION
kubecost/cost-analyzer 1.106.0 1.106.0 A Helm chart that sets up Kubecost, Prometheus,...
https://www.kubecost.com/install#show-instructions
[root@bastion kubecost]# kubectl label --overwrite ns kubecost \
> pod-security.kubernetes.io/enforce=baseline \
> pod-security.kubernetes.io/enforce-version=v1.24
Openshift 에 맞게 values를 수정하기 위해 values.yaml 로 다운 받는다.
[root@bastion kubecost]# helm show values kubecost/cost-analyzer > values.yaml
values.yaml 화일에서 아래 부분을 수정한다.
223 # generated at http://kubecost.com/install, used for alerts tracking and free trials
224 kubecostToken: "c2hjbHViQG************f98"
...
565 persistentVolume:
566 size: 32Gi
567 dbSize: 32.0Gi
568 enabled: true # Note that setting this to false means configurations will be wiped out on pod restart.
569 storageClass: "nfs-client" #
...
596 password: ******** # change me
...
873 cpu: 1000m
874 memory: 500Mi
875 ## default storage class
876 storageClass: "nfs-client"
877 databaseVolumeSize: 100Gi
878 configVolumeSize: 1Gi
879
이제 위에서 수정한 values.yaml 화일로 설치를 한다.
service 와 pod 그리고 pvc 생성을 확인한다.
[root@bastion kubecost]# helm install my-kubecost -f values.yaml kubecost/cost-analyzer -n kubecost
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/okd4/auth/kubeconfig
NAME: my-kubecost
LAST DEPLOYED: Tue Sep 5 12:56:13 2023
NAMESPACE: kubecost
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
--------------------------------------------------Kubecost has been successfully installed.
Please allow 5-10 minutes for Kubecost to gather metrics.
If you have configured cloud-integrations, it can take up to 48 hours for cost reconciliation to occur.
When using Durable storage (Enterprise Edition), please allow up to 4 hours for data to be collected and the UI to be healthy.
When pods are Ready, you can enable port-forwarding with the following command:
kubectl port-forward --namespace kubecost deployment/my-kubecost-cost-analyzer 9090
Next, navigate to http://localhost:9090 in a web browser.
Having installation issues? View our Troubleshooting Guide at http://docs.kubecost.com/troubleshoot-install
[root@bastion kubecost]# kubectl get svc -n kubecost
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
harbor ClusterIP 172.30.232.108 <none> 80/TCP,443/TCP,4443/TCP 9s
my-harbor-core ClusterIP 172.30.73.246 <none> 80/TCP 10s
my-harbor-database ClusterIP 172.30.134.241 <none> 5432/TCP 9s
my-harbor-jobservice ClusterIP 172.30.149.250 <none> 80/TCP 9s
my-harbor-notary-server ClusterIP 172.30.112.41 <none> 4443/TCP 9s
my-harbor-notary-signer ClusterIP 172.30.46.231 <none> 7899/TCP 9s
my-harbor-portal ClusterIP 172.30.77.199 <none> 80/TCP 9s
my-harbor-redis ClusterIP 172.30.253.146 <none> 6379/TCP 9s
my-harbor-registry ClusterIP 172.30.83.190 <none> 5000/TCP,8080/TCP 9s
my-harbor-trivy ClusterIP 172.30.76.51 <none> 8080/TCP 9s
[root@bastion harbor]# kubectl get po -n harbor
NAME READY STATUS RESTARTS AGE
my-harbor-core-67587bcbc4-x6whs 1/1 Running 0 6m35s
my-harbor-database-0 1/1 Running 0 6m34s
my-harbor-jobservice-599859fd57-4qvkk 1/1 Running 2 (6m19s ago) 6m35s
my-harbor-notary-server-bf5f9b94f-pmxd4 1/1 Running 0 6m35s
my-harbor-notary-signer-55db47f788-srzq8 1/1 Running 0 6m34s
my-harbor-portal-694bf8c545-p56f6 1/1 Running 0 6m34s
my-harbor-redis-0 1/1 Running 0 6m34s
my-harbor-registry-6c57986d5d-84fzb 2/2 Running 0 6m34s
my-harbor-trivy-0 1/1 Running 0 6m34s
[root@bastion harbor]# kubectl get pvc -n harbor
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-my-harbor-redis-0 Bound pvc-8c45a518-871e-4c4c-bd15-e91243ba5ade 1Gi RWO nfs-client 12m
data-my-harbor-trivy-0 Bound pvc-88724475-62b8-44d7-93c6-70ec4f2b48f8 5Gi RWO nfs-client 12m
database-data-my-harbor-database-0 Bound pvc-d7c7d316-7d28-4258-b326-925e25c8bb68 1Gi RWO nfs-client 12m
my-harbor-jobservice Bound pvc-e0a9fcf7-d360-4c56-9a4d-0500572cbfc1 1Gi RWO nfs-client 12m
my-harbor-registry Bound pvc-7a0cb3da-87ba-4758-ba4d-aaa78baec07c 5Gi RWO nfs-client 12m
grafana 계정 알아보기
[root@bastion kubecost]# kubectl get secret -n kubecost
NAME TYPE DATA AGE
builder-dockercfg-dsjcd kubernetes.io/dockercfg 1 6h48m
builder-token-x8b56 kubernetes.io/service-account-token 4 6h48m
default-dockercfg-jtnw6 kubernetes.io/dockercfg 1 6h48m
default-token-567gx kubernetes.io/service-account-token 4 6h48m
deployer-dockercfg-vqhgw kubernetes.io/dockercfg 1 6h48m
deployer-token-hvc44 kubernetes.io/service-account-token 4 6h48m
my-kubecost-cost-analyzer-dockercfg-24jng kubernetes.io/dockercfg 1 64m
my-kubecost-cost-analyzer-token-m4jph kubernetes.io/service-account-token 4 64m
my-kubecost-grafana Opaque 3 64m
my-kubecost-grafana-dockercfg-rb585 kubernetes.io/dockercfg 1 64m
my-kubecost-grafana-token-qbtpx kubernetes.io/service-account-token 4 64m
sh.helm.release.v1.my-kubecost.v1 helm.sh/release.v1 1 64m
[root@bastion kubecost]# kubectl get secret my-kubecost-grafana -o jsonpath='{.data}'
{"admin-password":"c3Ryb25ncGFzc3dvcmQ=","admin-user":"YWRtaW4=","ldap-toml":""}
[root@bastion kubecost]# echo "c3Ryb25ncGFzc3dvcmQ=" | base64 --decode
strongpassword
[root@bastion kubecost]# echo "YWRtaW4=" | base64 --decode
admin
ingress 와 route 도 확인해 본다.
[root@bastion harbor]# kubectl get route -n harbor
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
harbor harbor-harbor.apps.okd4.ktdemo.duckdns.org my-harbor-portal http-web edge/Allow None
my-harbor-ingress-2cggf myharbor.apps.okd4.ktdemo.duckdns.org /service/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-5xdrs myharbor.apps.okd4.ktdemo.duckdns.org /api/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-bxkmn myharbor.apps.okd4.ktdemo.duckdns.org /chartrepo/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-hqspm myharbor.apps.okd4.ktdemo.duckdns.org /v2/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-jx9x7 myharbor.apps.okd4.ktdemo.duckdns.org /c/ my-harbor-core http-web edge/Redirect None
my-harbor-ingress-notary-6j87b notray-harbor.apps.okd4.ktdemo.duckdns.org / my-harbor-notary-server <all> edge/Redirect None
my-harbor-ingress-tsr84 myharbor.apps.okd4.ktdemo.duckdns.org / my-harbor-portal <all> edge/Redirect None
[root@bastion harbor]# kubectl get ing -n harbor
NAME CLASS HOSTS ADDRESS PORTS AGE
my-harbor-ingress <none> myharbor.apps.okd4.ktdemo.duckdns.org router-default.apps.okd4.ktdemo.duckdns.org 80, 443 10m
my-harbor-ingress-notary <none> notray-harbor.apps.okd4.ktdemo.duckdns.org router-default.apps.okd4.ktdemo.duckdns.org 80, 443 10m
참고 자료
- 소개 : https://velog.io/@_gyullbb/series/OKD
- OKD 설치 : https://www.server-world.info/en/note?os=CentOS_Stream_8&p=okd4&f=1
- Openshfit 설치 ( Main) : https://hkjeon2.tistory.com/104
- OKD 설치 (Sub) : https://www.okd.io/guides/upi-sno/ #post-install
- Openshfit 설치 (Baremetal) : https://gruuuuu.github.io/ocp/ocp4.7-restricted/
- 인증 : https://gruuuuu.github.io/ocp/ocp4-authentication/
- Openshfit 미러 사이트 : https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.8/latest/
- HA Proxy : https://hoing.io/archives/2196
- DNS 서버 구축 : https://it-serial.tistory.com/entry/Linux-CentOS-7-DNS-%EC%84%9C%EB%B2%84-%EA%B5%AC%EC%B6%95-%EB%8F%84%EB%A9%94%EC%9D%B8-%EC%84%A4%EC%A0%95
- Proxmox 내부에 가상 사설망 구축: https://hwanstory.kr/@kim-hwan/posts/Proxmox-Virtual-Private-Network-Configuration
- pull secret 설정 : https://www.ibm.com/docs/ko/mas-cd/continuous-delivery?topic=platform-setting-up-bastion-host
- 오픈쉬프트 사용법 : https://sysdocu.tistory.com/1765 , https://sysdocu.tistory.com/1774
- OKD 아키텍처 : https://daaa0555.tistory.com/479
- ArgoCD : https://www.skyer9.pe.kr/wordpress/?p=6845
- ArgoCD Redis 접속 에러 (networkpolicy) : https://www.xiexianbin.cn/cicd/argo-cd/deploy/index.html
- Harbor 설치 : https://computingforgeeks.com/install-harbor-image-registry-on-kubernetes-openshift-with-helm-chart/?amp
- Openshift 4.12.0 on BareMetal 설치 및 설정 : https://sysdocu.tistory.com/1765
- kubecost : https://devocean.sk.com/blog/techBoardDetail.do?ID=164699&boardType=techBlog
- kubecost : https://judae.tistory.com/45
journalctl -b -f -u crio.service
journalctl -xeu kubelet -f
[root@okd-1 core]# iptables-save | grep 443
-A KUBE-SERVICES -d 172.30.176.17/32 -p tcp -m comment --comment "openshift-config-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.80.69/32 -p tcp -m comment --comment "openshift-kube-controller-manager-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.17.22/32 -p tcp -m comment --comment "openshift-cluster-storage-operator/csi-snapshot-webhook:webhook has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.47.60/32 -p tcp -m comment --comment "openshift-service-ca-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.157.102/32 -p tcp -m comment --comment "openshift-authentication/oauth-openshift:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.4.82/32 -p tcp -m comment --comment "openshift-oauth-apiserver/api:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.61.212/32 -p tcp -m comment --comment "openshift-console/console:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.189.3/32 -p tcp -m comment --comment "openshift-operator-lifecycle-manager/catalog-operator-metrics:https-metrics has no endpoints" -m tcp --dport 8443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.92.217/32 -p tcp -m comment --comment "openshift-machine-api/cluster-baremetal-operator-service:https has no endpoints" -m tcp --dport 8443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.148.240/32 -p tcp -m comment --comment "openshift-operator-lifecycle-manager/packageserver-service:5443 has no endpoints" -m tcp --dport 5443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.253.148/32 -p tcp -m comment --comment "openshift-kube-storage-version-migrator-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.85.240/32 -p tcp -m comment --comment "openshift-authentication-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.158.160/32 -p tcp -m comment --comment "openshift-console-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.179.123/32 -p tcp -m comment --comment "openshift-cloud-credential-operator/cco-metrics:metrics has no endpoints" -m tcp --dport 8443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.27.215/32 -p tcp -m comment --comment "openshift-monitoring/prometheus-adapter:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.19.233/32 -p tcp -m comment --comment "openshift-cluster-storage-operator/csi-snapshot-controller-operator-metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.160.67/32 -p tcp -m comment --comment "openshift-multus/multus-admission-controller:webhook has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.20.56/32 -p tcp -m comment --comment "openshift-cluster-storage-operator/cluster-storage-operator-metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.139.147/32 -p tcp -m comment --comment "openshift-apiserver/api:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.131.213/32 -p tcp -m comment --comment "openshift-insights/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.231.206/32 -p tcp -m comment --comment "openshift-machine-api/cluster-autoscaler-operator:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.129.139/32 -p tcp -m comment --comment "openshift-kube-apiserver-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.254.166/32 -p tcp -m comment --comment "openshift-machine-api/machine-api-operator:https has no endpoints" -m tcp --dport 8443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.33.145/32 -p tcp -m comment --comment "openshift-controller-manager/controller-manager:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.78.19/32 -p tcp -m comment --comment "openshift-machine-api/machine-api-operator-webhook:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.20.240/32 -p tcp -m comment --comment "openshift-etcd-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.173.83/32 -p tcp -m comment --comment "openshift-kube-scheduler-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.70.103/32 -p tcp -m comment --comment "openshift-controller-manager-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.160.67/32 -p tcp -m comment --comment "openshift-multus/multus-admission-controller:metrics has no endpoints" -m tcp --dport 8443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.211.88/32 -p tcp -m comment --comment "openshift-operator-lifecycle-manager/olm-operator-metrics:https-metrics has no endpoints" -m tcp --dport 8443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.167.51/32 -p tcp -m comment --comment "openshift-machine-api/cluster-baremetal-webhook-service has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 172.30.0.6/32 -p tcp -m comment --comment "openshift-apiserver-operator/metrics:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SEP-O36DLNQIDMAMWMZX -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 192.168.1.146:6443
-A KUBE-SEP-OJQBONVVMSWJJ73L -p tcp -m comment --comment "harbor/my-harbor-notary-server" -m tcp -j DNAT --to-destination 10.129.0.23:4443
-A KUBE-SEP-QQSOREN4FLMPNLUL -p tcp -m comment --comment "openshift-kube-apiserver/apiserver:https" -m tcp -j DNAT --to-destination 192.168.1.146:6443
-A KUBE-SEP-YO6BHA4K2YLFIUW6 -p tcp -m comment --comment "openshift-ingress/router-internal-default:https" -m tcp -j DNAT --to-destination 192.168.1.146:443
-A KUBE-SERVICES -d 172.30.254.208/32 -p tcp -m comment --comment "harbor/my-harbor-notary-server cluster IP" -m tcp --dport 4443 -j KUBE-SVC-R5E3RRBI3OXL227A
-A KUBE-SERVICES -d 172.30.239.126/32 -p tcp -m comment --comment "openshift-kube-scheduler/scheduler:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-OGQPOTBHHZMRDA43
-A KUBE-SERVICES -d 172.30.100.183/32 -p tcp -m comment --comment "openshift-kube-controller-manager/kube-controller-manager:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-VQFT5ZCKL2KRMQ3Q
-A KUBE-SERVICES -d 172.30.154.250/32 -p tcp -m comment --comment "argocd/argocd-server:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-A32MGCDFPRQGQDBB
-A KUBE-SERVICES -d 172.30.85.98/32 -p tcp -m comment --comment "openshift-ingress/router-internal-default:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-PIUKAOOLWSYDMVAC
-A KUBE-SERVICES -d 172.30.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES -d 172.30.6.240/32 -p tcp -m comment --comment "openshift-kube-apiserver/apiserver:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-X7YGTN7QRQI2VNWZ
-A KUBE-SVC-A32MGCDFPRQGQDBB -d 172.30.154.250/32 ! -i tun0 -p tcp -m comment --comment "argocd/argocd-server:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -d 172.30.0.1/32 ! -i tun0 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-OGQPOTBHHZMRDA43 -d 172.30.239.126/32 ! -i tun0 -p tcp -m comment --comment "openshift-kube-scheduler/scheduler:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-PIUKAOOLWSYDMVAC -d 172.30.85.98/32 ! -i tun0 -p tcp -m comment --comment "openshift-ingress/router-internal-default:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-R5E3RRBI3OXL227A -d 172.30.254.208/32 ! -i tun0 -p tcp -m comment --comment "harbor/my-harbor-notary-server cluster IP" -m tcp --dport 4443 -j KUBE-MARK-MASQ
-A KUBE-SVC-VQFT5ZCKL2KRMQ3Q -d 172.30.100.183/32 ! -i tun0 -p tcp -m comment --comment "openshift-kube-controller-manager/kube-controller-manager:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-X7YGTN7QRQI2VNWZ -d 172.30.6.240/32 ! -i tun0 -p tcp -m comment --comment "openshift-kube-apiserver/apiserver:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
참고 :
[root@okd-1 core]# df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/sda4 120G 46G 75G 38% /
[root@okd-1 core]# journalctl --disk-usage
Archived and active journals take up 4.0G in the file system.
[root@okd-1 core]# journalctl --vacuum-size=3758096384
[root@okd-1 core]# journalctl --vacuum-files=100
Vacuuming done, freed 0B of archived journals from /run/log/journal.
Vacuuming done, freed 0B of archived journals from /var/log/journal.
Vacuuming done, freed 0B of archived journals from /var/log/journal/126376a91cbf47ffab943ee1bddd8398.
Vacuuming done, freed 0B of archived journals from /run/log/journal/c4748c3389184cc596c377b877a5bb0f.
[root@okd-1 core]# journalctl --disk-usage
Archived and active journals take up 3.5G in the file system.
[root@okd-1 core]# journalctl --vacuum-time=1d
[core@okd-1 ~]$ systemctl status [email protected]
× [email protected] - Process Core Dump (PID 128284/UID 0)
Loaded: loaded (/usr/lib/systemd/system/[email protected]; static)
Active: failed (Result: timeout) since Fri 2023-09-08 09:26:06 KST; 5h 18min ago
Duration: 5min 619ms
TriggeredBy: ● systemd-coredump.socket
Docs: man:systemd-coredump(8)
Process: 128307 ExecStart=/usr/lib/systemd/systemd-coredump (code=killed, signal=TERM)
Main PID: 128307 (code=killed, signal=TERM)
CPU: 44.847s
서비스를 재기동 하고 failed 된 것들을 reset 한다.
[root@okd-1 core]# systemctl restart systemd-coredump.socket
[root@okd-1 core]# systemctl reset-failed
[root@okd-1 core]# systemctl status systemd-coredump.socket
● systemd-coredump.socket - Process Core Dump Socket
Loaded: loaded (/usr/lib/systemd/system/systemd-coredump.socket; static)
Active: active (listening) since Fri 2023-09-08 14:46:39 KST; 16s ago
Until: Fri 2023-09-08 14:46:39 KST; 16s ago
Docs: man:systemd-coredump(8)
Listen: /run/systemd/coredump (SequentialPacket)
Accepted: 1; Connected: 0;
CGroup: /system.slice/systemd-coredump.socket
Sep 08 14:46:39 okd-1.okd4.ktdemo.duckdns.org systemd[1]: Listening on systemd-coredump.socket - Process Core Dump Socket.
okd 를 초기 설치 하면 기본적인 cpu , mem, disk 정도만 모니터링이 되고 데이터는 node에 저장이된다.
openshift-monitoring
namespace 에 cluster-monitoring-config
configmap 을 생성한다.
root@bastion monitoring]# cat cluster-monitoring-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
#grafana: # grafana 자동 설치
# nodeSelector:
# devops: "true" # node selector 찾아서 설치
prometheusK8s:
retention: 14d # 데이터 보존 기간
volumeClaimTemplate:
spec:
storageClassName: nfs-client # storage는 다이나믹 프로비져닝으로 생성
volumeMode: Filesystem
resources:
requests:
storage: 50Gi
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 3
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 3
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
alertmanagerMain:
nodeSelector:
devops: "true"
volumeClaimTemplate:
spec:
storageClassName: nfs-client
volumeMode: Filesystem
resources:
requests:
storage: 50Gi
telemeterClient:
nodeSelector:
devops: "true"
prometheusOperator:
nodeSelector:
devops: "true"
kubeStateMetrics:
nodeSelector:
devops: "true"
openshiftStateMetrics:
nodeSelector:
devops: "true"
thanosQuerier:
nodeSelector:
devops: "true"
k8sPrometheusAdapter:
nodeSelector:
devops: "true"
아래 와 같이 --dry-run=server
명령어를 사용하여 yaml 화일을 검증 한다.
[root@bastion monitoring]# kubectl apply -f cluster-monitoring-config.yaml --dry-run=server -n openshift-monitoring
configmap/cluster-monitoring-config created (server dry run)
에러가 없으면 configmap 을 생성한다.
[root@bastion monitoring]# kubectl apply -f cluster-monitoring-config.yaml -n openshift-monitoring
configmap/cluster-monitoring-config created
openshift-monitoring namespace 의 pod 를 조회해 보면 신규로 alertmanager-main
, openshift-state-metrics
등이 생성이 된다.
[root@bastion monitoring]# kubectl get po -n openshift-monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 6/6 Running 1 (58s ago) 77s
cluster-monitoring-operator-585c6bf574-pzqth 2/2 Running 4 10d
kube-state-metrics-755448c775-875h2 3/3 Running 0 107s
node-exporter-6z2q6 2/2 Running 0 10d
node-exporter-p6khv 2/2 Running 4 10d
node-exporter-v9jcd 2/2 Running 2 3d6h
openshift-state-metrics-6c75f948d8-qt4cx 3/3 Running 0 107s
prometheus-adapter-5fc64d5b7f-n58sh 1/1 Running 0 106s
prometheus-k8s-0 6/6 Running 0 63s
prometheus-operator-54f7875879-g49qq 2/2 Running 0 2m
prometheus-operator-admission-webhook-7d8d84948d-xqhhg 1/1 Running 0 2m12s
telemeter-client-5fb4cc8d85-lksv2 3/3 Running 0 106s
thanos-querier-798bc997d-trsbq 6/6 Running 0 93s
다이나믹 프로비저닝을 통해 2개의 pvc가 생성이 된 것을 확인할 수 있다.
observer -> metrics 메뉴로 이동하여 아래와 같이 absent_over_time(container_cpu_usage_seconds_total{}[5m])
promQL을 입력하여 데이터를 조회 한다.
참고 : https://velog.io/@sojukang/세상에서-제일-쉬운-Prometheus-Grafana-모니터링-설정
[root@bastion ~]# wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz
node_exporter-1.6.1.linux-amd64.tar.gz 100%[==============================================================================>] 9.89M 21.8MB/s in 0.5s
2023-09-12 08:11:18 (21.8 MB/s) - 'node_exporter-1.6.1.linux-amd64.tar.gz' saved [10368103/10368103]
[root@bastion ~]# ls
README.md etcd_backup kubecost oclogin
anaconda-ks.cfg harbor minio okd4
argocd htpasswd monitoring openshift-client-linux-4.12.0-0.okd-2023-03-18-084815.tar.gz
cloud_shell initial-setup-ks.cfg node_exporter-1.6.1.linux-amd64.tar.gz openshift-install-linux-4.12.0-0.okd-2023-03-18-084815.tar.gz
dynamic_provisioning install-config.yaml oauth-config.yaml
[root@bastion ~]# tar xvfz node_exporter-1.6.1.linux-amd64.tar.gz
node_exporter-1.6.1.linux-amd64/
node_exporter-1.6.1.linux-amd64/NOTICE
node_exporter-1.6.1.linux-amd64/node_exporter
node_exporter-1.6.1.linux-amd64/LICENSE
[root@bastion ~]# mv node_exporter-1.6.1.linux-amd64 node_exporter
[root@bastion ~]# cd node_exporter
[root@bastion node_exporter]# ls
LICENSE NOTICE node_exporter
[root@bastion node_exporter]# nohup ./node_exporter --web.listen-address=:8081 &
[1] 193400
<kube-state-metric 사용>
모든 쿠버네티스 오브젝트의 메트릭을 가지고 있는 kube state metric
kubelet에 포함된 cAdvisor를 통해 컨테이너의 메트릭 정보를 볼 수 있음
openshift-user-workload-monitoring
을 활성화 하기 위해서는 cluster-monitoring-config
이름의
configmap에 enableUserWorkload: true
를 추가한다.
[root@bastion ~]# oc -n openshift-monitoring edit configmap cluster-monitoring-config
아래와 같이 추가 하고 저장.
6 data:
7 config.yaml: |
8 enableUserWorkload: true # 추가
cluster operator 에서 성공 메시지를 확인은 한다.
openshift-user-workload-monitoring
namespace 에서 3개의 pod가 생성된 것을 확인한다.
[root@bastion ~]# kubectl get po -n openshift-user-workload-monitoring
NAME READY STATUS RESTARTS AGE
prometheus-operator-9d64bcf56-468q5 2/2 Running 0 48s
prometheus-user-workload-0 6/6 Running 0 44s
thanos-ruler-user-workload-0 3/3 Running 0 41s
[root@bastion monitoring]# kubectl apply -f frontend-v1-and-backend-v1-JVM.yaml -n shclub
deployment.apps/frontend-v1 created
service/frontend created
route.route.openshift.io/frontend created
deployment.apps/backend-v1 created
service/backend created
Call frontend app with curl
[root@bastion monitoring]# curl -k https://$(oc get route frontend -o jsonpath='{.spec.host}' -n shclub)
Frontend version: v1 => [Backend: http://backend:8080, Response: 200, Body: Backend version:v1, Response:200, Host:backend-v1-86d9c7747d-96d7s, Status:200, Message: Hello, World]
Check for backend's metrics
- jvm heap size
[root@bastion monitoring]# oc exec -n shclub $(oc get pods -l app=backend \
--no-headers -o custom-columns='Name:.metadata.name' \
-n shclub | head -n 1 ) -- curl -s http://localhost:8080/q/metrics | grep heap
# HELP jvm_gc_memory_allocated_bytes Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'"} 1.22912768E8
jvm_memory_max_bytes{area="heap",id="PS Old Gen"} 1.048576E8
jvm_memory_max_bytes{area="heap",id="PS Survivor Space"} 524288.0
jvm_memory_max_bytes{area="heap",id="PS Eden Space"} 5.1380224E7
jvm_memory_max_bytes{area="nonheap",id="Metaspace"} -1.0
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-nmethods'"} 5828608.0
jvm_memory_max_bytes{area="nonheap",id="Compressed Class Space"} 1.073741824E9
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'"} 1.22916864E8
# HELP jvm_memory_usage_after_gc_percent The percentage of long-lived heap pool used after the last GC event, in the range [0..1]
jvm_memory_usage_after_gc_percent{area="heap",pool="long-lived"} 0.09887138366699219
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'"} 6029312.0
jvm_memory_committed_bytes{area="heap",id="PS Old Gen"} 1.1534336E7
jvm_memory_committed_bytes{area="heap",id="PS Survivor Space"} 524288.0
jvm_memory_committed_bytes{area="heap",id="PS Eden Space"} 1048576.0
jvm_memory_committed_bytes{area="nonheap",id="Metaspace"} 3.2636928E7
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'non-nmethods'"} 2555904.0
jvm_memory_committed_bytes{area="nonheap",id="Compressed Class Space"} 4194304.0
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'"} 2555904.0
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'"} 5988992.0
jvm_memory_used_bytes{area="heap",id="PS Old Gen"} 1.0367416E7
jvm_memory_used_bytes{area="heap",id="PS Survivor Space"} 248008.0
jvm_memory_used_bytes{area="heap",id="PS Eden Space"} 787736.0
jvm_memory_used_bytes{area="nonheap",id="Metaspace"} 3.22404E7
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-nmethods'"} 1318656.0
jvm_memory_used_bytes{area="nonheap",id="Compressed Class Space"} 4023512.0
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'"} 1021696.0
# HELP jvm_gc_live_data_size_bytes Size of long-lived heap memory pool after reclamation
# HELP jvm_gc_max_data_size_bytes Max size of long-lived heap memory pool
Check for backend application related metrics
[root@bastion monitoring]# oc exec -n shclub $(oc get pods -l app=backend \
> --no-headers -o custom-columns='Name:.metadata.name' \
> -n shclub | head -n 1 ) \
> -- curl -s http://localhost:8080/q/metrics | grep http_server_requests_seconds
# TYPE http_server_requests_seconds summary
# HELP http_server_requests_seconds
http_server_requests_seconds_count{method="GET",outcome="SUCCESS",status="200",uri="root"} 1.0
http_server_requests_seconds_sum{method="GET",outcome="SUCCESS",status="200",uri="root"} 3.911644255
# TYPE http_server_requests_seconds_max gauge
# HELP http_server_requests_seconds_max
http_server_requests_seconds_max{method="GET",outcome="SUCCESS",status="200",uri="root"} 0.0
[root@bastion monitoring]# vi backend-service-monitor.yaml
[root@bastion monitoring]# kubectl apply -f backend-service-monitor.yaml -n shclub
servicemonitor.monitoring.coreos.com/backend-monitor created
Role monitor-edit is required for create ServiceMonitor and PodMonitor resources. Following example is granting role montior-edit to user1 for project1
[root@bastion monitoring]# oc adm policy add-role-to-user monitoring-edit shclub -n shclub
clusterrole.rbac.authorization.k8s.io/monitoring-edit added: "shclub"
siege는 명령어를 사용하여 성능 테스트를 수행합니다.
okd_prometheus_user1.png okd_prometheus_user2.png
okd_topology1.png
Assign the user-workload-monitoring-config-edit role to a user in the openshift-user-workload-monitoring project:
$ oc -n openshift-user-workload-monitoring adm policy add-role-to-user
user-workload-monitoring-config-edit shclub
--role-namespace openshift-user-workload-monitoring
alert manager
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
alertmanagerMain:
enableUserAlertmanagerConfig: true
apiVersion: v1
kind: ConfigMap
metadata:
name: user-workload-monitoring-config
namespace: openshift-user-workload-monitoring
data:
config.yaml: |
alertmanager:
enabled: true
enableAlertmanagerConfig: true
# oc -n openshift-user-workload-monitoring get alertmanager
Assign the alert-routing-edit cluster role to a user in the user-defined project:
$ oc -n <namespace> adm policy add-role-to-user alert-routing-edit <user>
2.5. Configurable monitoring components This table shows the monitoring components you can configure and the keys used to specify the components in the cluster-monitoring-config and user-workload-monitoring-config ConfigMap objects:
[root@bastion monitoring]# kubectl get configmap -n openshift-user-workload-monitoring
NAME DATA AGE
kube-root-ca.crt 1 12d
metrics-client-ca 1 26h
openshift-service-ca.crt 1 12d
prometheus-user-workload-rulefiles-0 0 26h
serving-certs-ca-bundle 1 26h
thanos-ruler-trusted-ca-bundle 1 26h
thanos-ruler-trusted-ca-bundle-c7nmestil7q08 1 26h
thanos-ruler-user-workload-rulefiles-0 0 26h
user-workload-monitoring-config 0 26h
[root@bastion monitoring]# kubectl edit configmap user-workload-monitoring-config -n openshift-user-workload-monitoring
configmap/user-workload-monitoring-config edited
apiVersion: v1
kind: ConfigMap
metadata:
name: user-workload-monitoring-config
namespace: openshift-user-workload-monitoring
data:
config.yaml: |
prometheus:
retention: 14d
volumeClaimTemplate:
spec:
storageClassName: nfs-client
resources:
requests:
storage: 10Gi
thanosRuler:
retention: 14d
volumeClaimTemplate:
spec:
storageClassName: nfs-client
resources:
requests:
storage: 10Gi
prometheus-user-workload
와 thanos-ruler-user-workload
이름으로 statefulset 이 생성이 된다.
[root@bastion monitoring]# kubectl get statefulset -n openshift-user-workload-monitoring
NAME READY AGE
prometheus-user-workload 1/1 81m
thanos-ruler-user-workload 1/1 5m17s
[root@bastion monitoring]# kubectl get po -n openshift-user-workload-monitoring
NAME READY STATUS RESTARTS AGE
prometheus-operator-9d64bcf56-468q5 2/2 Running 0 28h
prometheus-user-workload-0 6/6 Running 0 81m
thanos-ruler-user-workload-0 3/3 Running 0 4m55s
tsdb status 확인 방법
[root@bastion ~]# host=$(oc -n openshift-monitoring get route prometheus-k8s -ojsonpath={.spec.host})
[root@bastion ~]# echo $host
prometheus-k8s-openshift-monitoring.apps.okd4.ktdemo.duckdns.org
[root@bastion ~]# token=$(oc whoami -t)
[root@bastion ~]# curl -H "Authorization: Bearer $token" -k "https://$host/api/v1/status/tsdb"
{"status":"success","data":{"headStats":{"numSeries":142428,"numLabelPairs":12717,"chunkCount":369937,"minTime":1694592000485,"maxTime":1694601927016},"seriesCountByMetricName":[{"name":"apiserver_request_duration_seconds_bucket","value":11086},{"name":"apiserver_request_slo_duration_seconds_bucket","value":10802},{"name":"etcd_request_duration_seconds_bucket","value":7848},{"name":"workqueue_queue_duration_seconds_bucket","value":4554},{"name":"workqueue_work_duration_seconds_bucket","value":4554},{"name":"apiserver_response_sizes_bucket","value":3568},{"name":"apiserver_watch_events_sizes_bucket","value":1692},{"name":"prober_probe_duration_seconds_bucket","value":1464},{"name":"apiserver_request_total","value":1402},{"name":"grpc_server_handled_total","value":1326}],"labelValueCountByLabelName":[{"name":"__name__","value":1890},{"name":"name","value":1059},{"name":"secret","value":1035},{"name":"id","value":828},{"name":"resource","value":463},{"name":"configmap","value":413},{"name":"le","value":407},{"name":"type","value":336},{"name":"container_id","value":327},{"name":"uid","value":321}],"memoryInBytesByLabelName":[{"name":"id","value":134699},{"name":"name","value":77553},{"name":"__name__","value":70940},{"name":"secret","value":27612},{"name":"container_id","value":23544},{"name":"mountpoint","value":20798},{"name":"rule_group","value":13094},{"name":"uid","value":11540},{"name":"image_id","value":11385},{"name":"type","value":11119}],"seriesCountByLabelValuePair":[{"name":"endpoint=https","value":75269},{"name":"namespace=default","value":45665},{"name":"service=kubernetes","value":45635},{"name":"job=apiserver","value":45634},{"name":"apiserver=kube-apiserver","value":45633},{"name":"instance=192.168.1.146:6443","value":44948},{"name":"service=kubelet","value":33095},{"name":"endpoint=https-metrics","value":31939},{"name":"job=kubelet","value":31772},{"name":"component=apiserver","value":30400}]}}
외부 서비스 등록 방법
https://gist.github.com/christophlehmann/b1bbf2821a876c7f91d8eec3b6788f24
docker run node_exporter
https://devops.college/prometheus-operator-how-to-monitor-an-external-service-3cb6ac8d5acb
prometheus.yaml vs operator
[root@bastion monitoring]# podman run -d -p 9100:9100 prom/node-exporter
✔ docker.io/prom/node-exporter:latest
Trying to pull docker.io/prom/node-exporter:latest...
Getting image source signatures
Copying blob 2b6642e6c59e done
Copying blob d5c4df21b127 done
Copying blob 2f5f7d8898a1 done
Copying config 458e026e6a done
Writing manifest to image destination
Storing signatures
3b79c5a114c3c6908c33539230ba9926d907651824cd419f8660f6537b0618d5
[root@bastion monitoring]# podman ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3b79c5a114c3 docker.io/prom/node-exporter:latest 24 seconds ago Up 24 seconds 0.0.0.0:9100->9100/tcp stoic_black
[root@bastion monitoring]# curl http://192.168.1.40:9100/metrics
[root@bastion monitoring]# kubectl apply -f external-node-exporter-svc.yaml -n shclub
service/external-node-exporter created
[root@bastion monitoring]# kubectl apply -f external-node-exporter-ep.yaml -n shclub
endpoints/external-node-exporter created
[root@bastion monitoring]# kubectl apply -f external-service-monitor.yaml -n shclub
servicemonitor.monitoring.coreos.com/external-node-exporter created
[root@bastion monitoring]# kubectl get svc -n shclub | grep external-node-exporter
external-node-exporter ClusterIP 172.30.33.239 <none> 9100/TCP 3m11s
[root@bastion monitoring]# kubectl get servicemonitor -n shclub
NAME AGE
external-node-exporter 11m
metric url
node_cpu_seconds_total
node_filesystem_avail_bytes
prometheus.xml vs operator 구분
promethus rule 역할 ?
4.12에서 grafana 는 ?
https://docs.okd.io/4.12/logging/cluster-logging-loki.html
ibm 자료 grafana 설치
https://loki-operator.dev/docs/howto_connect_grafana.md/
openshift-user-workload-monitoring 에 grafana operator 설치
Install the Grafana operator for the openshift-user-workload-monitoring namespace. Create an instance of Grafana. Select Operator > OperatorHub > Grafana Operator and then select the Grafana tab. Click Create Grafana. On YAML tab, enter the following text:
Operator 에서 Grafana를 생성한다.
apiVersion: integreatly.org/v1alpha1
kind: Grafana
metadata:
name: mas-grafana
namespace: openshift-user-workload-monitoring
spec:
ingress:
enabled: true
dataStorage:
accessModes:
- ReadWriteOnce
size: 10Gi
class: ocs-storagecluster-cephfs
config:
log:
mode: "console"
level: "warn"
security:
admin_user: "root"
admin_password: "secret"
auth:
disable_login_form: False
disable_signout_menu: True
auth.anonymous:
enabled: True
dashboardLabelSelector:
- matchExpressions:
- {key: app, operator: In, values: [grafana]}
권한을 할당한다.
[root@bastion monitoring]# oc adm policy add-cluster-role-to-user cluster-monitoring-view -z grafana-serviceaccount -n openshift-user-workload-monitoring
clusterrole.rbac.authorization.k8s.io/cluster-monitoring-view added: "grafana-serviceaccount"
[root@bastion monitoring]# SECRET=`oc get secret -n openshift-user-workload-monitoring | grep prometheus-user-workload-token | head -n 1 | awk '{print $1 }'`
[root@bastion monitoring]# TOKEN=`echo $(oc get secret $SECRET -n openshift-user-workload-monitoring -o json | jq -r '.data.token') | base64 -d`
이게 맞으려냐 위에께 맞으려나 아래꺼는 에전
[root@bastion monitoring]# token=$(oc create token grafana-serviceaccount -n openshift-user-workload-monitoring)
grafana dashboard 를 생성한다.
apiVersion: integreatly.org/v1alpha1
kind: GrafanaDataSource
metadata:
name: prometheus-grafanadatasource
namespace: openshift-user-workload-monitoring
spec:
datasources:
- access: proxy
editable: true
isDefault: true
jsonData:
httpHeaderName1: 'Authorization'
timeInterval: 5s
tlsSkipVerify: true
name: Prometheus
secureJsonData:
httpHeaderValue1: 'Bearer ${BEARER_TOKEN}'
type: prometheus
url: 'https://thanos-querier.openshift-monitoring.svc.cluster.local:9091'
name: prometheus-grafanadatasource.yaml
https://www.jacobbaek.com/1540
job 을 만든다.
[root@bastion monitoring]# cat prometheus-additional.yaml
- job_name: node
static_configs:
- targets: ['localhost:9100']
secret 화일을 생성한다.
[root@bastion monitoring]# kubectl create secret generic additional-scrape-configs --from-file=prometheus-additional.yaml --dry-run=client -oyaml > additional-scrape-configs.yaml
[root@bastion monitoring]# cat additional-scrape-configs.yaml
apiVersion: v1
data:
prometheus-additional.yaml: LSBqb2JfbmFtZTogbm9kZQogICAgc3RhdGljX2NvbmZpZ3M6CiAgICAgIC0gdGFyZ2V0czogWydsb2NhbGhvc3Q6OTEwMCddCg==
kind: Secret
metadata:
creationTimestamp: null
name: additional-scrape-configs
additional-scrape-configs
라는 이름의 secret을 생성한다.
[root@bastion monitoring]# kubectl apply -f additional-scrape-configs.yaml -n openshift-user-workload-monitoring
secret/additional-scrape-configs created
prometheus crd의 spec내에 아래 라인을 추가한다. 이건 아님..
[root@bastion monitoring]# kubectl edit crd prometheuses.monitoring.coreos.com -n openshift-user-workload-monitoring
spec:
additionalScrapeConfigs:
key: prometheus-additional.yaml
name: additional-scrape-configs
[root@bastion monitoring]# kubectl get prometheus -n openshift-user-workload-monitoring
NAME VERSION DESIRED READY RECONCILED AVAILABLE AGE
user-workload 2.39.1 1 1 False True 35h
https://www.reddit.com/r/openshift/comments/1029ud7/is_it_possible_to_set_additional_scrape_configs/
apiVersion: v1
kind: ConfigMap
metadata:
name: user-workload-monitoring-config
namespace: openshift-user-workload-monitoring
data:
config.yaml: |
prometheus:
retention: 14d
additionalScrapeConfigs:
enabled: true
name: additional-scrape-configs
key: additional_scrape_config.yaml
volumeClaimTemplate:
spec:
storageClassName: nfs-client
resources:
requests:
storage: 10Gi
thanosRuler:
retention: 14d
volumeClaimTemplate:
spec:
storageClassName: nfs-client
resources:
requests:
storage: 10Gi
namespace 별 pod 갯수 세기
https://devocean.sk.com/blog/techBoardDetail.do?ID=164488
minio 사용 하기 : https://devocean.sk.com/blog/techBoardDetail.do?page=&boardType=undefined&query=&ID=164946&searchData=&subIndex=
PromQL: sum(kube_pod_info) by (namespace)
kube_pod_info
는 kubernetes-exporter를 통해 수집되는 metric
[root@bastion monitoring]# SECRET=`oc get secret -n openshift-user-workload-monitoring | grep prometheus-user-workload-token | head -n 1 | awk '{print $1 }'`
[root@bastion monitoring]# TOKEN=`echo $(oc get secret $SECRET -n openshift-user-workload-monitoring -o json | jq -r '.data.token') | base64 -d`
[root@bastion monitoring]# THANOS_QUERIER_HOST=`oc get route thanos-querier -n openshift-monitoring -o json | jq -r '.spec.host'`
query 부분에 위 query를 URL-encoding을 수행한 값으로 넣어주면 된다.
[root@bastion monitoring]# curl -X GET -kG "https://$THANOS_QUERIER_HOST/api/v1/query?" --data-urlencode "query=sum(kube_pod_info) by (namespace)" -H "Authorization: Bearer $TOKEN"
{"status":"success","data":{"resultType":"vector","result":[{"metric":{"namespace":"openshift-etcd"},"value":[1694754323.376,"4"]},{"metric":{"namespace":"openshift-kube-apiserver"},"value":[1694754323.376,"9"]},{"metric":{"namespace":"openshift-kube-controller-manager"},"value":[1694754323.376,"6"]},{"metric":{"namespace":"openshift-kube-scheduler"},"value":[1694754323.376,"6"]},{"metric":{"namespace":"openshift-marketplace"},"value":[1694754323.376,"8"]},{"metric":{"namespace":"edu1"},"value":[1694754323.376,"4"]},{"metric":{"namespace":"edu2"},"value":[1694754323.376,"8"]},{"metric":{"namespace":"minio"},"value":[1694754323.376,"2"]},{"metric":{"namespace":"shclub"},"value":[1694754323.376,"9"]},{"metric":{"namespace":"openshift-dns"},"value":[1694754323.376,"6"]},{"metric":{"namespace":"openshift-ingress-canary"},"value":[1694754323.376,"3"]},{"metric":{"namespace":"openshift-machine-config-operator"},"value":[1694754323.376,"6"]},{"metric":{"namespace":"openshift-multus"},"value":[1694754323.376,"10"]},{"metric":{"namespace":"openshift-network-diagnostics"},"value":[1694754323.376,"4"]},{"metric":{"namespace":"openshift-image-registry"},"value":[1694754323.376,"7"]},{"metric":{"namespace":"openshift-monitoring"},"value":[1694754323.376,"13"]},{"metric":{"namespace":"openshift-sdn"},"value":[1694754323.376,"4"]},{"metric":{"namespace":"openshift-cluster-node-tuning-operator"},"value":[1694754323.376,"4"]},{"metric":{"namespace":"haerin"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-operator-lifecycle-manager"},"value":[1694754323.376,"7"]},{"metric":{"namespace":"etcd-backup"},"value":[1694754323.376,"3"]},{"metric":{"namespace":"openshift-oauth-apiserver"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-apiserver"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"argo-rollouts"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"argocd"},"value":[1694754323.376,"7"]},{"metric":{"namespace":"openshift-authentication-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-cloud-credential-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-machine-api"},"value":[1694754323.376,"4"]},{"metric":{"namespace":"openshift-cloud-controller-manager-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-cluster-samples-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-cluster-storage-operator"},"value":[1694754323.376,"4"]},{"metric":{"namespace":"openshift-cluster-version"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-console"},"value":[1694754323.376,"2"]},{"metric":{"namespace":"openshift-console-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-controller-manager"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-dns-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-etcd-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-user-workload-monitoring"},"value":[1694754323.376,"5"]},{"metric":{"namespace":"openshift-ingress-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-insights"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-kube-apiserver-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-kube-controller-manager-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-kube-storage-version-migrator-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-operators"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-cluster-machine-approver"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-kube-storage-version-migrator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"harbor"},"value":[1694754323.376,"9"]},{"metric":{"namespace":"edu5"},"value":[1694754323.376,"2"]},{"metric":{"namespace":"openshift-network-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-authentication"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-apiserver-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-config-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-controller-manager-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-kube-scheduler-operator"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-route-controller-manager"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-ingress"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-service-ca"},"value":[1694754323.376,"1"]},{"metric":{"namespace":"openshift-service-ca-operator"},"value":[1694754323.376,"1"]}]}}
추가적으로 보기 편하도록 파이썬 모듈인 json.tool을 사용할 수 있다.
[root@bastion monitoring]# curl -X GET -kG "https://$THANOS_QUERIER_HOST/api/v1/query?" --data-urlencode "query=sum(kube_pod_info) by (namespace)" -H "Authorization: Bearer $TOKEN" | python3 -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 4749 0 4749 0 0 96918 0 --:--:-- --:--:-- --:--:-- 96918
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"namespace": "openshift-etcd"
},
"value": [
1694651859.101,
"4"
]
},
{
"metric": {
"namespace": "openshift-kube-apiserver"
},
"value": [
1694651859.101,
"9"
]
},
{
"metric": {
"namespace": "openshift-kube-controller-manager"
},
"value": [
1694651859.101,
"6"
]
},
{
"metric": {
"namespace": "openshift-kube-scheduler"
},
"value": [
1694651859.101,
"6"
]
},
...
[root@bastion monitoring]# kubectl get servicemonitors.monitoring.coreos.com -n openshift-user-workload-monitoring NAME AGE prometheus-operator 45h prometheus-user-workload 45h thanos-ruler 45h thanos-sidecar 45h [root@bastion monitoring]# kubectl get servicemonitors.monitoring.coreos.com -n shclub NAME AGE backend-monitor 39h external-node-exporter 14h
minio
13502
15306
12563 : standalone 으로 설치 된경우
prometheusK8s: - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod
- job_name: minio-job
bearer_token: my-minio
metrics_path: /minio/v2/metrics/cluster
scheme: http
static_configs:
- targets: ['my-minio.minio.svc.cluster.local:9000']
https://min.io/docs/minio/kubernetes/upstream/operations/monitoring/metrics-and-alerts.html https://min.io/docs/minio/linux/reference/minio-server/minio-server.html#envvar.MINIO_PROMETHEUS_AUTH_TYPE
curl https://play.min.io/minio/v2/metrics/cluster
http://minio.example.net:9000/minio/v2/metrics/cluster
잠시만 백업
apiVersion: v1 data: config.yaml: | prometheus: - job_name: minio-job bearer_token: my-minio metrics_path: /minio/v2/metrics/cluster scheme: http static_configs: - targets: ['my-minio.minio.svc.cluster.local:9000'] retention: 14d additionalScrapeConfigs: enabled: true name: additional-scrape-configs key: additional_scrape_config.yaml
TOKEN=$(oc create token grafana-serviceaccount -n openshift-user-workload-monitoring)
cat manifests/grafana-datasource.yaml | sed 's/Bearer .*/Bearer '"$TOKEN""'"'/'|oc apply -n openshift-user-workload-monitoring -f -
okd_minio_env.png
okd_minio_metric_test.png
okd_cadvisor1.png
api server 에러 로그 [root@okd-1 core]# cat /var/log/kube-apiserver/audit.log
[root@okd-1 core]# swapoff -a [root@okd-1 core]# systemctl restart kubelet [root@okd-1 core]# cat /etc/systemd/system/kubelet.service [Unit] Description=Kubernetes Kubelet Wants=rpc-statd.service network-online.target Requires=crio.service kubelet-auto-node-size.service After=network-online.target crio.service kubelet-auto-node-size.service After=ostree-finalize-staged.service
[Service] Type=notify ExecStartPre=/bin/mkdir --parents /etc/kubernetes/manifests ExecStartPre=/bin/rm -f /var/lib/kubelet/cpu_manager_state ExecStartPre=/bin/rm -f /var/lib/kubelet/memory_manager_state EnvironmentFile=/etc/os-release EnvironmentFile=-/etc/kubernetes/kubelet-workaround EnvironmentFile=-/etc/kubernetes/kubelet-env EnvironmentFile=/etc/node-sizing.env
ExecStart=/usr/local/bin/kubenswrapper
/usr/bin/kubelet
--config=/etc/kubernetes/kubelet.conf
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig
--kubeconfig=/var/lib/kubelet/kubeconfig
--container-runtime=remote
--container-runtime-endpoint=/var/run/crio/crio.sock
--runtime-cgroups=/system.slice/crio.service
--node-labels=node-role.kubernetes.io/control-plane,node-role.kubernetes.io/master,node.openshift.io/os_id=${ID}
--node-ip=${KUBELET_NODE_IP}
--minimum-container-ttl-duration=6m0s
--cloud-provider=
--volume-plugin-dir=/etc/kubernetes/kubelet-plugins/volume/exec
--hostname-override=${KUBELET_NODE_NAME}
--provider-id=${KUBELET_PROVIDERID}
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule
--pod-infra-container-image=quay.io/openshift/okd-content@sha256:5e754620ce9f9c97f8ef12ed089338511d60eaa6d8c9f9cf79b472d3fe8ef988
--system-reserved=cpu=${SYSTEM_RESERVED_CPU},memory=${SYSTEM_RESERVED_MEMORY},ephemeral-storage=${SYSTEM_RESERVED_ES}
--v=${KUBELET_LOG_LEVEL}
Restart=always RestartSec=10
[Install] WantedBy=multi-user.target
.. [root@okd-1 multus.d]# pwd /etc/kubernetes/cni/net.d/multus.d
master node 에 접속 하여
export KUBECONFIG=/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/lb-int.kubeconfig
[root@okd-1 kubernetes]# systemctl status kubelet ● kubelet.service - Kubernetes Kubelet Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; preset: disabled) Drop-In: /etc/systemd/system/kubelet.service.d └─01-kubens.conf, 10-mco-default-env.conf, 10-mco-default-madv.conf, 20-logging.conf, 20-nodenet.conf Active: active (running) since Sat 2023-09-16 22:47:25 KST; 6min ago Process: 1992 ExecStartPre=/bin/mkdir --parents /etc/kubernetes/manifests (code=exited, status=0/SUCCESS) Process: 1993 ExecStartPre=/bin/rm -f /var/lib/kubelet/cpu_manager_state (code=exited, status=0/SUCCESS) Process: 1994 ExecStartPre=/bin/rm -f /var/lib/kubelet/memory_manager_state (code=exited, status=0/SUCCESS) Main PID: 1995 (kubelet) Tasks: 24 (limit: 11606) Memory: 46.3M CPU: 7.025s CGroup: /system.slice/kubelet.service └─1995 /usr/bin/kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/var/lib/kubelet/kubeconfig --container-runtime=rem>
Sep 16 22:54:20 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: I0916 22:54:20.970618 1995 kubelet_node_status.go:590] "Recording event message for node" node="okd-1.okd4.ktdemo.duckd> Sep 16 22:54:20 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: I0916 22:54:20.970627 1995 kubelet_node_status.go:590] "Recording event message for node" node="okd-1.okd4.ktdemo.duckd> Sep 16 22:54:20 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: I0916 22:54:20.970647 1995 kubelet_node_status.go:72] "Attempting to register node" node="okd-1.okd4.ktdemo.duckdns.org" Sep 16 22:54:20 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: E0916 22:54:20.971915 1995 kubelet_node_status.go:94] "Unable to register node with API server" err="nodes is forbidden> Sep 16 22:54:21 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: E0916 22:54:21.057688 1995 kubelet.go:2471] "Error getting node" err="node "okd-1.okd4.ktdemo.duckdns.org" not found" Sep 16 22:54:21 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: E0916 22:54:21.158320 1995 kubelet.go:2471] "Error getting node" err="node "okd-1.okd4.ktdemo.duckdns.org" not found" Sep 16 22:54:21 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: E0916 22:54:21.259128 1995 kubelet.go:2471] "Error getting node" err="node "okd-1.okd4.ktdemo.duckdns.org" not found" Sep 16 22:54:21 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: E0916 22:54:21.359983 1995 kubelet.go:2471] "Error getting node" err="node "okd-1.okd4.ktdemo.duckdns.org" not found" Sep 16 22:54:21 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: E0916 22:54:21.460865 1995 kubelet.go:2471] "Error getting node" err="node "okd-1.okd4.ktdemo.duckdns.org" not found" Sep 16 22:54:21 okd-1.okd4.ktdemo.duckdns.org kubenswrapper[1995]: E0916 22:54:21.561750 1995 kubelet.go:2471] "Error getting node" err="node "okd-1.okd4.ktdemo.duckdns.org" not found"
[root@okd-1 kubernetes]# oc whoami Error from server (ServiceUnavailable): the server is currently unable to handle the request (get users.user.openshift.io ~) [root@okd-1 kubernetes]# oc get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-48vzj 77m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-5tscx 113m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-6xnr7 72m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-bjftj 6m30s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-g2fjp 96m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-h2f95 27m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-kh7dr 42m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-pbkq2 106m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-qn87p 99m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-x2k78 92m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-zp94n 57m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending [root@okd-1 kubernetes]# oc get csr -o name | xargs oc adm certificate approve certificatesigningrequest.certificates.k8s.io/csr-48vzj approved certificatesigningrequest.certificates.k8s.io/csr-5tscx approved certificatesigningrequest.certificates.k8s.io/csr-6xnr7 approved certificatesigningrequest.certificates.k8s.io/csr-bjftj approved certificatesigningrequest.certificates.k8s.io/csr-g2fjp approved certificatesigningrequest.certificates.k8s.io/csr-h2f95 approved certificatesigningrequest.certificates.k8s.io/csr-kh7dr approved certificatesigningrequest.certificates.k8s.io/csr-pbkq2 approved certificatesigningrequest.certificates.k8s.io/csr-qn87p approved certificatesigningrequest.certificates.k8s.io/csr-x2k78 approved certificatesigningrequest.certificates.k8s.io/csr-zp94n approved
[root@okd-1 kubernetes]# oc get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-48vzj 79m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-5tscx 115m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-6xnr7 74m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-bjftj 8m36s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-g2fjp 98m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-ghlqp 41s kubernetes.io/kubelet-serving system:node:okd-1.okd4.ktdemo.duckdns.org Pending csr-h2f95 29m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-kh7dr 44m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-pbkq2 108m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-qn87p 101m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-x2k78 94m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-zp94n 59m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued [root@okd-1 kubernetes]# oc get csr -o name | xargs oc adm certificate approve certificatesigningrequest.certificates.k8s.io/csr-48vzj approved
[root@okd-1 kubernetes]# oc adm certificate approve csr-ghlqp
certificatesigningrequest.certificates.k8s.io/csr-ghlqp approved
[root@bastion ~]# oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
# userData
[root@bastion ~]# cat worker.ign
{"ignition":{"config":{"merge":[{"source":"https://api-int.okd4.ktdemo.duckdns.org:22623/config/worker","verification":{}}],"replace":{"verification":{}}},"proxy":{},"security":{"tls":{"certificateAuthorities":[{"source":"data:text/plain;charset=utf-8;base64,LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURFRENDQWZpZ0F3SUJBZ0lJUFNOQkNySVA4ZjB3RFFZSktvWklodmNOQVFFTEJRQXdKakVTTUJBR0ExVUUKQ3hNSmIzQmxibk5vYVdaME1SQXdEZ1lEVlFRREV3ZHliMjkwTFdOaE1CNFhEVEl6TURrd01UQXdNelkwTVZvWApEVE16TURneU9UQXdNelkwTVZvd0pqRVNNQkFHQTFVRUN4TUpiM0JsYm5Ob2FXWjBNUkF3RGdZRFZRUURFd2R5CmIyOTBMV05oTUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUEwZ1QyQndyM0pBU28Kd2tKdE5Ca3Y5YkdmZDN5RisyNXBOZGhJbDhuZGJzQjUyeGpvQU1MZFBSekRXOStBWkNEeDFKUDlQR245Qnp5Tgo4K2UxYk1UMDRwWExUWkNyRXNlZkdUcGswYVljbWxLU3gwaThWYUU1cEJtSHZYNDVaUE1YamtRemJsWStGY3BBCmhydkowS0E4eTl5cDJSQllyb1kyZVNLcTB2UkJlRCtaQ0dyZVRKdUdrQ3FKVTFYS3kwRHY0TTVGOWE4Q3dNS1cKcVlUSkIwaThoVnh4K2xqTnBLUWtJcjR2ZXNOQkNsRERVZzR1WVFjdXlmRURLcDlqSDhOSTBXYng5aFB6bGZJOAppMi8wS2NVdnYrVURTVnYvcHA0RVh4QXZnSW93MVhsQnU3b0phVWYwSUJWRFlSeUNZUi9RNkQrQjRXMG1lMTZrClJQVWVQSlZQMVFJREFRQUJvMEl3UURBT0JnTlZIUThCQWY4RUJBTUNBcVF3RHdZRFZSMFRBUUgvQkFVd0F3RUIKL3pBZEJnTlZIUTRFRmdRVXlFRWVkeWM1WFMrS0dQY0ovc3UwU3k3Y2tYa3dEUVlKS29aSWh2Y05BUUVMQlFBRApnZ0VCQUd5YzdXOHJLcThYRWQ1c2VBSElKSld0WHd2T1ZzTVFrVVo5ditEam5pSVRyNkRVU3JpbEVHQzBmM08xCk9Pbk5kWjI5RVRtNmgxbDZOVG11WUtoK2J5V045eEZFTGZxSXFvelFCeG1TNUZWMEYrNXBxRE5qbEo3UWxIZkMKVGZETWROeTQ5WEZBOGhwOGtPdklMcXR4YmxReVR6c1o3M3ZEVlB5a01OSXMwUzRoUmlyOFBUcmt0a2pGakNIMgpUREFITERtdU91TTA2V0RIL0dGRUJsOVpHSWx2bThUYTY0Y28vNE5SaU1Zak5aWnNBd0kxSlI0dDk5NTg5UmpPCno0UmFUOVRZdmFoZlczdUJuY2E3TkJ4d29ZS1NVYlU4YStUazg3bTZaNHpudGdYVnhUVjlRdU5kZkNwdklNTTEKNThERHpCdnM2OVBqTEdOQmRwN3lWMnBlQlE4PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==","verification":{}}]}},"timeouts":{},"version":"3.2.0"},"passwd":{},"storage":{},"systemd":{}}
참고
[root@okd-1 kubernetes]# mount -t nfs 192.168.1.79:/volume3/okd/cluster_backup /mnt
[root@okd-1 kubernetes]# ls /mnt
etcd-backup-2023-09-11T01:53:12+00:00 etcd-backup-2023-09-12T07:01:02+00:00 etcd-backup-2023-09-14T07:01:02+00:00
etcd-backup-2023-09-11T07:01:02+00:00 etcd-backup-2023-09-13T07:01:02+00:00
[root@okd-1 mnt]# cd /home/core
[root@okd-1 core]# mkdir -p backup
[root@okd-1 core]# cd /mnt/etcd-backup-2023-09-14T07:01:02+00:00
[root@okd-1 etcd-backup-2023-09-14T07:01:02+00:00]# ls
snapshot_2023-09-14_160102.db static_kuberesources_2023-09-14_160102.tar.gz
[root@okd-1 etcd-backup-2023-09-14T07:01:02+00:00]# cp * /home/core/backup
[root@okd-1 etcd-backup-2023-09-14T07:01:02+00:00]# cd /home/core/backup
[root@okd-1 backup]# ls
snapshot_2023-09-14_160102.db static_kuberesources_2023-09-14_160102.tar.gz
Move the existing etcd pod file out of the kubelet manifest directory:
[root@okd-1 core]# mv /etc/kubernetes/manifests/etcd-pod.yaml /tmp
Verify that the etcd pods are stopped. 아무것도 안나와야 정상이다.
[root@okd-1 core]# crictl ps | grep etcd | grep -v operator
Move the existing Kubernetes API server pod file out of the kubelet manifest directory:
[root@okd-1 core]# mv /etc/kubernetes/manifests/kube-apiserver-pod.yaml /tmp
Verify that the Kubernetes API server pods are stopped.
[root@okd-1 core]# crictl ps | grep kube-apiserver | grep -v operator
Move the etcd data directory to a different location
[root@okd-1 core]# crictl ps | grep kube-apiserver | grep -v operator
Move the etcd data directory to a different location:
[root@okd-1 core]# mv /var/lib/etcd/ /tmp
[root@okd-1 core]# ls -al /tmp
total 148
drwxrwxrwt. 11 root root 280 Sep 17 16:16 .
drwxr-xr-x. 12 root root 4096 Sep 1 12:15 ..
drwxrwxrwt. 2 root root 40 Sep 17 07:37 .ICE-unix
drwxrwxrwt. 2 root root 40 Sep 17 07:37 .X11-unix
drwxrwxrwt. 2 root root 40 Sep 17 07:37 .XIM-unix
drwxrwxrwt. 2 root root 40 Sep 17 07:37 .font-unix
drwxr-xr-x. 3 root root 60 Sep 17 07:43 etcd
-rw-r--r--. 1 root root 20110 Sep 1 12:28 etcd-pod.yaml
-rw-r--r--. 1 root root 9182 Sep 17 00:32 kube-apiserver-pod.yaml
-rw-r--r--. 1 root root 114610 Sep 17 14:50 multus-lumberjack.log
drwx------. 3 root root 60 Sep 17 07:37 systemd-private-7aa69c824f82418882c1c0e7161675f8-chronyd.service-lITnch
drwx------. 3 root root 60 Sep 17 07:37 systemd-private-7aa69c824f82418882c1c0e7161675f8-dbus-broker.service-eeBP6c
drwx------. 3 root root 60 Sep 17 07:37 systemd-private-7aa69c824f82418882c1c0e7161675f8-systemd-logind.service-dBL8lL
drwx------. 3 root root 60 Sep 17 07:37 systemd-private-7aa69c824f82418882c1c0e7161675f8-systemd-resolved.service-xTQ1Ss
[root@okd-1 core]# /usr/local/bin/cluster-restore.sh /home/core/backup
993c4dcfe4594585c54f17989b10d5ffaa7a5f1d5b3591399e5af5e56a507955
etcdctl version: 3.5.6
API version: 3.5
Deprecated: Use `etcdutl snapshot status` instead.
{"hash":3451402762,"revision":5760427,"totalKey":8772,"totalSize":135467008}
...stopping kube-apiserver-pod.yaml
...stopping kube-controller-manager-pod.yaml
...stopping kube-scheduler-pod.yaml
...stopping etcd-pod.yaml
Waiting for container etcd to stop
complete
Waiting for container etcdctl to stop
complete
Waiting for container etcd-metrics to stop
complete
Waiting for container kube-controller-manager to stop
................complete
Waiting for container kube-apiserver to stop
complete
Waiting for container kube-scheduler to stop
.......................................................complete
starting restore-etcd static pod
starting kube-apiserver-pod.yaml
static-pod-resources/kube-apiserver-pod-10/kube-apiserver-pod.yaml
starting kube-controller-manager-pod.yaml
static-pod-resources/kube-controller-manager-pod-7/kube-controller-manager-pod.yaml
starting kube-scheduler-pod.yaml
static-pod-resources/kube-scheduler-pod-7/kube-scheduler-pod.yaml
[root@okd-1 core]# oc get nodes
NAME STATUS ROLES AGE VERSION
okd-1.okd4.ktdemo.duckdns.org Ready control-plane,master 16d v1.25.7+eab9cc9
okd-2.okd4.ktdemo.duckdns.org Ready worker 16d v1.25.7+eab9cc9
okd-3.okd4.ktdemo.duckdns.org Ready worker 9d v1.25.7+eab9cc9
[root@okd-1 core]# systemctl restart kubelet.service
[root@okd-1 core]# oc get csr
No resources found
[root@okd-1 core]# cd /var/lib/kubelet/pki [root@okd-1 pki]# ls kubelet-client-2023-09-01-03-16-13.pem kubelet-client-current.pem kubelet-server-2023-09-17-08-10-36.pem kubelet.crt kubelet-client-2023-09-16-22-57-51.pem kubelet-server-2023-09-01-03-16-34.pem kubelet-server-current.pem kubelet.key [root@okd-1 pki]# mv * /home/core
[root@okd-1 pki]# openssl x509 -in kubelet-client.pem -text Certificate: Data: Version: 3 (0x2) Serial Number: da:2f:6e:89:7b:8f:6b:56:e5:b9:40:0c:a5:61:d1:4d Signature Algorithm: sha256WithRSAEncryption Issuer: OU = openshift, CN = kubelet-signer Validity Not Before: Sep 1 03:11:13 2023 GMT Not After : Sep 2 00:36:57 2023 GMT Subject: O = system:nodes, CN = system:node:okd-1.okd4.ktdemo.duckdns.org Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:eb:ab:91:53:ab:50:73:4d:75:ce:f0:c6:19:a6: 12:50:74:e4:6a:b8:d5:0d:62:5f:00:44:6e:14:91: 19:db:97:a5:95:6e:fa:17:1b:0d:4a:13:23:a6:fb: 24:d6:e8:fd:c0:e0:e1:f2:e9:94:9c:33:6f:8f:f5: f7:57:42:17:77 ASN1 OID: prime256v1 NIST CURVE: P-256 X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Client Authentication X509v3 Basic Constraints: critical CA:FALSE X509v3 Authority Key Identifier: 4A:13:BB:F5:96:36:01:5C:BA:FE:26:06:96:81:EA:F5:4B:67:D2:95 Signature Algorithm: sha256WithRSAEncryption Signature Value: 20:7b:f7:92:5b:b8:06:fb:99:18:53:bb:02:39:12:99:dd:9f: 82:a5:f3:a0:18:65:9c:eb:1c:3d:af:bd:81:ec:71:ee:4d:86: e0:4c:dd:38:56:40:d5:2a:c8:b5:ec:a6:bd:09:bf:0f:ef:c1: 80:bf:97:a6:e1:0b:3e:40:fa:fc:38:5a:54:4e:45:5d:26:52: 2d:2e:4a:f4:6c:3d:c0:7d:83:da:02:d1:f3:89:27:ac:8c:53: 55:9d:9f:12:3c:ea:b0:f4:00:12:e2:48:dc:85:e4:04:d5:83: c3:24:65:00:7c:4e:57:93:30:7e:ef:07:ef:e4:83:0d:96:77: b0:b6:bd:22:1c:14:08:36:eb:20:7a:29:0d:52:2b:09:63:12: 12:66:0d:c1:66:90:01:9c:44:0a:4f:95:69:27:87:2e:21:50: fd:b5:83:09:37:58:d5:3b:9f:0b:01:00:17:9a:fa:bd:60:a4: 66:88:36:d1:cc:ae:e4:5f:4a:8c:ee:8c:66:a8:bc:85:3b:79: 8f:98:12:48:14:99:21:15:c8:81:85:3c:fe:49:07:dc:59:9e: 76:ee:9d:ff:74:d5:76:99:1b:e4:e0:ef:a0:a9:2b:2f:29:4a: b9:17:c2:ce:79:57:ec:bb:a1:dd:c8:4c:a7:b5:f1:bc:cb:b2: b0:47:31:4e -----BEGIN CERTIFICATE----- MIICjjCCAXagAwIBAgIRANovbol7j2tW5blADKVh0U0wDQYJKoZIhvcNAQELBQAw LTESMBAGA1UECxMJb3BlbnNoaWZ0MRcwFQYDVQQDEw5rdWJlbGV0LXNpZ25lcjAe Fw0yMzA5MDEwMzExMTNaFw0yMzA5MDIwMDM2NTdaMEsxFTATBgNVBAoTDHN5c3Rl bTpub2RlczEyMDAGA1UEAxMpc3lzdGVtOm5vZGU6b2tkLTEub2tkNC5rdGRlbW8u ZHVja2Rucy5vcmcwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAATrq5FTq1BzTXXO 8MYZphJQdORquNUNYl8ARG4UkRnbl6WVbvoXGw1KEyOm+yTW6P3A4OHy6ZScM2+P 9fdXQhd3o1YwVDAOBgNVHQ8BAf8EBAMCBaAwEwYDVR0lBAwwCgYIKwYBBQUHAwIw DAYDVR0TAQH/BAIwADAfBgNVHSMEGDAWgBRKE7v1ljYBXLr+JgaWger1S2fSlTAN BgkqhkiG9w0BAQsFAAOCAQEAIHv3klu4BvuZGFO7AjkSmd2fgqXzoBhlnOscPa+9 gexx7k2G4EzdOFZA1SrIteymvQm/D+/BgL+XpuELPkD6/DhaVE5FXSZSLS5K9Gw9 wH2D2gLR84knrIxTVZ2fEjzqsPQAEuJI3IXkBNWDwyRlAHxOV5Mwfu8H7+SDDZZ3 sLa9IhwUCDbrIHopDVIrCWMSEmYNwWaQAZxECk+VaSeHLiFQ/bWDCTdY1TufCwEA F5r6vWCkZog20cyu5F9KjO6MZqi8hTt5j5gSSBSZIRXIgYU8/kkH3Fmedu6d/3TV dpkb5ODvoKkrLylKuRfCznlX7Luh3chMp7XxvMuysEcxTg== -----END CERTIFICATE-----
[root@okd-1 pki]# openssl x509 -in kubelet-client.pem -text Certificate: Data: Version: 3 (0x2) Serial Number: da:2f:6e:89:7b:8f:6b:56:e5:b9:40:0c:a5:61:d1:4d Signature Algorithm: sha256WithRSAEncryption Issuer: OU = openshift, CN = kubelet-signer Validity Not Before: Sep 1 03:11:13 2023 GMT Not After : Sep 2 00:36:57 2023 GMT Subject: O = system:nodes, CN = system:node:okd-1.okd4.ktdemo.duckdns.org Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:eb:ab:91:53:ab:50:73:4d:75:ce:f0:c6:19:a6: 12:50:74:e4:6a:b8:d5:0d:62:5f:00:44:6e:14:91: 19:db:97:a5:95:6e:fa:17:1b:0d:4a:13:23:a6:fb: 24:d6:e8:fd:c0:e0:e1:f2:e9:94:9c:33:6f:8f:f5: f7:57:42:17:77 ASN1 OID: prime256v1 NIST CURVE: P-256 X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Client Authentication X509v3 Basic Constraints: critical CA:FALSE X509v3 Authority Key Identifier: 4A:13:BB:F5:96:36:01:5C:BA:FE:26:06:96:81:EA:F5:4B:67:D2:95 Signature Algorithm: sha256WithRSAEncryption Signature Value: 20:7b:f7:92:5b:b8:06:fb:99:18:53:bb:02:39:12:99:dd:9f: 82:a5:f3:a0:18:65:9c:eb:1c:3d:af:bd:81:ec:71:ee:4d:86: e0:4c:dd:38:56:40:d5:2a:c8:b5:ec:a6:bd:09:bf:0f:ef:c1: 80:bf:97:a6:e1:0b:3e:40:fa:fc:38:5a:54:4e:45:5d:26:52: 2d:2e:4a:f4:6c:3d:c0:7d:83:da:02:d1:f3:89:27:ac:8c:53: 55:9d:9f:12:3c:ea:b0:f4:00:12:e2:48:dc:85:e4:04:d5:83: c3:24:65:00:7c:4e:57:93:30:7e:ef:07:ef:e4:83:0d:96:77: b0:b6:bd:22:1c:14:08:36:eb:20:7a:29:0d:52:2b:09:63:12: 12:66:0d:c1:66:90:01:9c:44:0a:4f:95:69:27:87:2e:21:50: fd:b5:83:09:37:58:d5:3b:9f:0b:01:00:17:9a:fa:bd:60:a4: 66:88:36:d1:cc:ae:e4:5f:4a:8c:ee:8c:66:a8:bc:85:3b:79: 8f:98:12:48:14:99:21:15:c8:81:85:3c:fe:49:07:dc:59:9e: 76:ee:9d:ff:74:d5:76:99:1b:e4:e0:ef:a0:a9:2b:2f:29:4a: b9:17:c2:ce:79:57:ec:bb:a1:dd:c8:4c:a7:b5:f1:bc:cb:b2: b0:47:31:4e -----BEGIN CERTIFICATE----- MIICjjCCAXagAwIBAgIRANovbol7j2tW5blADKVh0U0wDQYJKoZIhvcNAQELBQAw LTESMBAGA1UECxMJb3BlbnNoaWZ0MRcwFQYDVQQDEw5rdWJlbGV0LXNpZ25lcjAe Fw0yMzA5MDEwMzExMTNaFw0yMzA5MDIwMDM2NTdaMEsxFTATBgNVBAoTDHN5c3Rl bTpub2RlczEyMDAGA1UEAxMpc3lzdGVtOm5vZGU6b2tkLTEub2tkNC5rdGRlbW8u ZHVja2Rucy5vcmcwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAATrq5FTq1BzTXXO 8MYZphJQdORquNUNYl8ARG4UkRnbl6WVbvoXGw1KEyOm+yTW6P3A4OHy6ZScM2+P 9fdXQhd3o1YwVDAOBgNVHQ8BAf8EBAMCBaAwEwYDVR0lBAwwCgYIKwYBBQUHAwIw DAYDVR0TAQH/BAIwADAfBgNVHSMEGDAWgBRKE7v1ljYBXLr+JgaWger1S2fSlTAN BgkqhkiG9w0BAQsFAAOCAQEAIHv3klu4BvuZGFO7AjkSmd2fgqXzoBhlnOscPa+9 gexx7k2G4EzdOFZA1SrIteymvQm/D+/BgL+XpuELPkD6/DhaVE5FXSZSLS5K9Gw9 wH2D2gLR84knrIxTVZ2fEjzqsPQAEuJI3IXkBNWDwyRlAHxOV5Mwfu8H7+SDDZZ3 sLa9IhwUCDbrIHopDVIrCWMSEmYNwWaQAZxECk+VaSeHLiFQ/bWDCTdY1TufCwEA F5r6vWCkZog20cyu5F9KjO6MZqi8hTt5j5gSSBSZIRXIgYU8/kkH3Fmedu6d/3TV dpkb5ODvoKkrLylKuRfCznlX7Luh3chMp7XxvMuysEcxTg== -----END CERTIFICATE----- [root@okd-1 pki]# ls kubelet-client.pem kubelet-server.pem kubelet.crt kubelet.key [root@okd-1 pki]# openssl x509 -in kubelet-server.pem -text Certificate: Data: Version: 3 (0x2) Serial Number: 3e:09:e2:5b:a9:a2:08:bb:cd:81:89:39:53:2a:18:f6 Signature Algorithm: sha256WithRSAEncryption Issuer: OU = openshift, CN = kubelet-signer Validity Not Before: Sep 1 03:11:34 2023 GMT Not After : Sep 2 00:36:57 2023 GMT Subject: O = system:nodes, CN = system:node:okd-1.okd4.ktdemo.duckdns.org Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:60:0a:eb:a7:80:87:33:55:94:ad:76:af:b7:c3: 67:ba:71:cc:b5:73:20:f3:be:27:b2:b9:3b:2f:b9: 9a:ec:70:88:b7:58:65:ed:1b:16:95:ad:9f:d2:b1: 82:73:11:3c:a2:df:20:79:5d:d6:dd:6d:40:17:3a: 6a:44:c4:0c:a8 ASN1 OID: prime256v1 NIST CURVE: P-256 X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Server Authentication X509v3 Basic Constraints: critical CA:FALSE X509v3 Authority Key Identifier: 4A:13:BB:F5:96:36:01:5C:BA:FE:26:06:96:81:EA:F5:4B:67:D2:95 X509v3 Subject Alternative Name: DNS:okd-1.okd4.ktdemo.duckdns.org, IP Address:192.168.1.146 Signature Algorithm: sha256WithRSAEncryption Signature Value: 8a:95:28:9f:f7:6a:6a:69:7b:c2:ad:52:db:05:3a:b8:1d:f2: 26:4d:a4:ae:79:f5:2a:c6:66:35:ac:f3:e9:48:ac:d8:e9:6a: 23:a4:07:9f:be:78:cb:64:7e:45:4a:0f:80:06:e3:df:82:c2: cc:fc:ee:93:01:63:f8:11:0c:28:a5:58:7a:87:ef:c4:3c:33: ea:03:46:df:2d:94:ac:b0:e6:e2:4d:39:a0:07:6c:0c:b8:a8: 1b:17:01:69:58:2b:38:59:dd:d0:83:03:21:a9:c6:b3:d1:73: c9:39:91:6f:71:b2:a1:2e:c7:62:3a:d0:d7:76:12:a4:28:1b: a5:6e:55:d8:e4:20:38:6d:05:22:c8:6e:e1:94:6e:7f:b0:b8: ee:53:06:fb:fe:cd:3b:78:a4:fe:8d:8f:db:6f:fa:0c:18:a0: 4f:cb:ac:37:fe:eb:8d:dd:f0:97:72:1c:54:ca:ac:91:3a:4b: 3e:b0:8c:55:ba:5d:22:5d:43:97:9d:ee:26:e8:7a:be:e7:da: 4c:3d:97:36:44:40:eb:3f:ca:0b:08:18:59:d4:05:e4:76:c5: 1a:0c:59:00:a8:8d:29:93:68:f1:7d:6d:fb:29:b5:00:32:34: 2a:d4:be:3b:dd:63:9a:aa:1e:d4:40:21:d2:d6:c6:c2:7f:20: 2a:6d:1e:71 -----BEGIN CERTIFICATE----- MIICvzCCAaegAwIBAgIQPgniW6miCLvNgYk5UyoY9jANBgkqhkiG9w0BAQsFADAt MRIwEAYDVQQLEwlvcGVuc2hpZnQxFzAVBgNVBAMTDmt1YmVsZXQtc2lnbmVyMB4X DTIzMDkwMTAzMTEzNFoXDTIzMDkwMjAwMzY1N1owSzEVMBMGA1UEChMMc3lzdGVt Om5vZGVzMTIwMAYDVQQDEylzeXN0ZW06bm9kZTpva2QtMS5va2Q0Lmt0ZGVtby5k dWNrZG5zLm9yZzBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABGAK66eAhzNVlK12 r7fDZ7pxzLVzIPO+J7K5Oy+5muxwiLdYZe0bFpWtn9KxgnMRPKLfIHld1t1tQBc6 akTEDKijgYcwgYQwDgYDVR0PAQH/BAQDAgWgMBMGA1UdJQQMMAoGCCsGAQUFBwMB MAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAUShO79ZY2AVy6/iYGloHq9Utn0pUw LgYDVR0RBCcwJYIdb2tkLTEub2tkNC5rdGRlbW8uZHVja2Rucy5vcmeHBMCoAZIw DQYJKoZIhvcNAQELBQADggEBAIqVKJ/3amppe8KtUtsFOrgd8iZNpK559SrGZjWs 8+lIrNjpaiOkB5++eMtkfkVKD4AG49+Cwsz87pMBY/gRDCilWHqH78Q8M+oDRt8t lKyw5uJNOaAHbAy4qBsXAWlYKzhZ3dCDAyGpxrPRc8k5kW9xsqEux2I60Nd2EqQo G6VuVdjkIDhtBSLIbuGUbn+wuO5TBvv+zTt4pP6Nj9tv+gwYoE/LrDf+643d8Jdy HFTKrJE6Sz6wjFW6XSJdQ5ed7iboer7n2kw9lzZEQOs/ygsIGFnUBeR2xRoMWQCo jSmTaPF9bfsptQAyNCrUvjvdY5qqHtRAIdLWxsJ/ICptHnE= -----END CERTIFICATE-----
[root@okd-1 core]# oc get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-4zbk8 19s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-5nbgm 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-cck4v 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-dxq7v 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-k8cns 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-lmw6j 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-phl8h 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-qn8pj 16d kubernetes.io/kubelet-serving system:node:okd-1.okd4.ktdemo.duckdns.org Approved,Issued csr-r8554 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued system:openshift:openshift-authenticator-d7v5c 16d kubernetes.io/kube-apiserver-client system:serviceaccount:openshift-authentication-operator:authentication-operator Approved,Issued system:openshift:openshift-monitoring-np6r8 16d kubernetes.io/kube-apiserver-client system:serviceaccount:openshift-monitoring:cluster-monitoring-operator Approved,Issued
3번노드 재설치
[root@okd-1 core]# oc get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-4zbk8 67m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-5nbgm 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-6b8z9 11m kubernetes.io/kubelet-serving system:node:okd-1.okd4.ktdemo.duckdns.org Approved,Issued csr-b2hhb 63m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-cck4v 17d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-dxq7v 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-dxtxg 32s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-k8cns 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-lmw6j 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-nr9dz 42s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
승인하고 나면
[root@okd-1 core]# oc get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-4zbk8 69m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-5nbgm 16d kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-6b8z9 13m kubernetes.io/kubelet-serving system:node:okd-1.okd4.ktdemo.duckdns.org Approved,Issued csr-6f5jq 14s kubernetes.io/kubelet-serving system:node:okd-3.okd4.ktdemo.duckdns.org Pending
4번 서버
[root@bastion ~]# oc get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-fzpbw 8s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-sphb7 19s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
[root@bastion ~]# oc get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-fzpbw 108s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-nn266 4s kubernetes.io/kubelet-serving system:node:okd-4.okd4.ktdemo.duckdns.org Pending csr-sphb7 119s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued [root@bastion ~]# oc adm certificate approve csr-nn266 certificatesigningrequest.certificates.k8s.io/csr-nn266 approved