2023年6月30日发(作者:)
kubernetes安装⼿册kubernetes逻辑部署图机器列表模块机器配置备注域名172.31.101.1754G内存master172.31.101.176etcd、kube-apiserver、kube-scheduler、kube-controller-manager、flannel4核CPU
172.31.101.177
node172.31.101.1728G内存172.31.101.173kubelet、kube-proxy、docker、flannel8核CPU
172.31.101.174
172.31.101.11532G内存172.31.101.116kubelet、kube-proxy、docker、flannel8核CPU
172.31.101.117
node
master节点可以不安装 docker的,但是需要安装flannel,否则⼀些转发请求不好使。
机器初始化采⽤CentOS 7系统,线下的.d的配置和线上不⼀样,导致安装的⼀些软件不是最新的。需要做如下操作:独⽴的数据盘 /data 必须是ext4格式的,否则会有问题1. 同步172.31.101.166:/etc/.d/* 到 本机 /etc/.d/ ⽬录下2. 运⾏ echo 'overlay' > /etc/modules-load.d/ 启⽤overlay的模块⽀持3. 运⾏ yum -y update 命令等待更新系统4. 运⾏ yum -y install flannel docker 安装docker 程序5. 运⾏ yum -y install lrzsz telnet strace bridge_utils 等⼯具⽅便后续定位问题6. 运⾏ yum -y install ceph-common 组件7. 运⾏ pip install docker-compose 组件8. 运⾏ systemctl enable docker 把docker放到开机⾃启动中9. 升级内核到最新4.x的版本,匹配ceph分布式存储的要求10. 运⾏ reboot 命令,重启系统升级centos7的内核到4.9rpm --import
/ -Uvh / -y install
yum-plugin-fastestmirroryum -y --enablerepo=elrepo-kernel install
kernel-mlgrub2-set-default 0升级到4.x版本之后,和systemd/kubelet/docker的配合有问题,暂时先回退到r设置使⽤阿⾥镜像加速cat
/etc/docker/{ "storage-driver": "overlay", "live-restore": true, "registry-mirrors": ["]}将设备⽬录等放到/data盘,避免撑爆根⽬录调整docker⽬录# docker相关信息会存储在 /var/lib/docker 下mv
/var/lib/docker
/data/ln
-sf /data/docker
/var/lib/docker# pod相关信息会存储在 /var/lib/kubelet 下,如果没有这个⽬录,请先创建mv
/var/lib/kubelet
/data/ln
-sf /data/kubelet
/var/lib/kubelet
配置允许使⽤http⽅式访问harbor仓库修改 /etc/sysconfig/docker ⽂件OPTIONS='--selinux-enabled --log-driver=journald --signature-verification=false'INSECURE_REGISTRY='--insecure-registry '配置后重新加载1. systemctl daemon-reload2. systemctl restart docker安装kubernetes关闭防⽕墙由于master和node之间有频繁的⽹络操作,⽽且在内⽹使⽤,所以关闭防⽕墙,避免影响效率1. systemctl disable firewalld 去掉开机⾃启动2. systemctl stop firewalld 停⽌防⽕墙服务3. setenforce 0 关闭SElinux安装etcd clusteryum -y install etcd,⽬前版本是3.1.0.配置参考⽂档:
机器配置数据服务账户备份监控
位置172.31.101.175/176/177/etc/etcd//data/etcd-storageetcd
需要修改为 etcd 账户可读写不⽤root账户启动重启后⾃动运⾏etcd
备注⾄少3台构成⼀个cluster,通过 访问/usr/lib/systemd/system/e
⾃启动systemctl enable etcdetcd配置172.31.101.175172.31.101.176ETCD_NAME=vlnx101175ETCD_NAME=vlnx101176ETCD_NAME=vlnx101177ETCD_DATA_DIR="/data/etcd-storage/"ETCD_DATA_DIR="/data/etcd-storage/"ETCD_DATA_DIR="/data/etcd-sETCD_HEARTBEAT_INTERVAL="1000"ETCD_HEARTBEAT_INTERVAL="1000"ETCD_HEARTBEAT_INTERVAETCD_ELECTION_TIMEOUT="10000"ETCD_ELECTION_TIMEOUT="10000"ETCD_ELECTION_TIMEOUT=ETCD_LISTEN_PEER_URLS=""ETCD_LISTEN_PEER_URLS=""ETCD_LISTEN_PEER_URLS=ETCD_LISTEN_CLIENT_URLS="172.31.101.175:2379,127.0.0.1:2379"ETCD_LISTEN_CLIENT_URLS="172.31.101.176:2379,127.0.0.1:2379"ETCD_LISTEN_CLIENT_URLSETCD_INITIAL_ADVERTISE_PEER_URLS=""ETCD_INITIAL_ADVERTISE_PEER_URLS=""ETCD_INITIAL_ADVERTISE_PETCD_INITIAL_CLUSTER="vlnx101175=ETCD_INITIAL_CLUSTER="vlnx101175=ETCD_INITIAL_CLUSTER="vlnlnx101177="lnx101177="lnx101177="ETCD_INITIAL_CLUSTER_STATE="new"ETCD_INITIAL_CLUSTER_STATE="new"ETCD_INITIAL_CLUSTER_STAETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-k8s"ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-k8s"ETCD_INITIAL_CLUSTER_TOETCD_ADVERTISE_CLIENT_URLS=""ETCD_ADVERTISE_CLIENT_URLS=""ETCD_ADVERTISE_CLIENT_U验证etcd是否正常运⾏命令 etcdctl cluster-health 检查是否⼯作正常
安装master准备好tls认证⽂件从⽼的k8s集群172.31.101.119:/key1 ⽬录拷贝证书等⽂件到所有的master节点的 /etc/kubernetes/ssl/ ⽬录下。配置说明如下⽂件名basic_
ic auth user and passwordCertificate Authority certClient certificate, public keyClient certificate, private keyServer certificate, public keyServer certificate, private key说明known_ens that entities (e.g. the kubelet) can use to talk to the apiserver通过yum安装运⾏ yum -y install kubernetes-master 命令,⽬前安装的是1.5.2,我们先⽤它把service相关的东西都配置好,然后⼿动升级到1.6.0的最新版本。可以通过rpm -ql kubernetes-master 来看都安装了哪些⽂件修改master配置⽂件到 /etc/kubernetes/ ⽬录下修改配置⽂件名⽂件名内容内容KUBE_LOGTOSTDERR="--logtostderr=true"KUBE_LOG_LEVEL="--v=0"KUBE_ALLOW_PRIV="--allow-privileged=true"KUBE_MASTER="--master="KUBE_API_ADDRESS="--insecure-bind-address=0.0.0.0"KUBE_API_PORT="--insecure-port=8080 --secure-port=6443"KUBE_ETCD_SERVERS="--etcd-servers="KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.137.0.0/16"KUBE_ADMISSION_CONTROL="--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota"KUBE_API_ARGS="--service-node-port-range=80-50000 --client-ca-file=/etc/kubernetes/ssl/ --tls-cert-file=/etc/kubernetes/ssl/ --tls-private-key-file=/etc/kubernetes/ssl/ --basic-auth-file=/etc/kubernetes/ssl/basic_ --token-auth-file=/etc/kubernetes/ssl/known_"KUBE_CONTROLLER_MANAGER_ARGS="--leader-elect=true --leader-elect-lease-duration=150s --leader-elect-renew-deadline=100s --leader-elect-retry-period=20s --root-ca-file=/etc/kubernetes/ssl/ --service-account-private-key-file=/etc/kubernetes/ssl/"KUBE_SCHEDULER_ARGS="--leader-elect=true 备注备注config
apiserver⽀持ssl/bash/token等安全认证controller-manager必须设置认证⽅式,否则创建deploy的时候会报错:No API token found for service account "default", retry after the token isautomatically created and added to the service account scheduler⽂件名配置开机⾃启动--leader-elect-lease-duration=150s 内容--leader-elect-renew-deadline=100s --leader-elect-retry-period=20s"
备注systemctl daemon-reloadsystemctl enable
kube-apiserversystemctl enable
kube-schedulersystemctl enable
kube-controller-manager启动master相关服务systemctl start kube-apiserversystemctl start kube-schedulersystemctl start kube-controller-manager
验证组件是否正常运⾏命令 kubectl get componentstatuses 检查组件是否正常apiserver 是⽆状态的,所以对于多个节点⽽⾔可以通过前端挡⼀层 nginx 、haproxy 等实现⾼可⽤和负载均衡,⽽对于 scheduler 和 controller-managere ⽽⾔,它们需要操作后端存储 etcd,是有状态的,为了减少同步操作后端存储带来的数据不⼀致性,3 个 节点 的scheduler 和 controller-manager 组件同时只有⼀个提供服务,具体的实现就是3个节点之间会进⾏⼀个 leader 的选举,只有 leader 节点才提供服务。因此,验证选举⼯作是否正常也是⾮常重要的。查看当前 3 个 shcduler和controller-manager 哪个是 leader 节点:kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml
升级新版本下载路径:/kubernetes/kubernetes/blob/master/#downloads-for-v160
到上⾯路径下载 ⽂件,然后上传到服务器上,解压缩到 /opt/fs/kubernetes ⽬录,把 bin ⽬录的⽂件复制到 系统 /bin/ ⽬录下。注意事项:1. 需要修改可执⾏权限:chmod a+x /bin/kube* , 这⾥⽐较坑,否则会报没有可执⾏权限的错误,⽆法启动。2. 删除 /var/run/kubernetes/apiserver* 下产⽣的⽂件,否则⽆法启动然后重启master相关服务就可以了安装node
运⾏ yum -y install kubernetes-node 命令,⽬前安装的是1.5.2,我们先⽤它把service相关的东西都配置好,然后⼿动升级到1.6.0的最新版本。
安装flanneld(注意:各个master节点上也需要安装并启动flannel,配置和下⾯的相同)yum -y install
flannelsystemctl daemon-reloadsystemctl enable
flanneldsystemctl start flanneld修改flannel配置⽂件修改 /etc/sysconfig/flanneld 配置⽂件,配置etcd的地址FLANNEL_ETCD_ENDPOINTS=""FLANNEL_ETCD_PREFIX="/"在master的机器上,修改etcd设置flannel的⽹段。 (SubnetLen指定分配给每个Node节点上Pod⽹络的掩码位数,从⽽在⽹络上间接决定了每个node上可运⾏的pod数,参考⽂档;)etcdctl set / '{ "Network": "10.132.0.0/16","SubnetLen":24 }'这样每个node节点上的flanneld启动的时候,会⾃动分配对应⽹段的地址并注册到etcd中。修改docker配置去掉bip的设置
在启动flannel后,每个node节点的bip范围是⾃动分配的,不需要再docker的配置⽂件中指明。修改docker的配置⽂件,指明cgroup⽤systemd/etc/sysconfig/dockerOPTIONS='--selinux-enabled=false --log-driver=journald --signature-verification=false'if [ -z "${DOCKER_CERT_PATH}" ]; then DOCKER_CERT_PATH=/etc/dockerfiINSECURE_REGISTRY='--insecure-registry '
flannel⽹络机制
经由所在主机的docker0虚拟⽹卡转发到flannel0虚拟⽹卡,这是个P2P的虚拟⽹卡,flanneld服务监听在⽹卡的另外⼀端。Flannel通过Etcd服务维护了⼀张节点间的路由表。
修改node配置⽂件到 /etc/kubernetes/ ⽬录下修改配置⽂件名KUBE_LOGTOSTDERR="--logtostderr=true"KUBE_LOG_LEVEL="--v=0"KUBE_ALLOW_PRIV="--allow-privileged=true"KUBE_MASTER="--master="KUBELET_ADDRESS="--address=0.0.0.0"KUBELET_PORT="–port=10250"kubelet内容备注config不需要配置override-hostname,直接使⽤本机hostname就可以了必须配置cgroup-driver=systemd,否则⽆法启动注释KUBELET_HOSTNAME配置或者修改值为node的真实hostname,否则master上 kubectl get node只会显⽰⼀个127.0.0.1
#KUBELET_HOSTNAME="--hostname-override=127.0.0.1"KUBELET_API_SERVER="--kubeconfig=/var/lib/kubelet/kubeconfig"KUBELET_POD_INFRA_CONTAINER="--pod-infra-container-image="KUBELET_ARGS="--cgroup-driver=systemd --require-kubeconfig --cluster_dns=10.137.254.254 --cluster_domain="KUBE_PROXY_ARGS=""proxyapiVersion: v1clusters:- cluster:insecure-skip-tls-verify: trueserver:
name: k8scontexts:kubeconfig- context:cluster: k8suser: ""name: firstsharecurrent-context: firstsharekind: Configpreferences: {}users: []/var/lib/kubelet/kubeconfig配置启动依赖顺序1. docker要在flanneld之后启动2. kublet要在flanneld和docker之后启动cat /usr/lib/systemd/system/ecat /usr/lib/systemd/system/e配置开机⾃启动systemctl daemon-reloadsystemctl enable
flanneldsystemctl enable
dockersystemctl enable
kube-proxysystemctl enable
kubelet启动node相关服务systemctl start flanneldsystemctl start dockersystemctl start kube-proxysystemctl start kubelet升级新版本下载路径:
到上⾯路径下载 ⽂件,然后上传到服务器上,解压缩到 /opt/fs/kubernetes ⽬录,把 bin ⽬录的⽂件复制到 系统 /bin/ ⽬录下。注意事项:1. 需要修改可执⾏权限:chmod a+x /bin/kube* , 这⾥⽐较坑,否则会报没有可执⾏权限的错误,⽆法启动。然后重启node相关服务就可以了检查node是否正常在 master 节点上,运⾏命令 kubectl get node ,如果显⽰node为 NotReady, 则把所有node节点都重启reboot⼀下才可能恢复正常。
参考资料kubernetes 1.6.0 ⾼可靠集群部署
kubernetes + etcd ssl ⽀持
kubernetes 中的服务发现和负载均衡
⽤ Flannel 配置 Kubernetes ⽹络
DockOne技术分享(⼗⼋):⼀篇⽂章带你了解Flannel
kubernetes 搭建过程中遇到的问题
CentOS 7 禁⽤ipv6
和我⼀起⼀步步搭建k8s集群
发布者:admin,转转请注明出处:http://www.yc00.com/news/1688055821a72128.html
评论列表(0条)