2021-01-07

K8S搭建rook-ceph

一 Rook概述

1.1 Ceph简介

Ceph是一种高度可扩展的分布式存储解决方案,提供对象、文件和块存储。在每个存储节点上,将找到Ceph存储对象的文件系统和Ceph OSD(对象存储守护程序)进程。在Ceph集群上,还存在Ceph MON(监控)守护程序,它们确保Ceph集群保持高可用性。

更多Ceph介绍参考:https://www.cnblogs.com/itzgr/category/1382602.html

1.2 Rook简介

Rook 是一个开源的cloud-native storage编排, 提供平台和框架;为各种存储解决方案提供平台、框架和支持,以便与云原生环境本地集成。目前主要专用于Cloud-Native环境的文件、块、对象存储服务。它实现了一个自我管理的、自我扩容的、自我修复的分布式存储服务。

Rook支持自动部署、启动、配置、分配(provisioning)、扩容/缩容、升级、迁移、灾难恢复、监控,以及资源管理。为了实现所有这些功能,Rook依赖底层的容器编排平台,例如 kubernetes、CoreOS 等。。

Rook 目前支持Ceph、NFS、Minio Object Store、Edegefs、Cassandra、CockroachDB 存储的搭建。

Rook机制:

  • Rook 提供了卷插件,来扩展了 K8S 的存储系统,使用 Kubelet 代理程序 Pod 可以挂载 Rook 管理的块设备和文件系统。
  • Rook Operator 负责启动并监控整个底层存储系统,例如 Ceph Pod、Ceph OSD 等,同时它还管理 CRD、对象存储、文件系统。
  • Rook Agent 代理部署在 K8S 每个节点上以 Pod 容器运行,每个代理 Pod 都配置一个 Flexvolume 驱动,该驱动主要用来跟 K8S 的卷控制框架集成起来,每个节点上的相关的操作,例如添加存储设备、挂载、格式化、删除存储等操作,都有该代理来完成。

更多参考如下官网:https://rook.io;https://ceph.com/

1.3 Rook架构

Rook架构如下:
Kubernetes集成Rook架构如下:

二 Rook部署

2.1 前期规划

主机 IP 磁盘 备注
k8smaster01 192.168.12.88 Kubernetes master节点
k8smaster02 192.168.12.89 Kubernetes master节点
k8smaster03 192.168.12.90 Kubernetes master节点
k8snode01 192.168.12.91 sdb Kubernetes node节点 Ceph节点
k8snode02 192.168.12.92 sdb Kubernetes node节点 Ceph节点
k8snode03 192.168.12.93 sdb Kubernetes node节点 Ceph节点

裸磁盘规划

k8snode01 k8snode02 k8snode03
Disk sdb sdb sdb

2.2 获取YAML 拉取项目

#外网速度较慢 建议提前下载好
git clone --single-branch --branch v1.5.1 https://github.com/rook/rook.git

2.3 部署Rook Operator

cd rook/cluster/examples/kubernetes/ceph
kubectl create -f crds.yaml -f common.yaml -f operator.yaml

2.4 获取镜像

由于镜像默认采用国外的镜像  直接运行yaml 会因下载速度过慢导致无法成功创建
建议先从阿里云下载 然后再打上官方的tag
# 拉取镜像
docker pull ceph/ceph:v15.2.5
docker pull rook/ceph:v1.5.1
docker pull registry.aliyuncs.com/it00021hot/cephcsi:v3.1.2
docker pull registry.aliyuncs.com/it00021hot/csi-node-driver-registrar:v2.0.1
docker pull registry.aliyuncs.com/it00021hot/csi-attacher:v3.0.0
docker pull registry.aliyuncs.com/it00021hot/csi-provisioner:v2.0.0
docker pull registry.aliyuncs.com/it00021hot/csi-snapshotter:v3.0.0
docker pull registry.aliyuncs.com/it00021hot/csi-resizer:v1.0.0

# 设置tag
docker tag registry.aliyuncs.com/it00021hot/csi-snapshotter:v3.0.0 k8s.gcr.io/sig-storage/csi-snapshotter:v3.0.0
docker tag registry.aliyuncs.com/it00021hot/csi-resizer:v1.0.0 k8s.gcr.io/sig-storage/csi-resizer:v1.0.0
docker tag registry.aliyuncs.com/it00021hot/cephcsi:v3.1.2 quay.io/cephcsi/cephcsi:v3.1.2
docker tag registry.aliyuncs.com/it00021hot/csi-node-driver-registrar:v2.0.1 k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.0.1
docker tag registry.aliyuncs.com/it00021hot/csi-attacher:v3.0.0 k8s.gcr.io/sig-storage/csi-attacher:v3.0.0
docker tag registry.aliyuncs.com/it00021hot/csi-provisioner:v2.0.0 k8s.gcr.io/sig-storage/csi-provisioner:v2.0.0
# 保存镜像
docker save \
ceph/ceph:v15.2.5 \
rook/ceph:v1.5.1  \
k8s.gcr.io/sig-storage/csi-snapshotter:v3.0.0 \
k8s.gcr.io/sig-storage/csi-resizer:v1.0.0 \
quay.io/cephcsi/cephcsi:v3.1.2 \
k8s.gcr.io/sig-storage/csi-attacher:v3.0.0 \
k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.0.1 \
k8s.gcr.io/sig-storage/csi-provisioner:v2.0.0 | gzip -1 > rook.tar
#分发镜像到个节点
yum install -y sshpass
echo 'StrictHostKeyChecking no'>>/etc/ssh/ssh_config
export SSHPASS='password' # ssh认证密码
export ALL_IPS=(192.168.12.88 192.168.12.89 192.168.12.90 192.168.12.91 192.168.12.92 192.168.12.93)
export TAR_NAME=rook.tar
for NODE in ${ALL_IPS[*]} ; do
    echo ">>>>>${NODE}"
    sshpass -e scp ${TAR_NAME} root@"${NODE}":/root
    sshpass -e ssh root@"${NODE}" "docker load -i ${TAR_NAME} && rm -rf ${TAR_NAME} "
done

2.5 配置cluster

vi cluster.yaml
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    image: ceph/ceph:v15.2.5
    allowUnsupported: false
  dataDirHostPath: /var/lib/rook
  skipUpgradeChecks: false
  mon:
    count: 3
    allowMultiplePerNode: false
  dashboard:
    enabled: true
    ssl: true #ssl开关
  monitoring:
    enabled: false
    rulesNamespace: rook-ceph
  network:
    hostNetwork: false
#  rbdMirroring:  #会报错 故而注释
#    workers: 0
  placement:                           #配置特定节点亲和力保证Node作为存储节点
#    all:
#      nodeAffinity:
#        requiredDuringSchedulingIgnoredDuringExecution:
#          nodeSelectorTerms:
#          - matchExpressions:
#            - key: role
#              operator: In
#              values:
#              - storage-node
#      tolerations:
#      - key: storage-node
#        operator: Exists
    mon:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-mon
              operator: In
              values:
              - enabled
      tolerations:
      - key: ceph-mon
        operator: Exists
    ods:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-osd
              operator: In
              values:
              - enabled
      tolerations:
      - key: ceph-osd
        operator: Exists
    mgr:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-mgr
              operator: In
              values:
              - enabled
      tolerations:
      - key: ceph-mgr
        operator: Exists
  annotations:
  resources:
  removeOSDsIfOutAndSafeToRemove: false
  storage:
    useAllNodes: false                  #关闭使用所有Node
    useAllDevices: false                #关闭使用所有设备
    deviceFilter: sdb
    config:
        metadataDevice:
        databaseSizeMB: "1024"
        journalSizeMB: "1024"
    nodes:
    - name: "k8snode01"                 #指定存储节点主机
      config:
        storeType: bluestore    #指定类型为裸磁盘
      devices:
      - name: "sdb"                         #指定磁盘为sdb
    - name: "k8snode02"
      config:
        storeType: bluestore
      devices:
      - name: "sdb"
    - name: "k8snode03"
      config:
        storeType: bluestore
      devices:
      - name: "sdb"
  disruptionManagement:
    managePodBudgets: false
    osdMaintenanceTimeout: 30
    manageMachineDisruptionBudgets: false
    machineDisruptionBudgetNamespace: openshift-machine-api

2.6 部署cluster&ToolBox

kubectl create -f cluster.yaml
kubectl create -f toolbox.yaml

三 测试ROOK

3.1 查看ceph集群的状态

满足以下的条件 被视为健康:

所有mons应该达到法定数量

mgr应该是激活状态

至少有一个OSD处于激活状态

如果不是HEALTH_OK状态 则应该查看告警或者错误信息

进入tool容器

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
ceph status
  cluster:
    id:     be0ad378-ad31-4745-9e08-e72200021f37
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 41h)
    mgr: a(active, since 94s)
    mds: myfs:1 {0=myfs-a=up:active} 1 up:standby-replay
    osd: 3 osds: 3 up (since 41h), 3 in (since 41h)
 
  task status:
    scrub status:
        mds.myfs-a: idle
        mds.myfs-b: idle
 
  data:
    pools:   4 pools, 97 pgs
    objects: 31 objects, 4.9 KiB
    usage:   3.0 GiB used, 147 GiB / 150 GiB avail
    pgs:     97 active+clean
 
  io:
    client:   1.3 KiB/s rd, 170 B/s wr, 2 op/s rd, 0 op/s wr
#ceph osd status
#ceph osd df
#ceph osd utilization
#ceph osd pool stats
#ceph osd tree
#ceph pg sta
#ceph df
#rados df

3.2 ceph 集群dashboard

#vi dashboard-external-https.yaml
apiVersion: v1
kind: Service
metadata:
  name: rook-ceph-mgr-dashboard-external-https
  namespace: rook-ceph
  labels:
    app: rook-ceph-mgr
    rook_cluster: rook-ceph
spec:
  ports:
  - name: dashboard
    port: 8443
    protocol: TCP
    targetPort: 8443
  selector:
    app: rook-ceph-mgr
    rook_cluster: rook-ceph
  sessionAffinity: None
  type: NodePort
kubectl create -f dashboard-external-https.yaml

登录 dashboard 需要安全访问。Rook 在运行 Rook Ceph 集群的名称空间中创建一个默认用户,admin 并生成一个称为的秘密rook-ceph-dashboard-admin-password

要检索生成的密码,可以运行以下命令:

kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
[root@k8smaster01 ~]# kubectl get svc -n rook-ceph | grep dashboard-external-https
rook-ceph-mgr-dashboard-external-https   NodePort    10.111.29.80    <none>        8443:32477/TCP      41h

https://192.168.12.91:32477/

[图片上传失败...(image-c30571-1610011437134)]

3.3 实现在k8s宿主机对rook-ceph集群的简单查看

3.3.1 复制key和config

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') cat /etc/ceph/ceph.conf > /etc/ceph/ceph.conf
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') cat /etc/ceph/keyring > /etc/ceph/keyring

3.3.2 配置ceph的repo源

[root@k8smaster01 ceph]# tee /etc/yum.repos.d/ceph.repo <<-'EOF'
[Ceph]
name=Ceph packages for $basearch
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/$basearch
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1

[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1

[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1
EOF

3.3.3 安装客户端

yum -y install ceph-common ceph-fuse
#之后便可以 直接在k8smaster集群 运行查询的命令
[root@k8smaster01 ~]# ceph status
  cluster:
    id:     be0ad378-ad31-4745-9e08-e72200021f37
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 42h)
    mgr: a(active, since 100m)
    mds: myfs:1 {0=myfs-a=up:active} 1 up:standby-replay
    osd: 3 osds: 3 up (since 43h), 3 in (since 43h)
 
  task status:
    scrub status:
        mds.myfs-a: idle
        mds.myfs-b: idle
 
  data:
    pools:   4 pools, 97 pgs
    objects: 31 objects, 4.9 KiB
    usage:   3.0 GiB used, 147 GiB / 150 GiB avail
    pgs:     97 active+clean
 
  io:
    client:   1.2 KiB/s rd, 2 op/s rd, 0 op/s wr

3.4 块设备创建及测试

3.4.1 创建wordpress 进行测试

cd /tmp/rook/cluster/examples/kubernetes/ceph/csi/rbd
sed -i 's/failureDomain: host/failureDomain: osd/g' storageclass.yaml
kubectl apply -f storageclass.yaml
kubectl get sc -n rook-ceph

[图片上传失败...(image-3e60c8-1610011437134)]

# 创建 Wordpress 进行测试
cd /tmp/rook/cluster/examples/kubernetes
sed -i 's|mysql:5.6|registry.cn-hangzhou.aliyuncs.com/vinc-auto/mysql:5.6|g' mysql.yaml
sed -i 's|wordpress:4.6.1-apache|registry.cn-hangzhou.aliyuncs.com/vinc-auto/wordpress:4.6.1-apache|g' wordpress.yaml
sed -i 's/LoadBalancer/NodePort/g' wordpress.yaml
kubectl create -f mysql.yaml
kubectl create -f wordpress.yaml

kubectl get pvc -o wide
kubectl get deploy -o wide
kubectl get pod -o wide
kubectl get service -o wide
kubectl get svc wordpress -o wide
# 浏览器访问 wordpress 进行部署

# 查看Ceph集群中的相关数据
kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"
NAME=$(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}')
kubectl -n rook-ceph exec -it ${NAME} sh
ceph osd pool stats
rbd ls -p replicapool
rbd info replicapool/'csi-vol-a15dc75d-69a0-11ea-a3b7-2ef116ca54b6'
rbd info replicapool/'csi-vol-a18385ed-69a0-11ea-a3b7-2ef116ca54b6'
exit

# 删除测试环境
cd /tmp/rook/cluster/examples/kubernetes
kubectl delete -f wordpress.yaml
kubectl delete -f mysql.yaml
kubectl delete -n rook-ceph cephblockpools.ceph.rook.io replicapool
kubectl delete storageclass rook-ceph-block

[图片上传失败...(image-8f2c3-1610011437134)]

[图片上传失败...(image-17f3b6-1610011437134)]

3.5 CephFS创建和测试

  • CephFS 允许用户挂载一个兼容posix的共享目录到多个主机,该存储和NFS共享存储以及CIFS共享目录相似
# filesystem.yaml: 3份副本的生产环境配置,需要至少3个节点
# filesystem-ec.yaml: 纠错码的生产环境配置,需要至少3个节点
# filesystem-test.yaml: 1份副本的测试环境,只需要一个节点
cd /tmp/rook/cluster/examples/kubernetes/ceph
sed -i 's/failureDomain: host/failureDomain: osd/g' filesystem.yaml
kubectl apply -f filesystem.yaml
kubectl -n rook-ceph get pod -l app=rook-ceph-mds

# 简单查看
kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"
NAME=$(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}')
kubectl -n rook-ceph exec -it ${NAME} sh
ceph status
ceph osd lspools
ceph mds stat
ceph fs ls
exit
  • 如果要使用CephFS,则必须先创建对应的storageclass
cd /tmp/rook/cluster/examples/kubernetes/ceph/csi/cephfs/
kubectl apply -f storageclass.yaml

[图片上传失败...(image-dfd0a5-1610011437134)]

测试

# 部署多个私有仓库共享同一个数据目录进行测试
docker pull registry:2
kubectl create -f kube-registry.yaml
# 在kube-system下创建了一个deployment作为私有仓库
# 将目录/var/lib/registry挂接到CephFS,并且是3个副本共享挂载的
kubectl get pod -n kube-system -l k8s-app=kube-registry -o wide
kubectl -n kube-system exec -it kube-registry-65df7d789d-9bwzn sh
df -hP|grep '/var/lib/registry'
cd /var/lib/registry
touch abc
exit
kubectl -n kube-system exec -it kube-registry-65df7d789d-sf55j ls /var/lib/registry

# 删除环境
cd /tmp/rook/cluster/examples/kubernetes/ceph/csi/cephfs/
kubectl delete -f kube-registry.yaml
kubectl delete -f storageclass.yaml
cd /tmp/rook/cluster/examples/kubernetes/ceph
kubectl delete -f filesystem.yaml

3.6 对象存储创建和测试

3.6.1 创建CephObjectStore

在提供(object)对象存储之前,需要先创建相应的支持,使用如下官方提供的默认yaml可部署对象存储的CephObjectStore。

#kubectl create -f object.yaml
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  name: my-store
  namespace: rook-ceph
spec:
  metadataPool:
    failureDomain: host
    replicated:
      size: 3
  dataPool:
    failureDomain: host
    replicated:
      size: 3
  preservePoolsOnDelete: false
  gateway:
    type: s3
    sslCertificateRef:
    port: 80
    securePort:
    instances: 1
    placement:
    annotations:
    resources:

kubectl -n rook-ceph get pod -l app=rook-ceph-rgw #查看部署完成会创建rgw的Pod

3.6.2 创建StorageClass

使用如下官方提供的默认yaml可部署对象存储的StorageClass。

#kubectl create -f storageclass-bucket-delete.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-delete-bucket
provisioner: ceph.rook.io/bucket
reclaimPolicy: Delete
parameters:
  objectStoreName: my-store
  objectStoreNamespace: rook-ceph
  region: us-east-1

kubectl get sc #查看StorageClass 是否成功创建

3.6.3 创建bucket

# kubectl create -f object-bucket-claim-delete.yaml
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
  name: ceph-delete-bucket
spec:
  generateBucketName: ceph-bkt
  storageClassName: rook-ceph-delete-bucket

3.6.4 设置对象存储访问信息

kubectl -n default get cm ceph-delete-bucket -o yaml | grep BUCKET_HOST | awk '{print $2}'
rook-ceph-rgw-my-store.rook-ceph
kubectl -n rook-ceph get svc rook-ceph-rgw-my-store
export AWS_HOST=$(kubectl -n default get cm ceph-delete-bucket -o yaml | grep BUCKET_HOST | awk '{print $2}')
export AWS_ACCESS_KEY_ID=$(kubectl -n default get secret ceph-delete-bucket -o yaml | grep AWS_ACCESS_KEY_ID | awk '{print $2}' | base64 --decode)
export AWS_SECRET_ACCESS_KEY=$(kubectl -n default get secret ceph-delete-bucket -o yaml | grep AWS_SECRET_ACCESS_KEY | awk '{print $2}' | base64 --decode)
export AWS_ENDPOINT='10.102.165.187'
echo '10.102.165.187 rook-ceph-rgw-my-store.rook-ceph' >> /etc/hosts

3.6.5 测试访问

radosgw-admin bucket list           #查看bucket
yum --assumeyes install s3cmd           #安装S3客户端
echo "Hello Rook" > /tmp/rookObj        #创建测试文件
s3cmd put /tmp/rookObj --no-ssl --host=${AWS_HOST} --host-bucket= s3://ceph-bkt-377bf96f-aea8-4838-82bc-2cb2c16cccfb/test.txt                   #测试上传至bucket
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容