测试环境

节点IP 节点功能
192.168.1.10 mon,osd,rgw
192.168.1.11 mon,osd,rgw
192.168.1.12 mon,osd,rgw

测试准备

1、配置升级Luminous的yum源
  1. # cat ceph-luminous.repo
  2. [ceph]
  3. name=x86_64
  4. baseurl=https://mirrors.aliyun.com/ceph/rpm-luminous/el7/x86_64/
  5. gpgcheck=0
  6. [ceph-noarch]
  7. name=noarch
  8. baseurl=https://mirrors.aliyun.com/ceph/rpm-luminous/el7/noarch/
  9. gpgcheck=0
  10. [ceph-arrch64]
  11. name=arrch64
  12. baseurl=https://mirrors.aliyun.com/ceph/rpm-luminous/el7/aarch64/
  13. gpgcheck=0
  14. [ceph-SRPMS]
  15. name=SRPMS
  16. baseurl=https://mirrors.aliyun.com/ceph/rpm-luminous/el7/SRPMS/
  17. gpgcheck=0

把生成的yum源文件拷贝到每一个节点上,并删除原本的jewel版yum源

  1. # ansible node -m copy -a 'src=ceph-luminous.repo dest=/etc/yum.repos.d/ceph-luminous.repo'
  2. # ansible node -m file -a 'name=/etc/yum.repos.d/ceph-jewel.repo state=absent'
2、设置sortbitwis

如果未设置,升级过程中可能会出现数据丢失的情况

  1. # ceph osd set sortbitwise
3、设置noout

为了防止升级过程中出现数据重平衡,升级完成后取消设置即可

  1. # ceph osd set noout

设置完成后集群状态如下

  1. # ceph -s
  2. cluster 0d5eced9-8baa-48be-83ef-64a7ef3a8301
  3. health HEALTH_WARN
  4. noout flag(s) set
  5. monmap e1: 3 mons at {node1=192.168.1.10:6789/0,node2=192.168.1.11:6789/0,node3=192.168.1.12:6789/0}
  6. election epoch 26, quorum 0,1,2 node1,node2,node3
  7. osdmap e87: 9 osds: 9 up, 9 in
  8. flags noout,sortbitwise,require_jewel_osds
  9. pgmap v267: 112 pgs, 7 pools, 3084 bytes data, 173 objects
  10. 983 MB used, 133 GB / 134 GB avail
  11. 112 active+clean
4、Luminous版的ceph需要指定允许pool删除的参数,在每个mon节点的ceph配置文件中添加”mon allow pool delete = true”
  1. # ansible node -m shell -a 'echo "mon allow pool delete = true" >> /etc/ceph/ceph.conf'

开始升级

1、确认当前集群中安装的ceph软件包版本
  1. # ansible node -m shell -a 'rpm -qa | grep ceph'
  2. [WARNING]: Consider using yum, dnf or zypper module rather than running rpm
  3. node1 | SUCCESS | rc=0 >>
  4. ceph-selinux-10.2.11-0.el7.x86_64
  5. ceph-10.2.11-0.el7.x86_64
  6. ceph-deploy-1.5.39-0.noarch
  7. libcephfs1-10.2.11-0.el7.x86_64
  8. python-cephfs-10.2.11-0.el7.x86_64
  9. ceph-base-10.2.11-0.el7.x86_64
  10. ceph-mon-10.2.11-0.el7.x86_64
  11. ceph-osd-10.2.11-0.el7.x86_64
  12. ceph-radosgw-10.2.11-0.el7.x86_64
  13. ceph-common-10.2.11-0.el7.x86_64
  14. ceph-mds-10.2.11-0.el7.x86_64
  15. node3 | SUCCESS | rc=0 >>
  16. ceph-mon-10.2.11-0.el7.x86_64
  17. ceph-radosgw-10.2.11-0.el7.x86_64
  18. ceph-common-10.2.11-0.el7.x86_64
  19. libcephfs1-10.2.11-0.el7.x86_64
  20. python-cephfs-10.2.11-0.el7.x86_64
  21. ceph-selinux-10.2.11-0.el7.x86_64
  22. ceph-mds-10.2.11-0.el7.x86_64
  23. ceph-10.2.11-0.el7.x86_64
  24. ceph-base-10.2.11-0.el7.x86_64
  25. ceph-osd-10.2.11-0.el7.x86_64
  26. node2 | SUCCESS | rc=0 >>
  27. ceph-mds-10.2.11-0.el7.x86_64
  28. python-cephfs-10.2.11-0.el7.x86_64
  29. ceph-base-10.2.11-0.el7.x86_64
  30. ceph-mon-10.2.11-0.el7.x86_64
  31. ceph-osd-10.2.11-0.el7.x86_64
  32. ceph-radosgw-10.2.11-0.el7.x86_64
  33. ceph-common-10.2.11-0.el7.x86_64
  34. ceph-selinux-10.2.11-0.el7.x86_64
  35. ceph-10.2.11-0.el7.x86_64
  36. libcephfs1-10.2.11-0.el7.x86_64
2、确认当前集群使用的ceph版本
  1. # ansible node -m shell -a 'for i in `ls /var/run/ceph/ | grep "ceph-mon.*asok"` ; do ceph --admin-daemon /var/run/ceph/$i --version ; done'
  2. node1 | SUCCESS | rc=0 >>
  3. ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)
  4. node2 | SUCCESS | rc=0 >>
  5. ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)
  6. node3 | SUCCESS | rc=0 >>
  7. ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)
3、升级软件包
  1. # ansible node -m yum -a 'name=ceph state=latest'
4、升级完成后,查看当前集群节点中安装的软件包版本
  1. # ansible node -m shell -a 'rpm -qa | grep ceph'
  2. [WARNING]: Consider using yum, dnf or zypper module rather than running rpm
  3. node2 | SUCCESS | rc=0 >>
  4. ceph-base-12.2.10-0.el7.x86_64
  5. ceph-osd-12.2.10-0.el7.x86_64
  6. python-cephfs-12.2.10-0.el7.x86_64
  7. ceph-common-12.2.10-0.el7.x86_64
  8. ceph-selinux-12.2.10-0.el7.x86_64
  9. ceph-mon-12.2.10-0.el7.x86_64
  10. ceph-mds-12.2.10-0.el7.x86_64
  11. ceph-radosgw-12.2.10-0.el7.x86_64
  12. libcephfs2-12.2.10-0.el7.x86_64
  13. ceph-mgr-12.2.10-0.el7.x86_64
  14. ceph-12.2.10-0.el7.x86_64
  15. node1 | SUCCESS | rc=0 >>
  16. ceph-base-12.2.10-0.el7.x86_64
  17. ceph-osd-12.2.10-0.el7.x86_64
  18. ceph-deploy-1.5.39-0.noarch
  19. python-cephfs-12.2.10-0.el7.x86_64
  20. ceph-common-12.2.10-0.el7.x86_64
  21. ceph-selinux-12.2.10-0.el7.x86_64
  22. ceph-mon-12.2.10-0.el7.x86_64
  23. ceph-mds-12.2.10-0.el7.x86_64
  24. ceph-radosgw-12.2.10-0.el7.x86_64
  25. libcephfs2-12.2.10-0.el7.x86_64
  26. ceph-mgr-12.2.10-0.el7.x86_64
  27. ceph-12.2.10-0.el7.x86_64
  28. node3 | SUCCESS | rc=0 >>
  29. python-cephfs-12.2.10-0.el7.x86_64
  30. ceph-common-12.2.10-0.el7.x86_64
  31. ceph-mon-12.2.10-0.el7.x86_64
  32. ceph-radosgw-12.2.10-0.el7.x86_64
  33. libcephfs2-12.2.10-0.el7.x86_64
  34. ceph-base-12.2.10-0.el7.x86_64
  35. ceph-mgr-12.2.10-0.el7.x86_64
  36. ceph-osd-12.2.10-0.el7.x86_64
  37. ceph-12.2.10-0.el7.x86_64
  38. ceph-selinux-12.2.10-0.el7.x86_64
  39. ceph-mds-12.2.10-0.el7.x86_64
5、分别对所有的mon,osd,rgw进程进行重启

node1节点

  1. # systemctl restart ceph-mon@node1
  2. # systemctl restart ceph-osd@{0,1,2}
  3. # systemctl restart ceph-radosgw@rgw.node1

node2节点

  1. # systemctl restart ceph-mon@node2
  2. # systemctl restart ceph-osd@{3,4,5}
  3. # systemctl restart ceph-radosgw@rgw.node2

node3节点

  1. # systemctl restart ceph-mon@node3
  2. # systemctl restart ceph-osd@{6,7,8}
  3. # systemctl restart ceph-radosgw@rgw.node3
6、调整require_osd_release

此时查看集群状态信息如下

  1. # ceph -s
  2. cluster:
  3. id: 0d5eced9-8baa-48be-83ef-64a7ef3a8301
  4. health: HEALTH_WARN
  5. noout flag(s) set
  6. all OSDs are running luminous or later but require_osd_release < luminous
  7. no active mgr
  8. services:
  9. mon: 3 daemons, quorum node1,node2,node3
  10. mgr: no daemons active
  11. osd: 9 osds: 9 up, 9 in
  12. flags noout
  13. data:
  14. pools: 7 pools, 112 pgs
  15. objects: 189 objects, 3.01KiB
  16. usage: 986MiB used, 134GiB / 135GiB avail
  17. pgs: 112 active+clean

需要手动调整require_osd_release

  1. # ceph osd require-osd-release luminous
7、取消noout设置
  1. # ceph osd unset noout

再次查看集群状态如下

  1. # ceph -s
  2. cluster:
  3. id: 0d5eced9-8baa-48be-83ef-64a7ef3a8301
  4. health: HEALTH_WARN
  5. no active mgr
  6. services:
  7. mon: 3 daemons, quorum node1,node2,node3
  8. mgr: no daemons active
  9. osd: 9 osds: 9 up, 9 in
  10. data:
  11. pools: 0 pools, 0 pgs
  12. objects: 0 objects, 0B
  13. usage: 0B used, 0B / 0B avail
  14. pgs:
8、配置mgr

1)生成密钥

  1. # ceph auth get-or-create mgr.node1 mon 'allow *' osd 'allow *'
  2. [mgr.node1]
  3. key = AQC0IA9c9X31IhAAdQRm3zR5r/nl3b7+WOwZjQ==

2)创建数据目录

  1. # mkdir /var/lib/ceph/mgr/ceph-node1/

3)添加密钥

  1. # ceph auth get mgr.node1 -o /var/lib/ceph/mgr/ceph-node1/keyring
  2. exported keyring for mgr.node1

4)设置服务开机自启

  1. # systemctl enable ceph-mgr@node1
  2. Created symlink from /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@node1.service to /usr/lib/systemd/system/ceph-mgr@.service.

5)启动mgr

  1. # systemctl start ceph-mgr@node1

6)其他mon节点通过同样的方式配置一下mgr,再次查看集群状态

  1. # ceph -s
  2. cluster:
  3. id: 0d5eced9-8baa-48be-83ef-64a7ef3a8301
  4. health: HEALTH_OK
  5. services:
  6. mon: 3 daemons, quorum node1,node2,node3
  7. mgr: node1(active), standbys: node2, node3
  8. osd: 9 osds: 9 up, 9 in
  9. rgw: 3 daemons active
  10. data:
  11. pools: 7 pools, 112 pgs
  12. objects: 189 objects, 3.01KiB
  13. usage: 986MiB used, 134GiB / 135GiB avail
  14. pgs: 112 active+clean

7)开启mgr的dashboard模块,dashboard提供一个web界面可以对集群状态进行监控

  1. # ceph mgr module enable dashboard
  2. # ceph mgr module ls
  3. {
  4. "enabled_modules": [
  5. "balancer",
  6. "dashboard",
  7. "restful",
  8. "status"
  9. ],
  10. "disabled_modules": [
  11. "influx",
  12. "localpool",
  13. "prometheus",
  14. "selftest",
  15. "zabbix"
  16. ]
  17. }
  18. # ceph mgr services
  19. {
  20. "dashboard": "http://node1:7000/"
  21. }

8)访问dashboard
1.png

使用deploy升级集群

如果集群是使用的deploy部署,也可以通过deploy进行升级,软件包的升级命令如下,其他的操作步骤都是类似的,这里不再赘述。

  1. # ceph-deploy install --release lumious node1 node2 node3
  2. # ceph-deploy --overwrite-conf mgr create node1 node2 node3