ceph-deploy比较适合生产环境,不是用cephadm搭建。相对麻烦一些,但是并不难,细节把握好就行,只是命令多一些而已。
略…
服务器主机 | public网段IP(对外服务) | cluster网段IP(集群通信) | 角色 |
---|---|---|---|
deploy | 192.168.2.120 | 用于部署集群、管理集群 | |
ceph-node1 | 192.168.2.121 | 192.168.6.135 | ceph-mon、ceph-mgr、ceph-osd |
ceph-node2 | 192.168.2.122 | 192.168.6.136 | ceph-mon、ceph-mgr、ceph-osd |
ceph-node3 | 192.168.2.123 | 192.168.6.137 | ceph-mon、ceph-osd |
ceph-osd节点:
1. 一般建议裸金属部署。
2. 配置为:10c\12c, 32G、64G更好。
ceph-mgr节点:
两个节点就可以做高可用了,当然可以用更多节点。
ceph-mon必须3个节点以上。
ceph-mon性能可以低一点,比如跑虚拟机上。
4c8g也够用,4C16G更好。
集群搭建和使用过程中,还会设计rgw、mds等节点会与上面的ceph-node1至ceph-node3、ceph-deploy节点混用,因为我没有这么多机器。
生产环境,如果可以就将ceph-mgr、ceph-mon节点都单独分开,不能的话ceph-mgr和ceph-mon也可以混用。
systemctl disable firewalld
systemctl stop firewalld
setenforce 0
sed -i '7s/enforcing/disabled/' /etc/selinux/config
hostnamectl set-hostname ceph-node1
hostnamectl set-hostname ceph-node2
hostnamectl set-hostname ceph-node3
hostnamectl set-hostname ceph-deploy
192.168.2.120 ceph-deploy
192.168.2.121 ceph-node1
192.168.2.122 ceph-node2
192.168.2.123 ceph-node3
[epel]
name=Extra Packages for Enterprise Linux 7 -
baseurl=http://mirrors.tuna.tsinghua.edu.cn/epel/7/$basearch/
#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=
failovermethod=priority
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
[epel-debuginfo]
name=Extra Packages for Enterprise Linux 7 - - Debug
baseurl=http://mirrors.tuna.tsinghua.edu.cn/epel/7/$basearch/debug
#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-debug-7&arch=
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=1
[epel-source]
name=Extra Packages for Enterprise Linux 7 - - Source
baseurl=http://mirrors.tuna.tsinghua.edu.cn/epel/7/SRPMS
#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-source-7&arch=
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=1
[Ceph]
name=Ceph packages for $basearch
baseurl=http://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-mimic/el7/$basearch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc
[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-mimic/el7/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc
[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-mimic/el7/SRPMS
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc
groupadd ceph -g 3333
useradd -u 3333 -g 3333 ceph
echo "cephadmin888" | passwd --stdin ceph
echo "ceph ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
# 切换到ceph用户,切记一定要切换再做..不然免密就是免密你当前的用户,因为后续要用ceph用户来部署。
su - ceph
# 生成ssh密钥
ssh-keygen
sudo ssh-copy-id ceph@192.168.2.121
sudo ssh-copy-id ceph@192.168.2.122
sudo ssh-copy-id ceph@192.168.2.123
su - ceph
[ceph@ceph-deploy ~]$ mkdir ceph-cluster-deploy
[ceph@ceph-deploy ~]$ cd ceph-cluster-deploy/
[ceph@ceph-deploy ceph-cluster-deploy]$
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo yum install ceph-deploy python-setuptools python2-subprocess3
安装成功后可以查看ceph-deploy命令是否能够使用
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy
usage: ceph-deploy [-h] [-v | -q] [--version] [--username USERNAME]
[--overwrite-conf] [--ceph-conf CEPH_CONF]
COMMAND ...
ceph-deploy 2.0.1默认安装mimic的ceph版本(也就是13.2.10),如果需要安装其他版本ceph,可以使用–release来指定
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy --version
2.0.1
在ceph-deploy节点通过执行install命令,为ceph集群中的osd节点安装ceph相关包
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy install --help
usage: ceph-deploy install [-h] [--stable [CODENAME] | --release [CODENAME] |
--testing | --dev [BRANCH_OR_TAG]]
[--dev-commit [COMMIT]] [--mon] [--mgr] [--mds]
[--rgw] [--osd] [--tests] [--cli] [--all]
[--adjust-repos | --no-adjust-repos | --repo]
[--local-mirror [LOCAL_MIRROR]]
[--repo-url [REPO_URL]] [--gpg-url [GPG_URL]]
[--nogpgcheck]
HOST [HOST ...]
Install Ceph packages on remote hosts.
positional arguments:
HOST hosts to install on
... 等选项,此处忽略
# 这里有2个比较重要的选项,分别是:
--no-adjust-repos install packages without modifying source repos # 不要去修改ceph的repo源,因为我们前面已经将源改成清华的源了,等下它给你改回来就慢的要死
--nogpgcheck install packages without gpgcheck # 跳过gpg校验
执行命令:
# p.s:ceph-node{1..3} 中的{1..3}这个是linux中的一个循环运算,比如用在for循环中
# 实际上生产命令:ceph-deploy install --no-adjust-repos --nogpgcheck ceph-node1 ceph-node2 ceph-node3
# 执行该命令进行安装
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy install --no-adjust-repos --nogpgcheck ceph-node{1..3}
执行过程就忽略了,执行成功后有类似提示,如下:
[ceph-node3][DEBUG ] 完毕!
[ceph-node3][INFO ] Running command: sudo ceph --version
[ceph-node3][DEBUG ] ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)
# 查看ceph-deploy new子命令的帮助信息
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy new --help
usage: ceph-deploy new [-h] [--no-ssh-copykey] [--fsid FSID]
[--cluster-network CLUSTER_NETWORK]
[--public-network PUBLIC_NETWORK]
MON [MON ...]
Start deploying a new cluster, and write a CLUSTER.conf and keyring for it.
positional arguments:
MON initial monitor hostname, fqdn, or hostname:fqdn pair
optional arguments:
-h, --help show this help message and exit
--no-ssh-copykey do not attempt to copy SSH keys
--fsid FSID provide an alternate FSID for ceph.conf generation
--cluster-network CLUSTER_NETWORK
specify the (internal) cluster network
--public-network PUBLIC_NETWORK
specify the public network for a cluster
执行命令:
# 由于我是将mon也放到osd节点上,所以这里就是ceph-node1、ceph-node2、ceph-node3了
# 生产环境,建议将mon单独服务器节点。
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy new --cluster-network 192.168.6.0/24 --public-network 192.168.2.0/24 ceph-node1 ceph-node2 ceph-node3
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy new --cluster-network 192.168.6.0/24 --public-network 192.168.2.0/24 ceph-node1 ceph-node2
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] func : <function new at 0x7fa768c08de8>
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fa76837f8c0>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] ssh_copykey : True
[ceph_deploy.cli][INFO ] mon : ['ceph-node1', 'ceph-node2']
[ceph_deploy.cli][INFO ] public_network : 192.168.2.0/24
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster_network : 192.168.6.0/24
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] fsid : None
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds
[ceph-node1][DEBUG ] connected to host: ceph-deploy
[ceph-node1][INFO ] Running command: ssh -CT -o BatchMode=yes ceph-node1
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph-node1][DEBUG ] detect platform information from remote host
[ceph-node1][DEBUG ] detect machine type
[ceph-node1][DEBUG ] find the location of an executable
[ceph-node1][INFO ] Running command: sudo /usr/sbin/ip link show
[ceph-node1][INFO ] Running command: sudo /usr/sbin/ip addr show
[ceph-node1][DEBUG ] IP addresses found: [u'192.168.2.121', u'192.168.6.135']
[ceph_deploy.new][DEBUG ] Resolving host ceph-node1
[ceph_deploy.new][DEBUG ] Monitor ceph-node1 at 192.168.2.121
[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds
[ceph-node2][DEBUG ] connected to host: ceph-deploy
[ceph-node2][INFO ] Running command: ssh -CT -o BatchMode=yes ceph-node2
[ceph_deploy.new][WARNIN] could not connect via SSH
[ceph_deploy.new][INFO ] will connect again with password prompt
The authenticity of host 'ceph-node2 (192.168.2.122)' can't be established.
ECDSA key fingerprint is SHA256:bFB9FzJjKEKMP2W5kW+orMbo9mD+tr8fLOPRsYaXhj8.
ECDSA key fingerprint is MD5:b7:e5:bd:6a:56:10:42:3d:34:3a:54:ac:79:a2:3c:5b.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ceph-node2' (ECDSA) to the list of known hosts.
[ceph-node2][DEBUG ] connected to host: ceph-node2
[ceph-node2][DEBUG ] detect platform information from remote host
[ceph-node2][DEBUG ] detect machine type
[ceph_deploy.new][INFO ] adding public keys to authorized_keys
[ceph-node2][DEBUG ] append contents to file
[ceph-node2][DEBUG ] connection detected need for sudo
[ceph-node2][DEBUG ] connected to host: ceph-node2
[ceph-node2][DEBUG ] detect platform information from remote host
[ceph-node2][DEBUG ] detect machine type
[ceph-node2][DEBUG ] find the location of an executable
[ceph-node2][INFO ] Running command: sudo /usr/sbin/ip link show
[ceph-node2][INFO ] Running command: sudo /usr/sbin/ip addr show
[ceph-node2][DEBUG ] IP addresses found: [u'192.168.6.136', u'192.168.2.122']
[ceph_deploy.new][DEBUG ] Resolving host ceph-node2
[ceph_deploy.new][DEBUG ] Monitor ceph-node2 at 192.168.2.122
[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-node1', 'ceph-node2']
[ceph_deploy.new][DEBUG ] Monitor addrs are [u'192.168.2.121', u'192.168.2.122']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
# 查看当前目录,可以发现生产了一些文件
[ceph@ceph-deploy ceph-cluster-deploy]$ ll
总用量 16
-rw-rw-r-- 1 ceph ceph 292 12月 22 12:10 ceph.conf # conf是ceph集群的配置文件
-rw-rw-r-- 1 ceph ceph 5083 12月 22 12:10 ceph-deploy-ceph.log # 日志
-rw------- 1 ceph ceph 73 12月 22 12:10 ceph.mon.keyring # 这个是ceph集群的密钥
# 查看ceph.conf
[ceph@ceph-deploy ceph-cluster-deploy]$ cat ceph.conf
[global]
fsid = f1da3a2e-b8df-46ba-9c6b-0030da25c73e
public_network = 192.168.2.0/24
cluster_network = 192.168.6.0/24
mon_initial_members = ceph-node1, ceph-node2
mon_host = 192.168.2.121,192.168.2.122
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
如果是独立的mon节点,记得检查各个mon节点中是否安装了ceph-mon包
yum install -y ceph-mon
切换回ceph-deploy节点
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy mon create-initial
执行完成后回发现又多了一些配置文件,这些文件都是非常重要的,类似k8s中的kubeconfig,不要随意泄露。
[ceph@ceph-deploy ceph-cluster-deploy]$ ll
总用量 476
-rw------- 1 ceph ceph 113 12月 22 13:11 ceph.bootstrap-mds.keyring
-rw------- 1 ceph ceph 113 12月 22 13:11 ceph.bootstrap-mgr.keyring
-rw------- 1 ceph ceph 113 12月 22 13:11 ceph.bootstrap-osd.keyring
-rw------- 1 ceph ceph 113 12月 22 13:11 ceph.bootstrap-rgw.keyring
-rw------- 1 ceph ceph 151 12月 22 13:11 ceph.client.admin.keyring
-rw-rw-r-- 1 ceph ceph 292 12月 22 12:11 ceph.conf
-rw-rw-r-- 1 ceph ceph 207826 12月 22 13:17 ceph-deploy-ceph.log
-rw------- 1 ceph ceph 73 12月 22 12:11 ceph.mon.keyring
而且对应的mon节点上的服务器,运行的mon服务
ceph-mon@.service
从此处链接:
/etc/systemd/system/ceph-mon.target.wants/ceph-mon@<mon节点主机名>.service
并且也有对应的进程
[root@ceph-node3 ~]# ps axu | grep non
ceph 2614 0.5 2.1 470596 39944 ? Ssl 13:17 0:00 /usr/bin/ceph-mon -f --cluster ceph --id ceph-node3 --setuser ceph --setgroup ceph
推送密钥到各个osd节点、或者你需要使用ceph集群管理的节点。不推送你就得每次自己指定密钥,比较麻烦。。。
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy admin ceph-node{1..3}
# 推送给自己,因为我这里是用同一个服务器来部署和管理ceph集群
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy admin ceph-deploy
在各个node节点上设置文件facl,因为推送过去的密码默认属主和属组都是root用户,但是我们前面是创建ceph用户用于管理ceph集群
# 可以在root用户下设置,也可以用sudo
# ceph-node1
[root@ceph-node1 ~]# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring
[root@ceph-node1 ~]# getfacl /etc/ceph/ceph.client.admin.keyring
getfacl: Removing leading '/' from absolute path names
# file: etc/ceph/ceph.client.admin.keyring
# owner: root
# group: root
user::rw-
user:ceph:rw-
group::---
mask::rw-
other::---
# ceph-node2 和 ceph-node3 类似
# 因为我打算在deploy节点同时管理ceph,也就是admin和deploy是同一个节点,所以这里也要给ddeploy节点设置facl
[root@ceph-deploy ~]# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring
只有ceph luminios和以上的版本才有mgr节点,老版本并没有,所以老版本不需要部署。
但是我们部署的是安装mimic的ceph版本(也就是13.2.10),所以需要部署。
如果是独立的mgr节点服务器,记得检查是否安装了ceph-mgr包
yum install -y ceph-mgr
ceph-mgr命令选项:
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy mgr --help
usage: ceph-deploy mgr [-h] {create} ...
Ceph MGR daemon management
positional arguments:
{create}
create Deploy Ceph MGR on remote host(s)
optional arguments:
-h, --help show this help message and exit
执行命令,初始化mgr节点
# 由于我是osd、mon、mgr混用服务器,所以这里就用ceph-node1、ceph-node2了。
ceph-deploy mgr create ceph-node1 ceph-node2
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph -s
cluster:
id: f1da3a2e-b8df-46ba-9c6b-0030da25c73e
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 3
services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
mgr: ceph-node1(active), standbys: ceph-node2
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
每台服务器添加3块5G的硬盘作为测试。
# 1.擦除osd节点上要被添加的磁盘的空间
ceph-deploy disk zap ceph-node1 /dev/sd{b,c,d}
ceph-deploy disk zap ceph-node2 /dev/sd{b,c,d}
ceph-deploy disk zap ceph-node3 /dev/sd{b,c,d}
# 2.添加ceph-node1上的磁盘为osd
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy osd create ceph-node1 --data /dev/sdb
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy osd create ceph-node1 --data /dev/sdc
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy osd create ceph-node1 --data /dev/sdd
# 3.添加ceph-node2上的磁盘为osd
ceph-deploy osd create ceph-node2 --data /dev/sdb
ceph-deploy osd create ceph-node2 --data /dev/sdc
ceph-deploy osd create ceph-node2 --data /dev/sdd
# 4.添加ceph-node3上的磁盘为osd
ceph-deploy osd create ceph-node3 --data /dev/sdb
ceph-deploy osd create ceph-node3 --data /dev/sdc
ceph-deploy osd create ceph-node3 --data /dev/sdd
# 5.添加完成后,会在对应的osd节点上添加osd服务(但只是runtime临时生效,必须将其改为永久生效)
如:/run/systemd/system/ceph-osd.target.wants/ceph-osd@7.service # 7是osd的id,从0开始。
# 通过ceph-deploy可以检查
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy osd list ceph-node{1,2,3}
# 通过ceph osd stat命令检查
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd stat
9 osds: 9 up, 9 in; epoch: e37
# 使用 ceph osd status 查看
- `id`: OSD的唯一标识符。
- `host`: OSD所在的主机名
- `used`: OSD已使用的存储容量。
- `avail`: OSD可用的存储容量。
- `wr ops`: OSD每秒写入操作的数量。
- `wr data`: OSD每秒写入数据的数量。
- `rd ops`: OSD每秒读取操作的数量。
- `rd data`: OSD每秒读取数据的数量。
- `state`: OSD的状态,"exists"表示OSD存在,"up"表示OSD正常运行。
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd status
+----+------------+-------+-------+--------+---------+--------+---------+-----------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+------------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | ceph-node1 | 1028M | 4087M | 0 | 0 | 0 | 0 | exists,up |
| 1 | ceph-node1 | 1028M | 4087M | 0 | 0 | 0 | 0 | exists,up |
| 2 | ceph-node1 | 1028M | 4087M | 0 | 0 | 0 | 0 | exists,up |
| 3 | ceph-node2 | 1028M | 4087M | 0 | 0 | 0 | 0 | exists,up |
| 4 | ceph-node2 | 1028M | 4087M | 0 | 0 | 0 | 0 | exists,up |
| 5 | ceph-node2 | 1028M | 4087M | 0 | 0 | 0 | 0 | exists,up |
| 6 | ceph-node3 | 1028M | 4087M | 0 | 0 | 0 | 0 | exists,up |
| 7 | ceph-node3 | 1028M | 4087M | 0 | 0 | 0 | 0 | exists,up |
| 8 | ceph-node3 | 1028M | 4087M | 0 | 0 | 0 | 0 | exists,up |
+----+------------+-------+-------+--------+---------+--------+---------+-----------+
# ceph osd tree 也可以
[root@ceph-node1 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04408 root default
-3 0.01469 host ceph-node1
0 hdd 0.00490 osd.0 up 1.00000 1.00000
1 hdd 0.00490 osd.1 up 1.00000 1.00000
2 hdd 0.00490 osd.2 up 1.00000 1.00000
-5 0.01469 host ceph-node2
3 hdd 0.00490 osd.3 up 1.00000 1.00000
4 hdd 0.00490 osd.4 up 1.00000 1.00000
5 hdd 0.00490 osd.5 up 1.00000 1.00000
-7 0.01469 host ceph-node3
6 hdd 0.00490 osd.6 up 1.00000 1.00000
7 hdd 0.00490 osd.7 up 1.00000 1.00000
8 hdd 0.00490 osd.8 up 1.00000 1.00000
# 这个是用来查看osd的disk free,类似linux的df
- `ID`: OSD的唯一标识符。
- `CLASS`: OSD的存储类别。
- `WEIGHT`: OSD的权重。
- `REWEIGHT`: OSD的重新加权比例。
- `SIZE`: OSD的总存储容量。
- `RAW USE`: OSD当前使用的原始存储容量。
- `DATA`: OSD数据存储使用量。
- `OMAP`: OSD的OMAP(Object Map)数据存储使用量。
- `META`: OSD元数据存储使用量。
- `AVAIL`: OSD可用的存储容量。
- `%USE`: OSD使用率百分比。
- `VAR`: OSD使用率方差。
- `PGS`: OSD分布的PG(Placement Group)数量。
- `STATUS`: OSD的状态,"up"表示OSD正常运行。
[root@ceph-node1 ~]# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE USE DATA OMAP META AVAIL %USE VAR PGS
0 hdd 0.00490 1.00000 5.0 GiB 1.0 GiB 4.7 MiB 0 B 1 GiB 4.0 GiB 20.11 1.00 0
1 hdd 0.00490 1.00000 5.0 GiB 1.0 GiB 4.7 MiB 0 B 1 GiB 4.0 GiB 20.11 1.00 0
2 hdd 0.00490 1.00000 5.0 GiB 1.0 GiB 4.7 MiB 0 B 1 GiB 4.0 GiB 20.11 1.00 0
3 hdd 0.00490 1.00000 5.0 GiB 1.0 GiB 4.7 MiB 0 B 1 GiB 4.0 GiB 20.11 1.00 0
4 hdd 0.00490 1.00000 5.0 GiB 1.0 GiB 4.7 MiB 0 B 1 GiB 4.0 GiB 20.11 1.00 0
5 hdd 0.00490 1.00000 5.0 GiB 1.0 GiB 4.7 MiB 0 B 1 GiB 4.0 GiB 20.11 1.00 0
6 hdd 0.00490 1.00000 5.0 GiB 1.0 GiB 4.7 MiB 0 B 1 GiB 4.0 GiB 20.11 1.00 0
7 hdd 0.00490 1.00000 5.0 GiB 1.0 GiB 4.7 MiB 0 B 1 GiB 4.0 GiB 20.11 1.00 0
8 hdd 0.00490 1.00000 5.0 GiB 1.0 GiB 4.7 MiB 0 B 1 GiB 4.0 GiB 20.11 1.00 0
TOTAL 45 GiB 9.0 GiB 42 MiB 0 B 9 GiB 36 GiB 20.11
MIN/MAX VAR: 1.00/1.00 STDDEV: 0
根据osd所在节点,添加对应的服务为开机启动
# ceph-node1
systemctl enable ceph-osd@{0,1,2}
# ceph-node2
systemctl enable ceph-osd@{3,4,5}
# ceph-node3
systemctl enable ceph-osd@{6,7,8}
移除的时候,最好一个个移除,不然有可能性能跟不上,因为ceph自己去找其他osd的备份来作为主,一旦一次性删除太多就可能出现性能问题。
# 停用osd
ceph osd out <osd-id>
# 停止osd服务
systemctl stop ceph-osd@<osd-id>
# 移除osd
ceph osd ourge <osd-id> --yes-i-really-mean-it
# 检查ceph.conf集群配置文件中,是福哦还有对应osd的配置,如有则手动删除、
###### Luminous 之前的版本,移除步骤如下 :
ceph osd crush remove <name>
ceph auth del osd <osd-id>
ceph osd rm <osd-id>
# 通过rados创建pool
rados mkpool <pool-name> [123[ 4]] create pool <pool-name>'
[with auid 123[and using crush rule 4]]
# 通过ceph 命令创建pool
ceph osd pool create <poolname> <int[0-]> {<int[0-]>} {replicated|erasure} {<erasure_code_profile>} create pool
{<rule>} {<int>}
ceph osd pool create <pool名> <pg值> <pg备份值>
# 上传文件到指定的pool
[ceph@ceph-deploy ceph-cluster-deploy]$ rados put myfile /etc/fstab -p swq-test
# 列出指定pool中的文件
[ceph@ceph-deploy ceph-cluster-deploy]$ rados ls -p swq-test
myfile
# 下载文件
[ceph@ceph-deploy ceph-cluster-deploy]$ rados get myfile -p swq-test /tmp/my.txt
[ceph@ceph-deploy ceph-cluster-deploy]$ cat /tmp/my.txt
#
# /etc/fstab
# Created by anaconda on Thu Dec 21 23:51:13 2023
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root / xfs defaults 0 0
UUID=4b1bb372-7f34-48f6-8852-036ee6dfd125 /boot
# 查看文件的映射关系
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd map swq-test myfile
osdmap e43 pool 'swq-test' (2) object 'myfile' -> pg 2.423e92f7 (2.17) -> up ([5,6,2], p5) acting ([5,6,2], p5)
# 在哪个pg中?
# pg为2的423e92f7中。
# -> pg 2.423e92f7 (2.17)
# 在哪个osd中?
# 这里是在5,6,2这3个osd中,主osd为:5
# acting是目前活动的osd
# -> up ([5,6,2], p5) acting ([5,6,2], p5)
mon节点必须是3个或者以上奇数
# 假设现在有一个ceph-mon4的主机
ceph-mon4
public network ip: 192.168.2.124
cluster ip: 192.168.6.137
# 在需要成为mon节点的服务器上安装包
yum install -y ceph-common ceph-mon
# 在每个节点上添加hosts解析
192.168.2.124 ceph-mon4
# ssh登录到ceph-deploy节点,下面的命令都在ceph-deploy节点执行
# 切换为ceph用户
su - ceph
# 切换到之前部署的目录中
cd ceph-cluster-deploy
# 将ceph-mon4节点添加到mon集群中
ceph-deploy mon add ceph-mon4
# 验证ceph-mon状态
ceph quorum_status --format json-pretty
# 或者执行ceph -s 也可以
ceph -s
# 在需要成为mgr节点的服务器上安装包
yum install ceph-mgr
##### 切换到ceph-deploy节点,以下命令都需要在该节点上执行
# 登录ceph用户
su - ceph
# 切换到之前部署的文件夹下
cd ceph-deploy-cluster
# 添加mgr节点 (由于我这里是将osd节点也当作mgr节点,所以直接用ceph-node3了)
# 如果是生产环境分开部署的话,一样需要添加hosts解析、ssh免密登录等
ceph-deploy mgr create ceph-node3
用于K8S、openstack、linux中直接挂载。类似使用iscsi块存储一样。
# 1. 创建存储池
# 语法:ceph osd pool create <存储池名> <PG> [<PGP>] [{replicated|erasure}]
# PG: 指定存储池的pg数量
# PGP: 指定存储池pgp数量,一般与pg相同。不填写默认就是和PG一样【可选】
# replicated 副本池(默认)【可选】
# erasure 纠错码池 (monio有用到)【可选】
# 1.1 创建名为django-web的存储池
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool create django-web 16
pool 'django-web' created
# 1.2 查看存储池
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd lspools
... # 此处省略,是我之前创建的存储池
3 django-web
# 2. 开启存储池的RDB功能
语法:ceph osd pool application enable <poolname> <app> {--yes-i-really-mean-it}
# 2.1 开启django-web存储池的RBD功能
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool application enable django-web rbd
enabled application 'rbd' on pool 'django-web'
# 2.2 通过rbd命令对存储池进行初始化
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd pool init -p django-web
# 3. 此时还不能直接挂载rbd,还需要创建img对象。
# 一个存储池可以有多个镜像img,需要通过img进行进行挂载。
语法: rbd create [--pool <pool>] [--image <image>] [--image-feature <image-feature>] --size <size> [--no-progress]
# --image-feature用来指定镜像的特性,因为在某些linux内核较低的发行版本(如centos7),有些功能用不了,一旦你挂载的时候就会报错。)
# 默认开启的特性: --image-feature arg image features
[layering(+), exclusive-lock(+*), object-map(+*),
fast-diff(+*), deep-flatten(+-), journaling(*)]
Image Features:
(*) supports enabling/disabling on existing images
(-) supports disabling-only on existing images
(+) enabled by default for new images if features not specified
# 3.1 创建img镜像文件
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd create --pool django-web --image img001 --size 1G
# 虽然后面可以修改img的特性,不过我特意再创建多一个img镜像img002,因为等下我要用centos7来挂载,
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd create --pool django-web --image img002 --size 1G --image-feature layering
# 3.2 查看django-web存储池下的img镜像有哪些?
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd ls -p django-web
img001
img002
# 3.3 查看img的配置信息
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd -p django-web --image img001 info
rbd image 'img001':
size 1 GiB in 256 objects # 一个256个对象
order 22 (4 MiB objects) # 每个对象4M , 4*256=1024M=1G
id: 282f76b8b4567
block_name_prefix: rbd_data.282f76b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten # 这里就是开启了什么特性
op_features:
flags:
create_timestamp: Mon Dec 25 18:56:14 2023
# 4. 客户端安装ceph-common包(不安装用不了挂载)
# 配置epel源、配置ceph源,然后再安装ceph-common包
yum install -y ceph-common
# 4.1 复制admin权限的认证权限keyring文件(这里暂时用admin的,后续学习cephx认证后用普通用户)
# 生产环境一般也是用普通用户,绝不会用admin权限
scp ceph.client.admin.keyring root@xxxxx:/etc/ceph
# 4.2 挂载rbd存储
usage: rbd map [--device-type <device-type>] [--pool <pool>] [--image <image>]
[--snap <snap>] [--read-only] [--exclusive]
[--options <options>]
<image-or-snap-spec>
Map an image to a block device.
# 4.2.1 开始挂载
# 由于内核版本太低,有些特性不支持,所以会报错
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo rbd map -p django-web --image img001
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable django-web/img001 object-map fast-diff deep-flatten".
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (6) No such device or address
# 报错的同时还会提示你如何通过关闭img镜像的特性来解决
# rbd feature disable django-web/img001 object-map fast-diff deep-flatten
# 我这里就不关闭了,直接挂载之前创建的img002
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo rbd map -p django-web --image img002
/dev/rbd0 # 从命令的返回结果可以看到,已经将其挂载到/deb/rbd0这个文件上
# 4.2.2 查看块设备 lsblk
[ceph@ceph-deploy ceph-cluster-deploy]$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 20G 0 disk
├─sda1 8:1 0 512M 0 part /boot
└─sda2 8:2 0 19.5G 0 part
└─centos-root 253:0 0 19.5G 0 lvm /
sr0 11:0 1 1024M 0 rom
rbd0 252:0 0 1G 0 disk
# 4.3 格式化/deb/rbd0
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mkfs.ext4 /dev/rbd0
mke2fs 1.42.9 (28-Dec-2013)
Discarding device blocks: 完成
文件系统标签=
OS type: Linux
块大小=4096 (log=2)
分块大小=4096 (log=2)
Stride=1024 blocks, Stripe width=1024 blocks
65536 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
第一个数据块=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Allocating group tables: 完成
正在写入inode表: 完成
Creating journal (8192 blocks): 完成
Writing superblocks and filesystem accounting information: 完成
# 4.4 挂载到指定目录
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mkdir /mnt/ceph-rbd-dir
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mount /dev/rbd0 /mnt/ceph-rbd-dir/
# 4.5 切换到root用户,往挂载的目录写入数据进行测试
[root@ceph-deploy ~]# echo hahaha > /mnt/ceph-rbd-dir/1.txt
[root@ceph-deploy ~]# cat /mnt/ceph-rbd-dir/1.txt
hahaha
# 配置开机挂载
systemctl enable rc-local
vim /etc/rc.local
# 添加以下两行命令
rbd map -p django-web --image img001 [--id <ceph用户名,不需要client前缀>]
mount /dev/rbd0 /mnt/ceph-rbd-dir/
RGW提供restful接口,客户端通过请求api接口进行交互,从而进行数据的增删改查。
这种一般都是开发app调用。
需将要某个节点添加为rgw节点
通过7480端口来访问对象存储,并提供key来访问校验权限。
# 比如:将ceph-node2 添加为rgw节点
# 1. 在需要成为rgw节点的服务器上安装 ceph-radosgw 包
yum install -y ceph-radosgw
# 2. 在ceph-deploy节点上添加某个节点成为rgw节点
# 这个操作,将会在rgw节点上添加一个名为:ceph-radosgw@rgw.<节点名>的服务,并设置为开机启动
# 然后rgw节点上会监听7480端口
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy rgw create ceph-node2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy rgw create ceph-node2
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] rgw : [('ceph-node2', 'rgw.ceph-node2')]
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f56cd7c2cf8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function rgw at 0x7f56cde08050>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.rgw][DEBUG ] Deploying rgw, cluster ceph hosts ceph-node2:rgw.ceph-node2
[ceph-node2][DEBUG ] connection detected need for sudo
[ceph-node2][DEBUG ] connected to host: ceph-node2
[ceph-node2][DEBUG ] detect platform information from remote host
[ceph-node2][DEBUG ] detect machine type
[ceph_deploy.rgw][INFO ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.rgw][DEBUG ] remote host will use systemd
[ceph_deploy.rgw][DEBUG ] deploying rgw bootstrap to ceph-node2
[ceph-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node2][WARNIN] rgw keyring does not exist yet, creating one
[ceph-node2][DEBUG ] create a keyring file
[ceph-node2][DEBUG ] create path recursively if it doesn't exist
[ceph-node2][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-rgw --keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring auth get-or-create client.rgw.ceph-node2 osd allow rwx mon allow rw -o /var/lib/ceph/radosgw/ceph-rgw.ceph-node2/keyring
[ceph-node2][INFO ] Running command: sudo systemctl enable ceph-radosgw@rgw.ceph-node2
[ceph-node2][WARNIN] Created symlink from /etc/systemd/system/ceph-radosgw.target.wants/ceph-radosgw@rgw.ceph-node2.service to /usr/lib/systemd/system/ceph-radosgw@.service.
[ceph-node2][INFO ] Running command: sudo systemctl start ceph-radosgw@rgw.ceph-node2
[ceph-node2][INFO ] Running command: sudo systemctl enable ceph.target
[ceph_deploy.rgw][INFO ] The Ceph Object Gateway (RGW) is now running on host ceph-node2 and default port 7480
# 3. 另外,启动rgw后会默认生成与其相关的pool
[root@ceph-node2 ~]# ceph osd lspools
... # 此处省略其他pool
3 django-web
4 .rgw.root
5 default.rgw.control
6 default.rgw.meta
7 default.rgw.log
# 4. 通过ceph -s 也可以看到有一个rgw
[root@ceph-node2 ~]# ceph -s
cluster:
id: f1da3a2e-b8df-46ba-9c6b-0030da25c73e
health: HEALTH_WARN
application not enabled on 1 pool(s)
too few PGs per OSD (29 < min 30)
services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
mgr: ceph-node2(active), standbys: ceph-node3, ceph-node1
osd: 9 osds: 9 up, 9 in
rgw: 1 daemon active
data:
pools: 7 pools, 88 pgs
objects: 211 objects, 37 MiB
usage: 9.2 GiB used, 36 GiB / 45 GiB avail
pgs: 88 active+clean
之前只将ceph-node2添加为rgw节点,目前只是单节点,没有实现高可用。
一般生产环境可以添加多个节点成为rgw节点,然后通过nginx反代多个rgw节点的7480端口。
# 切换到ceph-deploy节点,并添加ceph-node1为rgw节点
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy rgw create ceph-node1
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy rgw create ceph-node1
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] rgw : [('ceph-node1', 'rgw.ceph-node1')]
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fb193e27cf8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function rgw at 0x7fb19446d050>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.rgw][DEBUG ] Deploying rgw, cluster ceph hosts ceph-node1:rgw.ceph-node1
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph-node1][DEBUG ] detect platform information from remote host
[ceph-node1][DEBUG ] detect machine type
[ceph_deploy.rgw][INFO ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.rgw][DEBUG ] remote host will use systemd
[ceph_deploy.rgw][DEBUG ] deploying rgw bootstrap to ceph-node1
[ceph-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node1][WARNIN] rgw keyring does not exist yet, creating one
[ceph-node1][DEBUG ] create a keyring file
[ceph-node1][DEBUG ] create path recursively if it doesn't exist
[ceph-node1][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-rgw --keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring auth get-or-create client.rgw.ceph-node1 osd allow rwx mon allow rw -o /var/lib/ceph/radosgw/ceph-rgw.ceph-node1/keyring
[ceph-node1][INFO ] Running command: sudo systemctl enable ceph-radosgw@rgw.ceph-node1
[ceph-node1][WARNIN] Created symlink from /etc/systemd/system/ceph-radosgw.target.wants/ceph-radosgw@rgw.ceph-node1.service to /usr/lib/systemd/system/ceph-radosgw@.service.
[ceph-node1][INFO ] Running command: sudo systemctl start ceph-radosgw@rgw.ceph-node1
[ceph-node1][INFO ] Running command: sudo systemctl enable ceph.target
[ceph_deploy.rgw][INFO ] The Ceph Object Gateway (RGW) is now running on host ceph-node1 and default port 7480
# 查看ceph集群状态,可以发现rgw变成了2个
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph -s
cluster:
id: f1da3a2e-b8df-46ba-9c6b-0030da25c73e
health: HEALTH_WARN
application not enabled on 1 pool(s)
services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
mgr: ceph-node2(active), standbys: ceph-node1, ceph-node3
mds: cephfs-test-1/1/1 up {0=ceph-node3=up:active}, 1 up:standby
osd: 9 osds: 9 up, 9 in
rgw: 2 daemons active
data:
pools: 9 pools, 120 pgs
objects: 234 objects, 37 MiB
usage: 9.2 GiB used, 36 GiB / 45 GiB avail
pgs: 120 active+clean
# vim /etc/ceph.conf
[client.rgw.<节点名>]
rgw_host=<节点名 或者 节点IP地址>
rgw_frontends="civetweb port=8880" # 这里只是修改http端口, 如果需要https端口,要修改为:
rgw_frontends="civetweb port=8880+8443s"
# 添加ssl证书
rgw_frontends="civetweb port=8880+8443s ssl_certificate='pem证书'
# civetweb默认的请求处理线程数
num_threads=50 # 默认就是50
access_log_file="log path" # 设置access log记录位置
error_log_file="log path" # 设置error log记录文件
# 重启节点上的rgw服务
systemctl restart ceph-radosgw@rgw.<节点名>
性能不如块存储,一般能用块存储就用块存储。
类似NFS,ceph-fs只不过是使用ceph协议并挂载存储使用而已。相对NFS性能更好。
cephfs需要使用mds服务(ceph-mds -> metadata-server)
另外:创建好ceph-fs后,会在各个mon节点上监听6789端口。挂载的时候可以随意挂任意节点mon的6789,也可以搞个haproxy或nginx来代理。也可以在挂载的时候同时指定多个mon节点。
这个案例的mds服务是单点,后期需要解决单点mds服务的问题
yum install -y ceph-mds
# 2. 在ceph-deploy节点上,将某节点添加为ceph-mds节点
# 此操作会将成为mds节点的服务器上添加服务:ceph-mds@<节点名>
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy mds create ceph-node3
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy mds create ceph-node3
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f93208594d0>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mds at 0x7f9320aa7ed8>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] mds : [('ceph-node3', 'ceph-node3')]
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-node3:ceph-node3
[ceph-node3][DEBUG ] connection detected need for sudo
[ceph-node3][DEBUG ] connected to host: ceph-node3
[ceph-node3][DEBUG ] detect platform information from remote host
[ceph-node3][DEBUG ] detect machine type
[ceph_deploy.mds][INFO ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-node3
[ceph-node3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node3][WARNIN] mds keyring does not exist yet, creating one
[ceph-node3][DEBUG ] create a keyring file
[ceph-node3][DEBUG ] create path if it doesn't exist
[ceph-node3][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-node3 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-node3/keyring
[ceph-node3][INFO ] Running command: sudo systemctl enable ceph-mds@ceph-node3
[ceph-node3][WARNIN] Created symlink from /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-node3.service to /usr/lib/systemd/system/ceph-mds@.service.
[ceph-node3][INFO ] Running command: sudo systemctl start ceph-mds@ceph-node3
[ceph-node3][INFO ] Running command: sudo systemctl enable ceph.target
# 3. 此时mds还不能使用,需要创建存储池用于保存mds的数据
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph mds stat
, 1 up:standby
# 3.1 创建metedata专用的存储池
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool create cephfs-metadata-pool 16
pool 'cephfs-metadata-pool' created
# 3.2 创建存储实际数据的存储池
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool create cephfs-data-pool 16
pool 'cephfs-data-pool' created
# 4. 创建ceph-fs文件系统
# make new filesystem using named pools <metadata> and <data>
语法: ceph fs new <fs_name> <metadata> <data> {--force} {--allow-dangerous-metadata-overlay}
<metadata>: 指定metadata存储池
<data>: 指定data的存储池
# 4.1
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs new cephfs-test cephfs-metadata-pool cephfs-data-pool
new fs with metadata pool 8 and data pool 9
# 5.查看ceph fs
# 通过ceph fs status查看
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs status
cephfs-test - 0 clients
===========
+------+--------+------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+------------+---------------+-------+-------+
| 0 | active | ceph-node3 | Reqs: 0 /s | 10 | 13 |
+------+--------+------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
| Pool | type | used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 2286 | 11.1G |
| cephfs-data-pool | data | 0 | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
+-------------+
MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)
# 5.1 通过ceph fs ls查看
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs ls
name: cephfs-test, metadata pool: cephfs-metadata-pool, data pools: [cephfs-data-pool ]
# 5.2 通过ceph mds stat 查看
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph mds stat
cephfs-test-1/1/1 up {0=ceph-node3=up:active}
# 6. 挂载ceph-fs到客户端
# 6.1 需要挂载的客户端,需要安装ceph-common,
yum install -y ceph-common
# 6.2 并且需要授权文件,这里就复制admin的,生产环境不要用admin,和上面的rbd和radosgw一样。
scp
# 6.3 使用mount.ceph挂载ceph-fs
[ceph@ceph-deploy ceph-cluster-deploy]$ mount.ceph -h
mount.ceph monaddr1[,monaddr2,...]:/[subdir] dir [-o options ]
src用于指定服务器和端口, 端口不指定默认为6789
1. 主机名 或者 主机名:端口
2. ip 或 ip:端口
options:
-h: Print this help
-n: Do not update /etc/mtab
-v: Verbose
ceph-options: refer to mount.ceph(8)
ceph-options常用选项:
name=xxx # 用于指定cephx中的用户,不填的话,默认为guest
secret=xxx # 用于指定cephx中用户的key
secretfile=xx # 用于直接指定密钥文件,这样就不需要指定secret了,而且也更安全。(这个key文件不是keyring,而是keyring中的key后面的那串base64, 直接复制或者用ceph auth print-key <keyring>来输出也可以)
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mount.ceph 192.168.2.121:6789:/ /mnt/ceph-fs-dir -o name=admin,secret=AQCAGoVlpj0zERAA5dhEHlg/a5TyQhPPlTigUg==
# 6.4 写入测试数据
[root@ceph-deploy ceph-cluster-deploy]$ echo 123456 > /mnt/ceph-fs-dir/123.txt
# 6.5. 查看挂载情况:
[root@ceph-deploy ~]# df -Th
文件系统 类型 容量 已用 可用 已用% 挂载点
devtmpfs devtmpfs 898M 0 898M 0% /dev
tmpfs tmpfs 910M 0 910M 0% /dev/shm
tmpfs tmpfs 910M 9.6M 901M 2% /run
tmpfs tmpfs 910M 0 910M 0% /sys/fs/cgroup
/dev/mapper/centos-root xfs 20G 2.4G 18G 12% /
/dev/sda1 xfs 509M 144M 366M 29% /boot
tmpfs tmpfs 182M 0 182M 0% /run/user/0
/dev/rbd0 ext4 976M 2.6M 907M 1% /mnt/ceph-rbd-dir
192.168.2.121:6789:/ ceph 12G 0 12G 0% /mnt/ceph-fs-dir
# 7. 拓展:添加到fstab分区表中,以便开机自行挂载。
# 每次都要手动挂麻烦?要不写到rc-local中,要不就添加到fstab分区表
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mount.ceph 192.168.2.121,192.168.2.122,192.168.2.123:/ /mnt/ceph-fs-dir/ -o name=admin,secret=AQCAGoVlpj0zERAA5dhEHlg/a5TyQhPPlTigUg==
# fstab分区表添加以下内容:
# _netdev: 这种网络类型的fs都要加,不然开机会起不来,比如NFS也一样。因为启动的时候网络还没有启动完毕,所以会找不到网络,然后就卡。
# noatime:并发大的场景如果经常更新atime会浪费性能,这个看需求吧。noatime就是每次访问不更新文件的atime,可以优化性能。
# defaults: 是默认的挂载属性,包含:rw, suid, dev, exec, auto, nouser, and async.
# 【格式】:
# <mon节点:端口>,[<mon节点:端口>,...]:/ <挂载点目录> ceph defaults,_netdev,noatime,name=<ceph账号>,secretfile=<账号的key文件>|secret=账号的key 0 0
# 先备份fstab分区表
[root@ceph-deploy ~]# cp /etc/fstab{,.bak}
# 在fstab添加以下内容
#
# /etc/fstab
# Created by anaconda on Thu Dec 21 23:51:13 2023
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root / xfs defaults 0 0
UUID=4b1bb372-7f34-48f6-8852-036ee6dfd125 /boot xfs defaults 0 0
# 挂载ceph-fs by swq
192.168.2.121,192.168.2.122,192.168.2.123:/ /mnt/ceph-fs-dir ceph defaults,_netdev,noatime,name=admin,secret=AQCAGoVlpj0zERAA5dhEHlg/a5TyQhPPlTigUg== 0 0
# 取消之前手动挂载的ceph-fs
[root@ceph-deploy ~]# umount /mnt/ceph-fs-dir/
# 先手动测试下fstab是否能够正常挂载
[root@ceph-deploy ~]# mount -a
# 没有报错,然后查看是否挂载了?
[root@ceph-deploy ~]# df -Th
文件系统 类型 容量 已用 可用 已用% 挂载点
devtmpfs devtmpfs 898M 0 898M 0% /dev
tmpfs tmpfs 910M 0 910M 0% /dev/shm
tmpfs tmpfs 910M 9.6M 901M 2% /run
tmpfs tmpfs 910M 0 910M 0% /sys/fs/cgroup
/dev/mapper/centos-root xfs 20G 2.4G 18G 12% /
/dev/sda1 xfs 509M 144M 366M 29% /boot
tmpfs tmpfs 182M 0 182M 0% /run/user/0
192.168.2.121,192.168.2.122,192.168.2.123:/ ceph 12G 0 12G 0% /mnt/ceph-fs-dir
[root@ceph-deploy ~]# ll /mnt/ceph-fs-dir/
总用量 1
-rw-r--r-- 1 root root 7 12月 25 21:17 123.txt
ceph-fs是通过mon节点+端口6789来来访问的,虽然mon节点可以高可用,但是实际上提供mds的节点之前只添加了一个。
因此我们需要添加更多的mds实现高可用,甚至是实现两组《一主一从》
高可用:一主多备。
? 这个实现比较简单,只需要添加多个mds服务节点即可实现,默认就是一主多备
高可用+高性能:两组《一主一从》的mds同时提供服务。
? 需要手动设置一些参数和配置项。
# 查看当天的ceph中的mds服务
[root@ceph-deploy ~]# ceph -s
cluster:
id: f1da3a2e-b8df-46ba-9c6b-0030da25c73e
health: HEALTH_WARN
application not enabled on 1 pool(s)
services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
mgr: ceph-node2(active), standbys: ceph-node1, ceph-node3
# 发现只有一个ceph-node3是mds服务。
mds: cephfs-test-1/1/1 up {0=ceph-node3=up:active}
osd: 9 osds: 9 up, 9 in
rgw: 1 daemon active
data:
pools: 9 pools, 120 pgs
objects: 234 objects, 37 MiB
usage: 9.2 GiB used, 36 GiB / 45 GiB avail
pgs: 120 active+clean
# 使用ceph fs status 也可以查看。
[root@ceph-deploy ~]# ceph fs status
cephfs-test - 0 clients
===========
+------+--------+------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+------------+---------------+-------+-------+
| 0 | active | ceph-node3 | Reqs: 0 /s | 11 | 14 |
+------+--------+------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
| Pool | type | used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 7179 | 11.1G |
| cephfs-data-pool | data | 7 | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
+-------------+
# 进入ceph-deploy节点,并切换到ceph用户。
# 添加多一个ceph-node2作为mds服务
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy mds create ceph-node2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy mds create ceph-node2
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f9c6190e4d0>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mds at 0x7f9c61b5ced8>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] mds : [('ceph-node2', 'ceph-node2')]
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-node2:ceph-node2
[ceph-node2][DEBUG ] connection detected need for sudo
[ceph-node2][DEBUG ] connected to host: ceph-node2
[ceph-node2][DEBUG ] detect platform information from remote host
[ceph-node2][DEBUG ] detect machine type
[ceph_deploy.mds][INFO ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-node2
[ceph-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node2][WARNIN] mds keyring does not exist yet, creating one
[ceph-node2][DEBUG ] create a keyring file
[ceph-node2][DEBUG ] create path if it doesn't exist
[ceph-node2][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-node2 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-node2/keyring
[ceph-node2][INFO ] Running command: sudo systemctl enable ceph-mds@ceph-node2
[ceph-node2][WARNIN] Created symlink from /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-node2.service to /usr/lib/systemd/system/ceph-mds@.service.
[ceph-node2][INFO ] Running command: sudo systemctl start ceph-mds@ceph-node2
[ceph-node2][INFO ] Running command: sudo systemctl enable ceph.target
当前环境:
ceph-node2 、 ceph-node3都是mds节点。
添加两个节点成为mds节点
ceph-node1、 ceph-deploy(因为我没机器了所以用的ceph-deploy,生产环境不要随意混用服务器...)
# 执行命令,将两个节点分别添加为mds节点。
ceph-deploy mds create ceph-node1
ceph-deploy mds create ceph-deploy
# 查看ceph-fs状态
# 可以发现,目前还是属于一主多备的状态。
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs status
cephfs-test - 0 clients
===========
+------+--------+------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+------------+---------------+-------+-------+
| 0 | active | ceph-node3 | Reqs: 0 /s | 11 | 14 |
+------+--------+------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
| Pool | type | used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 7179 | 11.1G |
| cephfs-data-pool | data | 7 | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
| ceph-node2 |
| ceph-node1 |
| ceph-deploy |
+-------------+
# 修改mds的最大活跃数,实现多个主提供服务
语法:ceph fs set <ceph-fs-name> max_mds <NUM>
ceph fs set cephfs-test max_mds 2
# 再次查看ceph-fs状态
# 可以发现变成了2个rank提供服务。
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs status
cephfs-test - 0 clients
===========
+------+--------+-------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+-------------+---------------+-------+-------+
| 0 | active | ceph-node3 | Reqs: 0 /s | 11 | 14 |
| 1 | active | ceph-deploy | Reqs: 0 /s | 0 | 0 |
+------+--------+-------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
| Pool | type | used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 7835 | 11.1G |
| cephfs-data-pool | data | 7 | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
| ceph-node2 |
| ceph-node1 |
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs status
cephfs-test - 0 clients
===========
+------+--------+-------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+-------------+---------------+-------+-------+
| 0 | active | ceph-node3 | Reqs: 0 /s | 11 | 14 |
| 1 | active | ceph-deploy | Reqs: 0 /s | 0 | 0 |
+------+--------+-------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
| Pool | type | used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 7835 | 11.1G |
| cephfs-data-pool | data | 7 | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
| ceph-node2 |
| ceph-node1 |
+-------------+
默认情况下,由mon节点来控制哪些节点成为mds主节点,哪些作为备用节点,当出现故障时接替失效的mds。
假设我们现在希望,ceph-node2作为ceph-node3的备用mds、ceph-node1作为ceph-deploy的备用mds。
ceph提供了一些选项用于控制standby状态的mds如何接替主mds。
我们可以通过修改ceph.conf来指定这样的主备关系。
# vim /etc/ceph.conf
# 添加下面的配置
[mds.ceph-node2] # 表示配置ceph-node2这个mds节点
mds_standby_replay=true
mds_standby_for_name=ceph-node3
[mds.ceph-node1]
mds_standby_replay=true
mds_standby_for_name=ceph-deploy
# 重新分发配置文件到节点上
ceph-deploy --overwrite-conf config push ceph-node2
ceph-deploy --overwrite-conf config push ceph-node1
ceph-deploy --overwrite-conf config push ceph-node3
ceph-deploy --overwrite-conf config push ceph-deploy
# 重启备mds节点上的mds服务
systemctl restart ceph-mds@<节点名>
有时候有些特殊场景,比如机房要断点,需要提前关闭ceph服务。
为了避免osd在节点关闭后被ceph集群误以为挂了,然后将其标记为out状态,建议在重启前将osd设置为noout
ceph osd set noout
待启动ceph集群后再重新设置为out,让其正常判断osd并自动剔除。
ceph osd set out
刚好和关闭集群步骤反过来。
查看pg
[root@ceph-node3 ~]# ceph pg ls
PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG STATE STATE_STAMP VERSION REPORTED UP ACTING SCRUB_STAMP DEEP_SCRUB_STAMP
1.0 0 0 0 0 0 0 0 0 active+clean 2023-12-25 17:02:12.477859 0'0 104:105 [3,7,2]p3 [3,7,2]p3 2023-12-24 17:52:07.729007 2023-12-22 14:51:21.195350
1.1 0 0 0 0 0 0 0 0 active+clean 2023-12-25 19:10:29.506390 0'0 104:115 [7,0,5]p7 [7,0,5]p7 2023-12-25 19:10:29.506220 2023-12-24 17:52:11.632761
1.2 0 0 0 0 0 0 0 0 active+clean 2023-12-25 20:03:07.345014 0'0 104:120 [4,0,8]p4 [4,0,8]p4 2023-12-25 20:03:07.344982 2023-12-22 14:51:21.195350
1.3 0 0 0 0 0 0 0 0 active+clean 2023-12-25 17:13:35.967744 0'0 104:87 [1,5,8]p1 [1,5,8]p1 2023-12-24 17:51:58.414104 2023-12-22 14:51:21.195350
1.4 0 0 0 0 0 0 0 0 active+clean 2023-12-25 21:31:42.912662 0'0 105:123 [7,3,1]p7 [7,3,1]p7 2023-12-25 21:31:42.912604 2023-12-22 14:51:21.195350
1.5 0 0 0 0 0 0 0 0 active+clean 2023-12-25 17:07:55.059959 0'0 104:99 [8,2,5]p8 [8,2,5]p8 2023-12-24 17:52:04.692830 2023-12-22 14:51:21.195350
1.6 0 0 0 0 0 0 0 0 active+clean 2023-12-25 17:13:45.965625 0'0 104:85 [6,4,1]p6 [6,4,1]p6 2023-12-24 17:51:58.879476 2023-12-22 14:51:21.195350
# 过滤以下,太多列了
[root@ceph-node3 ~]# ceph pg ls | awk '{print $1, $2, $10, $15, $16}' | head -n +1
PG OBJECTS STATE UP ACTING
1.0 0 active+clean [3,7,2]p3 [3,7,2]p3
1.1 0 active+clean [7,0,5]p7 [7,0,5]p7
.... 此处省略....
正常状态应该为“active+clean”
常见状态
对存储池进行增删改查等操作…
语法:
# 这种方式创建的是副本池类型的存储池
osd pool create <poolname> <pg-num> [<pgp-num>]
# 创建纠删码类型存储池,用得不多..
ceph osd pool create <pool-name> <pg-num> <pgp-num> erasure [erasure-code-profile] [crush-rule-name] [expected-num-objects]
# 列出所有的存储池
ceph osd pool ls [detail]
detail: 显示更加详细的信息,会有存储池id,存储池的属性等。
# 列出所有的存储池并显示存储池的id
ceph osd lspools
# 查看所有 或 指定存储池的状态
ceph osd pool stats [pool-name]
# 查看存储池容器信息
rados df
ceph df [detail]
ceph osd pool rename <poolname> <new-poolname>
为了防止意外删除存储池,ceph本身由2个防止删除的机制来控制存储池可否被删除。
存储池的属性nodelete,默认是false。如果是true时,则不可以被删除。
删除操作默认时经过mon节点的,mon节点默认不允许删除pool。
建议删除之前将其值设置为true,删除完成后再改为false。
临时设定方法:
ceph tell mon.* injectargs --mon-allow-pool-delete={true|false}
删除存储池语法:
ceph osd pool rm <pool-name> <pool-name> --yes-i-really-really-mean-it
你没有看错,需要输入两次pool-name,就和你输入确定密码一样...
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool get django-web all
size: 3
min_size: 2
pg_num: 16
pgp_num: 16
crush_rule: replicated_rule # 存储池类型,默认就是副本池
hashpspool: true
nodelete: false
nopgchange: false
nosizechange: false
write_fadvise_dontneed: false
noscrub: false
nodeep-scrub: false
use_gmt_hitset: 1
auid: 0
fast_read: 0
语法: ceph osd pool set <pool name> <key> <value>
常用属性参数:
size:
min_size:
pg_num: 存储池的pg数
pgp_num: 计算数据归置时要使用的PG的有效数量
nodelete: 不可被删除,默认为false
nopgchange: 不可修改pg和pgp的数量,默认为false
nosiezechange: 不可修改存储池的大小,默认为false
noscrub: 不进行浅度整理存储池,默认为false。
ceph默认时每天进行一次浅度数据整理(主要是确保osd数据一致性),浅度整理I/O不会占用很多。
nodeep-scrub:不进行深层整理存储池,默认为false
ceph默认每周进行一次深度数据整理,深度整理能够更加保证OSD的数据一致性,但是比较占IO。
scrub_min_interval:集群负载较低时整理存储池的最小时间间隔;默认值为0,表示其取值
来自于配置文件中的osd_scrub_min_interval参数;
scrub_max_interval:整理存储池的最大时间间隔;默认值为0,表示其取值来自于配置文件
中的osd_scrub_max_interval参数;
deep_scrub_interval:深层整理存储池的间隔;默认值为0,表示其取值来自于配置文件中的
osd_deep_scrub参数;
存储池支持以两种方式来设置配额
存储对象数量(max_objects)
可用空间大小(max_bytes)
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool get-quota django-web
quotas for pool 'django-web':
max objects: N/A
max bytes : N/A
[root@ceph-deploy ~]# ceph osd pool set-quota django-web max_bytes 2G
set-quota max_bytes = 2147483648 for pool django-web
[root@ceph-deploy ~]# ceph osd pool get-quota django-web
quotas for pool 'django-web':
max objects: N/A
max bytes : 2 GiB
存储池快照是针对一整个存储池进行快照的备份、管理等操作。
快照还是比较占用空间的,看需求使用。
# 添加快照
osd pool mksnap <poolname> <snap>
rados -p <poolname> mksnap <snap>
# 删除快照
osd pool rmsnap <poolname> <snap>
rados -p <poolname> rmsnap <snap>
# 查看存储池中的快照
rados -p <poolname> lssnap
# 查看某个object对象有啥快照 list the snapshots of this object
rados -p <poolname> listsnaps <obj-name>
# 恢复快照 roll back object to snap <snap-name>
raods -p <poolname> rollback <obj-name> <snap-name>
RBD自身也是RADOS存储集群的客户端,它通过将存储池提供的存储服务抽象为一到多个image(表现为块设备)向客户端提供块级别的存储。
rbd feature enable|disable
如:
rbd feature disable django-web/img001 object-map fast-diff deep-flatten
需要更多空间?那就进行扩容。
但是不要随便缩容,就像lvm一样。
usage:
rbd resize [--pool <pool>] [--image <image>] --size <size>
[--allow-shrink] [--no-progress]
<image-spec>
Resize (expand or shrink) image.
Positional arguments
<image-spec> image specification
(example: [<pool-name>/]<image-name>)
Optional arguments
-p [ --pool ] arg pool name
--image arg image name
-s [ --size ] arg image size (in M/G/T) [default: M]
--allow-shrink permit shrinking # 允许缩容,缩容必须带上这个参数
--no-progress disable progress output
ps. 我这都是直接将rbd挂载ceph-deploy节点,只用于测试而已。
# 1. 查看当前挂载的rbd,可以发现是1G的空间。
[root@ceph-deploy ~]# df -Th
文件系统 类型 容量 已用 可用 已用% 挂载点
... # 此处省略
/dev/rbd0 ext4 976M 2.6M 907M 1% /mnt/ceph-rbd-dir
# 2. 查看img002镜像的体积信息
[root@ceph-deploy ~]# rbd info -p django-web --image img002
rbd image 'img002':
size 1 GiB in 256 objects
order 22 (4 MiB objects)
id: 25c446b8b4567
block_name_prefix: rbd_data.25c446b8b4567
format: 2
features: layering
op_features:
flags:
create_timestamp: Mon Dec 25 20:27:10 2023
# 3. 先关闭rbd挂载
[root@ceph-deploy ~]# umount /mnt/ceph-rbd-dir/
[root@ceph-deploy ~]# rbd unmap --pool django-web --image img002 /dev/rbd0
# 4. 开始扩容img镜像大小
[root@ceph-deploy ~]# rbd resize -p django-web --image img002 --size 2G
Resizing image: 100% complete...done.
# 4.1 在客户端上重新映射rbd
[root@ceph-deploy ~]# rbd map -p django-web --image img002
/dev/rbd0
# 4.2 查看块存储的大小,发现已经变成了2G
[root@ceph-deploy ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 20G 0 disk
├─sda1 8:1 0 512M 0 part /boot
└─sda2 8:2 0 19.5G 0 part
└─centos-root 253:0 0 19.5G 0 lvm /
sr0 11:0 1 1024M 0 rom
rbd0 252:0 0 2G 0 disk
# 4.3 但是文件系统还没有扩容呀,文件系统还是默认的大小,需要我们重新扩容
[root@ceph-deploy ~]# df -Th
文件系统 类型 容量 已用 可用 已用% 挂载点
... # 此处省略其他的文件系统
# 可以发现,还是1G的空间
/dev/rbd0 ext4 976M 2.6M 907M 1% /mnt/ceph-rbd-dir
# 4.4 扩容文件系统
# 由于我的是ext格式,所以用resize2fs命令即可扩容
# xfs格式的可以使用:xfs_growfs
[root@ceph-deploy ~]# resize2fs /dev/rbd0
resize2fs 1.42.9 (28-Dec-2013)
Filesystem at /dev/rbd0 is mounted on /mnt/ceph-rbd-dir; on-line resizing required
old_desc_blocks = 1, new_desc_blocks = 1
The filesystem on /dev/rbd0 is now 524288 blocks long.
# 4.5 再次查看文件系统大小,可以发现已经变成了2G空间了
[root@ceph-deploy ~]# df -Th
文件系统 类型 容量 已用 可用 已用% 挂载点
... 此处省略其他文件系统
/dev/rbd0 ext4 2.0G 3.1M 1.9G 1% /mnt/ceph-rbd-dir
当我们使用rbd块存储的时候会用到存储池中的镜像。此时可以单独给存储池中的某个镜像img进项快照备份。
镜像快照是针对存储池中某个镜像进行快照的备份、删除、管理等操作。
rdb
snap create (snap add) Create a snapshot. 创建快照
snap limit clear Remove snapshot limit. 删除快照数量限制
snap limit set Limit the number of snapshots. 设置快照数量限制
snap list (snap ls) Dump list of image snapshots. 列出快照
snap protect Prevent a snapshot from being deleted. 禁止删除快照
snap purge Delete all unprotected snapshots. 删除所有允许删除的快照(就是非protect状态的快照)
snap remove (snap rm) Delete a snapshot. 删除快照
snap rename Rename a snapshot. 重命名快照
snap rollback (snap revert) Rollback image to snapshot. 恢复快照
snap unprotect Allow a snapshot to be deleted. 允许删除快照
前面学习都是使用admin账号,权限很大,生产环境不可能用。
一个授权文件中,可以有多个用户,但是一般也不会这样干…通过[client.<用户名>]来区分
# /etc/ceph/ceph.client.admin.keyring
[client.admin]
# key密钥
key = AQCAGoVlpj0zERAA5dhEHlg/a5TyQhPPlTigUg==
# caps是表示能力的意思,也就是权限。
# caps <daemon-type> = "allow <caps>"
caps mds = "allow *"
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *"
如何去定义权限? 授权文件keyring中的格式含义:
格式:allow <caps>
caps:
r: 表示只读
w: 可写
x: 读写,且在mon节点上可以执行auth。
class-read: x的子集,允许用户用户调用类读取的能力
class-write: x的子集,允许用户用户调用类写入的能力
* : 表示所有权限
profile osd:
用户可以以某个osd身份连接到其他osd或者监视器mon的权限。
用户可以获取osd状态信息。
profile mds
用户可以以某个mds身份连接到其他mds或者监视器mon的权限。
用户可以获取mds状态信息。
profile bootstrap-osd
允许引导部署osd一,般可以授权给deploy节点、允许osd部署的节点的用户使用。
profile bootstrap-mds
允许引导部署mds,一般可以授权给deploy节点、允许mds部署的节点的用户使用。
pool=<poolname> : 表示该caps只在指定pool生效
一般用户至少需要mon的读取权限,osd需要读写但是需要指定具体的pool存储池,不然所有存储池都可以读写了。
# 添加指定的用户,可以在添加的时候指定权限
ceph auth add <entity> {<caps> [<caps>...]}
# 修改指定用户的权限
ceph auth caps <entity> <caps> [<caps>...]
# 输出指定用户的keyring,如果不指定就输出全部用户keyring信息。
ceph auth export {<entity>}
# 输出指定用户的keyring
ceph auth get <entity>
# 输出指定用户的key密码
ceph auth get-key <entity>
# 获取 或 添加 指定用户,如果用户不存在则执行添加的操作。
# 添加时:可以同时指定用户的权限。
# 获取时:获取指定用户
ceph auth get-or-create <entity> {<caps> [<caps>...]}
# 和上面的一样,只不过在获取的时候,只显示用户的key,而不携带caps权限信息。
ceph auth get-or-create-key <entity> {<caps> [<caps>...]}
# 从keyring文件中导入用户
ceph auth import -i <keyring-file>
# 查看当前授权状态
ceph auth ls
# 打印授权用户的key
ceph auth print-key <entity>
# remove all caps for <name>
ceph auth rm <entity>
[root@ceph-deploy ~]# ceph auth ls
installed auth entries:
# 格式是:类型.ID
# 这里就是mds服务,后面是ceph-deploy这个节点名称
mds.ceph-deploy
key: AQD4l4plEMeOFRAAJzbIdjWftm4fjcSiClsVZg==
caps: [mds] allow
caps: [mon] allow profile mds
caps: [osd] allow rwx
mds.ceph-node1
key: AQDxl4pljT5CHhAAi+cTCp+yliT7kjYjsHo6gA==
caps: [mds] allow
caps: [mon] allow profile mds
caps: [osd] allow rwx
...#太多了,此处省略。
# 这里表示类型是client(客户端). admin表示用户名
client.admin
key: AQCAGoVlpj0zERAA5dhEHlg/a5TyQhPPlTigUg==
caps: [mds] allow *
caps: [mgr] allow *
caps: [mon] allow *
caps: [osd] allow *
... #太多了,此处省略。
# ceph auth add <entity> {<caps> [<caps>...]}
[root@ceph-deploy ~]# ceph auth add client.swq
added key for client.swq
# 查看client.swq
# 可以发现目前并没有任何权限
[root@ceph-deploy ~]# ceph auth get client.swq
exported keyring for client.swq
[client.swq]
key = AQAiwotlCQXNIRAARAuIcks1CA+D1bvdo7E7lQ==
# 有两种方法可以添加权限
# 1. 在使用add子命令的同时,可以指定权限。
如:ceph auth add client.zhangsan mds 'allow r' osd 'allow rw'
# 2. 后续使用caps子命令对指定用户添加权限
详解下小节《》。
# ceph auth caps <entity> <caps> [<caps>...]
# caps这里的格式是:daemon '具体权限'
# 比如: mds 'allow *'
[root@ceph-deploy ~]# ceph auth caps client.swq mds 'allow *'
updated caps for client.swq
[root@ceph-deploy ~]# ceph auth get client.swq
exported keyring for client.swq
[client.swq]
key = AQAiwotlCQXNIRAARAuIcks1CA+D1bvdo7E7lQ==
caps mds = "allow *"
ceph auth del <TYPE.ID>
keyring文件名规则:<ceph集群名>.<TYPE.ID>.keyring
# 生成client.zhangsan的keyring文件
语法:
ceph auth <TYPE.ID> > /PATH/<ceph集群名>.<TYPE.ID>.keyring
# 实例:输出并按照指定的文件命名规则进行重定向
[root@ceph-deploy ~]# ceph auth get client.zhangsan > /tmp/ceph.client.zhangsan.keyring
exported keyring for client.zhangsan
# 查看keyring
[root@ceph-deploy ~]# cat /tmp/ceph.client.zhangsan.keyring
[client.zhangsan]
key = AQAJw4tlDAeyNhAASg05b/bGMCM4kU+R3L3UmA==
caps mds = "allow r"
caps osd = "allow rw"
# 生成key文件
语法:auth get-key <entity>
# 实例:生成client.zhangsan的key
# 可以使用-o 重定向,也可以使用重定向符号>自行重定向。
[root@ceph-deploy ~]# ceph auth get-key client.zhangsan -o /tmp/zhangsan.key
[root@ceph-deploy ~]# cat /tmp/zhangsan.key
AQAJw4tlDAeyNhAASg05b/bGMCM4kU+R3L3UmA==
前提:指定用户的keyring文件,已经在ceph会去自动查找的目录下
/etc/ceph/ceph.client.lisi.keyring
/etc/ceph/ceph.keyring
/etc/ceph/keyring
/etc/ceph/keyring.bin
ceph命令指定用户:
ceph
--id, --user: 指定CLIENT ID
这个CLIENT ID不需要TYPE类型,比如,你有client.lisi,这里只需要写lisi即可
-n, --name: 指定用户名
这里需要指定TYPE.ID, 比如:client.lisi
之前都是用admin权限挂载的,现在改成使用普通账号。
ceph-fs涉及到mds,一般给个读写就可以了。
mon 给个读权限
osd 给读写权限,但是要指定pool存储池。
fs中的metadata pool一般不需要给读写权限,因为这个存储池是mds服务去操作的。
我们只需要给data-pool读写权限即可。
之前的案例使用cephfs-metadata-pool用保存mds数据;cephfs-data-pool用于保存实际数据。
[root@ceph-deploy ~]# ceph fs status
cephfs-test - 1 clients
===========
+------+--------+------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+------------+---------------+-------+-------+
| 0 | active | ceph-node2 | Reqs: 0 /s | 11 | 14 |
| 1 | active | ceph-node1 | Reqs: 0 /s | 10 | 13 |
+------+--------+------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
| Pool | type | used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 10.6k | 11.1G |
| cephfs-data-pool | data | 7 | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
| ceph-node3 |
| ceph-deploy |
添加lisi用户,并添加权限
[root@ceph-deploy ~]# ceph auth add client.lisi mon 'allow r' mds 'allow rw' osd 'allow rw pool=cephfs-data-pool'
added key for client.lisi
# 查看client.lisi用户的权限
[root@ceph-deploy ~]# ceph auth get client.lisi
exported keyring for client.lisi
[client.lisi]
key = AQCTzYtlN5UmCxAAWJdO+AZ1XUHwapkkAoVp/w==
caps mds = "allow rw"
caps mon = "allow r"
caps osd = "allow rw pool=cephfs-data-pool"
导出lisi用户的key
ceph auth get-key client.lisi > /etc/ceph/lisi.key
复制用户key到客户端
# 由于我直接在ceph-deploy部署,所以就不需要复制了
scp xxxx xxxxx
修改fstab分区表中的挂载选项
原内容:
192.168.2.121,192.168.2.122,192.168.2.123:/ /mnt/ceph-fs-dir ceph defaults,_netdev,noatime,name=admin,secret=AQCAGoVlpj0zERAA5dhEHlg/a5TyQhPPlTigUg== 0 0
修改为:主要修改name和secretfile选项,用于指定新用户。
192.168.2.121,192.168.2.122,192.168.2.123:/ /mnt/ceph-fs-dir ceph defaults,_netdev,noatime,name=lisi,secretfile=/etc/ceph/lisi.key 0 0
卸载原来的挂载
umount /mnt/ceph-fs-dir
测试挂载是否成
# 重新挂载/etc/fstab分区表
[root@ceph-deploy ~]# mount -a
# df查看,发现已经挂载成功了
[root@ceph-deploy ~]# df -Th
文件系统 类型 容量 已用 可用 已用% 挂载点
...此处省略其他文件系统
192.168.2.121,192.168.2.122,192.168.2.123:/ ceph 12G 0 12G 0% /mnt/ceph-fs-dir
# 查看挂载的目录
[root@ceph-deploy ~]# ll /mnt/ceph-fs-dir/
总用量 1
-rw-r--r-- 1 root root 7 12月 25 21:17 123.txt
# 测试挂载的目录读写是否正常
[root@ceph-deploy ~]# touch /mnt/ceph-fs-dir/456.txt
[root@ceph-deploy ~]# echo 666 > /mnt/ceph-fs-dir/456.txt
[root@ceph-deploy ~]# cat /mnt/ceph-fs-dir/456.txt
666
# 创建普通用户zhangsan去访问rbd的存储池
# 必须rwx,不然挂载不了。。。
[root@ceph-deploy ~]# ceph auth caps client.zhangsan osd 'allow rwx pool=django-web' mon 'allow r'
updated caps for client.zhangsan
[root@ceph-deploy ~]# ceph auth get client.zhangsan
exported keyring for client.zhangsan
[client.zhangsan]
key = AQCH1Itl7AosBRAAwAiPExAxrBflWA8ZlqU09w==
caps mon = "allow r"
caps osd = "allow rwx pool=django-web"
# 导出zhangsan的keyring到/etc/ceph目录下
[root@ceph-deploy ~]# ceph auth get client.zhangsan > /etc/ceph/ceph.client.zhangsan.keyring
exported keyring for client.zhangsan
# 在进行map映射的时候,使用--id来指定用户
[root@ceph-deploy ~]# rbd map -p django-web --image img002 --id zhangsan
/dev/rbd0
# 挂载块设备
mount /dev/rbd0 /mnt/ceph-rbd-dir/
# 测试数据的读写
[root@ceph-deploy ~]# touch /mnt/ceph-rbd-dir/2.txt
[root@ceph-deploy ~]# echo 666 > /mnt/ceph-rbd-dir/2.txt
[root@ceph-deploy ~]# cat /mnt/ceph-rbd-dir/2.txt
666
# 将mgr节点上的dashboard插件启用
ceph mgr module enable dashboard
# 启用后还不能马上访问dashboard, 需要配置关闭SSL或启用 SSL 及指定监听地址
# 方法一:关闭ssl
ceph config set mgr mgr/dashboard/ssl false
# 方法二:使用内置命令生成并安装自签名证书:
[root@ceph-deploy]# ceph dashboard create-self-signed-cert
Self-signed certificate created
# 创建dashboard用户:
[root@node1 my-cluster]# ceph dashboard set-login-credentials swq 123456
Username and password updated
#查看ceph-mgr服务:
[root@ceph-deploy ~]# ceph mgr services
{
"dashboard": "http://ceph-node1:8080/" # 如果开启了ssl,则默认是8443
}
# 如果要修改端口和ip地址
ceph config set mgr mgr/dashboard/ceph-node1/server_addr 192.168.2.121
ceph config set mgr mgr/dashboard/ceph-node1/server_port 9977
# 重启mgr接入点上的mgr服务
[root@ceph-node1 ~]# systemctl status ceph-mgr@ceph-node1
# 查看当前dashboard状态
[root@ceph-node1 ~]# ceph mgr services
{
"dashboard": "https://192.168.2.121:9977/"
}
mgr自带了prometheus 的监控模块,启用后会并监听在每个 mgr节点的 9283 端口,用于将采集的数据向 prometheus 提供数据
[root@ceph-deploy ~]# ceph mgr module enable prometheus
[root@ceph-deploy~]# ceph mgr services
{
"dashboard": "https://192.168.2.121:9977/",
"prometheus": "http://ceph-node1:9283/"
}
# 实际上每个mgr节点都监听了9283端口
略
略
# vim /etc/ceph.conf
# 设置监视器允许的时钟偏移值(多个节点之间时钟的偏差值)
mon clock drift allowed=<秒> # 默认0.05秒
# 设置上面设置的时间偏移值,如果发生偏移,要在连续发生多少次后发出警告
mon clock drift warn backoff=<NUM>
# 修改完一般都需要将配置文件推送到各个ceph集群的节点上
ceph-deploy --overwrite-conf config push <节点名>