-
Notifications
You must be signed in to change notification settings - Fork 75
Drive Groups
This is a new way of defining OSD layouts included in DeepSea versions >= v0.9.14.
Older versions still follow the proposal
runner approach.
There is a single file (/srv/salt/ceph/configuration/files/drive_groups.yml
) which acts as a single source of
information.
In order to create a OSD layout, you'll have to edit one file only. As opposed to the previous approach where we generated several yaml files based on parameters given to the proposal
runner, we now compute the layout on the fly. This gives us more flexibility and avoids the need of having to refresh the pillar after each layout change. There are more advantages (link to the nautilus osd-deployment wikis)
The mentioned file is located in /srv/salt/ceph/configuration/files/drive_groups.yml
and acts as a single source of information.
- Open the file mentioned above
- Fill with a Drive Group spec according to your needs (See (Link to Examples) Examples)
2.1 When you are not sure about the properties of the disks you can use.
salt-run disks.details
A very basic example would be:
# /srv/salt/ceph/configuration/files/drive_groups.yml
default_drive_group_name: # <- this is the name of the drive_group (name can be custom)
target: '*' # <- the target - salt notation can be used
data_devices: # <- the type of devices you are applying specs to
all: true # <- a specification, check below for a full list
Remember to use spaces instead of tabs (YAML).
That would just use all available(to ceph) drives as OSDs.
- Verify the results
There are new runners and modules which do the computation. The new user facing runner is called disks
It exposes two main functions that can be used to verify your Drive Group.
salt-run disks.list
This returns you a structure of matching disks based on your Drive Group.
If you're not satisfied with the output, just go back to step 2, edit the file and run salt-run disks.list
again.
Rinse-Repeat until you happy.
- Deploy
On the next stage.3 invocation those disks will be deployed accordingly.
drive_group_default_name:
target: *
data_devices:
device_spec: arg
db_devices:
device_spec: arg
wal_devices:
device_spec: arg
block_wal_size: '5G' (optional)
block_db_size: '5G' (optional)
osds_per_device: 1 # number of osd daemons per device. To fully utilize nvme devices multiple osds are required.
format: bluestore/filestore (defaults to bluestore)
encrypted: True/False (defaults to False)
db_slots: 5 (optional)
wal_slots: 1 (optional)
when using filestore this would look something like this:
drive_group_default_name:
target: *
data_devices:
device_spec: arg
journal_devices:
device_spec: arg
format: filestore
encrypted: True/False (optional)
journal_size: '500M' (optional)
with {device_spec}
being
# substring match on the ID_MODEL property of the drive
model: disk_model_name
# substring match on the VENDOR property of the drive
vendor: disk_vendor_name
# Please note the quotes around the values when using delimiter notation.
Yaml would interpret the ':' as a new hash.
# Size specification of format LOW:HIGH. Can also take the
# the form :HIGH, LOW: or an exact value (as ceph-volume inventory reports)
size: '10G' # - Includes disks of an exact size.
size: '10G:40G' # - Includes disks which size is within the range.
size: ':10G' # - Includes disks less than or equal to 10G in size.
size: '40G:' # - Includes disks equal to or greater than 40G in size.
# Sizes don't have to be exclusively in Gigabyte(G).
# Supported units are Megabyte(M), Gigabyte(G) and Terrabyte(T). Also appending the (B) for byte is supported. MB, GB, TB
# is the drive rotating or not (SSDs and NVMEs don't rotate)
rotational: 0
The drive_spec
for data_devices
may also simply be all
instead of a yaml structure. This should offer a convenient method to deploy a node while using all available drives to deploy standalone OSDs.
# This is exclusive for the data_devices section.
all: true
When you specified valid filters but want to limit the amount of matching disks you can use the 'limit' directive.
# if this is present, limit the number of matching drives to this number.
limit: 10
This new structure is proposed to serve as a declarative way to specify OSD deployments. On a per host basis OSD deployments are defined by the list of devices and their intended use (data, wal, db or journal) and a list of flags for the deployment tools (ceph-volume in this case).
The Drive Group specification (dg) is intended to be created manually by a user and specifies a group of OSDs that are interrelated (hybrid OSDs that are deployed on solid state and spinners) or share the same deployment options (identical, i.e. same objectstore, same encryption option, ... standalone OSDs)
To avoid explicitly listing devices, we'll rely on a list of filter items. These correspond to a few selected fields of ceph-volume inventory
reports. In the simplest case this could be the rotational flag (all solid-state drives are to be db_devices, all rotating one data devices) or something more involved like model strings, sizes or others.
DeepSea will provide code that translates these drive groups into actual device lists for inspection by the user.
2 Nodes with the same setup:
- 20 HDDs
- Vendor: Intel
- Model: SSD-123-foo
- Size: 4TB
- 2 SSDs
- Vendor: Micron
- Model: MC-55-44-ZX
- Size: 512GB
This is a common setup and can be described quite easily:
drive_group_default:
target: '*'
data_devices:
model: SSD-123-foo
db_devices:
model: MC-55-44-XZ
This is a simple and valid, but maybe not future-safe configuration. The user may add disks of different vendors in the future, which wouldn't be included with this configuration
We can improve it by reducing the filters on core properties of the drives:
drive_group_default:
target: '*'
data_devices:
rotational: 1
db_devices:
rotational: 0
Now, we enforce all rotating devices to be declared as 'data devices' and all non-rotating devices will be used as shared_devices (wal, db)
If you know that drives with more than 2TB will always be the slower data devices, you can also filter by size:
drive_group_default:
target: '*'
data_devices:
size: '2TB:'
db_devices:
size: ':2TB'
Forcing encryption on your OSDs is as simple as appending 'encrypted: True' to the layout(? need to agree on a terminology, probably layout is bad - specification or spec?).
drive_group_default:
target: '*'
data_devices:
size: '2TB:'
db_devices:
size: ':2TB'
encrypted: True
This was a rather simple setup. Following this approach you can also describe more sophisticated setups.
- 20 HDDs
- Vendor: Intel
- Model: SSD-123-foo
- Size: 4TB
- 12 SSDs
- Vendor: Micron
- Model: MC-55-44-ZX
- Size: 512GB
- 2 NVMEs
- Vendor: Samsung
- Model: NVME-QQQQ-987
- Size: 256GB
Here we have two distinct setups;
20 HDDs should share 2 SSDs;
10 SSDs should share 2 NVMes;
This can be described with two layouts.
drive_group:
target: '*'
data_devices:
rotational: 0
db_devices:
model: MC-55-44-XZ
db_slots: 5 # How many OSDs per DB device
Settings db_slots: 5
will ensure that only two SSDs will be used ( 10 left )
followed by
drive_group_default:
target: '*'
data_devices:
model: MC-55-44-XZ
db_devices:
vendor: samsung
size: 256GB
db_slots: 5 # How many OSDs per DB device
The examples above assumed that all nodes have the same drives. That's however not always the case. Example:
Node1-5:
- 20 HDDs
- Vendor: Intel
- Model: SSD-123-foo
- Size: 4TB
- 2 SSDs
- Vendor: Micron
- Model: MC-55-44-ZX
- Size: 512GB
Node6-10:
- 5 NVMEs
- Vendor: Intel
- Model: SSD-123-foo
- Size: 4TB
- 20 SSDs
- Vendor: Micron
- Model: MC-55-44-ZX
- Size: 512GB
You can use the 'target' key in the layout to target certain nodes. Salt target notation helps to keep things easy.
drive_group_node_one_to_five:
target: 'node[1-5]'
data_devices:
rotational: 1
db_devices:
rotational: 0
followed by:
drive_group_the_rest:
target: 'node[6-10]'
data_devices:
model: MC-55-44-XZ
db_devices:
model: SSD-123-foo
All previous cases co-colacated the WALs with the DBs. It's however possible to deploy the WAL on a dedicated device as well(if it makes sense).
- 20 HDDs
- Vendor: Intel
- Model: SSD-123-foo
- Size: 4TB
- 2 SSDs
- Vendor: Micron
- Model: MC-55-44-ZX
- Size: 512GB
- 2 NVMEs
- Vendor: Samsung
- Model: NVME-QQQQ-987
- Size: 256GB
drive_group_default:
target: '*'
data_devices:
model: MC-55-44-XZ
db_devices:
model: SSD-123-foo
wal_devices:
model: NVME-QQQQ-987
db_slots: 10
wal_slots: 10
Neither Ceph, Deepsea or ceph-volume prevents you from making questionable decisions.
- 23 HDDs
- Vendor: Intel
- Model: SSD-123-foo
- Size: 4TB
- 10 SSDs
- Vendor: Micron
- Model: MC-55-44-ZX
- Size: 512GB
- 1 NVMEs
- Vendor: Samsung
- Model: NVME-QQQQ-987
- Size: 256GB
Here we are trying to define:
20 HDDs backed by 1 NVME
2 HDDs backed by 1 SSD(db) and 1 NVME(wal)
8 SSDs backed by 1 NVME
2 SSDs standalone (encrypted)
1 HDD is spare and should not be deployed
drive_group_hdd_nvme:
target: '*'
data_devices:
rotational: 0
db_devices:
model: NVME-QQQQ-987
db_slots: 20
drive_group_hdd_ssd_nvme:
target: '*'
data_devices:
rotational: 0
db_devices:
model: MC-55-44-XZ
wal_devices:
model: NVME-QQQQ-987
db_slots: 2
wal_slots: 2
drive_group_ssd_nvme:
target: '*'
data_devices:
model: SSD-123-foo
db_devices:
model: NVME-QQQQ-987
db_slots: 8
drive_group_ssd_standalone_encrypted:
target: '*'
data_devices:
model: SSD-123-foo
encryption: True
One HDD will remain as the file is being parsed from top to bottom and the db_slots(former ratios) are strictly defined.
- Prerequisites
- Manual Installation
- Custom Profiles
- Alternate Installations
- Automated Installation
- Purging
- Reinstallation
- Replacing an OSD
- Inspecting the Configuration
- Understanding Pathnames and Arguments
- Overriding Default Settings
- Overriding Default Steps
- Man Pages
- deepsea.1
- deepsea.7
- deepsea-commands.7
- deepsea-minions.7
- deepsea-monitor.1
- deepsea-policy.cfg.5
- deepsea-stage.1
- deepsea-stage-dry-run.1
- deepsea-stage-run.1
- deepsea-stages.7
- Backporting
- Testing
- Branches & Releases