[[TOC]]
Creating
New volume group
A new volume group is a rare occurrence. You probably want to create a logical volume instead (see below). Or you might add disks to an existing volume group, see the extend volume group instructions.
But if you're setting up a new machine, this might be right.
First you need to partition the disk and setup full disk encryption
(see RAID). Then you can create the volume group with, assuming
the RAID array is mounted in /dev/mapper/crypt_dev_md2:
pvcreate /dev/mapper/crypt_dev_md2 &&
vgcreate vg_ganeti_nvme /dev/mapper/crypt_dev_md2 &&
New logical volume with bind mount
Assuming a situation where the root LV is full and want to add a new volume and move part of the existing filesystem in it. We also assume that we have available free space in the existing volume group.
Create the new LV, we'll call it srv since it will be mounted at
/srv:
lvcreate -n srv -L50G vg_ganeti
mkfs -t ext4 /dev/vg_ganeti/srv
Mount the new volume at a temporary location:
mkdir /mnt/srv
mount /dev/vg_ganeti/srv /mnt/srv
Then stop any services that might be using the data that we're about to move to its new home:
systemctl stop gitlab-runner docker
Move data around, recreate mount point and unmount the temp location:
mv /var/lib/docker /mnt/srv
mkdir /var/lib/docker
umount /mnt/srv
Adjust /etc/fstab:
echo "UUID=$(blkid /dev/vg_ganeti/srv -o value -s UUID) /srv ext4 defaults 1 2" >> /etc/fstab
echo "/srv/docker /var/lib/docker none bind 0 0" >> /etc/fstab
Reload systemd and local-fs.target to regenerate .mount services:
systemctl daemon-reload
systemctl restart local-fs.target
Verify that the new volume is mounted correctly and restart services:
findmnt
systemctl start docker gitlab-runner
Resizing
Assume we want to grow this partition to take the available free space in the PV:
root@vineale:/srv# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
srv vg_vineale -wi-ao---- 35,00g
root@vineale:/srv# pvs
PV VG Fmt Attr PSize PFree
/dev/sdb vg_vineale lvm2 a-- 40,00g 5,00g
root@vineale:~# pvdisplay
--- Physical volume ---
PV Name /dev/sdb
VG Name vg_vineale
PV Size 40,00 GiB / not usable 4,00 MiB
Allocatable yes
PE Size 4,00 MiB
Total PE 10239
Free PE 1279
Allocated PE 8960
PV UUID CXKO15-Wze1-xY6y-rOO6-Tfzj-cDSs-V41mwe
Extend the volume group
The procedures below assume there is free space on the volume group for the operation. If there isn't you will need to add disks to the volume group, and grow the physical volume. For example:
pvcreate /dev/md123
vgextend vg_vineale /dev/md123
If the underlying disk was grown magically without your intervention, which happens in virtual hosting environments, you can also just extend the physical volume:
pvresize /dev/sdb
Note that if there's an underlying crypto layer, it needs to be resized as well:
cryptsetup resize $DEVICE_LABEL
In this case, the $DEVICE_LABEL is the device's name in
/etc/crypttab, not the device name. For example, it would be
/dev/mapper/crypt_sdb, not /dev/sdb.
Note that striping occurs at the logical volume level, not at the volume group level, see those instructions from RedHat and this definition.
Also note that you cannot mix physical volumes with different block sizes in the same volume group. This can between older and newer drives, and will yield a warning like:
Devices have inconsistent logical block sizes (512 and 4096).
This can, technically, be worked around with
allow_mixed_block_sizes=1 in /etc/lvm/lvm.conf, but this can lead
to data loss. It's possible to reformat the underlying LUKS volume
with the --sector-size argument, see this answer as well.
See also the upstream documentation.
online procedure (ext3 and later)
Online resize has been possible ever since ext3 came out and it considered reliable enough for use. If you are unsure that you can trust that procedure, or if you have an ext2 filesystem, do not use this procedure and see the ext2 procedure below instead.
To resize the partition to take up all available free space, you should do the following:
-
extend the partition, in case of a logical volume:
lvextend vg_vineale/srv -L +5GThis might miss some extents, however. You can use the extent notation to take up all free space instead:
lvextend vg_vineale/srv -l +1279If the partition sits directly on disk, use
parted'sresizepartcommand orfdiskto resize that first.To resize to take all available free space:
lvextend vg_vineale/srv -l '+100%FREE' -
resize the filesystem:
resize2fs /dev/mapper/vg_vineale-srv
That's it! The resize2fs program automatically determines the size
of the underlying "partition" (the logical volume, in most cases) and
fixes the filesystem to fill the space.
Note that the resize process can take a while. Growing an active 20TB
partition to 30TB took about 5 minutes, for example. The -p flag
that could show progress only works in the "offline" procedure (below).
If the above fails because of the following error:
Unable to resize logical volumes of cache type.
It's because the logical volume has a cache attached. Follow the above procedure to "uncache" the logical volume and then re-enable the cache.
WARNING: Make sure you remove the physical volume cache from the volume group before you resize, otherwise the logical volume will be extended to also cover that and re-enabling the cache won't be possible! A typical, incorrect session looks like:
root@materculae:~# lvextend -l '+100%FREE' vg_materculae/srv
Unable to resize logical volumes of cache type.
root@materculae:~# lvconvert --uncache vg_materculae/srv
Logical volume "srv_cache" successfully removed
Logical volume vg_materculae/srv is not cached.
root@materculae:~# lvextend -l '+100%FREE' vg_materculae/srv
Size of logical volume vg_materculae/srv changed from <150.00 GiB (38399 extents) to 309.99 GiB (79358 extents).
Logical volume vg_materculae/srv successfully resized.
root@materculae:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
srv vg_materculae -wi-ao---- 309.99g
root@materculae:~# vgs
VG #PV #LV #SN Attr VSize VFree
vg_materculae 2 1 0 wz--n- 309.99g 0
root@materculae:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sdc vg_materculae lvm2 a-- <10.00g 0
/dev/sdd vg_materculae lvm2 a-- <300.00g 0
A proper procedure is:
VG=vg_$(hostname)
FAST=/dev/sdc
lvconvert --uncache $VG/srv
vgreduce $VG/srv $FAST # remove the cache volume
lvextend -l '+100%FREE' $VG/srv # resize the volume
vgextend $VG $FAST # re-add the cache volume
lvcreate -n cache -l '100%FREE' $VG $FAST
lvconvert --type cache --cachevol cache $VG
And here's a successful run:
root@materculae:~# VG=vg_$(hostname)
root@materculae:~# FAST=/dev/sdc
root@materculae:~# vgreduce $VG $FAST
Removed "/dev/sdc" from volume group "vg_materculae"
root@materculae:~# vgs
VG #PV #LV #SN Attr VSize VFree
vg_materculae 1 1 0 wz--n- <300.00g <10.00g
root@materculae:~# lvextend -l '+100%FREE' $VG
Size of logical volume vg_materculae/srv changed from 150.00 GiB (38400 extents) to <300.00 GiB (76799 extents).
Logical volume vg_materculae/srv successfully resized.
root@materculae:~# vgextend $VG $FAST
Volume group "vg_materculae" successfully extended
root@materculae:~# lvcreate -n cache -l '100%FREE' $VG $FAST
Logical volume "cache" created.
root@materculae:~# lvconvert --type cache --cachevol cache vg_materculae
Erase all existing data on vg_materculae/cache? [y/n]: y
Logical volume vg_materculae/srv is now cached.
Command on LV vg_materculae/cache_cvol requires LV with properties: lv_is_visible .
Note that the above output was edited for correctness: the actual run was much bumpier and involved shrinking the logical volume as the "incorrect" run was actually done in tpo/tpa/team#41258.
offline procedure (ext2)
To resize the partition to take up all available free space, you should do the following:
-
stop services and processes using the partition (will obviously vary):
service apache2 stop -
unmount the filesystem:
umount /srv -
check the filesystem:
fsck -y -f /dev/mapper/vg_vineale-srv -
extend the filesystem using the extent notation to take up all available space:
lvextend vg_vineale/srv -l +1279 -
grow the filesystem (
-pis for "show progress"):resize2fs -p /dev/mapper/vg_vineale-srv -
recheck the filesystem:
fsck -f -y /dev/mapper/vg_vineale-srv -
remount the filesystem and start processes:
mount /srv service apache2 start
Shrinking
Shrinking the filesystem is also possible, but is more risky. Making an error in the commands in this section could incur data corruption or, more likely, data loss.
It is very important to reduce the size of the filesystem before resizing the size of the logical volume, so the order of the steps is critical. In the procedure below, we're enforcing this order by using lvm's ability to also resize ext4 filesystems to the requested size automatically.
-
First, identify which volume needs to be worked on.
WARNING: this step is the most crucial one in the procedure. Make sure to verify what you've typed 3 times to be very certain you'll be launching commands on the correct volume before moving on (i.e. "measure twice, cut once")
VG_NAME=vg_name LV_NAME=lv_name DEV_NAME=/dev/${VG_NAME}/${LV_NAME} -
Unmount the filesystem:
umount "$DEV_NAME"If the above command is not failing because the filesystem is in use, you'll need to stop processes using it. If that's impossible (for example when resizing
/), you'll need to reboot in a separate operating system first, or shutdown the VM and work from the physical node below. -
Forcibly check the filesystem:
e2fsck -fy "$DEV_NAME" -
Shrink both the filesystem and the logical volume at once:
WARNING: make sure you get the size right here before launching the command
Here we reduce to 5G (new absolute size for the volume):
lvreduce -L 5G --resizefs "${VG_NAME}/${LV_NAME}"To reduce by 5G instead:
lvreduce -L -5G --resizefs "${VG_NAME}/${LV_NAME}"TIP: You might want to ask a coworker to check your command right here, because this is a really risky command!
-
check the filesystem again:
e2fsck -fy "$DEV_NAME" -
If you want to resize the underlying device (for example, if this is a LVM inside a virtual machine on top of another LVM), you can also shrink the parent logical volume, physical volume, and crypto device (if relevant) at this point.
lvreduce -L 5G vg/hostname pvresize /dev/sdY cryptsetup resize DEVICE_LABELWARNING: this last step has not been tested.
Renaming
Rename volume group containing root
Assuming a situation where a machine was deployed successfully but the volume
group name is not adequate and should be changed. In this example, we'll change
vg_ganeti to vg_tbbuild05.
This operation requires at least one reboot, and a live rescue system if the root filesystem is encrypted.
First, rename the LVM volume group:
vgrename vg_ganeti vg_tbbuild05
Then adjust some configuration files and regenerate the initramfs to replace the old name:
sed -i 's/vg_ganeti/vg_tbbuild05/g' /etc/fstab
sed -i 's/vg_ganeti/vg_tbbuild05/g' /boot/grub/grub.cfg
update-initramfs -u -k all
The next step depends on whether the root volume is encrypted or not. If it's encrypted, the last command will output an error like:
update-initramfs: Generating /boot/initrd.img-5.10.0-14-amd64
cryptsetup: ERROR: Couldn't resolve device /dev/mapper/vg_ganeti-root
cryptsetup: WARNING: Couldn't determine root device
If this happens, boot the live rescue system and follow the remount
procedure to chroot into the root
filesystem of the machine. Then, inside the chroot, execute these two commands
to ensure GRUB and the initramfs use the new root LV path/name:
update-grub
update-initramfs -u -k all
Then exit the chroot, cleanup and reboot back into the normal system.
If the root volume is not encrypted, the last steps should be enough to
ensure the system boots. To ensure everything works as expected, run the
update-grub command after rebooting and ensure grub.cfg retains the new
volume group name.
Snapshots
This creates a snapshot for the "root" logical volume, with a 1G capacity:
lvcreate -s -L1G vg/root -n root-snapshot
Note that the "size" here needs to take into account not just the data written to the snapshot, but also data written to the parent logical volume. You can also specify the size as a percentage of the parent volume, for example this assumes you'll only rewrite 10% of the parent:
lvcreate -s -l 10%ORIGIN vg/root -n root-snapshot
If you're performing, for example, a major upgrade, you might want to have that be a fully replica of the parent volume:
lvcreate -s -l 100%ORIGIN vg/root -n root-snapshot
Make sure you destroy the snapshot when you're done with it, as keeping a snapshot around has an impact on performance and will cause issues when full:
lvremove vg/root-snapshot
You can also roll back to a previous snapshot:
lvconvert --merge vg/root-snapshot
Caching
WARNING: those instructions are deprecated. There's a newer, simpler way of setting up the cache that doesn't require two logical volumes, see the rebuild instructions for instructions that need to be adapted here. See also the lvmcache(7) manual page for further instructions.
Create the VG consisting of 2 block devices (a slow and a fast)
apt install lvm2 &&
vg="vg_$(hostname)_cache" &&
lsblk &&
echo -n 'slow disk: ' && read slow &&
echo -n 'fast disk: ' && read fast &&
vgcreate "$vg" "$slow" "$fast"
Create the srv LV, but leave a few (like 50?) extents empty on the slow disk. (lvconvert needs this extra free space later. That's probably a bug.)
pvdisplay &&
echo -n "#extents: " && read extents &&
lvcreate -l "$extents" -n srv "$vg" "$slow"
The -cache-meta disk should be 1/1000 the size of the -cache LV. (if it is slightly more that also shouldn't hurt.)
lvcreate -L 100MB -n srv-cache-meta "$vg" "$fast" &&
lvcreate -l '100%FREE' -n srv-cache "$vg" "$fast"
setup caching
lvconvert --type cache-pool --cachemode writethrough --poolmetadata "$vg"/srv-cache-meta "$vg"/srv-cache
lvconvert --type cache --cachepool "$vg"/srv-cache "$vg"/srv
Disabling / Recovering from a cache failure
If for some reason the cache LV is destroyed or lost (typically by naive operator error), it might be possible to restore the original LV functionality with:
lvconvert --uncache vg_colchicifolium/srv
Rebuilding the cache after removal
If you've just --uncached a volume, for example to resize it, you
might want to re-establish the cache. For this, you can't follow the
same procedure above, as that requires recreating a VG from
scratch. Instead, you need to extend the VG and then create new
volumes for the cache. It should look something like this:
-
extend the VG with the fast storage:
VG=vg_$(hostname) FAST=/dev/sdc vgextend $VG $FAST -
create a LV for the cache:
lvcreate -n cache -l '100%FREE' $VG $FAST -
add the cache to the existing LV to be cached:
lvconvert --type cache --cachevol cache $VG
Example run:
root@colchicifolium:~# vgextend vg_colchicifolium /dev/sdc
Volume group "vg_colchicifolium" successfully extended
root@colchicifolium:~# lvcreate -n cache -l '100%FREE' vg_colchicifolium /dev/sdc
Logical volume "cache" created.
root@colchicifolium:~# lvconvert --type cache --cachevol cache vg_colchicifolium
Erase all existing data on vg_colchicifolium/cache? [y/n]: y
Logical volume vg_colchicifolium/srv is now cached.
Command on LV vg_colchicifolium/cache_cvol requires LV with properties: lv_is_visible .
You can see the cache in action with the lvs command:
root@colchicifolium:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
srv vg_colchicifolium Cwi-aoC--- <1.68t [cache_cvol] [srv_corig] 0.01 13.03 0.00
You might get a modprobe error on the last command:
root@colchicifolium:~# lvconvert --type cache --cachevol cache vg_colchicifolium
Erase all existing data on vg_colchicifolium/cache? [y/n]: y
modprobe: ERROR: could not insert 'dm_cache_smq': Operation not permitted
/sbin/modprobe failed: 1
modprobe: ERROR: could not insert 'dm_cache_smq': Operation not permitted
/sbin/modprobe failed: 1
device-mapper: reload ioctl on (254:0) failed: Invalid argument
Failed to suspend logical volume vg_colchicifolium/srv.
Command on LV vg_colchicifolium/cache_cvol requires LV with properties: lv_is_visible .
That's because the kernel module can't be loaded. Reboot and try again.
See also the lvmcache(7) manual page for further instructions.
Troubleshooting
Recover previous lv configuration after wrong operation
You've just made a mistake and resized the wrong LV, or maybe resized the LV without resizing the filesystem first. Here's what you can do:
- Stop all processes reading and writing from the volume that was mistakenly resized as soon as possible
- Note that you might need to forcibly kill the processes. However, forcibly killing a database is generally not a good idea.
- Look into
/etc/lvm/archiveand find the latest archive. Inspect the file in that latest archive to confirm that the sizes and names of all LVs are correct and match the state prior to the modification. - Unmount all volumes from all LVs in the volume group if that's possible. Don't forget bind mounts as well.
- If your "/" partition is in one of the LVs you might need to reboot into a rescue system to perform the recovery.
-
Deactivate all volumes in the group:
vgchange -a n vg_name
-
Restore the lvm config archive:
vgcfgrestore -f /etc/lvm/archive/vg_name_00007-745337126.vg vg_name
-
Re-enable the LVs:
vgchange -a y vg_name
-
You'll probably want to run a filesystem check on the volume that was wrongly resized. Watch out for what errors happen during the fsck: if it's encountering many issues and especially with unknown or erroneous files, you might want to consider restoring data from backup.
fsck /dev/vg_name/lv-that-was-mistakenly-resized
-
Once that's done, if the state of all things seems ok, you can mount all of the volumes back up:
mount -a
-
Finally, you can now start the processes that use the LVs.