this article also has an alternative title:
How I Learned to Stop Worrying and Loved my Team
This is a story of troubleshooting cloud disk volumes (long post).
Cloud Disk Volume
Working with data disk volumes in the cloud have a few benefits. One of them is when the volume runs out of space, you can just increase it! No need of replacing the disk, no need of buying a new one, no need of transferring 1TB of data from one disk to another. It is a very simple matter.
Partitions Vs Disks
My personal opinion is not to use partitions. Cloud data disk on EVS (elastic volume service) or cloud volumes for short, they do not need a partition table. You can use the entire disk for data.
Use: /dev/vdb
instead of /dev/vdb1
Filesystem
You have to choose your filesystem carefully. You can use XFS that supports Online resizing via xfs_growfs
, but you can not shrunk them. But I understand that most of us are used to work with extended filesystem ext4 and to be honest I also feel more comfortable with ext4.
You can read the below extensive article in wikipedia Comparison of file systems for more info, and you can search online regarding performance between xfs and ext4. There are really close to each other nowadays.
Increase Disk
Today, working on a simple operational task (increase a cloud disk volume), I followed the official documentation. This is something that I have done in the past like a million times. To provide a proper documentation I will use redhat’s examples:
In a nutshell
- Umount data disk
- Increase disk volume within the cloud dashboard
- Extend (change) the geometry
- Check filesystem
- Resize ext4 filesystem
- Mount data disk
Commands
Let’s present the commands for reference:
# umount /dev/vdb1
[increase cloud disk volume]
# partprobe
# fdisk /dev/vdb
[delete partition]
[create partition]
# partprobe
# e2fsck /dev/vdb1
# e2fsck -f /dev/vdb1
# resize2fs /dev/vdb1
# mount /dev/vdb1
And here is fdisk in more detail:
Fdisk
# fdisk /dev/vdb
Welcome to fdisk (util-linux 2.27.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): p
Disk /dev/vdb: 1.4 TiB, 1503238553600 bytes, 2936012800 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x0004e2c8
Device Boot Start End Sectors Size Id Type
/dev/vdb1 1 2097151999 2097151999 1000G 83 Linux
Delete
Command (m for help): d
Selected partition 1
Partition 1 has been deleted.
Create
Command (m for help): n
Partition type
p primary (0 primary, 0 extended, 4 free)
e extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1):
First sector (2048-2936012799, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-2936012799, default 2936012799):
Created a new partition 1 of type 'Linux' and of size 1.4 TiB.
Command (m for help): p
Disk /dev/vdb: 1.4 TiB, 1503238553600 bytes, 2936012800 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x0004e2c8
Device Boot Start End Sectors Size Id Type
/dev/vdb1 2048 2936012799 2936010752 1.4T 83 Linux
Write
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
File system consistency check
An interesting error occurred, something that I had never seen before when using e2fsck
# e2fsck /dev/vdb1
e2fsck 1.42.13 (17-May-2015)
ext2fs_open2: Bad magic number in super-block
e2fsck: Superblock invalid, trying backup blocks...
e2fsck: Bad magic number in super-block while trying to open /dev/vdb1
The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem. If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
or
e2fsck -b 32768 <device>
Superblock invalid, trying backup blocks
Panic
I think I lost 1 TB of files!
At that point, I informed my team to raise awareness.
Yes I know, I was a bit sad at the moment. I’ve done this work a million times before, also the Impostor Syndrome kicked in!
Snapshot
I was lucky enough because I could create a snapshot, de-attach the disk from the VM, create a new disk from the snapshot and work on the new (test) disk to try recovering 1TB of lost files!
Make File System
mke2fs has a dry-run option that will show us the superblocks:
mke2fs 1.42.13 (17-May-2015)
Creating filesystem with 367001344 4k blocks and 91750400 inodes
Filesystem UUID: f130f422-2ad7-4f36-a6cb-6984da34ead1
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848
Testing super blocks
so I created a small script to test every super block against /dev/vdb1
e2fsck -b 32768 /dev/vdb1
e2fsck -b 98304 /dev/vdb1
e2fsck -b 163840 /dev/vdb1
e2fsck -b 229376 /dev/vdb1
e2fsck -b 294912 /dev/vdb1
e2fsck -b 819200 /dev/vdb1
e2fsck -b 884736 /dev/vdb1
e2fsck -b 1605632 /dev/vdb1
e2fsck -b 2654208 /dev/vdb1
e2fsck -b 4096000 /dev/vdb1
e2fsck -b 7962624 /dev/vdb1
e2fsck -b 11239424 /dev/vdb1
e2fsck -b 20480000 /dev/vdb1
e2fsck -b 23887872 /dev/vdb1
e2fsck -b 71663616 /dev/vdb1
e2fsck -b 78675968 /dev/vdb1
e2fsck -b 102400000 /dev/vdb1
e2fsck -b 214990848 /dev/vdb1
Unfortunalyt none of the above commands worked!
last-ditch recovery method
There is a nuclear option DO NOT DO IT
mke2fs -S /dev/vdb1
Write superblock and group descriptors only. This is useful if all of the superblock and backup superblocks are corrupted, and a last-ditch recovery method is desired. It causes mke2fs to reinitialize the superblock and group descriptors, while not touching the inode table and the block and inode bitmaps.
Then e2fsck -y -f /dev/vdb1
moved 1TB of files under lost+found with their inode as the name of every file.
I cannot stress this enough: DO NOT DO IT !
Misalignment
So what is the issue?
See the difference of fdisk on 1TB and 1.4TB
Device Boot Start End Sectors Size Id Type
/dev/vdb1 1 2097151999 2097151999 1000G 83 Linux
Device Boot Start End Sectors Size Id Type
/dev/vdb1 2048 2936012799 2936010752 1.4T 83 Linux
The First sector is now at 2048 instead of 1.
Okay delete disk, create a new one from the snapshot and try again.
Fdisk Part Two
Now it is time to manually put the first sector on 1.
# fdisk /dev/vdb
Welcome to fdisk (util-linux 2.27.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): p
Disk /dev/vdb: 1.4 TiB, 1503238553600 bytes, 2936012800 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x0004e2c8
Device Boot Start End Sectors Size Id Type
/dev/vdb1 2048 2936012799 2936010752 1.4T 83 Linux
Command (m for help): d
Selected partition 1
Partition 1 has been deleted.
Command (m for help): n
Partition type
p primary (0 primary, 0 extended, 4 free)
e extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1): 1
First sector (2048-2936012799, default 2048): 1
Value out of range.
Value out of range.
damn it!
sfdisk
In our SRE team, we use something like a Bat-Signal to ask for All hands
on a problem and that was what we were doing. A colleague made a point that fdisk is not the best tool for the job, but we should use sfdisk instead. I actually use sfdisk to create backups and restore partition tables but I was trying not to deviate from the documentation and I was not sure that everybody knew how to use sfdisk.
So another colleague suggested to use a similar 1TB disk from another VM.
I could hear the gears in my mind working…
sfdisk export partition table
sfdisk -d /dev/vdb > vdb.out
# fdisk -l /dev/vdb
Disk /dev/vdb: 1000 GiB, 1073741824000 bytes, 2097152000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x0009e732
Device Boot Start End Sectors Size Id Type
/dev/vdb1 1 2097151999 2097151999 1000G 83 Linux
# sfdisk -d /dev/vdb > vdb.out
# cat vdb.out
label: dos
label-id: 0x0009e732
device: /dev/vdb
unit: sectors
/dev/vdb1 : start= 1, size= 2097151999, type=83
okay we have something here to work with, start sector is 1 and the geometry is 1TB for an ext file system. Identically to the initial partition table (before using fdisk).
sfdisk restore partition table
sfdisk /dev/vdb < vdb.out
# sfdisk /dev/vdb < vdb.out
Checking that no-one is using this disk right now ... OK
Disk /dev/vdb: 1.4 TiB, 1503238553600 bytes, 2936012800 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x0004e2c8
Old situation:
Device Boot Start End Sectors Size Id Type
/dev/vdb1 2048 2936012799 2936010752 1.4T 83 Linux
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new DOS disklabel with disk identifier 0x0009e732.
Created a new partition 1 of type 'Linux' and of size 1000 GiB.
/dev/vdb2:
New situation:
Device Boot Start End Sectors Size Id Type
/dev/vdb1 1 2097151999 2097151999 1000G 83 Linux
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
# fdisk -l /dev/vdb
Disk /dev/vdb: 1.4 TiB, 1503238553600 bytes, 2936012800 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x0009e732
Device Boot Start End Sectors Size Id Type
/dev/vdb1 1 2097151999 2097151999 1000G 83 Linux
Filesystem Check ?
# e2fsck -f /dev/vdb1
e2fsck 1.42.13 (17-May-2015)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
SATADISK: 766227/65536000 files (1.9% non-contiguous), 200102796/262143999 blocks
f#ck YES
Mount ?
# mount /dev/vdb1 /mnt
# df -h /mnt
Filesystem Size Used Avail Use% Mounted on
/dev/vdb1 985G 748G 187G 81% /mnt
f3ck Yeah !!
Extend geometry
It is time to extend the partition geometry to 1.4TB with sfdisk.
If you remember from the fdisk output
Device Boot Start End Sectors Size Id Type
/dev/vdb1 1 2097151999 2097151999 1000G 83 Linux
/dev/vdb1 2048 2936012799 2936010752 1.4T 83 Linux
We have 2936010752 sectors in total.
The End sector of 1.4T is 2936012799
Simple math problem: End Sector - Sectors = 2936012799 - 2936010752 = 2047
The previous fdisk command, had the Start Sector at 2048,
So 2048 - 2047 = 1
the preferable Start Sector!
New sfdisk
By editing the text vdb.out file to re-present our new situation:
# diff vdb.out vdb.out.14
6c6
< /dev/vdb1 : start= 1, size= 2097151999, type=83
---
> /dev/vdb1 : start= 1, size= 2936010752, type=83
1.4TB
Let’s put everything together
# sfdisk /dev/vdb < vdb.out.14
Checking that no-one is using this disk right now ... OK
Disk /dev/vdb: 1.4 TiB, 1503238553600 bytes, 2936012800 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x0009e732
Old situation:
Device Boot Start End Sectors Size Id Type
/dev/vdb1 1 2097151999 2097151999 1000G 83 Linux
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new DOS disklabel with disk identifier 0x0009e732.
Created a new partition 1 of type 'Linux' and of size 1.4 TiB.
/dev/vdb2:
New situation:
Device Boot Start End Sectors Size Id Type
/dev/vdb1 1 2936010752 2936010752 1.4T 83 Linux
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
# e2fsck /dev/vdb1
e2fsck 1.42.13 (17-May-2015)
SATADISK: clean, 766227/65536000 files, 200102796/262143999 blocks
# e2fsck -f /dev/vdb1
e2fsck 1.42.13 (17-May-2015)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
SATADISK: 766227/65536000 files (1.9% non-contiguous), 200102796/262143999 blocks
# resize2fs /dev/vdb1
resize2fs 1.42.13 (17-May-2015)
Resizing the filesystem on /dev/vdb1 to 367001344 (4k) blocks.
The filesystem on /dev/vdb1 is now 367001344 (4k) blocks long.
# mount /dev/vdb1 /mnt
# df -h /mnt
Filesystem Size Used Avail Use% Mounted on
/dev/vdb1 1.4T 748G 561G 58% /mnt
Finally!!
Partition Alignment
By the way, you can read this amazing article to fully understand why this happened: