Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

In this blog, I am going to discuss how we can integrate LVM with Hadoop to provide elasticity to the Data node storage. Let me first explain about the task which I and going to perform.

Task Description :

For this task, there are some prerequisites

Prerequisite :

  • Knowledge of Linux Partition and LVM

Hadoop :

Logical Volume Management :

Now let’s begin our practical with a simple Hadoop cluster of one data node only.

Step 1: Start the NameNode Service

hadoop-daemon.sh start namenode

We can also check that there is no datanode is connected to namenode.

hadoop dfsadmin -report

Here we can clearly see that no Datanode is available.

Step 2: Add Hard Disk to the DataNode

Now we can check the Hard Disk with fdisk command.

fdisk -l

Here we can see that a new hard disk of 50GiB is added having the name /dev/sdb.

Step 3: Create Physical Volume from that hard disk.

pvcreate /dev/sdb

Also, we can display the physical volume with pvdisplay command.

pvdisplay /dev/sdb

Here a physical volume is created of size 50GiB. Also, we can see that the physical volume is not allocatable. So we have to allocate this physical volume to some Volume Group.

Step 4: Create the Volume Group

To create the volume group we have to use vgcreate command.

vgcreate dnvg /dev/sdb

Also, we can display the physical volume with pvdisplay command.

vgdisplay dnvg

Now we can check the physical volume that it is allocated or not.

We can clearly see that the Allocatable is yes and it is allocated to volume group dbvg.

Step 5: Create Logical Volume of Size 30GiB

To create the logical volume we have to use lvcreate command.

lvcreate — size 30G — name dnlv dnvg

We can display the logical volume with lvdisplay command.

lvdisplay dnvg/dnlv

We can clearly see that one logical volume of size 30GiB is created.

Step 6: Format the Logical Volume

To format the logical volume we have to use mkfs command.

mkfs.ext4 /dev/dnvg/dnlv

Step 7: Mount the Logical Volume with the DataNode directory

mount /dev/dnvg/dnlv /dn

We can also check that the logical volume is mounted or not with df command.

df -h

Step 8: Start Datanode service

hadoop-daemon.sh start datanode

Now we can see the report of the Hadoop cluster to check how much storage is shared.

hadoop dfsadmin -report

Here we can clearly see that 30GiB storage is shared. Now we have to increase the storage online i.e elastically storage will increase without stopping the datanode.

Step 9: Increate the Logical Volume Size

lvextend — size +10G /dev/dnvg/dnlv

Now we can see that the size of logical volume has been increased from 30GiB to 40GiB.

But if we will check the size of volume which is mounted with /dn directory it will be still 30GiB

It is still 30GiB because we have to update the inode table of the partition(logical volume).

Step 10: Format the extended Logical Volume

To format the extended logical volume we have to use the resize2fs command.

resize2fs /dev/dnvg/dnlv

We can also check the size of the logical volume increased or not with the df command.

df -h

Initially, the size of the logical volume was 30GiB but after resizing the size increased to 40GiB.

We can also check the report of Hadoop cluster to confirm that the storage size is increased or not.

Thank You !!

Hope you like it !!

I'm passionate learner diving into the concepts of computing 💻