Data Backups with Bacula: Creating the Storage Partition
Introduction
In this endeavor I've chosen to use Bacula backup software for the reasons outlined above. First we must talk about the Bacula server, but the outlined concept refers to all backup servers out there. There are multiple questions that need addressing: in a virtualized environment, should we backup our data to some virtual machine with the proper backup software installed, or should we backup on a standalone physical machine?
In my humble opinion, there are advantages and disadvantages to each of the solutions. When backing up data on a standalone physical machine, we need an additional hardware that has to be bought, which means we need to spend additional money to get the machine. But in virtual environments, where virtual images are usually stored on a NAS/SAN solution, it's best to mount the hard drive over the network in order to use it in a virtual machine or a hypervisor. In case a hypervisor (like ESX) malfunctions, we would somehow need to work with that image directly – alternatively we could install another ESX instance and copy the image there, then import it into the hypervisor normally. Therefore multiple solutions can be used:
- Bacula on NAS: the best way to backup the data is to store it on a NAS/SAN appliance itself, like a Synology box. In this case, we can use the NAS/SAN directoy without having to worry about a virtualized environment; in other words, the virtual environment can cease working and the backup server will still function. It won't backup the data from virtual machines, since the hypervisor is down, but the other physical machines will still be backed up regularly. The problem with such a design decision is multi-fold:
- Bacula on Hypervisor: usually the hypervisors (like ESX) are built upon certain versions of Linux, but installing 3rd party packages is not supported, in order not to break anything. Similar problems as discussed previously when installing backup on a NAS are encountered.
- Bacula on a VM: we can have a virtual machine which uses a shared network drive to store backup data. This is what we'll be using in our setup, where we'll create a iSCSI target on a NAS/SAN solution and connect it together with ESXi hypervisor. Then we'll add a hard drive to a separate virtual machine, let's call it backupVM, where the backup software will store the backup data on newly attached hard drive. Below are the steps needed to create such an environment.
First we need to create a iSCSI LUN by using the LUN configuration wizard as shown below. Note that in this case Synology NAS was used, but any other solution that supports LUN/iSCSI can be used as well.
Once the iSCSI LUN has been created, we must also create an iSCSI Target, which is a fairly simple process. The end result should be something like presented below, where 1.5TB of data has been assigned to LUN.
Once the iSCSI LUN and iSCSI target have been created on the Synology NAS, we should go to ESX and connect to the configured LUN. We can do that by going to Configuration – Storage Adapters – select iSCSI Software Adapter and right-click on Properties as presented below.
In the dialog that opens, we have to enable the iSCSI, so the status is marked as Enabled.
Click on 'Dynamic Discovery' and add the IP address of the Synology SAN/NAS and close the dialog. Afterwards, rescan the host bus adapter, which should detect the newly added LUN.
Next, we shouldn't add the the found LUN as VMDK datastore to ESX by selecting Storage – Add Storage. By doing that we won't be able to add a Raw Device Mapping (RDM) to the virtual machine; the option will be grayed out. Next, we have to add a separate hard disk to the virtual machine, which will be used exclusively for backups. In order to do that, edit the virtual machine settings and Add a new hard disk as presented below.
We have to add a Raw Device Mapping as shown below; if you've added the iSCSI LUN as a VMDK datastore to the ESX, this option will be disabled and you won't be able to add it to the virtual machine. But you may be wondering, how is that option better than using a virtual disk? It's simply because RDM is faster than virtual disk, where basic speed comparisons can be seen at [6].
In the 'Compatibility Mode' we can choose between physical or virtual compatibility mode. In physical compatibility mode, the SCSI commands are passed directly to the device, while only READ/WRITE commands are passed to the device in virtual compatibility mode. In physical compatibility mode, underlying hardware is exposed, while the hardware features are hidden in virtual compatibility mode.
After completing the wizard and adding a new hard disk to the virtual machine, we can start the virtual machine normally. Once the virtual machine has booted, additional hard drives will be present at /dev/sd* devices. If we execute the "fdisk -l" command, we can see the hard drive of the same size as created in the Synology NAS.
Next, we need to create a partition on the device by using fdisk commands as seen below. The p command inside fdisk prints all partitions of the current disk, the n command creates a new partition, and the w command saves all the changes to the partition table.
[plain]
# fdisk /dev/sdb
Command (m for help): p
Command (m for help): n
Partition type:
p primary (0 primary, 0 extended, 4 free)
e extended
Select (default p): p
Partition number (1-4, default 1):
Using default value 1
First sector (2048-1572863999, default 2048):
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-1572863999, default 1572863999):
Using default value 1572863999
Command (m for help): p
Command (m for help): w
The partition table has been altered!
We also have to create the filesystem of our choosing; in this case we'll use ext4 filesystem, which we can create by using mkfs.ext4 command.
[plain]
# mkfs.ext4 /dev/sdb1
To test whether mounting the newly created and formatted partition works, the mount command is used.
[plain]
# mount /dev/sdb1 /mnt/
# ls /mnt/
We also need to add the partition entry into the /etc/fstab, so it will be mounted automatically upon Linux boot. Since we're using the Debian Linux distribution, the UUIDs are used in /etc/fstab, which is why we must first get the UUID number of the new /dev/sdb1 partition. The UUID number can be displayed by using the blkid command as root.
[plain]
# blkid | grep sdb1
/dev/sdb1: UUID="9b6da96b-ebfd-4fa9-a585-484adb7c3a02" TYPE="ext4"
In order to automatically mount the partition at /backup/ location, we have to put the following entry into the /etc/fstab configuration file.
[plain]
UUID=9b6da96b-ebfd-4fa9-a585-484adb7c3a02 /backup/ ext4 defaults 0 0
To test whether the changed entry works, we can issue the "mount -a" command, which will mount all filesystems mentioned in /etc/fstab. I received an error about missing /backup/ folder, which I created with mkdir command, and reran the "mount -a" command to mount the partition. At this point the df command should successfully see the newly mounted filesystem and report its usage statistics.
[plain]
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 739G 197M 701G 1% /backup
After everything has been done, we need to install the bacula-server package in the current virtual machine, which will serve as a Bacula backup server used to backup the data throughout the internal network. The data will be stored on the previously attached and created hard drive, which stores the data directly on the NAS LUN.
Conclusion
In this article we've presented how we can go about creating the storage partition that will be used by Bacula in our virtualized environment. First we created an iSCSI LUN on our NAS storage and configured ESXi hypervisor to use that LUN for data storage. Then we connected the data storage to a virtual machine as a raw device, where we had to create the partitions as well as the filesystem in order to be able to save information on the partition.
For creating and sharing data storage with a virtual machine, we can begin by installing Bacula into that virtual machine and configuring it to use that data storage. The Bacula storage daemon is the one that will actually store information to data partition, so the daemon must be configured properly in order to do that.
References
[1] Solid-state drive, https://en.wikipedia.org/wiki/Solid-state_drive.
[2] Tape drive https://en.wikipedia.org/wiki/Tape_drive.
[3] List of backup software https://en.wikipedia.org/wiki/List_of_backup_software.
[4] Bacula-Web, http://www.bacula-web.org/.
[5] The Bootstrap File, http://www.bacula.org/5.2.x-manuals/en/main/main/Bootstrap_File.html.
[6] ESXi 5.1: Using Raw Device Mappings (RDM) on an HP Microserver, http://forza-it.co.uk/esxi-5-1-using-raw-device-mappings-rdm-on-an-hp-microserver/.
[7] Bacula Installation and Configuration Guide, https://access.redhat.com/site/sites/default/files/attachments/install_1.pdf.
[8] Overview on modifying the Synology Server, bootstrap, ipkg etc, http://forum.synology.com/wiki/index.php/Overview_on_modifying_the_Synology_Server,_bootstrap,_ipkg_etc.
[9] Data Encryption, http://www.bacula.org/5.2.x-manuals/en/main/main/Data_Encryption.html.
FREE role-guided training plans
[10] Messages Resource, http://www.bacula.org/5.2.x-manuals/en/main/main/Messages_Resource.html.