Orchestrating Your Storage: libStorageMgmt

NOTE: Updated 4/2/2015 to reflect new project links and updated command line syntax.

Abstract

This paper discusses some of the advanced features that can be used in modern storage subsystems to improve IT work flows. Being able to manage storage whether it be direct attached, storage area network (SAN) or networked file system is vital. The ability to manage different vendor solutions consistently using the same tools opens a new range of storage related solutions. LibStorageMgmt meets this need.

Introduction

Many of today’s storage subsystems have a range of features. Some examples include: create, delete, re-size, copy, make space-efficient copies and mirrors for block storage. Networked file systems can offload copies of files or even entire file systems quickly while using little to no additional storage, or keep numerous read-only point-in-time copies of file system state. This allows users to quickly provision new virtual machines, take instantaneous copies of databases, and back up other files. For example a user could quiesce a data base, call to the array to make a point-in-time copy of the data, and then resume database operations within seconds. Then the user could take as much time as necessary to replicate the copy to a remote location or removable media. There are many other valuable features that are available through the array management interface. In fact, in many cases, it’s necessary to use this out-of-band management interface to enable the use of features that are available in-band, across the data interface.

Problem

To use these advanced features, users must install proprietary tools and libraries for each array vendor. This allows users to fully exploit their hardware, but at the cost of learning new command line and graphical user interfaces and programming to new application programming interfaces (APIs) for each vendor. Open-source solutions frequently cannot use proprietary libraries to manage storage because of incompatible licensing. In other cases, the open-source developer cannot redistribute the vendor libraries. Thus the end users must manually install all of the required pieces themselves. The Storage Network Industry Association (SNIA) and the associated Storage Management Initiative Specification (SMI-S) have an ongoing effort to address this need with a well-defined and established storage standard. The standard is quite large. Preventing administrators and developers from leveraging it easily. With the scope and complexity of such a large standard, it is difficult for vendors to implement it without variations in behavior. The SMI-S members’ focus is on being the providers of the API and not consumers of them, so the emphasis is from the array provider perspective. The SMI-S standard must define an API for each new feature. The specification always trails vendor defined APIs.

The LibStorageMgmt solution

The libStorageMgmt project’s goal is to provide an open-source vendor-agnostic library and command line tool to allow administrators and developers the ability to leverage storage subsystem features in a consistent and unified manner. When a developer chooses to use the library, their users will benefit by their ability to use any of the supported arrays or future arrays when they are added. The library is licensed under the LGPL which allows use of the library in open-source and commercial applications. The command-line interface (lsmcli) has been designed with scriptability in mind, with configurable output to ease parsing. The library API has language bindings for C and Python. The library architecture uses plug-ins for easy integration with different arrays. The plug-ins execute in their own address space allowing the plug-in developer to choose whatever license that is most appropriate for their specific requirements. The separate address space provides fault isolation in the event of a plug-in crash, which will be very helpful if the plug-in is provided in binary form only.

LibStorageMgmt currently has plug-in support for:

  • NetApp filer (ontap)
  • Linux LIO (targetd)
  • Nexentastor (nstor)
  • SMI-S (smispy) Note: feature support varies by provider
  • Array simulator (sim) Allows testing of client code/scripts without requiring an array

Support for additional arrays is in development and will be released as they become available.

Example: Live database backup

An administrator has a MySQL database that they would like to do a live “hot” backup to minimize disruption to end users. They also use NetApp filers for their storage, and would like to leverage the hardware features it provides for point-in-time space efficient copies. The database is located on an iSCSI logical disk provided by the filer. (These are referred to as volumes in libStorageMgmt.)

The overall flow of operations:

  • Craft a uniform resource identifier for the array (URI) for use with libStorageMgmt
  • Identify the appropriate disk and obtain its libStorageMgmt ID
  • Quiesce the database
  • Use libStorageMgmt to issue a command to the array to replicate the disk
  • Release the database to continue
  • Use libStorageMgmt to grant access to then newly created disk to an initiator so that it can be mounted and backed-up

Crafting the URI

As the admin is using NetApp they need to select the ontap plug-in by crafting a URI. The URI looks like “ontap+ssl://root@filer_host_name_or_ip/”. The beginning of the URI specifies the plug-in with an optional indicator that the user would like to use SSL for communication. The user “root” is used for authentication, and the filer can be addressed by hostname or IP address. This example will be using the command line interface. We can either specify the URI on the command line with ‘-u’ or set an environment variable LSMCLI_URI to avoid typing for every command. The password can be prompted with a “-P”, or supplied in the environmental variable LSMCLI_PASSWORD.

Identify the disk to replicate

The administrator queries the array to identify the volume that the database is located on. To correctly identify which disk the admin first takes a look to see where the file system is mounted by looking for the UUID of the file system. Then they look in /dev/disk/by-id to identify the specific disk.

# lsblk -f | grep cd15fc03-749e-4d5b-9960-b3936ff25a62
sdb ext4 cd15fc03-749e-4d5b-9960-b3936ff25a62 /mnt/db

$ ls -gG /dev/disk/by-id/ | grep sdb
lrwxrwxrwx. 1 9 Apr 30 12:24 scsi-360a98000696457714a346c4f5851304f -> ../../sdb
lrwxrwxrwx. 1 9 Apr 30 12:24 wwn-0x60a98000696457714a346c4f5851304f -> ../../sdb

We can now use the SCSI disk id to identify the disk on the array.

$ lsmcli list --type volumes -t" " | grep 60a98000696457714a346c4f5851304f
idWqJ4lOXQ0O /vol/lsm_lun_container_lsm_test_aggr/tony_vol 60a98000696457714a346c4f5851304f 512 102400 OK 52428800 987654-32-0 e284bcf0-68e5-11e1-ad9b-000c29659817

This command displays all the available volumes for the array. It outputs a number of different fields for each volume on the storage array. The fields are separated by a space ( using -t” “) with the fields defined as: ID, Name, vpd83, block size, #blocks, status, size bytes, system ID and pool ID.

Definitions of each:

  • ID – Array unique identifier for the Volume (virtual disk)
  • Name – Human readable name
  • vpd83 – SCSI Inquiry data for page 0×83
  • block size – Number of bytes in each disk block (512 is common)
  • #block – Number of blocks on disk
  • status – Current status of disk
  • size bytes – Current size of disk in bytes
  • system ID – Unique identifier for this array
  • pool ID – Unique storage pool that virtual disk resides on

So, the array ID for the volume we are interested in is idWqJ4lOXQ0O.

 

Quiesce the database

Before issuing the replicate command, quiesce the database. For MySQL this can be done by establishing a connection and run “FLUSH TABLES WITH READ LOCK” and leaving the connection open.

 

Replicate the disk

To replicate the disk the user can issue the command (just outputting result ID for brevity):

$ lsmcli volume-replicate --vol idWqJ4lOXQ0O --rep-type CLONE --name "db_copy" -t” “ | awk '{print $1;}'
idWqJ4qtb1f1

 

This command creates a clone (space efficient copy) of a disk. The “-r” indicates replicate with the argument specifying which volume ID to replicate, “-type” is the type of replication to perform and “–name” is the human readable name of the copy. For more information about the available options type “lsmcli –help” or “man lsmcli” for additional information. The command line will return the details of the newly created disk. The output is identical to the information returned if you listed the volume, as shown above. In this example we just grabbed the volume ID as that is all we need to grant access to it in the following steps.

 

Release the database

Once this is done you can call “UNLOCK TABLES” or close the connection to the database.

 

Grant access to newly created disk

To access the newly created disk for backup we need to grant access to it for an initiator. There are two different ways to grant access to a volume for an initiator. Some arrays support groups of initiators which are referred to as access groups. For other arrays you specify individual mappings from initiator to volume. To determine what mechanism the arrays supports we take a look at the capabilities listed for the array.

 

To find out what capabilities an array has, we need to find the system ID:

$ lsmcli list --type systems
ID          | Name        | Status | Info
-----------------------------------------
987654-32-0 | netappdevel | OK

Then issue the command to query the capabilities by passing the system id:

$ lsmcli --capabilities --sys 987654-32-0 | grep ACCESS_GROUP

ACCESS_GROUP_GRANT:SUPPORTED
ACCESS_GROUP_REVOKE:SUPPORTED
ACCESS_GROUP_LIST:SUPPORTED
ACCESS_GROUP_CREATE:SUPPORTED
ACCESS_GROUP_DELETE:SUPPORTED
ACCESS_GROUP_ADD_INITIATOR:SUPPORTED
ACCESS_GROUP_DEL_INITIATOR:SUPPORTED
VOLUMES_ACCESSIBLE_BY_ACCESS_GROUP:SUPPORTED
ACCESS_GROUPS_GRANTED_TO_VOLUME:SUPPORTED

 

The Ontap plug-in supports access groups. In this example, we know the initiator we want to use has iSCSI IQN iqn.1994-05.com.domain:01.89bd03. We will look up the access group that has the iSCSI IQN of interest in it.

 

List the access groups, looking for the IQN of interest to backup too.

 

$ lsmcli list --type ACCESS_GROUPS

ID                               | Name    | Initiator IDs                    | System ID
-------------------------------------------------------------------------------------------
e11c718b99e26b1ca8b45f2df455c70b | fedora  | iqn.1994-05.com.domain:01.5d8644 | 987654-32-0
e11c718b99e26b1ca8b45f2df455c70b | fedora  | iqn.1994-05.com.domain:01.b7885f | 987654-32-0
0a9a917c8cf4183f4646534f5597eb02 | Tony_AG | iqn.1994-05.com.domain:01.89bd01 | 987654-32-0
0a9a917c8cf4183f4646534f5597eb02 | Tony_AG | iqn.1994-05.com.domain:01.89bd03 | 987654-32-0

 

The one we are interested in has ID 0a9a917c8cf4183f4646534f5597eb02. So at this point we can grant access for the new volume by issuing:

 

$ lsmcli volume-mask --ag 0a9a917c8cf4183f4646534f5597eb02 --volume idWqJ4qtb1f1

 

If the IQN of interest is not available it can be added to an existing access group or added to a new access group. An example of adding to an existing access group:

 

$ lsmcli access-group-add --ag 0a9a917c8cf4183f4646534f5597eb02 --init iqn.1994-05.com.domain:01.89bd04

 

To see what volumes are visible and accessible to an initiator we can issue:

 

$ lsmcli access-group-volumes --ag 0a9a917c8cf4183f4646534f5597eb02 -t" " -H
idWqJ4lOXQ0O /vol/lsm_lun_container_lsm_test_aggr/tony_vol 60a98000696457714a346c4f5851304f 512 102400 OK 50.00 MiB 987654-32-0 e284bcf0-68e5-11e1-ad9b-000c29659817
idWqJ4qtb1f1 /vol/lsm_lun_container_lsm_test_aggr/db_copy 60a98000696457714a34717462316631 512 102400 OK 50.00 MiB 987654-32-0 e284bcf0-68e5-11e1-ad9b-000c29659817

 

At this point you need to re-scan for targets on the host. Please check documentation appropriate for your distribution. Once the disk is visible to the host it can then be mounted and then backed up as usual.

 

This sequence of steps would be the same regardless of vendor, only the URI would be different. Other operations that are currently available for volumes include: delete, re-size, replicate a range of logical blocks, access group creations and modification, and a number of ways to interrogate relationships between initiators and volumes. This coupled with a stable API allows developers and administrators a consistent way to leverage these valuable features.

Summary

Having a consistent and reliable way to manage storage allows for the creation of new applications that can benefit from such features. Quickly provisioning a new virtual machine by replicating a disk template with very little additional disk space is one such example. Having an open source project that can be improved, developed, and molded by a community of users will ensure the best possible solution. LibStorageMgmt is looking for contributors in all areas (eg. users, developers, reviewers, array documentation, testing).

References

Documentation

Project: https://github.com/libstorage/libstoragemgmt/

Project documentation: http://libstorage.github.io/libstoragemgmt-doc/

Assistance

Mailing lists:

https://lists.fedorahosted.org/mailman/listinfo/libstoragemgmt-devel

https://lists.fedorahosted.org/mailman/listinfo/libstoragemgmt-users

IRC at #libStorageMgmt http://freenode.net

There is suppose to be hot water in the water heater, right?

A local contractor installed a new furnace and water heater for me on 1/21/2013. The install appeared to go well. The furnace is keeping our house warm and the water heater runs without making crazy noise and it is producing hot water. All is perfect in the world, well not quite…

While checking out the water heater (AO Smith GDHE-50) I noticed that the lower side connect was quite cold, the brass drain was very cold too. My three other water heaters never exhibited anything like this when they had hot water in them. The valve and the side connectors were quite warm when the unit was at standby.

To quantify how much cold water is in the heater I did the following experiment. Immediately after the unit completed a heating cycle (120F, 8F differential) I turned the water heater off. I then closed the cold water value to the water heater and opened a hot water faucet to allow air into the system. Then I systematically drained a gallon of water at a time from the water heater drain and took its temperature with a digital thermometer, repeating until I hit water that was 120F. My best guess at starting was that there is at least 16 gallons of fairly cold water in the heater as each vertical inch is just under a gallon of water and the side connector is 16″ off the floor.

The water heater had 11 gallons of water < 60F. The results of this experiment indicate to me that something is wrong. Past water heaters, I was able to get very hot water instantly out of the drain.

OK, so what’s the problem as long as we have hot water coming out the top? Bacterial growth in the tank. See http://www.treehugger.com/green-food/is-it-safe-to-turn-down-your-water-heater-temperature.html

This looks like the ultimate petri dish.

Fool me once, shame on you, fool me twice …

American water heater company, the maker of the water heater I installed issued me a return authorization number for the water heater that would not run when installed per the instructions. I installed the new one (1/5) and this one works better (it will start), but still not great. It makes noise when starting and the flame is quite yellow and has bad shape. I have posted videos of the start and flame for technical support to look at.

This is the replacement unit starting : http://youtu.be/_Bb3IgamFdo

I contacted technical support again via email and sent them video footage of the poor flame. After a few days technical support contacted me again and FedEx’d me a smaller orifice to try, a #30. The heater comes standard with a #29. I got this smaller orifice and installed it and the unit ran very poorly http://youtu.be/ml-uqXvCrp8

At this point I gave up, I contacted a local contractor and scheduled an install of a new furnace and water heater. I was done trying to make this water heater work.

The replacement water heater was returned to Lowe’s for a refund. Lowe’s was very helpful throughout this very frustrating experience.

I didn’t do anything wrong

Ben, a plumber from K&S came by and spent 3 hours going over my install and inspecting the water heater. He found nothing wrong with my install, whew! After talking to technical support we got the unit running by removing the intake and restricting the exhaust. We had hot water, but the install wasn’t going to pass code as it deviated from the installation instructions.

Water heaters go virtually unnoticed, that is until they don’t work!

You shall not heat!

As I was unable to correct the negative pressure causing my atmospherically vented natural gas water to back draft at start, I decided to replace it. The 15 year old AO Smith 40 gallon, 40K BTU water heater had absolutely no issues, except that it left our house hold yearning for more hot water.

Obviously a new atmospheric vented water heater was out as it would exhibit the same problem as the old one. What other options are there?

  • Atmospheric (what we had)

These are the old type that have existed since the beginning of water heating. Burner on the bottom with a flue up the center which has a draft hood and vents out a chimney. Exhaust is very hot because of low efficiency and it naturally rises and vents out the chimney.

  • Direct vent

Sealed combustion, used outside air for combustion. Non-powered (no electricity), horizontal and vertical venting with special pipe within a pipe with very stringent requirements. Not used very often as the water heater basically needs to be right against the wall. This is basically an atmospheric water heater with sealed venting and are typically not very energy efficient.

  • Powered direct vent

Sealed combustion, requires electricity, horizontal and vertical venting using plastic pipe. These have a wide range of efficiency ratings.

  • Power vent

Open combustion which uses inside air, requires electricity, horizontal and vertical venting using plastic pipe. These have a wide range of efficiency ratings.

Since 2003 water heaters have Flammable Vapor Ignition Resistant (FVIR) technology. This prevents home owners from blowing up their houses due to heaver than air explosive vapors. This has caused water heaters to rise dramatically in complexity and price. It has also been the cause of class action law suites.

As I was trying to mitigate a negative air pressure issue, I went with a powered direct vent. I purchased and installed a 50Gal 60K BTU Powerflex Direct (PDVG62-50T60-NV) from Lowe’s (ref.) . My dad and son helped with the install and after about 6 hours it was ready to go. I filled it up, went over the check list and then plugged it in. The gas was lit by the hot surface igniter and then it sounded like a rapid sequence of explosions and then the unit would turn off. This was repeated twice more and the unit got stuck in lock out. I called technical support and after talking for a while we decided to have a plumber come out and take a look, but that would be the next day!

We need some air in here!

In a previous posting I mentioned running into an issue with our water heater failing to draft when all the exhaust fans and clothes dryer running.  I did some research and through testing I thought the entire issue could be resolved by adding more “make-up air”.   Thus I added a 6″ duct into our mechanical room.  This was in addition to our existing 5″.

You would think that after having ~48″ sq. of open hole to the outside would mitigate any possible negative pressure, but you would be wrong.  The water heater still failed to draft in worst case CAZ testing.

I called a local HVAC contractor to evaluate.  They spent 3 hours trying different things, including adding 3′ to our chimney, but no joy!

The pitfalls of making your home energy efficient

Over Thanksgiving I discovered a torrent of cold air pouring behind our shower in the basement and tracked it to a large opening in our attic. That same day I went and picked up some additional insulation and proceeded to close the opening. Problem solved right? Nope, instead I found out by sealing this rather large hole I made our home too tight. The problem is that in worst case testing (all exhaust fans running and the clothes dryer) a significant enough negative pressure was created which prevented our water heater from drafting correctly.

If you have one or more atmospheric vented gas appliances, make sure to test after home improvements!