20 September 2017

Using Azure Files with Azure Container Services and Docker volumes

As a continuation to my last post about setting up an Azure Container Service, I thought it might be a good idea to have a look at persistent storage for the containers. Even if I prefer “outsourcing” my storage to external services, having persistent storage can be really useful in some cases.

Docker sets up storage using volumes, which are basically just some form of storage that is mounted to a path in the container. By default, the volumes are just directories on the host that are mounted inside the container. However, this has a couple of drawbacks.

If the container is taken down and set up on a new host, the data in the volume will be gone. Or rather, it will still be there, but on another host. So, in the eyes of the container, it’s gone. The only way to make sure that the data is persisted properly, is by setting up affinity between the service and the host. But this is a REALLY crappy solution. It breaks the idea that a container should preferably be able to run on any agent in the cluster.

On top of that, if a host is replaced, the data disappears. And in a cluster where machines should be cattle, and load should be handled by expanding and contracting the cluster, hosts come and go. So tieing the storage to a specific host is just not a good idea.

The solution is to map our volumes to something else than the host, and in the case of Azure, that would preferably be Azure Storage. And the way this is done, is by setting up an Azure File in storage, and map that as a volume using SMB.

But wait…this sounds really complicated! Is this really the only way? Can’t I just click some buttons and have it done for me? Well…no, yes, no. It isn’t that complicated to be honest. But yes, there are a few steps involved, but they are all pretty basic. And yes, you currently have to set it up manually. There is no button you can click in the Azure portal to get it up and running unfortunately. Not yet at least.

So, without a button in the Portal, how do we go about doing it?

Step 1 - Creating a Docker Swarm and Storage Account

The first step is to set up a new Docker Swarm in Azure. Luckily this is a piece of cake to do using the Azure Portal, and the Azure Container Service. If you haven’t done that before, I suggest having a look at my previous post. It covers how you set up a swarm, and what Azure actually provisions for you when you do it.

In my case, I created a Docker Swarm with a single master and a single agent. There is no need for a bigger cluster to test this. More nodes, just mean that you have to repeat the set up process for the driver more times.

Besides the Container Service, we will need a storage account. So go ahead and set that up as well while you wait for the ACS to provision all of its resources.

Step 2 - Connect to the Swarm

Once the swarm is up an running, you need to connect to the swarm master using SSH. Once again, this is sort of covered in the previous post, but in that post, I showed how to set up a tunnel and have the Docker client working against the master. In this case, I want to connect straight to the master, and execute commands on it. So to do that, I opened a terminal and executed

ssh -p 2200 zerokoll@chrisacstestmgmt.westeurope.cloudapp.azure.com

This connects to a Azure Container Service called chrisacstest using a user called zerokoll. If your service isn’t called chrisacstest, it will look like this

ssh -p 2200 [USERNAME]@[CONTAINER SERVICE NAME]mgmt.[AZURE REGION].cloudapp.azure.com

It might also be worth noting that I use port 2200 instead of the SSH default 22. Port 2200 is then forwarded to 22 by the load balancer…

Once connected, you can run

docker info

to see what the current state of the Docker Swarm is.

At least you might think so… However, you might note that this doesn’t give you the response you would have expected. It contains no information about a swarm…but there is a good reason for this. You are currently connected to the master host. Running docker info gives you the information about the Docker set up on this node. And this machine isn’t part of a swarm.

Wait…what!? What do I mean that the master node isn’t part of a swarm? Well…it is…but it’s not. It runs a Docker container that is configured as a master in the swarm.

So, if you run

docker ps

you can see that the host you are connected to is running a container based on an image called swarm:1.1.0, which is the actual swarm master. That container has port 2375 mapped to the host. So, in the previous post, when we set up the SSH tunnel and bound port 2375 on the host to port 2375 on the local machine, we were actually binding to a port that was in turn is bound to port 2375 on the swarm master container. So we were issuing commands to that container, not the host…

To get the info about the Docker master, we need to run the following command

docker -H 127.0.0.1:2375 info

This tells the Docker client on the host machine to issue its commands against port 2375 on the local machine, which is then bound to the Docker container that we really want to query. So, this should give us some information about the swarm.

Anyhow…let’s not go further down that rabbit hole! I just thought that I needed to give that information as it might cause some confusion…

The swarm master host isn’t actually the machine that we will be running containers that use Azure File backed volumes, but I’ll go ahead and add the driver here anyway. Mainly because it is easier to see what is being done at this level, than it is another level down in the agents. So I’ll start out by setting it up here, just to show how it’s done, and then I’ll go on and set it up on the agent node.

Step 3 - Setting up the Azure File Volume Driver

The first step is to get hold of the required volume driver, which is located on GitHub. The available releases are available here. At the time of writing, the latest version is 0.5.1, so I’ll be using that… And since we’ll be doing things that require root access, I’ll go ahead and run

sudo -s

This will start an interactive shell run as root

Note: If you are a Windows user and have very little experience with Linux, like me, you’ll note the change in the prompt from [user]@swarm-master-XXXX-0:~# to root@swarm-master-XXXX-0:~#

Next, we want to download the driver and place it in /usr/bin. And to do that, I’ll use wget

wget -qO /usr/bin/azurefile-dockervolumedriver https://github.com/Azure/azurefile-dockervolumedriver/releases/download/v0.5.1/azurefile-dockervolumedriver

and since that file needs to be allowed to execute, I’ll run chmod and set the access permission to allow it to execute

chmod +x /usr/bin/azurefile-dockervolumedriver

Once the driver is in place, we need to get the upstart init file for it. This is a file used by upstart to start the service.

Note: The master node currently runs Ubuntu 14.04.4, which uses upstart instead of systemd, which is used by later versions. If you are running a later build, the service set up is a little different

The init config file is located on GitHub as well, so all I have to do is call

wget -qO /etc/init/azurefile-dockervolumedriver.conf https://raw.githubusercontent.com/Azure/azurefile-dockervolumedriver/master/contrib/init/upstart/azurefile-dockervolumedriver.conf

The last thing that is required, is to configure the volume driver. We need to tell it what Storage Account and key should be used. To do this, we need to create a file called azurefile-dockervolumedriver at /etc/default. So I’ll run the following commands

touch /etc/default/azurefile-dockervolumedriver
echo "AF_ACCOUNT_NAME=chrisacsteststorage" >> /etc/default/azurefile-dockervolumedriver
echo "AF_ACCOUNT_KEY=D+rYUTUC14ALS13gxprCsBJMEu0..." >> /etc/default/azurefile-dockervolumedriver

First I use touch to create the file, and then I just use echo to write the values I want in there. As you can see, I set the AF_ACCOUNT_NAME to the name of my storage account, and the AF_ACCOUNT_KEY to the access key for that account.

Once all of that is in place, I can reload the service configuration to include the new stuff, and then start the Azure File Volume Driver service

initctl reload-configuration
initctl start azurefile-dockervolumedriver

This should output azurefile-dockervolumedriver start/running, process XXX to let you know that everything has worked as expected.

Now that the service is up and running, we can test the driver by creating a container that writes some data to it.

Step 4 - Testing the driver

The easiest way to verify that everything is working as it should, is to create a volume and write some data to it. And yes, I’m still on the “wrong” host to try this on. This should be done on the agents, but once again, it is just easier to try it here before we move on to the agents.

So let’s try running

docker volume create --name my_volume -d azurefile -o share=myshare

to create a new Docker volume called my_volume, using the driver called azurefile, using an Azure File called myshare.

Next I’ll go ahead and start an interactive alpine-based container, with the newly created volume mapped to /data/

docker run -it -v my_volume:/data --rm alpine

and inside that container, I’ll go ahead and run

touch /data/test.txt
echo "Hello World" >> /data/test.txt
exit

If everything went well, you should now be able to open up the Azure Portal, browse to the Storage Account you are using and click Files and see a new file service called myshare. Inside it, you’ll find the file you just created in the container.

Ok, so that’s kind of cool, but how does that help us? Well…that is persistent storage. So if we were to map that drive as a volume on the agents, and then use that volume from the containers in the swarm, we should have a great place to store persistent data. So let’s try that!

Step 5 - Installing the driver on the agent(s)

Now that we know that it works, we need to set up the driver on each one of the agent nodes. And yes, you need to do it on each one. And yes, if you ever add more agents, you need to set it up on those as well…

Note: This process could probably be simplified quite a bit by using some smart scripts, but I only have a single agent, so I’ll just go ahead and do it manually in this case.

The first thing we need to figure out is how we can connect to the agent nodes. They aren’t set up to be available over SSH from the internet. However, they are open to connect to over SSH from the master node, so that’s what I’ll do. But..to be able to do that, we need to have the private key for the SSH. So I’ll start out by disconnecting the current connection to the master by running

exit
exit

The first one exits out of the root shell, and the second exits from the SSH connection.

Next I’ll run

scp C:\\Users\\chris\\.ssh\\id_rsa zerokoll@chrisacstestmgmt.westeurope.cloudapp.azure.com:~/.ssh/id_rsa

This uses secure copy to copy my private key, id_rsa, from my local machine to the master node, placing it in the /.ssh/ directory. I can then use that key when connecting from the master to the agent.

Next, I re-open the SSH connection to the master node

ssh -p 2200 zerokoll@chrisacstestmgmt.westeurope.cloudapp.azure.com

and once I’m connected, I need to set the permissions to the id_rsa file by calling

chmod 600 ~/.ssh/id_rsa

This sets the permissions required to give the owner of the file is read and write access

With that in place, we need to figure out where my agents are located. This can be done by calling

docker -H 127.0.0.1:2375 info | grep -oP '(?:[0-9]{1,3}\.){3}[0-9]{1,3}'

This asks for the swarm master container’s information, and then uses grep to get the IP addresses based on a Regex. In my case, I get a single address, 10.0.0.5.

Now that I know where my agent is, I can start setting up the Azure File Volume Driver on it. So I’ll go ahead and run

scp /usr/bin/azurefile-dockervolumedriver zerokoll@10.0.0.5:~/
ssh zerokoll@10.0.0.5 sudo mv azurefile-dockervolumedriver /usr/bin/
scp /etc/default/azurefile-dockervolumedriver /etc/init/azurefile-dockervolumedriver.conf zerokoll@10.0.0.5:~/

to copy all the required files to the agent node. Unfortunately 2 of the files are named the same, so I have to start out by copying the driver and moving that to the correct location, before copying the rest of the files. I can then connect to the agent and run the following commands

ssh zerokoll@10.0.0.5
sudo -s
mv azurefile-dockervolumedriver.conf /etc/init/
mv azurefile-dockervolumedriver /etc/default/
chmod +x /usr/bin/azurefile-dockervolumedriver
initctl reload-configuration
initctl start azurefile-dockervolumedriver
exit

As you can see, I start out by connecting to the agent node using SSH. Then I elevate my privileges to root before moving the config and init files to their final locations, and setting execute permission to the driver executable. Finally I reload the configuration and start the azurefile-dockervolumedriver service before exiting the elevated shell.

Now that the driver is up and running, we can verify that the driver works by running

docker volume create --name my_volume -d azurefile -o share=myshare

to create a volume connected to the Azure File we used earlier, and then

docker run -it -v my_volume:/data --rm alpine

to start a new interactive alpine-based container. Once inside the container, we can run

cat data/test.txt

If everything works as it should, this should output “Hello World”. The content we put into that file when we tried out the driver from the master host.

From here, you should now be able to deploy containers to the cluster, using the Azure File Volume driver to connect the volumes to persistent storage. Just remember to repeat the process for all the agents if you have more than one. Otherwise you will get issues when you try running a container on a host that doesn’t have the Azure File Volume driver installed and configured.

That’s pretty much it for this time! The whole set up is a bit convoluted, and could probably be simplified a bit using some smart scripts, but as a demo of how it works, I think this works…