5 December 2022

Running a Windows container with gMSA on a non-domain joined host

Using gMSA to let Windows containers run under a domain context can be really useful.

However, it can also be a bit complicated to set up. Especially if you intend to run the container on a non-domain joined host.

Having just set this up, I thought it might be worth a blog post, as the documentation for it is not the most amazing to be honest. And the people doing it seem to be few, so there is not a lot of help to get on the old interwebs.

But first…what is gMSA, and what does it do?

Introduction to gMSA

gMSA stands for group Managed Service Account. It is a Windows feature that allows a domain admin to create a non-personal domain account that can provide a security context for non-interactive services. On top of that, they provide automatic password management that makes them more secure, than having to manage credentials for these services manually.

Basically, this allows a service (or container) to have an identity that is completely managed by the system after it has been set up. This makes for a secure way to manage the authentication, with very little load on the administrator.

Note: I am not a server admin, I’m a developer. So, if you want to know more about gMSA, I suggest that you read the docs instead of listen to me to be honest.

You can read more about it here: Group Managed Service Accounts Overview

In my case, I needed to use this feature to allow my Windows containers to run under a gMSA account to enable access to domain resources..

Setting up gMSA

Once again, I’m not an admin guy, so I know very little about this part, unfortunately. However, setting up the actual gMSA account seems to be fairly well documented here: Getting Started with Group Managed Service Accounts.

And since I can’t tell you anything more useful than that link, I will just move on to the container part of setting it up.

Running containers in a gMSA context

Once you have a gMSA account set up, you need to tell Docker that you want to run your container under this context. This in itself is fairly easy to do.

Docker has a parameter called --security-opt, which can be provided when executing docker run. It allows you to, among other things, pass a path to something called a “credential spec”.

A “credential spec”, is a JSON file that contains the information required to set up the gMSA context for the container. It can be generated by running a PowerShell command called New-CredentialSpec. At least if you are running on a domain joined machine.

The file comes out looking something like this

{
  "CmsPlugins": [ "ActiveDirectory" ],
  "DomainJoinConfig": {
    "Sid": "S-1-5-21-XXXXXXXXXX-XXXXXXXXXX-XXXXXXXXXX",
    "MachineAccountName": "gMSA ACCOUNT NAME",
    "Guid": "<DOMAIN GUID>",
    "DnsTreeName": "<DOMAIN DNS NAME>",
    "DnsName": "<DOMAIN DNS NAME>",
    "NetBiosName": "<DOMAIN NETBIOS NAME>"
  },
  "ActiveDirectoryConfig": {
    "GroupManagedServiceAccounts": [
      {
        "Name": "<gMSA ACCOUNT NAME>",
        "Scope": "<DOMAIN DNS NAME>"
      },
      {
        "Name": "<gMSA ACCOUNT NAME>",
        "Scope": "<DOMAIN NETBIOS NAME>"
      }
    ]
  }
}

When you spin up the container, Docker will use this file to figure out how to retrieve the credentials for the principle under which the container should run.

Note: This requires the host to be domain joined, and have permission to retrieve the credentials. I’ll get back to what you need to do if you are running a non-domain joined host in a little while.

The --security-opt parameter takes a path to a credentials file. However, it is definitely worth noting that this path is relative to a CredentialSpecs directory in the Docker root folder. By default, the path to that folder is C:\ProgramData\docker\CredentialSpecs.

This means that if your file is called my_credspec.json, it should be located at C:\ProgramData\docker\CredentialSpecs\my_credspec.json, and the Docker command should look like this

> docker run --security-opt "credentialspec=file://my_cred_spec_file.json" -it mcr.microsoft.com/windows/servercore:ltsc2022 powershell

Once the container is up and running, you should be able to verify that the container runs under in domain context by running

> nltest /sc_verify:<DOMAIN NAME>

> nltest /query

Both of these commands should return proper responses if everything is working as it should…

As there is obviously network communication going on here, you need to make sure that the following ports are open from the host to the domain controller:

Protocol	Port	Purpose
TCP/UDP	53	DNS
TCP/UDP	88	Kerberos
TCP	135	RPC
TCP	139	NetLogon
TCP/UDP	389	LDAP
TCP	445	SMB (NET LOGON)
TCP	636	LDAP SSL
TCP	1024-5000	Dynamic Ports
TCP	9389	AD Web Services
TCP	49152-65535	Dynamic Ports
ICMP	-	ICMP

Not all of these ports seem completely well documented on the interwebs. However, through some trial and error, these seem to be the ones needed.

Note: An interesting thing to note, is that if you are missing some of the ports, the VM will actually crash and reboot instead of giving you an error…

Ok…so this is all fine and dandy! And it definitely doesn’t look that hard at all. And…it is pretty well documented to be honest. However, if you want to do this without a domain joined host, it becomes a bit more complicated. And…a lot less well documented…

Running containers in a gMSA context with a non-domain joined host

The problem with running this on a non-domain joined host, is that the host cannot retrieve the gMSA credentials for the container, as this requires access to the domain. To fix this, Windows has a feature called the Container Credentials Guard (CCG). This is a service that is built specifically to retrieve the gMSA credentials for your containers. However, out of the box, it doesn’t actually do that much…

The CCG knows how to retrieve the gMSA credentials from the domain. And it knows how to use them to set up a gMSA context for a container. But…it doesn’t know how to retrieve the domain credentials needed to retrieve the gMSA credentials.

It obviously can’t use the host, as it has no knowledge of the domain. So instead, it relies on a COM+-based plug-in architecture that allows developers to write plug-ins that can retrieve the credentials to use when communicating with the AD.

This is actually quite smart, as it allows companies to store the credentials to access the AD in whatever store they want, and still rely on the CCG to make everything else work. However, on the other hand, it also requires us developers to build integrations. And that doesn’t seem to be the most common thing to do…

For me, since the containers were going to run on VMs in Azure, the choice of credential store was pretty obvious…Azure Key Vault! Not only does Key Vault store the credentials safely, it also allows me to use a managed identity to access it.

Trying to get a CCG plug-in for Key Vault

And what do you do when you need a CCG plug-in for Azure Key Vault? Well, you google “ccg plugin key vault”. And lo and behold, I found a repo called Azure-Key-Vault-Plugin-gMSA. And on the box, it looked exactly like what I needed. Not to mention that it was an official Microsoft repo. Bingo! I’m using that!

Unfortunately, there are no releases for the project. So there was no little DLL that I could simply download and use. Instead, I had to clone the project and try to build it.

The problem with that, is that I’m not a C++ developer. So, trying to get it compiled was a bit complicated to be honest. But that wasn’t just because I’m not a C++ developer. It was mostly because it uses an outdated vcpkg version that causes the build to fail… And since I don’t know the C++ space, I ended up abandoning the idea of building it myself.

Note: If I had looked at the GitHub issues, I might have noticed that there is an issue with the title “vcpkg version 2021.05.12 in restore.cmd out of date”, which might actually explain how to get it working. But I didn’t…

However, I figured out that this feature was actually used by Azure AKS. So, I spun up an AKS cluster, added a Windows node to it, ran some weird PowerShell commands to enable gMSA, and then downloaded the DLL from that node. But once again, I fell short.

I’m not quite sure why, but I simply couldn’t get that plug-in to work for some reason. And the debugging experience is horrible. Yes, there are some entries in the event log, but they aren’t very helpful to be honest.

I think it might have been some encoding issue when talking to Key Vault, but I’m not sure. Anyhow, I couldn’t get it to work….so, I gave up on that idea as well.

Last resort, building my own CCG plug-in

However, while scouring the web for any clues about what I was doing wrong, and there is very few of those available on the interwebs unfortunately, I found a GitHub project called macsux/gmsa-ccg-plugin. It is basically a proof of concept for createing a C#-based CCG plug-in.

So, with that code as a baseline, I decided to try and build my own plug-in. This would allow me to write it in a language I know, and also let me tweak it and play around with it until it worked.

Note: I’m normally not a fan of building things like this myself. It is not my core competence, and I definitely don’t need more code to maintain. However, in this case, I think it made sense. Since I couldn’t get the pre-compiled plug-in to work, or the unmaintained C++ version to build, I didn’t have much choice. At least this would hopefully give me something that worked…

Building the plug-in is actually fairly simple. The only problem is the COM+ interop stuff. But luckily, the GitHub repo I had found, had already sorted most of that out. The only thing I needed to do, was to implement a method called GetPasswordCredentials. It’s the method responsible for retrieving the credentials that the CCG should use while retrieving the gMSA credentials. The method takes one input parameter, and 3 output parameters. Like this

public void GetPasswordCredentials(
            [MarshalAs(UnmanagedType.LPWStr), In] string pluginInput,
            [MarshalAs(UnmanagedType.LPWStr)] out string domainName,
            [MarshalAs(UnmanagedType.LPWStr)] out string username,
            [MarshalAs(UnmanagedType.LPWStr)] out string password)
{
  // Implementation
}

The input is a simple string that allows us to “configure” the plug-in.

In my case, as I wanted to retrieve the credentials from Azure Key Vault using a managed identity, the input would have to contain the name of the Key Vault, the client ID of the managed identity to use, and the name of the Key Vault Secret that contained the credentials to retrieve. So, I decided to format the input like this: keyVaultName=<KEY VAULT NAME>;clientId=<CLIENT ID>;keyVaultSecret=<SECRET NAME>[,logFile=<PATH TO LOG FILE>]

Note: I added a logFile entry to enable logging to a file for debugging purposes. If it is left out, no logging happens. But if you add in a path to a file, it will log pretty much everything that happens to that file. And yes…it appends to the file, so don’t leave logging on…

The first part of the implementation is just parsing this input. It looks like this

public Config ParseInput(string input)
{
    var entries = input.Split(';').ToDictionary(str => str.Split('=')[0].ToUpper(), str => str.Split('=')[1]);

    if (entries.Count() > 4)
    {
        throw new Exception("Invalid configuration");
    }

    var config = new Config
    {
        KeyVaultName = entries.ContainsKey("KEYVAULTNAME") ? entries["KEYVAULTNAME"] : throw new Exception("Missing keyVaultName config"),
        KeyVaultSecretName = entries.ContainsKey("KEYVAULTSECRET") ? entries["KEYVAULTSECRET"] : throw new Exception("Missing keyVaultSecret config"),
        ClientId = entries.ContainsKey("CLIENTID") ? entries["CLIENTID"] : throw new Exception("Missing clientId config"),
        LogFile = entries.ContainsKey("LOGFILE") ? entries["LOGFILE"] : null
    };

    return config;
}

I wanted to make sure that any missing parameter threw an exception right at the beginning, instead of later on when the config was being used. That’s why there is so many conditionals in the code above.

Next, I needed to get an access token that would allow me to access the Key Vault. This is actually fairly easy when you use managed identities. As soon as you add a managed identity to your Azure service, you can simply retrieve an access token by making an HTTP call to the IP address 169.254.169.254.

In this case, the call looks like this

GET metadata/identity/oauth2/token?api-version=2018-02-01&resource=https://vault.azure.net&client_id=<CLIENT ID> HTTP/1.1
Host: 169.254.169.254
Metadata: true

This will return an access token for the defined client ID and resource (in this case https://vault.azure.net).

Note: This assumes that there is a user-assigned managed identity with the provided client ID for the current VM

Comment: And don’t miss the metadata header in the call. Otherwise, it won’t work…

However, as I couldn’t get my plug-in to load 3rd party assemblies for some reason, I didn’t have the luxury of using JSON.NET. So, I had to resort to some fairly crude, but working, string manipulation to get the access token.

Comment: I’m not sure why the assembly loading didn’t work. I assume it has to with it being a COM-component, and I haven’t done COM interop for the last 15-20 years. But the JSON parsing I needed was very simple, so I didn’t spend too much time looking into it either…

Lacking JSON.NET, I ended up with this

response = httpClient.GetStringAsync(tokenEndpointUri).Result;
var responseValues = response.Trim('{').TrimEnd('}').Split(',');
var tokenValues = responseValues[0].Split(':');
return tokenValues[1].Trim().Trim('"');

Remove the surrounding curly braces, and split it on comma. Then split the first value on :, and use the second value, after trimming off whitespace and ". It’s ugly, but it works!

Once I had the access token, it was just a matter of getting the secret from the Key Vault by doing another HTTP request. In this case

GET /secrets/<SECRET NAME>?api-version=7.3 HTTP/1.1
Host: https://<KEY VAULT NAME>.vault.azure.net
Authorization: <TOKEN FROM PREVIOUS CALL>

And once again, a bit of string manipulation gave me what I wanted.

var responseValues = response.Trim('{').TrimEnd('}').Split(',');
var secretValueString = responseValues.First(x => x.Trim().StartsWith("\"value\":"));
var secret = secretValueString.Substring(secretValueString.IndexOf(":") + 1).Trim().Trim('"').Replace("\\\\", "\\");

It might be worth noting the Replace("\\\\", "\\"). This is needed as the JSON that comes back from Key Vault has \ escaped as \\. So, the double \ need to turned into a single one. And with C# escaping that character as well, the code ends up looking a bit funny…

The contents of the secret should conform to the following format: <DOMAIN>\<USERNAME>:<PASSWORD>. So, a “simple”

var separatorIndex = secret.IndexOf(':');
var usernameParts = secret.Substring(0, separatorIndex).Split('\\');

domainName = usernameParts[0];
username = usernameParts[1];
password = secret.Substring(separatorIndex + 1);

to retrieve the values as we need them.

The full code is available on GitHub - FiftyNine.CCG.KeyVault!

Registering the plug-in

The next step is to register the assembly as a COM+ component, and as a CCG plug-in.

The first part is fairly simple. You just need to copy the DLL-file to a directory on the server, and then use regsvcs.exe to register it.

regsvcs.exe FiftyNine.CCG.KeyVault.dll

Note: Remember that registering a COM component with regsvcs.exe requires the assembly to be strong named. Because of this, the repo contains a signing key that is used to strong name the assembly during builds.

However, there is a little caveat. When you register an assembly as a COM component, it defaults to using an interactive user account. This means that the component will only be able to initialize if there is a user logged into the machine. And in my case, as it was going to be an automated build server, this wasn’t going to work. Instead, the COM registration needs to be updated to use for example the built-in NETWORK SERVICE account instead.

You can do this by using the Component Services MMC snap-in. Just navigate to the COM+ Applications folder. Right-click on the plug-in assembly, and select Properties. In the Identity tab, select the account you wish to use, and press OK to save the changes.

Or, you can simply use a bit of PowerShell to accomplish the same thing

$comAdmin = New-Object -comobject COMAdmin.COMAdminCatalog
$apps = $comAdmin.GetCollection("Applications")
$apps.Populate()
$app = $apps | Where-Object {$_.Name -eq "FiftyNine.CCG.KeyVault"}
$app.Value("Identity") = "NT AUTHORITY\NetworkService"
$apps.SaveChanges()

Unfortunately, it isn’t enough to just register the assembly as a COM component. You also need to register it as a CCG plug-in. This is done by adding a couple of keys to the registry. Unfortunately, the parent keys are set up in a way that makes it impossible for us to edit them straight away. Because of this, we first need to update the ownership before we add the keys that need to be added. And then remember to reset the ownership when we are done.

To simplify all of this, the FiftyNine.CCG.KeyVault repo contains a install-plugin.ps1 script that does everything from copying the DLL and registering as a COM component, to adding the required registry entries. Feel free to look through it if you are curious to what it does.

Note: It might be worth mentioning that the PowerShell script is just a slightly modified version of the one used by the Azure-Key-Vault-Plugin-gMSA project.

Configuring the use of the CCG plug-in

Once the plug-in has been properly registered, and all the required Azure resources have been set up, it is just a matter of telling the CCG that you want to use this plug-in when setting up the gMSA context. This is done by adding another JSON block, called HostAccountConfig, to the credentials spec file.

{
  ...
  "ActiveDirectoryConfig": {
    ...,
    "HostAccountConfig": {
      "PluginGUID": "{f919de1a-efc4-4902-b7e5-56a314a87262}",
      "PluginInput": "keyVaultName=<KEY VAULT NAME>;clientId=<CLIENT ID>;keyVaultSecret=<SECRET NAME>[,logFile=<PATH TO LOG FILE>]",
      "PortableCcgVersion": "1"
    }
  }
}

As you can see, it is just a matter of adding the extra HostAccountConfig entry to the ActiveDirectoryConfig, and defining 3 properties.

The first property is the GUID of the plug-in to use, which is actually defined in the C# code like this

[Guid("f919de1a-efc4-4902-b7e5-56a314a87262")]
[ProgId("CcgCredentialsProviderProvider")]
public class CcgCredentialsProviderProvider: ServicedComponent, ICcgDomainAuthCredentials
{
  ...
}

Next up, is the input to the plug-in. This obviously differs depending on what the plug-in expects, but for this plug-in, it should look like it does in the sample above.

And finally, you need to set the PortableCcgVersion to 1.

That’s it! Once that is in place, you should be able to run

> docker run --security-opt "credentialspec=file://my_cred_spec_file.json" -it mcr.microsoft.com/windows/servercore:ltsc2022 powershell

And if you have any problems, you can just add the logFile parameter to the PluginInput to get some logs that allow you to see what is happening.

Conclusion

This feature is obviously very specific, and won’t be needed by everyone and their uncle. But for those of us that need it, it is extremely useful. And with there being very little documentation about it, I hope that this post can be useful for someone at some point. If nothing else, it will work as a reminder to myself, about how everything works when I need to update something in 6 months time.

If you have any questions, feel free to reach out! I’m available @ZeroKoll as usual!

And the source code for the plug-in is available on GitHub - FiftyNine.CCG.KeyVault!

Finally, I want to give a big shout out to Andrew Stakhov (@andrewstakhov - stakhov.pro) for the PoC code that got me through this whole thing! It had everything I needed to get going, and some good comments about what he had tried in the documentation as well. So, thank you so much Andrew for letting me use the code as the base for my plug-in!