December 20, 2021

Accessing the macOS GUI in Automation Contexts

Although macOS always comes with a GUI, it is not always (fully) accessible. This can especially trip you up when trying to get any automation to work that needs to access the GUI, like automated GUI tests in CI pipelines.

People often try to apply their Linux or BSD knowledge to macOS, thinking that macOS is a BSD derivative after all. Some even try to use Xvfb. Please forget about all that. The window server of macOS is based on Quartz and does not use the X Window System. Furthermore, Mach bootstrap namespaces restrict the capabilities of a process in addition to the classical UID/GID mechanism.

This article is based on macOS 12.0.1. All techniques presented here should work with Mac OS X 10.3 and newer.

A Little Background On Mach Bootstrap Namespaces

As it turns out, those Mach bootstrap namespaces are kind of important when it comes to GUI access. Unfortunately, information about Mach bootstrap namespaces seems to be scarce and outdated. The best resource I could find is Technical Note TN2083 which dates back to Mac OS X 10.5 that was released in 2007. The following is a summary of the relevant parts. I tried my best to verify and update it.

As already mentioned, Mach bootstrap namespaces are a layer separate from UID/GID that can further curtail the capabilities of a process on macOS. There are three levels of bootstrap namespaces:

  • the (single) system bootstrap namespace
  • a bootstrap namespace for each user
  • a number of per-session bootstrap namespaces of each user.

Their hierarchy looks like that:

System bootstrap namespace
|
+----- Per-user bootstrap namespace
       |
       +----- Per-session bootstrap namespace

Each namespace can “see” and interact with the upper namespaces but not with adjacent or lower namespaces. Interaction with lower or adjacent namespaces is only possible if they advertise their services to upper namespaces.

There is a single system bootstrap namespace that is created during the boot process. It lives until the machine is rebooted or shut off.

A per-user bootstrap namespace is created for each user as soon as the respective user logs in, either through SSH or by using the GUI. It exists until the user logs out. If the user is logged in both through SSH and by using the GUI, a single per-user bootstrap namespace exists until the user has logged out both on the GUI and on the SSH terminal. There are exceptions, but those are irrelevant for this use case.

Then there are the per-session bootstrap namespaces. macOS distinguishes two types of per-session bootstrap namespaces: GUI per-session bootstrap namespaces and non-GUI per-session bootstrap namespaces. GUI per-session bootstrap namespaces are created by loginwindow. They are also called Aqua sessions. Only Aqua sessions allow unfettered access to the GUI. Non-GUI per-session bootstrap namespaces are created by launchd if the daemon or agent has the SessionCreate property set.

When you start a program in a specific namespace, that program is bound to the same namespace. You cannot switch to a different namespace using sudo because sudo is operating on UID/GID and not on the Mach layer. As mentioned before, it is possible to do things in another namespace if a service from that namespace advertises itself to other namespaces. Launch Services are such an example. They allow the command line utility open to start programs in another namespace, which is incredibly useful, as we will see.

Getting Session and Namespace Information

There are a few tools that help you determine what kind of sessions and namespaces are available and whether you have access to a GUI or not.

The most comprehensive information can be gleaned from launchctl print. launchctl print system prints all the information about the system bootstrap namespace, launchctl print user/501 prints all the information about the per-user bootstrap namespace of UID 501, including the type of session (session = Background), the security context, and available subdomains. If there is a GUI, it will display an entry like:

subdomains = {
	gui/501
}

You can get additional information about that GUI session with launchctl print gui/501. It will also print session = Aqua.

Probably more helpful for day-to-day use, debugging and querying the capabilities available to a running program are SessionGetInfo from the Security framework and CGSessionCopyCurrentDictionary() from Quartz Display Services. They are the tools recommended by Apple to get session information.

To query the Security framework, save the following code as sessioninfo.c and compile with clang -framework Security -o sessioninfo sessioninfo.c:

#include <Security/Security.h>

int main() {
	
	OSStatus error;
	SecuritySessionId mySession;
	SessionAttributeBits sessionInfo;

	error = SessionGetInfo(callerSecuritySession, &mySession, &sessionInfo);
	if (error != noErr) {
		printf("Could not get session info\n");
		return 1;
	}
 
 	printf("Session ID: %u\n", mySession);
	printf("sessionIsRoot: %s\n", sessionInfo & sessionIsRoot ? "true" : "false");
	printf("sessionHasGraphicAccess: %s\n", sessionInfo & sessionHasGraphicAccess ? "true" : "false");
	printf("sessionHasTTY: %s\n", sessionInfo & sessionHasTTY ? "true" : "false");
	printf("sessionIsRemote: %s\n", sessionInfo & sessionIsRemote ? "true" : "false");

	return 0;
}

When the bit sessionHasGraphicAccess is set, you are in an Aqua session. There is also a Gist with a Python version.

Querying Quartz Display Services is even possible with a one-liner as long as Python 2.7 including PyObjC comes preinstalled with macOS (C code):

$ python -c "import Quartz; print(Quartz.CGSessionCopyCurrentDictionary())"

CGSessionCopyCurrentDictionary() will merely tell you whether there is a running Aqua session for the current user, not whether you are in an Aqua session. In all other cases, it will print None.

When There Is a GUI, and When There Is Not

If no user has logged into the machine by using the GUI, there is no GUI. And you cannot do anything about it except logging in by using the GUI, either manually or by enabling automatic login.

If jane has previously logged into the machine by using the GUI, only jane can access the GUI, but nobody else, because jane’s GUI is living in jane’s per-session bootstrap namespace that is only accessible to jane. No amount of sudo trickery will allow anyone else to access janes GUI.

If joe needs a GUI, joe needs to log in by using the GUI, too. Of course, joe can only access joe’s GUI because joe GUI is living in joe’s per-session bootstrap namespace.

If you want to see it for yourself, reproduce the various situations and run the diagnostic utilities I provided above.

Launch Daemons and Launch Agents

When it comes to background processes, macOS distinguishes between Launch Daemons and Launch Agents.

  • Launch Daemons cannot access the GUI because they are started before any user has logged in and, thus, before any GUI has been started.
  • Launch Agents can access the GUI under the provision that they are configured to require an Aqua session. They are started after the user has logged into the GUI, either manually or by enabling automatic login.

SSH

With SSH, it gets a little murky. As per the theory about namespaces from above, the GUI should not be accessible from SSH. Remember the bootstrap namespace hierarchy:

System bootstrap namespace
|
+----- Per-user bootstrap namespace (SSH without SessionCreate)
       |
       +----- Non-GUI per-session bootstrap namespace (SSH with SessionCreate)
       |
       +----- GUI per-session bootstrap namespace

When you SSH into macOS, you either end up in the per-user bootstrap namespace or in a non-GUI per-session bootstrap namespace if the SessionCreate property is set (it is not in recent versions of macOS). But to access the GUI, you would have to enter the GUI per-session bootstrap namespace. Luckily, you can, with the help of Launch Services and the utility open:

jane@test ~ % open -a TextEdit.app

Under the provision that jane previously logged into the GUI to create an Aqua session, TextEdit will open. If you run my small sessioninfo program from above, it will even report that you have full access to an Aqua session:

jane@test ~ % open -a Terminal.app ./sessioninfo

The output displayed by Terminal on the screen:

jane@test ~ % /Users/jane/sessioninfo ; exit;
Session ID: 100004
sessionIsRoot: false
sessionHasGraphicAccess: true
sessionHasTTY: true
sessionIsRemote: false

It does not seem to be possible to launch arbitrary binaries with open. You need an actual macOS application that acts as a helper. Terminal is a good choice because it can run entire shell scripts, too. However, the output is not printed to stdout of the open SSH session but displayed on the GUI. That may be inconvenient.

It is also possible to access the GUI directly from SSH under the provision that the SSH login user previously logged into the machine using the GUI. According to Technical Note TN2083, this is possible because the window server advertises its service to upper namespaces. And, indeed, CGSessionCopyCurrentDictionary() reports that a GUI session is available:

jane@test ~ % python -c "import Quartz; print(Quartz.CGSessionCopyCurrentDictionary())"
{
    kCGSSessionAuditIDKey = 100047;
    kCGSSessionGroupIDKey = 20;
    kCGSSessionLoginwindowSafeLogin = 0;
    kCGSSessionOnConsoleKey = 1;
    kCGSSessionSystemSafeBoot = 0;
    kCGSSessionUserIDKey = 502;
    kCGSSessionUserNameKey = jane;
    kCGSessionLoginDoneKey = 1;
    kCGSessionLongUserNameKey = "Jane Appleseed";
    kSCSecuritySessionID = 100047;
}

However, this approach is not without drawbacks. The biggest: You do not get an Aqua session.

jane@test ~ % ./sessioninfo
Session ID: 958
sessionIsRoot: false
sessionHasGraphicAccess: false
sessionHasTTY: true
sessionIsRemote: true

What it means not to have an Aqua session is not entirely clear to me. The newest information I could find on this topic is this message from 2013 by Apple’s Mike Swingler to the OpenJDK mailing list. In short, expect the GUI to exhibit funky behaviours as showcased by Technical Note TN2083.

Fast User Switching and Screen Lock

A GUI session remains available even while the screen is locked. The same is true if another user logs into the machine through the GUI using Fast User Switching. Only a full logout ends the GUI session.

How Screen Sharing Fits into the Picture

Whether you use Screen Sharing (VNC) or not does not matter. Screen Sharing merely televises the remote display to your local screen.

Strategies For GUI Interaction in Automation

Finally, it is time to apply our newly learnt knowledge!

Remote Management Using SSH, Including Ansible

Some remote management tasks require a GUI session. For example, I sometimes experienced that the Xcode installation stalled without a GUI. If you have such tasks, proceed as follows:

  1. Enable automatic login for the SSH user that is used to remotely log into the machine. There can only be one.
  2. Directly perform the desired task. Only resort to using open and Launch Services if you really need an Aqua session.
If you connect to the machine with ssh [email protected], you must enable automatic login for jane. You cannot sudo into Jane’s GUI from another user account.

If you feel uneasy about enabling automatic login, please see the section Security Considerations.

Continuous Integration with Local Agents

This is the recommended approach for all CI servers supporting locally installable agents. This includes Jenkins, TeamCity, and AppVeyor Server. If you can choose between locally installable agents and SSH, like with Jenkins, use the locally installable agent.

  1. Enable automatic login for the user that is used to run the CI agent, typically jenkins.
  2. Create a Launch Agent for the user jenkins that starts the Jenkins agent. Ensure that the Launch Agent requires the session type Aqua.

The following plist file defines a Launch Agent for Jenkins’ agent:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
	<dict>
		<key>Label</key>
		<string>io.jenkins.agent</string>
		<key>OnDemand</key>
		<false/>
		<key>ProgramArguments</key>
		<array>
			<string>/usr/bin/java</string>
			<string>-jar</string>
			<string>/Users/jenkins/agent/agent.jar</string>
			<string>-jnlpUrl</string>
			<string>http://192.168.64.1:8080/computer/macos%2D12/jenkins-agent.jnlp</string>
			<string>-secret</string>
			<string>4c75a38aed67643d8ebe7e3a943aa93425340c65f0d3dd610019ffef1b82b8df</string>
			<string>-workDir</string>
			<string>/Users/jenkins/workspace</string>
		</array>
		<key>StandardErrorPath</key>
		<string>/Users/jenkins/agent/logs/jenkins.err.log</string>
		<key>StandardOutPath</key>
		<string>/Users/jenkins/agent/logs/jenkins.out.log</string>
		<key>KeepAlive</key>
		<true/>
		<key>RunAtLoad</key>
		<true/>
		<key>LimitLoadToSessionType</key>
		<array>
			<string>Aqua</string>
		</array>
	</dict>
</plist>

Adjust the configuration as necessary (especially the IP addresses, paths, and the secret). See Apple’s Daemons and Services Programming Guide for more information about the configuration file format. Save this file to ~/Library/LaunchAgents/io.jenkins.agent.plist. Then load it with:

jenkins@test ~ % launchctl load ~/Library/LaunchAgents/io.jenkins.agent.plist

In the future, the Jenkins agent will be automatically started after the user jenkins has logged into the GUI.

When I run the sessioninfo program from above as part of a Jenkins job, it confirms that we have access to a full Aqua session:

Started by user admin
Running as SYSTEM
Building remotely on macos-12 in workspace /Users/jenkins/workspace/workspace/test
[WS-CLEANUP] Deleting project workspace...
[WS-CLEANUP] Deferred wipeout is used...
[WS-CLEANUP] Done
[test] $ /bin/sh -xe /var/folders/nx/7lyh9gmn1dn2qdvrt4_m1tfm0000gq/T/jenkins7522466743682245704.sh
+ /Users/jenkins/sessioninfo
Session ID: 100005
sessionIsRoot: false
sessionHasGraphicAccess: true
sessionHasTTY: true
sessionIsRemote: false
Finished: SUCCESS

If you are using TeamCity, this Gist provides a ready-made agent configuration.

Continuous Integration with SSH

Only use this method if your CI software does not support locally installable agents.

The first step is to enable automatic login for the SSH user that your CI system uses to connect.

The second step depends on your needs. If you do not need an Aqua session, go ahead and run your CI jobs as you typically do. According to Apple, it is good enough for xcodebuild and all kinds of UI tests.

If you need an Aqua session, for example, for Java, some mind-boggling trickery is needed.

  1. As part of the build job, write a shell script on disk (for example, to a temporary file, tmp.VM0IvH0f in the example) that invokes the build step that requires a GUI with an Aqua session. As an example, I run sessioninfo from above:

    #! /usr/bin/env bash
    
    /Users/jenkins/sessioninfo
    
    kill -9 $(ps -p $(ps -p $PPID -o ppid=) -o ppid=) 
    
  2. Make the script executable.

  3. Start this script with the help of open and Terminal.app in your build step:

    open --wait-apps --new --fresh -a Terminal.app /var/folders/nx/7lyh9gmn1dn2qdvrt4_m1tfm0000gq/T/tmp.VM0IvH0f
    

Thanks to Launch Services, the Bash script will be run within an Aqua session.

--new --fresh give use a clean new Terminal. --wait-apps is necessary to wait for the script to complete. Otherwise, open returns immediately. Unfortunately, open does not wait for the script but instead for Terminal to quit. However, Terminal does not quit upon exit. At best, it can close its windows. That is what the complex kill command is for that I have found in this excellent answer on StackOverflow. Contrary to all other solutions I have found this approach works reliably with multiple parallel invocations of open.

The only remaining question you have to answer for yourself is how to regain access to the script’s stdout and stderr because those are printed to the screen. One possible option is to write them out to temporary files, for example, with tee from within the script, and then collect them at the end of the build. The --stdout and --stderr options of open are not suitable for this purpose because they only capture Terminal’s stdout and stderr.

Security Considerations

As you might have noticed, I always recommend enabling automatic login. This has some security implications:

  • Everybody with physical access to the machine can hook up a screen, mouse and keyboard and interact with the GUI. Actions that require elevated permissions like changing system preferences still prompt for the user’s password, however.
  • The password of the user is stored in /etc/kcpassword in recoverable form.
  • Screen Sharing (VNC) still prompts for the user’s password upon establishing the connection.
  • SSH authentication is unaffected.

If this is unacceptable for you, either due to security concerns or compliance regulations, you still have options:

Enabling Automatic Login on macOS

Apple explains how to enable automatic login on macOS by using the GUI. This works great, but might not fit your bill, for example, when using automation software like Ansible.

On the command line, the following steps have to be taken:

  1. Write the name of the user for which auto login should be enabled to the global user defaults by running:

    sudo /usr/bin/defaults write /Library/Preferences/com.apple.loginwindow autoLoginUser <user>
    
  2. Save the user’s password to /etc/kcpassword in a special format.

There are various scripts and tools on the internet that encode the password into the right format for /etc/kcpassword. This article offers a version written in pure Bash. kcpassword is an all-in-one solution that both writes the password and changes the user defaults.

Further Reading