Tom BellJavaScript, Go, Swift, and Ruby Developer.

Building a Window Manager for macOS

When you talk about window managers people immediately start thinking about Linux and the vast amount of window managers available. Your traditional window manager for Linux takes care of window positioning, sizing, and things like window borders, and workspaces. They’re typically built using the libraries Xlib or XCB. A window manager for macOS typically will only manage window sizes and positions.

Why?

Why would you want to build a window manager for macOS? Ever since I started using macOS 5 years ago I used some sort of application that let me use shortcut keys to reposition and resize windows. About a year ago I was switching between using Moom and Slate as window managers. Now I don’t think I could live without such an application.

I wanted to build my own window manager for macOS, and wanted to learn Swift as well as it was a new language at the time. I couldn’t find myself liking Objective-C so having something as nice and simple as Swift was motivating. I looked at existing applications like Slate, Phoenix, and Amethyst. Ultimately I wanted to build something that sat in the status menu and had a single configuration file.

Originally my friend Martin Rue suggested using Lua as the configuration language. I liked Lua, it was a nice and simple language that was easily embeddable into projects. I eventually stumbled across JavaScriptCore and thought this would be a nicer solution for a number of reasons.

  1. Everyone can write JavaScript
  2. Part of the Cocoa libraries
  3. Simpler to include in a Swift project

I looked at existing projects for ideas about what was possible for managing windows and started planning out what I wanted my window manager to do for my needs.

How?

I’d already experimented with status bar based applications previously. So I had a bit of experience setting up the type of project I wanted.

Going a bit off topic for a moment. Back when I was thinking up of names for the project I had all sorts of crap names, and for whatever reason Martin and I got on the topic of Iron Man. I liked that idea that the Arc reactor was what powered Iron Man. I liked the name Stark, it had the metaphorical justification that made me come up with the tag line Power your window management with an Arc reactor and the dictionary meaning of stark could relate to the simplicity and only including the bare minimum feature set for the application.

Once I had a name I started coming up with ideas for the status item image, I ended up browsing Dribbble and Google for ideas. I settled on the simple “Arc Reactor” imagery, and opened Sketch and started learning how to make what I wanted.

I finally had an application running that sat in the status menu and had a Quit Stark menu item. This was it, the start of my first Cocoa macOS application.

let statusItem = NSStatusBar.system().statusItem(withLength: NSVariableStatusItemLength)

func setupStatusItem() {
  let image = NSImage(named: "StatusItemIcon")
  image?.isTemplate = true

  statusItem.highlightMode = true
  statusItem.image = image

  let menu = NSMenu()
  menu.addItem(withTitle: "Quit Stark", action: #selector(AppDelegate.quit), keyEquivalent: "")

  statusItem.menu = menu
}

The above code is responsible for setting up the status item then adding the menu to it. I choose to declaratively define the menu items in code rather than use Interface Builder because it felt a lot simpler for a simple menu like this one.

The next step for the application was to define the protocols for the JavaScript API. I wrote the function definitions for the public API that I wanted to expose to JavaScript. I had classes for windows and applications. I then needed to work on allowing the user to define key combinations. The problem with this is that it uses low level Carbon APIs, and not many people had examples of doing this in Swift, but there were lots of examples of it in Objective-C. I think I spent about a month or three experimenting with trial and error getting the best solution for shortcut keys in my application. I had a way to define shortcuts in JavaScript and log out when was pressed.

private static var __once: () = {
  let callback: EventHandlerUPP = { (handler, event, data) -> OSStatus in
    autoreleasepool {
      var identifier = EventHotKeyID()

      let status = GetEventParameter(event, EventParamName(kEventParamDirectObject), EventParamType(typeEventHotKeyID), nil, MemoryLayout<EventHotKeyID>.size, nil, &identifier)

      if status != noErr {
        return
      }

      NotificationCenter.default.post(
        name: Notification.Name(rawValue: starkHotKeyKeyDownNotification),
        object: nil,
        userInfo: [starkHotKeyIdentifier: UInt(identifier.id)]
      )
    }

    return noErr
  }

  var keyDown = EventTypeSpec(eventClass: OSType(kEventClassKeyboard), eventKind: UInt32(kEventHotKeyPressed))

  InstallEventHandler(GetEventDispatcherTarget(), callback, 1, &keyDown, nil, nil)
}()

The above code sets up an event handler for handling global key presses, and calls the callback function. The callback function will get the parameters of the event and then post a notification so it can be handled by the observers listening for that key press notification. The instances of Bind set up an observer that will check if the bind identifier sent by the callback is the same identifier handled by the bind instance.

Next I had to implement the actual managing of windows and applications. Once again, macOS doesn’t give you a simple way to do this. You have to make use of the Accessibility APIs. Once again not many people had examples of using the APIs with Swift. There was about another month of trial and error getting things done as best as I wanted. I eventually was able to move, resize, and focus windows using JavaScript.

open func windows() -> [Window] {
  var values: CFArray?
  let result = AXUIElementCopyAttributeValues(element, kAXWindowsAttribute as CFString, 0, 100, &values)

  if result != .success {
    return []
  }

  let elements = values! as [AnyObject]

  return elements.map { Window(element: $0 as! AXUIElement) }
}

The above code gets all the windows for the application specified by the property element. It will then return an array of Window instances.

open func setTopLeft(_ topLeft: CGPoint) {
  var val = topLeft
  let value = AXValueCreate(AXValueType(rawValue: kAXValueCGPointType)!, &val)!
  AXUIElementSetAttributeValue(element, kAXPositionAttribute as CFString, value)
}

open func setSize(_ size: CGSize) {
  var val = size
  let value = AXValueCreate(AXValueType(rawValue: kAXValueCGSizeType)!, &val)!
  AXUIElementSetAttributeValue(element, kAXSizeAttribute as CFString, value)
}

The former function lets the user set the top left point of a window with the given X and Y coordinates. The latter function lets the user set the size of a window with the given width and height. These are the main two functions used for positioning and sizing, and a setFrame function is available to position and size at the same time passing a CGRect argument.

I was stoked, I had enough built to be able to replace using Moom and Slate. I always feel nervous when I start running my own software, I always feel like it’s never ready. I jumped in the deep end and started running Stark. I replicated the basic window management I used with Moom and was flying. I started doing more complicated things like adding a 10 pixel margin around every window when positioning and sizing them. I also wanted varying sizes of window centered in the middle of the screen, so I created 3 shortcut keys for large, medium, and small centered windows.

Bind.on("h", MODIFIERS, function() {
  const win = Window.focused();
  if (!win) return;

  const r = win.screen.frameWithoutDockOrMenu;

  const x = r.x;
  const y = r.y;
  const width = r.width / 2;
  const height = r.height;

  win.setFrame({ x, y, width, height });
});

This binds a callback to MODIFIERS (which are ctrl and shift for me) and h. The callback will reposition and resize the window to take up the left half of the current screen the window is on. It utilises the frameWithoutDockOrMenu so that the repositioning and resizing takes into account the Dock and the menu bar.

Bind.on("z", MODIFIERS, function() {
  const win = Window.focused();
  if (!win) return;

  const r = win.screen.frameWithoutDockOrMenu;

  const x = (r.width / GRID_WIDTH) * 2;
  const y = r.y + (r.height / GRID_HEIGHT) * 2;
  const width = (r.width / GRID_WIDTH) * 8;
  const height = (r.height / GRID_HEIGHT) * 6;

  win.setFrame({ x, y, width, height });
});

This binds a callback similarly to the one above. This callback repositions and resizes the window to be 8 by 6 on a 12 by 10 grid and centered.

I had also bought some Bluetooth speakers for my setup. The problem I encountered was my MacBook Pro would always connect to them if I was in a different room, and wanted to use the internal speakers or headphones. I eventually found a simple command line program to swap output devices. I now wanted to be able to control this via Stark (naturally). I ended up adding a function to the JavaScript API to run a command with the given arguments. Now I had shortcut keys to switch between all my output devices.

Bind.on("1", MODIFIERS, function() {
  Stark.run("/usr/local/bin/soundsource", ["-o", "Internal Speakers"])
});

Bind.on("2", MODIFIERS, function() {
  Stark.run("/usr/local/bin/soundsource", ["-o", "Creative T15 Wireless"])
});

A couple of friends offered to help test the alpha and try finding any bugs with the application. I gave them both the binary and an example configuration to get started. Martin found a bug or two. One was an integer overflow when computing the hash for an object.

var hash = key.hashValue
modifiers.forEach { hash += $0.hashValue }
return hash

Which was eventually replaced with the simpler and less error prone version

let key = String(format: "%@[%@]", key, modifiers.joinWithSeparator("|"))
return key.hashValue

I think the other bug Martin found was something to do with reading the configuration file if it was symlinked and I wasn’t resolving symlinks properly.

Martin and I have probably been running the latest Stark alpha build for over 6 months now without any major issues. Since running Stark full time, I’ve been improving, removing, and fixing minor issues I’ve come across.

I originally thought I would release Stark via the Mac App Store, but this would mean I couldn’t add support for things like Spaces because it relies on a private API. I ended up using Spaces myself, which I never used to. Now I started using them for different contexts, such as social, games, personal projects, and work projects.

Now I was thinking to myself I have to add support for Spaces. I wanted to be able to easily move focused windows to other spaces without using complicated touchpad gestures. I started experimenting with the private Spaces API. Eventually I worked out how the API worked and have a plan on how I want to add the API to Stark in the future.

Future?

I’ve started thinking about the future of Stark before I’ve even technically released it publicly. I cannot help thinking about improvements and neat things I could add to Stark.

I created a new project board on GitHub to help users follow the progress of Stark publicly. I initially starting using Trello, but GitHub announced their project boards, so I was interested in giving those a shot as I already wanted to use GitHub Issues. Now I was a place to motivate myself to fix and implement features, and a place for the public to know that things are moving along with Stark.

I have 3 remaining issues open before I feel comfortable to publicly release Stark for everyone. I want to make 100% certain I’m happy with the initial public JavaScript API as I don’t want to make major changes once it’s released. The next issue is about the internal JavaScript library that helps provide some simple things best done in JavaScript rather than Swift. I’m looking for a nice way to minify and concatenate the ES5 JavaScript. I am currently using uglify-js but this doesn’t support ES6 which I will be switching to as soon as JavaScriptCore fully supports it. The final issue is splitting out the task running function into its own object.

Once these have been finished, I’ll feel comfortable with releasing Stark to the public for feedback. For a follow up release I would like to include support for Spaces.

Another future idea I would like to try out as a client and server model. Whereby there is a server listening on a socket for incoming commands and the server will execute all the commands. The client would be another application that could use JavaScript, Lua, or a text based configuration for executing the client application to send the commands to the server. This is similar to how bspwm and kwm work. I think this is also great because it will let other people build a client to work with Stark even if they don’t want to use JavaScript themselves for configuration.