Open-sourcing our dashboard for Apache Oozie

At Foursquare, we use Apache Oozie to manage the scheduling and control of our offline data processing workflows. We’ve had great success with the project, and we run upwards of 1000 Oozie workflows per day.

Despite the quality of Oozie’s core workflow engine, the web UI is a little clunky, and is franky unusable in a lot of circumstances, especially when you’re using it at a moderate scale.

Thankfully, every few months, Foursquare hosts internal hack days. The idea is that every engineer has 1-2 days to build something cool that they otherwise wouldn’t have a chance to work on.

This time, we decided to build a bunch of tooling for Oozie. One particular tool was a new web dashboard. It’s built using Scalatra and Twitter Bootstrap, and we unimaginatively called it “Oozie Web.”

The dashboard behaves like a normal website (unlike the default dashboard), so we could integrate a bunch of features that were unavailable to us in the bundled dashboard. Specifically:

  • Unique URL’s for coordinators, and workflows
  • Proper ordering of coordinator / workflow actions
  • Syntax highlighting of job definition and configuration files
  • Coordinator actions link to their corresponding workflows
  • Workflow actions link to their corresponding hadoop jobs
  • Re-run failed coordinator actions with a single click.
  • A better search implementation that matches substrings in workflow names

We’ve been using Oozie Web internally for a couple of months now, so we figured it was about time to make the project open-source and give back to the community. We’re releasing the project under the Apache 2.0 license, and it’s available right now on github: http://github.com/foursquare/oozie-web

- Joe Crobak (@joecrobak), Joe Ennever (@TDJoe), and Matthew Rathbone (@rathboma)