Replacing cron with scheduled tasks in Airplane
Cron is one of the most common ways developers run tasks on a recurring basis. For example, scheduling a backup every day or running a monitoring script every 10 minutes are common use cases for cron.
Unfortunately, users of cron tend to run into a few problems:
- Logging: if you want a log of every time a script has run through cron, and the log outputs of each of those runs, you need to explicitly add that logic yourself.
- Handling failures: If a run fails, you need to explicitly add logic yourself that Slacks or emails you of that failure.
- Lack of UI: There's no central place to see all the cron jobs you have running. This becomes especially cumbersome in a larger engineering team.
- Manual runs: It's difficult to run a cron job manually as a one-off (not on the schedule) which often comes up when you want to re-run a failed task. For example, running the script from your own shell is not necessarily the same as running it through cron, due to possible environment differences.
- Maintenance: It's your responsibility to keep the cron job alive. If the server on which the job is running goes down, you're responsible for monitoring and alerting.
There are other systems like Airflow or Celery that support scheduled tasks, solve some problems that cron has, and also have a host of other useful features. Here are a few reasons you may consider using one of these job systems:
- Airflow is more than a simple scheduler. Its primary value-add is being able to model relationships between tasks in a DAG (directed acyclic graph). Airflow also comes with a fleshed-out UI. The main drawback to Airflow is that it requires significant work to setup and maintain and thus is primarily used by data teams with complex ETL pipelines.
- Luigi is another platform built by Spotify for data teams to manage complex pipelines. It offers many similar features as Airflow but doesn't use DAGs. Like Airflow, it's a good fit for complex data use cases but too heavyweight for simple scheduled tasks.
- There are several other task orchestration platforms similar to Airflow and Luigi, such as Argo, MLFlow, and Kubeflow. These are also powerful but heavyweight.
- Celery is a distributed task queue and isn't strictly a scheduler like cron. While it can be used for scheduled tasks, its primary use cases are around sending out asynchronous tasks to multiple servers.
All of these systems are built for a lot more than simple scheduling and aren't always optimally suited for many scheduled task use cases. As a result, cron continues to be the default option.
However, at Airplane, we've built a platform for scheduled and non-scheduled operations which is optimized for ease of use across the board–ease of setting up new jobs, ease of debugging failures, ease of re-running things, and ease of managing in a larger organization.
Airplane vs cron and other systems
Airplane has advantages over cron and other schedulers:
Automatic notifications and logs
Without any additional work, you can configure an Airplane task to notify you in Slack so you can watch out for failures. Airplane also automatically logs every task run and preserves the outputs for you or anyone on your team (with the right permissions) to be able to search, filter, and view.
Manual as well as automated runs
Any task in Airplane can be triggered manually as a one-off, and when run this way, it runs with the exact same configuration as if it had been run through a schedule.
Permissions and other safety features
Anyone who creates an Airplane task can set explicit permissions around who is allowed to run the task, create schedules, or view log outputs. Airplane offers granular group-based access controls to all tasks.
Native Support for SQL and REST
If you want to create a cron job that runs a database query regularly and notifies you each time, or hits an API endpoint on a schedule, you would need to write a script in Python, JS, or some other language to execute that logic. Airplane makes these use cases significantly easier. You can connect Airplane directly to your database and provide the query you want to run on a schedule or setup a REST resource to create scheduled tasks that hit an internal or third-party REST API.
Usable by non-engineers
Airplane's support for SQL and REST make it more accessible to people who are technical but not full-fledged engineers. Even for code-based tasks written in a scripting language, an engineer can deploy the task to Airplane, and then a non-engineer can setup a schedule through the Airplane UI.
How Airplane works
Here's how to setup a scheduled task on Airplane in just 5 minutes:
1) Create a new Task
After signing up for a free Airplane account, click "Create Task". Then choose how the task should get executed. You can use a Python, JS, or Shell script, or also have Airplane run a SQL query or hit a REST endpoint on a recurring basis:
The in-app guide will walk through setting input parameters for the task, adding in environment variables, setting permissions for who has access to this task, and configuring run constraints like timeout limits. For more detail, read our docs on how to create tasks.
2) Write your task and deploy to Airplane
After specifying the task metadata, you can initialize the task locally using Airplane's command-line utility:
$ airplane init --slug=run_db_backup
When you're ready, deploy the task to Airplane:
$ airplane deploy run_db_backup.py
3) Create a schedule
Click "New Schedule" on the Task page and use cron syntax to specify when it should run.
That's it! After that, you'll have a scheduled task that will run consistently without any maintenance. You'll get logging, Slack notifications, manual re-runs, and more with no further effort.
Airplane isn't just valuable for scheduled tasks–there are a whole host of other operational workflows people have automated through Airplane. Check out our docs to learn more.