It’s no secret that a developer’s day-to-day work often involves creating and maintaining small utility scripts. These scripts are the glue that connects various aspects of your system or build environment. While these Python scripts may not be complex, maintaining them can become a tedious chore that could cost you time and money.
One way to lighten the maintenance load is to use script automation. Rather than spending your time running scripts (often manually), script automation allows you to schedule these tasks to run on a particular timetable or be triggered in response to certain events.
This article will cover twelve Python scripts that were selected for their general utility, ease of use, and positive impact on your workload. They range in complexity from easy to intermediate and focus on text processing and file management. Specifically, we'll walk through the following use cases:
- Create strong random passwords
- Extract text from a PDF
- Text processing with Pandoc
- Manipulate audio with Pydub
- Filter text
- Locate addresses
- Convert a CSV to Excel
- Pattern match with regular expressions
- Convert images to JPG
- Compress images
- Get content from Wikipedia
- Create and manage Heroku apps
These scripts can be dropped into just about any workflow or run as part of an automated playbook using a tool like Airplane.
Using Airplane to manage and execute Python scripts
Airplane is a developer platform to transform APIs, SQL queries, and scripts into internal applications in minutes. The platform provides a central location to store your Python scripts as well as to run, manage, and share them securely.
Before we jump into some useful Python scripts, let's quickly walk through how you can use Airplane to manage and execute them.
There are a number of different options for creating a task. Let's select Python. Note that you can create your task from the UI as described below but you can also create and define your task fully in code from the CLI.
Add a task name and description and click Continue.
We'll first create our task and then deploy our code. Click Create task to finish adding the task to Airplane.
Next, we'll use the Airplane CLI tool to write a script for this task. This tool will allow you to create and manage scripts locally on your machine.
Be sure to wrap your code in a function called
main. Dependencies can be included in a requirements.txt file saved in your script’s parent directory.
Before deploying our Python script to Airplane, we'll create a
yaml file containing the name of the script so it's easily identifiable.
For example, if you are filtering text with regular expressions, you might title this file regex_filter.task.yaml. An explanation of the task definition can be found in the Airplane task definition docs.
yaml file should look similar to the example below:
Once your Python script and
yaml task definition file have been created, you can deploy to Airplane using the CLI:
airplane deploy regex_filter.task.yaml
Your Python task is now ready to run!
Now that we know how to use Airplane to manage and execute Python scripts, let's walk through some of the scripts themselves.
Python scripts for developers to implement
Let's dive into twelve Python utility scripts that we can leverage to make our lives easier. The code samples from this article can be found in this GitHub repository.
1. Create strong random passwords
There are many reasons to create strong random passwords, from onboarding new users to providing a password-reset workflow to creating a new password when rotating credentials. You can easily use a dependency-free Python script to automate this process:
2. Extract text from a PDF
Python can also be used to easily extract text from PDFs using the
PyPDF2 package. Getting text from a PDF file proves useful for data mining, invoice reconciliation, or report generation, and the extraction process can be automated in just a few lines of code. You can run
pip install PyPDF2 in your terminal to install the package. Below are a few examples of what you can achieve using Py2PDF2:
Say you receive a multipage PDF file but you only need the first page. The script below allows you to extract text from the first page in a PDF with just a few lines of Python code:
Maybe you’d like to copy text from two PDF files and merge the text into a new PDF. You can do this using the code below:
These examples, along with several others, can be found here.
3. Text processing with Pandoc
Pandoc is a fully featured command-line tool that allows you to convert markup between different formats. This means you can use Pandoc to convert Markdown text directly to docx or MediaWiki markup to DocBook. Markup-format conversion allows you to process external content or user-submitted information without restricting the data to a single format. You can install the
pandoc package with
pip. Following are a few examples of what you can do with
First, say you receive a
markdown formatted document but need to convert it to PDF.
pandoc makes this easy:
Or maybe you'd like to convert the
markdown file to a
json object. You can use the following script to do so:
4. Manipulate audio with Pydub
Pydub is a Python package that allows you to manipulate audio, including converting audio to various file formats like
mp3. Additionally, Pydub can segment an audio file into millisecond samples, which may be particularly useful for machine learning tasks. Pydub can be installed by entering
pip install pydub in your terminal.
Say you’re working with audio and need to ensure each file has the proper volume. You can use this script to automate the task:
Pydub has many additional features not covered in this example. You can find more of these in the Pydub GitHub repository.
5. Filter text
Matching and filtering text with regular expressions in Python is simple, and the benefits can be enormous. Say you have a system for batch processing sales-confirmation messages, and you need to extract a credit card from the text of an email message. The script below can quickly find any credit card number that matches the pattern, allowing you to easily filter this information from any textual content:
6. Locate addresses
Locating an address can be useful if you're dealing with shipping or delivery logistics or for simple user-profiling tasks. To get started, install
geocoder by running
pip install geocoder in your terminal. The script below allows you to easily find the latitude and longitude coordinates for any address or to find an address from any set of coordinates:
7. Convert a CSV to Excel
You may find yourself frequently managing CSV file outputs from an analytics platform or a dataset. Opening the CSV file in Excel is relatively simple, but Python allows you to skip this manual step by automating the conversion. This also allows you to manipulate the CSV data before conversion into Excel, saving additional time and effort.
Start by downloading the
openpyxl package using
pip install openpyxl. Once
openpyxl is installed, you can use the script below to convert a CSV file to an Excel spreadsheet:
8. Pattern match with regular expressions
Collecting data from unstructured sources can be a very tedious process. Similar to the filtering example above, Python allows for more detailed pattern matching using regular expressions. This is useful for categorizing textual information as part of a data-processing workflow or searching for specific keywords in user-submitted content. The built-in regular expression library is called
re, and once you get the hang of the regular expression syntax, you can automate almost any pattern-matching script.
For example, maybe you’d like to match any email addresses found in the text you're processing. You can use this script to do so:
You can use this script if you need to match phone numbers in your text:
9. Convert images to JPG
.jpg format is perhaps the most popular image format currently in use. You may find yourself needing to convert images from other formats to generate project assets or image recognition. The
pillow package from Python makes converting images to
jpg a simple process:
10. Compress images
Sometimes you may need to compress an image as part of the asset-creation pipeline for a new site or temporary landing page and may not want to do so manually, or you have to send the task to an external image-processing service. Using the
pillow package, you can easily compress JPG images to reduce the file size while retaining image quality. Install
pip install pillow.
The example below will reduce a 2.5 MB image to 293 KB:
11. Get content from Wikipedia
Wikipedia provides a fantastic general overview of many topics. This information can be used to add additional information to transactional emails, track changes on a particular set of articles, or to make training documentation or reports. Thankfully, it’s also extremely easy to gather information using the Wikipedia package for Python.
You can install the Wikipedia package using
pip install wikipedia. When the installation is complete, you are ready to begin.
If you already know the specific page content you would like to pull, you can do so directly from that page:
This package also allows you to search for pages matching specified text:
12. Create and manage Heroku apps
Heroku is a popular platform for deploying and hosting web applications. As a managed service, it allows developers to easily set up, configure, maintain, and even delete applications through the Heroku API. You can also easily create or manage Heroku applications using Airplane runbooks since Airplane makes it extremely easy to hit APIs and trigger events.
The example below relies on the
heroku3 package, which you can install using
pip install heroku3. Note that you will need a Heroku API key to access the platform.
Connect to Heroku using Python:
Once you are connected to Heroku, you can list your available applications and select an application to manage directly:
Then, the following script will allow you to create an application as a part of an Airplane runbook:
In this article we outlined twelve simple Python scripts you can use to automate various manual tasks. We selected these scripts not only because of their simplicity and utility but also because of their impact compared to their relative size.
Best of all, you can leverage Airplane to manage and run these scripts seamlessly and safely. Airplane runbooks allow you to easily manage and deploy Python scripts and compose multi-step workflows. Airplane also lets you manage your scripts on your local machine, which allows you to use your dev environment to write and test scripts before deployment.
While we covered Python in this article, you can also build workflows in Airplane using Node.js or Docker to run shell scripts. Airplane also allows you to work with SQL and REST, and automates alerting via Slack and email. You can sign up for a free account to try it out.
Author: Alvin Charity