12 useful Python scripts for developers

12 useful Python scripts for developers

Jun 8, 2022
Madhura Kumar

It’s no secret that a developer’s day-to-day work often involves creating and maintaining small utility scripts. These scripts are the glue that connects various aspects of your system or build environment. While these Python scripts may not be complex, maintaining them can become a tedious chore that could cost you time and money.

One way to lighten the maintenance load is to use script automation. Rather than spending your time running scripts (often manually), script automation allows you to schedule these tasks to run on a particular timetable or be triggered in response to certain events.

This article will cover twelve Python scripts that were selected for their general utility, ease of use, and positive impact on your workload. They range in complexity from easy to intermediate and focus on text processing and file management. Specifically, we'll walk through the following use cases:

  1. Create strong random passwords
  2. Extract text from a PDF
  3. Text processing with Pandoc
  4. Manipulate audio with Pydub
  5. Filter text
  6. Locate addresses
  7. Convert a CSV to Excel
  8. Pattern match with regular expressions
  9. Convert images to JPG
  10. Compress images
  11. Get content from Wikipedia
  12. Create and manage Heroku apps

These scripts can be dropped into just about any workflow or run as part of an automated playbook using a tool like Airplane runbooks.

Using Airplane to manage and execute Python scripts

Airplane is a developer platform to transform APIs, SQL queries, and scripts into internal applications in minutes. The platform provides a central location to store your Python scripts as well as to run, manage, and share them securely.

You can also use Airplane schedules as a substitute for cron and other job schedulers and Airplane provides permissions, audit logs, approval flows, and much more out of the box.

Before we jump into some useful Python scripts, let's quickly walk through how you can use Airplane to manage and execute them.

You can get started by signing up for a free Airplane account and heading to: Library > New task. A task represents a single step or operation such as 'hit X endpoint' or 'run X script'.

There are a number of different options for creating a task. Let's select Python. Note that you can create your task from the UI as described below but you can also create and define your task fully in code from the CLI.

Add a task name and description and click Continue.

We'll first create our task and then deploy our code. Click Create task to finish adding the task to Airplane.

Next, we'll use the Airplane CLI tool to write a script for this task. This tool will allow you to create and manage scripts locally on your machine.

Be sure to wrap your code in a function called main. Dependencies can be included in a requirements.txt file saved in your script’s parent directory.

Before deploying our Python script to Airplane, we'll create a yaml file containing the name of the script so it's easily identifiable.

For example, if you are filtering text with regular expressions, you might title this file regex_filter.task.yaml. An explanation of the task definition can be found in the Airplane task definition docs.

Your completed yaml file should look similar to the example below:

slug: regex_filter
name: "Regex Filter"
description: Filter text using Regular Expressions
python:
  entrypoint: regex_filter.py

Once your Python script and yaml task definition file have been created, you can deploy to Airplane using the CLI: airplane deploy regex_filter.task.yaml

Your Python task is now ready to run!

You can find more details on getting started in the Airplane developer docs for Python, quickstart guide, and guide to getting started with runbooks.

Now that we know how to use Airplane to manage and execute Python scripts, let's walk through some of the scripts themselves.

Python scripts for developers to implement

Let's dive into twelve Python utility scripts that we can leverage to make our lives easier. The code samples from this article can be found in this GitHub repository.

1. Create strong random passwords

There are many reasons to create strong random passwords, from onboarding new users to providing a password-reset workflow to creating a new password when rotating credentials. You can easily use a dependency-free Python script to automate this process:

# Generate Strong Random Passwords
import random
import string
# This script will generate an 18 character password
word_length = 18
# Generate a list of letters, digits, and some punctuation
components = [string.ascii_letters, string.digits, "!@#$%&"]
# flatten the components into a list of characters
chars = []
for clist in components:
  for item in clist:
    chars.append(item)
def generate_password():
  # Store the generated password
  password = []
  # Choose a random item from 'chars' and add it to 'password'
  for i in range(word_length):
    rchar = random.choice(chars)
    password.append(rchar)
  # Return the composed password as a string
  return "".join(password)
# Output generated password
print(generate_password())

2. Extract text from a PDF

Python can also be used to easily extract text from PDFs using the PyPDF2 package. Getting text from a PDF file proves useful for data mining, invoice reconciliation, or report generation, and the extraction process can be automated in just a few lines of code. You can run pip install PyPDF2 in your terminal to install the package. Below are a few examples of what you can achieve using Py2PDF2:

Say you receive a multipage PDF file but you only need the first page. The script below allows you to extract text from the first page in a PDF with just a few lines of Python code:

# import module PyPDF2
import PyPDF2
# put 'example.pdf' in working directory
# and open it in read binary mode
pdfFileObj = open('example.pdf', 'rb')
# call and store PdfFileReader
# object in pdfReader
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
# to print the total number of pages in pdf
# print(pdfReader.numPages)
# get specific page of pdf by passing
# number since it stores pages in list
# to access first page pass 0
pageObj = pdfReader.getPage(0)
# extract the page object
# by extractText() function
texts = pageObj.extractText()
# print the extracted texts
print(texts)

Maybe you’d like to copy text from two PDF files and merge the text into a new PDF. You can do this using the code below:

import PyPDF2
# open two pdfs
pdf1File = open('example.pdf', 'rb')
pdf2File = open('example2.pdf', 'rb')
# read first pdf
pdf1Reader = PyPDF2.PdfFileReader(pdf1File)
# read second pdf
pdf2Reader = PyPDF2.PdfFileReader(pdf2File)
# for writing in new pdf file
pdfWriter = PyPDF2.PdfFileWriter()
for pageNum in range(pdf1Reader.numPages):
    pageObj = pdf1Reader.getPage(pageNum)
    pdfWriter.addPage(pageObj)
for pageNum in range(pdf2Reader.numPages):
    pageObj = pdf2Reader.getPage(pageNum)
    pdfWriter.addPage(pageObj)
# create new pdf 'example3.pdf' 
pdfOutputFile = open('example3.pdf', 'wb')
pdfWriter.write(pdfOutputFile)
pdfOutputFile.close()
pdf1File.close()
pdf2File.close()

These examples, along with several others, can be found here.

3. Text processing with Pandoc

Pandoc is a fully featured command-line tool that allows you to convert markup between different formats. This means you can use Pandoc to convert Markdown text directly to docx or MediaWiki markup to DocBook. Markup-format conversion allows you to process external content or user-submitted information without restricting the data to a single format. You can install the pandoc package with pip. Following are a few examples of what you can do with pandoc.

First, say you receive a markdown formatted document but need to convert it to PDF. pandoc makes this easy:

import pandoc

in_file = open("example.md", "r").read()
pandoc.write(in_file, file="example.pdf", format="pdf")

Or maybe you'd like to convert the markdown file to a json object. You can use the following script to do so:

import pandoc
md_string = """
# Hello from Markdown

**This is a markdown string**
"""
input_string = pandoc.read(md_string)
pandoc.write(input_string, format="json", file="md.json")

You can find many other example functions here, or check out the pandoc package documentation for more information.

4. Manipulate audio with Pydub

Pydub is a Python package that allows you to manipulate audio, including converting audio to various file formats like wav or mp3. Additionally, Pydub can segment an audio file into millisecond samples, which may be particularly useful for machine learning tasks. Pydub can be installed by entering pip install pydub in your terminal.

Say you’re working with audio and need to ensure each file has the proper volume. You can use this script to automate the task:

from pydub import AudioSegment

audio_file = AudioSegment.from_mp3("example.mp3")
louder_audio_file = audio_file + 18
louder_audio_file.export("example_louder.mp3", format="mp3")

Pydub has many additional features not covered in this example. You can find more of these in the Pydub GitHub repository.

5. Filter text

Matching and filtering text with regular expressions in Python is simple, and the benefits can be enormous. Say you have a system for batch processing sales-confirmation messages, and you need to extract a credit card from the text of an email message. The script below can quickly find any credit card number that matches the pattern, allowing you to easily filter this information from any textual content:

# Filter Text
# Import re module
import re
# Take any string data
string = """a string we are using to filter specific items.
perhaps we would like to match credit card numbers
mistakenly entered into the user input. 4444 3232 1010 8989
and perhaps another? 9191 0232 9999 1111"""

# Define the searching pattern
pattern = '(([0-9](\s+)?){4}){4}'

# match the pattern with input value
found = re.search(pattern, string)
print(found)
# Print message based on the return value
if found:
  print("Found a credit card number!")
else:
  print("No credit card numbers present in input")

6. Locate addresses

Locating an address can be useful if you're dealing with shipping or delivery logistics or for simple user-profiling tasks. To get started, install geocoder by running pip install geocoder in your terminal. The script below allows you to easily find the latitude and longitude coordinates for any address or to find an address from any set of coordinates:

import geocoder
address = "1600 Pennsylvania Ave NW, Washington DC USA"
coordinates = geocoder.arcgis(address)
geo = geocoder.arcgis(address)
print(geo.latlng)
# output: [38.89767510765125, -77.03654699820865]

# If we want to retrieve the location from a set of coordinates
# perform a reverse query.
geocoder.arcgis([38.89767510765125, -77.03654699820865], method="reverse")

# output: <[OK] Arcgis - Reverse [White House]>

7. Convert a CSV to Excel

You may find yourself frequently managing CSV file outputs from an analytics platform or a dataset. Opening the CSV file in Excel is relatively simple, but Python allows you to skip this manual step by automating the conversion. This also allows you to manipulate the CSV data before conversion into Excel, saving additional time and effort.

Start by downloading the openpyxl package using pip install openpyxl. Once openpyxl is installed, you can use the script below to convert a CSV file to an Excel spreadsheet:

#!python3
# -*- coding: utf-8 -*-

import openpyxl
import sys

#inputs
print("This programme writes the data in any Comma-separated value file (such as: .csv or .data) to a Excel file.")
print("The input and output files must be in the same directory of the python file for the programme to work.\n")

csv_name = input("Name of the CSV file for input (with the extension): ")
sep = input("Separator of the CSV file: ")
excel_name = input("Name of the excel file for output (with the extension): ")
sheet_name = input("Name of the excel sheet for output: ")

#opening the files
try:
    wb = openpyxl.load_workbook(excel_name)
    sheet = wb.get_sheet_by_name(sheet_name)

    file = open(csv_name,"r",encoding = "utf-8")
except:
    print("File Error!")
    sys.exit()

#rows and columns
row = 1
column = 1

#for each line in the file
for line in file:
    #remove the \n from the line and make it a list with the separator
    line = line[:-1]
    line = line.split(sep)

    #for each data in the line
    for data in line:
        #write the data to the cell
        sheet.cell(row,column).value = data
        #after each data column number increases by 1
        column += 1

    #to write the next line column number is set to 1 and row number is increased by 1
    column = 1
    row += 1

#saving the excel file and closing the csv file
wb.save(excel_name)
file.close()

The script above is a part of the Awesome Python Scripts GitHub repository.

8. Pattern match with regular expressions

Collecting data from unstructured sources can be a very tedious process. Similar to the filtering example above, Python allows for more detailed pattern matching using regular expressions. This is useful for categorizing textual information as part of a data-processing workflow or searching for specific keywords in user-submitted content. The built-in regular expression library is called re, and once you get the hang of the regular expression syntax, you can automate almost any pattern-matching script.

For example, maybe you’d like to match any email addresses found in the text you're processing. You can use this script to do so:

import re
emailRegex = re.compile(r'''(
    [a-zA-Z0-9._%+-]+     # username
    @                               # @ symbol
     [a-zA-Z0-9.-]+           # domain name
     (\.[a-zA-Z]{2,4})         # dot-something
  )''', re.VERBOSE)

# store matched addresses in an array called "matches"
matches = []
text = """
An example text containing an email address, such as user@example.com or something like hello@example.com
"""

# search the text and append matched addresses to the "matches" array
for groups in emailRegex.findall(text):
    matches.append(groups[0])

# matches => ['user@example.com', 'hello@example.com']

You can use this script if you need to match phone numbers in your text:

import re

text = """
Here is an example string containing various numbers, some 
of which are not phone numbers.

Business Address
4553-A First Street
Washington, DC 20001

202-555-6473
301-555-8118
"""

phoneRegex = re.compile(r'''(
    (\d{3}|\(\d{3}\))?                 # area code
    (\s|-|\.)?                             # separator
    (\d{3})                               # first 3 digits
    (\s|-|\.)                               # separator
    (\d{4})                               # last 4 digits
    (\s*(ext|x|ext.)\s*(\d{2,5}))?    # extension
    )''', re.VERBOSE)

matches = []
for numbers in phoneRegex.findall(text):
  matches.append(numbers[0])

# matches => ['202-555-6473', '301-555-8118']

Automate the Boring Stuff has a great chapter about setting up and using regular expressions in Python.

9. Convert images to JPG

The .jpg format is perhaps the most popular image format currently in use. You may find yourself needing to convert images from other formats to generate project assets or image recognition. The pillow package from Python makes converting images to jpg a simple process:

# requires the Pillow module used as `PIL` below
from PIL import Image
import os
import sys
file="toJPG.png"
filename = file.split(".")
img = Image.open(file)
new_name = filename[0] + ".jpg"
converted_img = img.convert('RGB')
converted_img.save(new_name)

10. Compress images

Sometimes you may need to compress an image as part of the asset-creation pipeline for a new site or temporary landing page and may not want to do so manually, or you have to send the task to an external image-processing service. Using the pillow package, you can easily compress JPG images to reduce the file size while retaining image quality. Install pillow using pip install pillow.

The example below will reduce a 2.5 MB image to 293 KB:

# the pillow package can be imported as PIL
from PIL import Image
file_path =  "image_uncompressed.jpg"
img = Image.open(file_path)
height, width = img.size
compressed = img.resize((height, width), Image.ANTIALIAS)
compressed.save("image_compressed.jpg", optimize=True,quality=9)

11. Get content from Wikipedia

Wikipedia provides a fantastic general overview of many topics. This information can be used to add additional information to transactional emails, track changes on a particular set of articles, or to make training documentation or reports. Thankfully, it’s also extremely easy to gather information using the Wikipedia package for Python.

You can install the Wikipedia package using pip install wikipedia. When the installation is complete, you are ready to begin.

If you already know the specific page content you would like to pull, you can do so directly from that page:

import wikipedia
page_content = wikipedia.page("parsec").content
# outputs the text content of the "Parsec" page on wikipedia
print(page_content)

This package also allows you to search for pages matching specified text:

import wikipedia
search_results = wikipedia.search("arc second")
# outputs an array of pages matching the search term
print(search_results)

12. Create and manage Heroku apps

Heroku is a popular platform for deploying and hosting web applications. As a managed service, it allows developers to easily set up, configure, maintain, and even delete applications through the Heroku API. You can also easily create or manage Heroku applications using Airplane runbooks since Airplane makes it extremely easy to hit APIs and trigger events.

The example below relies on the heroku3 package, which you can install using pip install heroku3. Note that you will need a Heroku API key to access the platform.

Connect to Heroku using Python:

import heroku3

# Be sure to update the api_key variable with your key
api_key = "12345-ABCDE-67890-FGHIJ"
client = heroku3.from_key(api_key)

Once you are connected to Heroku, you can list your available applications and select an application to manage directly:

import heroku3
api_key = "12345-ABCDE-67890-FGHIJ"
client = heroku3.from_key(api_key)

client.apps()

# the above command prints an array of available applications
# [<app 'airplanedev-heroku-example - ed544e41-601d-4d1b-a327-9a1945b743cb'>, <app 'notes-app - 5b3d6aab-cde2-4527-9ecc-62bdee08ed4a'>, …] 

# use the following command to connect to a specific application
app = client.apps()["airplanedev-heroku-example"]

# add a config variable for your application
config = app.config()
config["test_var"] = "value"

# enable or disable maintenance mode
# enable
app.enable_maintenance_mode()

# disable
app.disable_maintenance_mode()

# restarting your application is simple
app.restart()

Then, the following script will allow you to create an application as a part of an Airplane runbook:

import heroku3
api_key = "12345-ABCDE-67890-FABCD"
client = heroku3.from_key(api_key)

client.create_app("app-created-with-airplane")

After creating the application, you can manage it directly with the heroku3 package. Head to the Heroku3.py GitHub repository for a full list of options.

Get started

In this article we outlined twelve simple Python scripts you can use to automate various manual tasks. We selected these scripts not only because of their simplicity and utility but also because of their impact compared to their relative size.

Best of all, you can leverage Airplane to manage and run these scripts seamlessly and safely. Airplane runbooks allow you to easily manage and deploy Python scripts and compose multi-step workflows. Airplane also lets you manage your scripts on your local machine, which allows you to use your dev environment to write and test scripts before deployment.

While we covered Python in this article, you can also build workflows in Airplane using Node.js or Docker to run shell scripts. Airplane also allows you to work with SQL and REST, and automates alerting via Slack and email. You can sign up for a free account to try it out.


Author: Alvin Charity

Alvin Charity is a writer, musician, sound artist, audio / video editor, and self-taught Javascript programmer based in Washington, DC.

Share this article:

Subscribe to new blog posts from Airplane:

Leveraging feature flags to boost productivity

Leveraging feature flags to boost productivity

Jun 23, 2022
Madhura Kumar
In this article we'll discuss six ways to leverage feature flags to boost productivity and walk through how to build a lightweight feature flag in under 5 minutes. What
Django admin crash course - how to build a basic admin panel

Django admin crash course - how to build a basic admin panel

Jun 9, 2022
Madhura Kumar
In this article, we’ll first explore how to create a Django admin panel with basic functionality. Then, we’ll demonstrate some intermediate and advanced concepts — including how to customize your admin panels to manage product inventory and generate reports.
go to home