The best automation is the kind that runs quietly in the background while you focus on actual work. Python is uniquely well-suited to this: the standard library covers most of what you need, the ecosystem fills in the rest, and the language is readable enough that you can understand a script you wrote six months ago without annotation.
This post covers the patterns and tools I reach for most when automating daily workflows.
File and folder management
The pathlib module (Python 3.4+) replaced os.path for a reason — it’s dramatically more readable:
from pathlib import Path
from datetime import datetime
downloads = Path.home() / "Downloads"
for file in downloads.iterdir():
if file.is_file():
# Move files older than 30 days to an archive folder
age = datetime.now().timestamp() - file.stat().st_mtime
if age > 30 * 86400:
archive = downloads / "archive" / str(datetime.now().year)
archive.mkdir(parents=True, exist_ok=True)
file.rename(archive / file.name)
Pair this with watchdog for file system events and you can build real-time watchers — auto-organising a downloads folder, triggering a build when a config changes, syncing files between directories.
Scheduling with APScheduler
cron is fine, but Python’s APScheduler library gives you scheduling inside a script with more flexibility:
from apscheduler.schedulers.blocking import BlockingScheduler
scheduler = BlockingScheduler()
@scheduler.scheduled_job("cron", hour=8, minute=0)
def morning_report():
# Fetch your data, send yourself a summary email
...
@scheduler.scheduled_job("interval", minutes=30)
def sync_files():
...
scheduler.start()
This is useful for scripts that need to run at different cadences without the overhead of multiple cron entries.
Web scraping and browser automation
Requests + BeautifulSoup handles the majority of scraping needs — static pages, APIs that return HTML, anything that doesn’t require JavaScript execution:
import httpx
from bs4 import BeautifulSoup
resp = httpx.get("https://news.ycombinator.com/")
soup = BeautifulSoup(resp.text, "html.parser")
for item in soup.select(".athing"):
title = item.select_one(".titleline a")
print(title.text, title["href"])
For JavaScript-heavy sites, Playwright is the current best choice (faster and more reliable than Selenium):
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://example.com")
page.fill("#search", "query")
page.press("#search", "Enter")
page.wait_for_selector(".results")
print(page.inner_text(".results"))
browser.close()
Email automation
Python’s smtplib works but is verbose. The yagmail library wraps Gmail sending into a few lines:
import yagmail
yag = yagmail.SMTP("you@gmail.com")
yag.send(
to="recipient@example.com",
subject="Daily summary",
contents="Here's your report.",
attachments=["report.pdf"],
)
For reading email (to trigger workflows from incoming messages), imaplib with the imap-tools wrapper is clean:
from imap_tools import MailBox
with MailBox("imap.gmail.com").login("you@gmail.com", "password") as mb:
for msg in mb.fetch(limit=10, reverse=True):
if "invoice" in msg.subject.lower():
for att in msg.attachments:
Path(f"invoices/{att.filename}").write_bytes(att.payload)
Working with spreadsheets
openpyxl for Excel, pandas for anything involving data manipulation:
import pandas as pd
# Read a CSV, filter rows, write a cleaned version
df = pd.read_csv("expenses.csv")
df["date"] = pd.to_datetime(df["date"])
df = df[df["amount"] > 0]
df.groupby("category")["amount"].sum().to_csv("summary.csv")
For Google Sheets, gspread with a service account gives you read/write access without OAuth flows for server-side scripts.
API integration patterns
Most productivity automation eventually involves calling an API. A few patterns worth standardising:
import httpx
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))
def fetch_data(url: str) -> dict:
resp = httpx.get(url, timeout=10)
resp.raise_for_status()
return resp.json()
tenacity for retries with exponential backoff is essential for any script hitting external APIs. Networks fail; build that assumption in from the start.
Building a personal CLI
Once you have a collection of scripts, typer turns them into a proper CLI with minimal boilerplate:
import typer
app = typer.Typer()
@app.command()
def organise(
folder: str = typer.Argument(..., help="Folder to organise"),
dry_run: bool = typer.Option(False, "--dry-run"),
):
"""Organise files in FOLDER by date."""
...
if __name__ == "__main__":
app()
Pair with pipx install -e . to make your personal tools available system-wide without polluting your global Python environment.
Putting it together
The workflow I’ve settled on for new automation scripts:
- Prototype in a notebook — fast iteration, easy to inspect intermediate state
- Extract to a module — once the logic is stable, move it out of the notebook
- Add a CLI with
typer— makes it usable from other scripts and cron - Schedule with APScheduler or cron depending on whether it needs to stay resident
- Log to a file —
loggingwith a rotating file handler so you can debug failures after the fact
The scripts that have saved me the most time are the boring ones: the one that moves files, the one that sends a summary email, the one that checks whether a service is up and pages me if not. None of them are clever. All of them are running right now.