Python for Productivity and Automation: Building Efficient Daily Workflows

The best automation is the kind that runs quietly in the background while you focus on actual work. Python is uniquely well-suited to this: the standard library covers most of what you need, the ecosystem fills in the rest, and the language is readable enough that you can understand a script you wrote six months ago without annotation.

This post covers the patterns and tools I reach for most when automating daily workflows.

File and folder management

The pathlib module (Python 3.4+) replaced os.path for a reason — it’s dramatically more readable:

from pathlib import Path
from datetime import datetime

downloads = Path.home() / "Downloads"

for file in downloads.iterdir():
    if file.is_file():
        # Move files older than 30 days to an archive folder
        age = datetime.now().timestamp() - file.stat().st_mtime
        if age > 30 * 86400:
            archive = downloads / "archive" / str(datetime.now().year)
            archive.mkdir(parents=True, exist_ok=True)
            file.rename(archive / file.name)

Pair this with watchdog for file system events and you can build real-time watchers — auto-organising a downloads folder, triggering a build when a config changes, syncing files between directories.

Scheduling with APScheduler

cron is fine, but Python’s APScheduler library gives you scheduling inside a script with more flexibility:

from apscheduler.schedulers.blocking import BlockingScheduler

scheduler = BlockingScheduler()

@scheduler.scheduled_job("cron", hour=8, minute=0)
def morning_report():
    # Fetch your data, send yourself a summary email
    ...

@scheduler.scheduled_job("interval", minutes=30)
def sync_files():
    ...

scheduler.start()

This is useful for scripts that need to run at different cadences without the overhead of multiple cron entries.

Web scraping and browser automation

Requests + BeautifulSoup handles the majority of scraping needs — static pages, APIs that return HTML, anything that doesn’t require JavaScript execution:

import httpx
from bs4 import BeautifulSoup

resp = httpx.get("https://news.ycombinator.com/")
soup = BeautifulSoup(resp.text, "html.parser")

for item in soup.select(".athing"):
    title = item.select_one(".titleline a")
    print(title.text, title["href"])

For JavaScript-heavy sites, Playwright is the current best choice (faster and more reliable than Selenium):

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://example.com")
    page.fill("#search", "query")
    page.press("#search", "Enter")
    page.wait_for_selector(".results")
    print(page.inner_text(".results"))
    browser.close()

Email automation

Python’s smtplib works but is verbose. The yagmail library wraps Gmail sending into a few lines:

import yagmail

yag = yagmail.SMTP("you@gmail.com")
yag.send(
    to="recipient@example.com",
    subject="Daily summary",
    contents="Here's your report.",
    attachments=["report.pdf"],
)

For reading email (to trigger workflows from incoming messages), imaplib with the imap-tools wrapper is clean:

from imap_tools import MailBox

with MailBox("imap.gmail.com").login("you@gmail.com", "password") as mb:
    for msg in mb.fetch(limit=10, reverse=True):
        if "invoice" in msg.subject.lower():
            for att in msg.attachments:
                Path(f"invoices/{att.filename}").write_bytes(att.payload)

Working with spreadsheets

openpyxl for Excel, pandas for anything involving data manipulation:

import pandas as pd

# Read a CSV, filter rows, write a cleaned version
df = pd.read_csv("expenses.csv")
df["date"] = pd.to_datetime(df["date"])
df = df[df["amount"] > 0]
df.groupby("category")["amount"].sum().to_csv("summary.csv")

For Google Sheets, gspread with a service account gives you read/write access without OAuth flows for server-side scripts.

API integration patterns

Most productivity automation eventually involves calling an API. A few patterns worth standardising:

import httpx
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))
def fetch_data(url: str) -> dict:
    resp = httpx.get(url, timeout=10)
    resp.raise_for_status()
    return resp.json()

tenacity for retries with exponential backoff is essential for any script hitting external APIs. Networks fail; build that assumption in from the start.

Building a personal CLI

Once you have a collection of scripts, typer turns them into a proper CLI with minimal boilerplate:

import typer

app = typer.Typer()

@app.command()
def organise(
    folder: str = typer.Argument(..., help="Folder to organise"),
    dry_run: bool = typer.Option(False, "--dry-run"),
):
    """Organise files in FOLDER by date."""
    ...

if __name__ == "__main__":
    app()

Pair with pipx install -e . to make your personal tools available system-wide without polluting your global Python environment.

Putting it together

The workflow I’ve settled on for new automation scripts:

Prototype in a notebook — fast iteration, easy to inspect intermediate state
Extract to a module — once the logic is stable, move it out of the notebook
Add a CLI with typer — makes it usable from other scripts and cron
Schedule with APScheduler or cron depending on whether it needs to stay resident
Log to a file — logging with a rotating file handler so you can debug failures after the fact

The scripts that have saved me the most time are the boring ones: the one that moves files, the one that sends a summary email, the one that checks whether a service is up and pages me if not. None of them are clever. All of them are running right now.