Python for Productivity and Automation: Building Efficient Daily Workflows

Practical Python scripts and tools for automating repetitive daily tasks — from file management and email to browser automation and scheduled jobs.

The best automation is the kind that runs quietly in the background while you focus on actual work. Python is uniquely well-suited to this: the standard library covers most of what you need, the ecosystem fills in the rest, and the language is readable enough that you can understand a script you wrote six months ago without annotation.

This post covers the patterns and tools I reach for most when automating daily workflows.

File and folder management

The pathlib module (Python 3.4+) replaced os.path for a reason — it’s dramatically more readable:

from pathlib import Path
from datetime import datetime

downloads = Path.home() / "Downloads"

for file in downloads.iterdir():
    if file.is_file():
        # Move files older than 30 days to an archive folder
        age = datetime.now().timestamp() - file.stat().st_mtime
        if age > 30 * 86400:
            archive = downloads / "archive" / str(datetime.now().year)
            archive.mkdir(parents=True, exist_ok=True)
            file.rename(archive / file.name)

Pair this with watchdog for file system events and you can build real-time watchers — auto-organising a downloads folder, triggering a build when a config changes, syncing files between directories.

Scheduling with APScheduler

cron is fine, but Python’s APScheduler library gives you scheduling inside a script with more flexibility:

from apscheduler.schedulers.blocking import BlockingScheduler

scheduler = BlockingScheduler()

@scheduler.scheduled_job("cron", hour=8, minute=0)
def morning_report():
    # Fetch your data, send yourself a summary email
    ...

@scheduler.scheduled_job("interval", minutes=30)
def sync_files():
    ...

scheduler.start()

This is useful for scripts that need to run at different cadences without the overhead of multiple cron entries.

Web scraping and browser automation

Requests + BeautifulSoup handles the majority of scraping needs — static pages, APIs that return HTML, anything that doesn’t require JavaScript execution:

import httpx
from bs4 import BeautifulSoup

resp = httpx.get("https://news.ycombinator.com/")
soup = BeautifulSoup(resp.text, "html.parser")

for item in soup.select(".athing"):
    title = item.select_one(".titleline a")
    print(title.text, title["href"])

For JavaScript-heavy sites, Playwright is the current best choice (faster and more reliable than Selenium):

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://example.com")
    page.fill("#search", "query")
    page.press("#search", "Enter")
    page.wait_for_selector(".results")
    print(page.inner_text(".results"))
    browser.close()

Email automation

Python’s smtplib works but is verbose. The yagmail library wraps Gmail sending into a few lines:

import yagmail

yag = yagmail.SMTP("you@gmail.com")
yag.send(
    to="recipient@example.com",
    subject="Daily summary",
    contents="Here's your report.",
    attachments=["report.pdf"],
)

For reading email (to trigger workflows from incoming messages), imaplib with the imap-tools wrapper is clean:

from imap_tools import MailBox

with MailBox("imap.gmail.com").login("you@gmail.com", "password") as mb:
    for msg in mb.fetch(limit=10, reverse=True):
        if "invoice" in msg.subject.lower():
            for att in msg.attachments:
                Path(f"invoices/{att.filename}").write_bytes(att.payload)

Working with spreadsheets

openpyxl for Excel, pandas for anything involving data manipulation:

import pandas as pd

# Read a CSV, filter rows, write a cleaned version
df = pd.read_csv("expenses.csv")
df["date"] = pd.to_datetime(df["date"])
df = df[df["amount"] > 0]
df.groupby("category")["amount"].sum().to_csv("summary.csv")

For Google Sheets, gspread with a service account gives you read/write access without OAuth flows for server-side scripts.

API integration patterns

Most productivity automation eventually involves calling an API. A few patterns worth standardising:

import httpx
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))
def fetch_data(url: str) -> dict:
    resp = httpx.get(url, timeout=10)
    resp.raise_for_status()
    return resp.json()

tenacity for retries with exponential backoff is essential for any script hitting external APIs. Networks fail; build that assumption in from the start.

Building a personal CLI

Once you have a collection of scripts, typer turns them into a proper CLI with minimal boilerplate:

import typer

app = typer.Typer()

@app.command()
def organise(
    folder: str = typer.Argument(..., help="Folder to organise"),
    dry_run: bool = typer.Option(False, "--dry-run"),
):
    """Organise files in FOLDER by date."""
    ...

if __name__ == "__main__":
    app()

Pair with pipx install -e . to make your personal tools available system-wide without polluting your global Python environment.

Putting it together

The workflow I’ve settled on for new automation scripts:

  1. Prototype in a notebook — fast iteration, easy to inspect intermediate state
  2. Extract to a module — once the logic is stable, move it out of the notebook
  3. Add a CLI with typer — makes it usable from other scripts and cron
  4. Schedule with APScheduler or cron depending on whether it needs to stay resident
  5. Log to a filelogging with a rotating file handler so you can debug failures after the fact

The scripts that have saved me the most time are the boring ones: the one that moves files, the one that sends a summary email, the one that checks whether a service is up and pages me if not. None of them are clever. All of them are running right now.