AI is well-suited to format conversions (CSV ↔ JSON ↔ SQL) and migration scripts because the transformation rules are mechanical. The trick is giving it enough context to be correct and safe: schemas, a sample row, and explicit safety requirements.
AI is well-suited to format conversions (CSV ↔ JSON ↔ SQL) and migration scripts because the transformation rules are mechanical. The trick is giving it enough context to be correct and safe: schemas, a sample row, and explicit safety requirements.
import csv
# Source CSV: id,email,signup_date (signup_date may be blank)
# Target table: users(id INT, email TEXT NOT NULL, signup_date DATE NULL)
def csv_to_sql(path: str) -> list[str]:
statements = []
with open(path, newline="", encoding="utf-8") as f:
for row in csv.DictReader(f):
email = row["email"].strip()
if not email: # validate: skip invalid rows, don't insert garbage
continue
date = row["signup_date"].strip() or None
email_sql = email.replace("'", "''") # escape quotes to avoid broken SQL / injection
date_sql = f"'{date}'" if date else "NULL"
# ON CONFLICT makes it idempotent: re-running won't create duplicates
statements.append(
f"INSERT INTO users (id, email, signup_date) "
f"VALUES ({int(row['id'])}, '{email_sql}', {date_sql}) "
f"ON CONFLICT (id) DO NOTHING;"
)
return statements
The comments mark the parts that matter: validation (skip empty emails), escaping (quotes), and idempotency (ON CONFLICT DO NOTHING). Ask AI to include all three — they're the things a naive script forgets.
Data migrations are high-stakes and often one-shot: a script that double-inserts or drops rows can be expensive to undo. AI accelerates writing the conversion, but the safety properties — idempotency, validation, and a dry-run on a copy — are non-negotiable. Treat the generated script as a draft you must read and test, never as something to run blind against real data.