Normalization is the process of organizing data to reduce redundancy and improve integrity by splitting data into related tables, following a series of "normal forms." The goal: each piece of data is stored once, avoiding duplication and the anomalies it causes.
The problem: a denormalized (redundant) table
❌ orders table with everything in one place — data is DUPLICATED:
order_id | customer_name | customer_email | product | price
1 | Ann | [email protected] | Phone | 999
2 | Ann | [email protected] | Case | 20 ← Ann's info repeated!
Problems (anomalies):
✗ UPDATE anomaly — change Ann's email → must update EVERY one of her orders
✗ INSERT anomaly — can't add a customer without an order
✗ DELETE anomaly — deleting Ann's last order loses her info entirely
✗ Wasted storage and inconsistency risk
