Training è il processo di insegnamento a un modello ML da dati (imparare pattern, regolare i parametri), mentre inference è l'utilizzo del modello addestrato per fare previsioni su nuovi dati. Sono fasi distinte con caratteristiche e costi diversi.
Training vs inference
TRAINING → teaching the model (the LEARNING phase):
→ feed lots of DATA → the model adjusts its parameters to learn patterns
→ computationally EXPENSIVE (lots of data, compute, time — e.g. training an LLM costs
huge resources); done once (or periodically to update)
→ produces a trained MODEL
INFERENCE → using the trained model (the PREDICTION phase):
→ give the trained model NEW input → it produces an output (prediction/generation)
→ much CHEAPER/faster than training (a single forward pass); done MANY times (every
time you use the model)
→ train once (expensive), infer many times (cheaper, in production)
