मल्टी-स्टेज बिल्ड्स म्हणजे काय आणि त्यांचा वापर का करावा?

Question

Accepted Answer

**मल्टी-स्टेज बिल्ड्स** एका Dockerfile मध्ये अनेक `FROM` स्टेज वापरतात — एका स्टेज मध्ये अ‍ॅप्लिकेशन तयार करतात (सर्व बिल्ड टूल्सह) आणि केवळ अंतिम artifacts एका स्वच्छ, किमान अंतिम स्टेज मध्ये कॉपी करतात. हे **खूप लहान, अधिक सुरक्षित** production images तयार करते.

## समस्या: बिल्ड टूल्स image ला फुगवतात

```text
Building an app needs build tools (compilers, dev dependencies, SDKs), but the
FINAL image shouldn't include them:
  → they bloat the image (larger size, slower deploys)
  → they increase the attack surface (more software = more vulnerabilities)
→ You want only the built artifact + its runtime in the final image.
```

## मल्टी-स्टेज बिल्ड

```dockerfile
# STAGE 1: "build" — has all the build tools (compile/bundle the app)
FROM node:20 AS build
WORKDIR /app
COPY package*.json ./
RUN npm install              # includes dev dependencies, build tools
COPY . .
RUN npm run build            # produces /app/dist

# STAGE 2: final — a clean, minimal image with ONLY the built artifact
FROM nginx:alpine
COPY --from=build /app/dist /usr/share/nginx/html   # copy ONLY the build output
# → the final image has NO build tools, no dev dependencies — just nginx + the built files
```

`--from=build` बिल्ड स्टेज मधील artifacts अंतिम image मध्ये कॉपी करते. अंतिम image **बिल्ड स्टेज मधील सर्वकाही वगळते सिवाय ज्या गोष्टी आपण स्पष्टपणे कॉपी करतो** — म्हणून बिल्ड टूल्स, dev dependencies आणि source production मध्ये येत नाहीत.

## फायदे

```text
✓ SMALLER images — only the runtime + artifacts (often 10x+ smaller)
✓ More SECURE — fewer packages = smaller attack surface (no compilers/build tools)
✓ Faster deploys/pulls (smaller images transfer faster)
✓ One Dockerfile — build and final image together (no separate build scripts)
→ Common for compiled languages (Go, Rust, Java) and Node/frontend builds.
```

## हे महत्वाचे का आहे

मल्टी-स्टेज बिल्ड्स समजणे **लहान, सुरक्षित production images तयार करण्यासाठी** मूल्यवान आहे, म्हणून production-गुणवत्तेच्या Docker images बिल्ड करण्यासाठी हे महत्वाचे व्यावहारिक ज्ञान आहे.

ज्या समस्येचे हे निराकरण करते ती वास्तविक आणि सामान्य आहे: अ‍ॅप्लिकेशन तयार करण्यासाठी **बिल्ड टूल्स** (compilers, SDKs, dev dependencies) आवश्यक आहेत जे **अंतिम production image मध्ये असू नये** — त्यांचा समावेश image ला फुगवतो (मोठा आकार, हळू deployments आणि pulls) आणि **attack surface** वाढवतो (अधिक स्थापित software म्हणजे अधिक संभाव्य vulnerabilities). **मल्टी-स्टेज बिल्ड्स** हे अनेक `FROM` स्टेज वापरून मार्जितपणे सोडवतात: सर्व टूल्स असलेल्या स्टेज मध्ये अ‍ॅप्लिकेशन तयार करतात, नंतर **केवळ अंतिम artifacts** (`--from=build`) एका स्वच्छ, किमान अंतिम स्टेज मध्ये कॉपी करतात जो सर्वकाही इतर गोष्टी वगळतो.

हे **नाटकीयपणे लहान** images तयार करते (बहुतेक 10x+ लहान — फक्त runtime आणि artifacts) आणि **अधिक सुरक्षित** (कोणते बिल्ड टूल्स किंवा अनावश्यक packages शोषण करण्यासाठी नाहीत), जेव्हा एकच Dockerfile मध्ये सर्वकाही ठेवते (कोणते वेगळे बिल्ड scripts नाहीत).

ही pattern compiled languages (Go, Rust, Java साठी, जेथे compiler runtime मध्ये आवश्यक नाही) आणि frontend/Node builds (जेथे बिल्ड tooling आणि source production मध्ये आवश्यक नाही) साठी मानक आहे.

कारण लहान, सुरक्षित production images deployment speed, storage आणि security साठी महत्वाचे आहेत, आणि मल्टी-स्टेज बिल्ड्स हे मानक, प्रभावी technique आहेत त्यांना प्राप्त करण्यासाठी (बिल्ड environment ला lean runtime image मधून वेगळे करणे), मल्टी-स्टेज बिल्ड्स समजणे — ते bloat/security समस्या सोडवतात, ते कसे काम करतात आणि त्यांचे फायदे — हे मूल्यवान, वारंवार-लागू होणारे ज्ञान आहे production-गुणवत्तेच्या Docker images बिल्ड करण्यासाठी, एक best practice जे real-world Dockerfiles मध्ये व्यापकपणे वापरले जाते आणि efficient, सुरक्षित containerized applications तयार करण्यासाठी एक महत्वाचे कौशल्य.