Node.js 中的 streams 是什么，你应该何时使用它们？

Question

Accepted Answer

**Streams** 逐步处理数据**以块的形式**，而不是一次性将所有数据加载到内存中。这对于大型数据（大文件、网络传输）至关重要，因为一次性缓冲所有数据会耗尽内存。

## Streams 解决的问题

```js
// ❌ loads the ENTIRE file into memory — crashes on a 10GB file
const data = await fs.promises.readFile("huge.csv");
process(data);

// ✅ stream it — constant memory, processes chunk by chunk
fs.createReadStream("huge.csv")
  .pipe(transform)
  .pipe(fs.createWriteStream("out.csv"));
```

使用 streams，内存使用量与文件大小无关——保持平稳——您每次处理一个块。

## 四种 stream 类型

```text
Readable  → source you read FROM (fs.createReadStream, http request)
Writable  → destination you write TO (fs.createWriteStream, http response)
Duplex    → both readable & writable (TCP socket)
Transform → modify data as it passes through (zlib gzip, encryption)
```

## pipe — 连接 streams

```js
import { createReadStream, createWriteStream } from "fs";
import { createGzip } from "zlib";

// read → compress → write, all streaming, low memory
createReadStream("file.txt")
  .pipe(createGzip())            // Transform: compress on the fly
  .pipe(createWriteStream("file.txt.gz"));
```

`pipe` 将一个可读流连接到一个可写流（中间加上转换），自动管理流量——这是组合 streams 的常用方法。

## pipeline — 具有正确错误处理的 pipe

```js
import { pipeline } from "stream/promises";

await pipeline(
  createReadStream("in.txt"),
  createGzip(),
  createWriteStream("out.gz")
); // ✅ cleans up all streams on error/completion (pipe alone leaks on errors)
```

相比于链式 `.pipe()`，`pipeline` 更受欢迎，因为它会传播错误并正确清理每个 stream。

## Streams 出现的地方

```text
✓ HTTP request/response bodies (req and res ARE streams)
✓ File reading/writing, file uploads/downloads
✓ Compression (zlib), encryption (crypto)
✓ Database cursors, large query results
✓ Real-time data processing, video streaming
```

## 何时不用费力

```text
Small data that fits comfortably in memory → readFile/simple buffering is simpler.
Streams add complexity; use them for LARGE or continuous data.
```

## 为什么这很重要

Streams 是 Node 处理大型或连续数据时的**内存高效**特性——这是导致大型上传应用崩溃和可以以平稳内存处理任意大小的应用之间的区别。

理解四种 stream 类型、`pipe`/`pipeline` 组合（以及为什么 `pipeline` 更安全）和 HTTP 请求/响应本身就是 streams 这一事实，对于在 Node 中构建可扩展的文件处理、数据处理和代理服务至关重要。