Igor Dobryn
about IT

Node.js stream to read data from API

Since almost the beginning Node.js has streams. They went through several regenerations and now it's a powerful tool widely used in Node.js. Streams process data by chunks so they are memory efficient.

Let’s imagine that we need to read data from an API which responds with user objects. For the article demo reasons, we can build such API using mockapi.io

We have an API and reading API function might look as follows:

async function getUsers({ page, perPage }) {
  return new Promise((resolve, reject) => {
    http.get(`${API_URL}/users?page=${page}&limit=${perPage}`, (resp) => {
      let data = '';
      resp.on('data', chunk => data += chunk);
      resp.on('end', () => resolve(JSON.parse(data)));
    }).on('error', error => reject(error));
  });
}

 
To build a readable stream let’s outline requirements:

  • it should operate by objects. It’s an important note as by default readable streams use Buffer as a chunk data type
  • it should iterate over all pages
  • finish condition would be receiving a chunk of users with a size less than requested
  • it should pass users further
  • in an unexpected case emit an error
  • in the sake of demo 10 users with 3 users per page would be totally enough

As we have requirements we can implement the stream:

class ApiReadStream extends Readable {
  constructor(options) {
    super({ ...options, objectMode: true });

    this.doneWithReading = false;
    this.page = 1;
    this.perPage = 3;
  }

  async _read() {
    try {
      const data = await getUsers({ page: this.page, perPage: this.perPage });

      this.page += 1;
      this.doneWithReading = data.length < this.perPage;
      this.push(data);

      if (this.doneWithReading) {
        this.push(null);
      }
    } catch (error) {
      process.nextTick(() => this.emit('error', new Error(error)));
    }
  }
}

 
Just for demo purposes, we are going to output received data. For compatibility with stdout stream, we need to stringify the results of the API stream. That’s why we will use streaming-json-stringify. 4 lines would be enough to compose streams:

  const Stringify = require('streaming-json-stringify');

  const readStream = new ApiReadStream();
  const stringifyStream = Stringify();
  const stdoutStream = process.stdout;

  readStream.pipe(stringifyStream).pipe(stdoutStream);

 
Which will produce the following output:

[
[{"id":"1","name":"Ernie5"},{"id":"2","name":"Naomie.Gutkowski"}]
,
[{"id":"3","name":"Carley_Hackett"},{"id":"4","name":"Robb_Goyette"}]
,
[{"id":"5","name":"Nathan_Emard21"}]
]

 
 

Conclusions

Node.js streams are a powerful tool. They used across different parts of node.js:

  • stdin/stdout/stderror
  • files reading/writing
  • network
  • compressing libraries

Streams are memory efficient and composable structures. This article demonstrates how easy streams can be used for API reading with a minimum of resources and how they can be composed.

As this isn't a production code before going live several refinements are preferable:

  • more logging in stream
  • all network requests take reasonable time, hence parallel processing will benefit
  • saving the current state of what was read, so in case of failure to continue
  • a better consumer of the ApiStream. I.e. stream to save data into the database
Node.js streams API