Skip to main content

Bytes and Text

The most basic type of parsers are the byte and text parsers. There are two kind of byte parsers, one that collects all bytes in a byte stream into a Buffer, and one that just passes each chunk through as they arrive1. The text parser interprets the byte stream according to some character encoding and converts to JavaScript strings.

FormatMain Media TypeParser
Bufferapplication/octet-streamBufferParser
Byte streamapplication/vnd.esxx.octet-stream2PassThroughParser
Texttext/plainStringParser

The Buffer parser is useful when you need to load some resource of unknown type, and the pass-through parser can be used for large objects that wont fit in memory. The text parser understands the most common character encodings, specified by the charset media type parameter.

The following examples shows how an ISO-8859-1-encoded text file might be read into memory in a couple of different ways3.

import { ContentType } from '@divine/headers';
import { URI } from '@divine/uri';

const latin1 = new URI('latin1-file.txt');
const buffer = await latin1.load<Buffer>(ContentType.bytes);
const string = (await latin1.load('text/plain; charset=iso-8859-1')).valueOf();
const stream: Buffer[] = [];

for await (const chunk of latin1 /* or latin1.load<AsyncIterator<Buffer>>(ContentType.stream) */ ) {
stream.push(chunk);
}

  1. In the WSF, byte streams are represented as AsyncIterable<Buffer>.
  2. This custom media type is only used to identify the pass-through parser and should not be used otherwise.
  3. Notice how the URI class is also an AsyncIterable<Buffer>, which can be iterated directly.