How to Make the JavaScript "atob()" Method Work With Multibyte Strings?

The JavaScript atob() method decodes a Base64-encoded ASCII string. For encoded characters that are within the range of ASCII or Latin-1 character sets, atob() correctly returns the decoded string:

atob('Zm9vYmFy'); // 'foobar'

However, for encoded characters that are beyond the range of ASCII or Latin-1 character sets, atob() returns a binary string:

atob('8J+mig=='); // 'ð\x9F¦\x8A'
atob('44GT44KT44Gr44Gh44Gv'); // 'ã\x81\x93ã\x82\x93ã\x81«ã\x81¡ã\x81¯'

The binary string still needs to be converted back to the original, human-readable form. One way to properly decode multibyte Base64-encoded string is to create a custom function that:

  1. Creates a Uint8Array of byte sequence from the binary string, and;
  2. Returns a stream of code points that can be decoded to output the original string.

You can implement this, for example, like so:

function fromBinaryStr(binary) {
  // 1: create an array of bytes
  const bytes = Uint8Array.from({ length: binary.length }, (_, index) =>
    binary.charCodeAt(index)
  );

  // 2: decode the byte data into a string
  const decoder = new TextDecoder('utf-8');
  return decoder.decode(bytes);
}

console.log(fromBinaryStr(atob('8J+mig=='))); // '🦊'
console.log(fromBinaryStr(atob('44GT44KT44Gr44Gh44Gv'))); // 'こんにちは'
console.log(fromBinaryStr(atob('Zm9vYmFy'))); // 'foobar'

In this code, the fromBinaryStr() function takes a decoded binary string (returned by atob()) as input and returns the original string.


This post was published by Daniyal Hamid. Daniyal currently works as the Head of Engineering in Germany and has 20+ years of experience in software engineering, design and marketing. Please show your love and support by sharing this post.