How to Make the JavaScript "btoa()" Method Work With Multibyte Strings?

The JavaScript btoa() method performs base64 encoding by converting a binary string to a Base64-encoded ASCII string. It can only encode characters within the 0-255 range, which corresponds to a single byte (8-bit) in the ASCII or Latin-1 character sets. If you pass a string to the btoa() method that's not in this range, it will throw an error:

// ES5+
// Uncaught DOMException: Failed to execute 'btoa' on 'Window': The string to be encoded contains characters outside of the Latin1 range.
btoa('🦊');

// Uncaught DOMException: Failed to execute 'btoa' on 'Window': The string to be encoded contains characters outside of the Latin1 range.
btoa('こんにちは');

One way to Base64-encode characters that require more than one byte for storage is to create a custom function that splits a multibyte string into an array of single byte characters. You can do so in the following steps:

  1. Split the UTF-16 string into an array of bytes using Uint8Array;
  2. Concatenate the array of bytes to create a binary string that can be used with the btoa() method.

You can implement this, for example, like so:

function toBinaryStr(str) {
  const encoder = new TextEncoder();
  // 1: split the UTF-16 string into an array of bytes
  const charCodes = encoder.encode(str);
  // 2: concatenate byte data to create a binary string
  return String.fromCharCode(...charCodes);
}

console.log(btoa(toBinaryStr('🦊'))); // '8J+mig=='
console.log(btoa(toBinaryStr('こんにちは'))); // '44GT44KT44Gr44Gh44Gv'
console.log(btoa(toBinaryStr('foobar'))); // 'Zm9vYmFy'

In this code, the toBinaryStr() function takes a string as input and returns a new string that represents the original string in binary format that can be used with the btoa() method for Base64 encoding.

The TextEncoder.encode() method encodes the input string into a sequence of 8-bit (single-byte) representation of each character in the input string using the UTF-8 encoding, and returns the resulting Uint8Array of those bytes. Using the String.fromCharCode() method, each byte in the array is converted back to its corresponding character value, and concatenated to form the binary string that is compatible with the btoa() method for Base64 encoding.

This conversion process is necessary when working with strings that contain characters that are not supported by the btoa() method, as it only supports characters that can be represented using a single byte (i.e. characters in the ASCII or Latin-1 character sets). The toBinaryStr() function splits the UTF-16 encoded string into an array of 8-bit characters, which can then be encoded using Base64 with the btoa() method.

One way to decode a multibyte Base64-encoded string is by creating a custom function that converts the binary string returned by atob() method back to the original string.


This post was published (and was last revised ) by Daniyal Hamid. Daniyal currently works as the Head of Engineering in Germany and has 20+ years of experience in software engineering, design and marketing. Please show your love and support by sharing this post.