How to Convert an HTML String to JavaScript NodeList?

Let's suppose we have a string like the following which we want to parse and get the DOM elements as NodeList:

const htmlStr = `
<html>
    <head>
        <title>hello world</title>
        <link rel="stylesheet" href="style.css" />
    </head>
    <body>
        <p>foo</p>
        <p>bar</p>
        <a href="#baz">baz</a>
    </body>
</html>
`;

First thing we need to do is to convert this string into a DOM Document tree object like so:

const parser = new DOMParser();
const doc = parser.parseFromString(htmlStr, 'text/html');

console.log(doc);

/*
Output:
    #document
        <html>
            <head>
                <title>hello world</title>
            </head>
            <body>
                <p>foo</p>
                <p>bar</p>
                <a href="#baz">baz</a>
            </body>
        </html>
*/

Now, to get any child node as a NodeList object, we can simply use the HTML DOM childNodes property like so:

console.log(doc.childNodes); // NodeList [html]

console.log(doc.documentElement.childNodes); // NodeList(3) [head, text, body]

console.log(doc.head.childNodes); // NodeList(5) [text, title, text, link, text]

console.log(doc.body.childNodes); // NodeList(7) [text, p, text, p, text, a, text]

An important thing to note in the output above is the fact that we have several text nodes. This is because of the empty spaces and new lines we have in our htmlStr.

To exclude empty text nodes from the resulting NodeList, you can either remove all unnecessary spacing and newlines in the htmlStr or use an alternative way of creating multi-line strings (one that doesn't output whitespace as they appear in your string). One such method, for example, is by concatenating multi-line strings with the + operator like so:

const htmlStr = '<html>' +
                    '<head>' +
                        '<title>hello world</title>' +
                        '<link rel="stylesheet" href="style.css" />' +
                    '</head>' +
                    '<body>' +
                        '<p>foo</p>' +
                        '<p>bar</p>' +
                        '<a href="#baz">baz</a>' +
                    '</body>' +
                '</html>';

Now the resulting NodeList objects won't have text nodes with empty text:

const parser = new DOMParser();
const doc = parser.parseFromString(htmlStr, 'text/html');

console.log(doc.documentElement.childNodes); // NodeList(2) [head, body]

console.log(doc.head.childNodes); // NodeList(2) [title, link]

console.log(doc.body.childNodes); // NodeList(3) [p, p, a]

If for some reason you must use template literals/strings for htmlStr, then you can loop through each node in the parsed string and exclude the nodes that are not an Element node. For example:

function getElemNodes(nodeList) {
    nodeList.forEach(node => {
        if (node.nodeType !== Node.ELEMENT_NODE) {
            node.parentNode.removeChild(node);
        }
    });

    return nodeList;
}

console.log(getElemNodes(doc.documentElement.childNodes)); // NodeList(2) [head, body]

console.log(getElemNodes(doc.head.childNodes)); // NodeList(2) [title, link]

console.log(getElemNodes(doc.body.childNodes)); // NodeList(3) [p, p, a]

Another way this could be done is by filtering out nodes that aren't Element nodes. Since NodeList object does not have a filter() method, we will have to convert the NodeList into an array. For example:

const getElemNodes = (nodeList) => Array.from(nodeList).filter(node => node.nodeType === Node.ELEMENT_NODE);

console.log(getElemNodes(doc.documentElement.childNodes)); // [head, body]

console.log(getElemNodes(doc.head.childNodes)); // [title, link]

console.log(getElemNodes(doc.body.childNodes)); // [p, p, a]

If the reason you're parsing HTML string is to inject the parsed html into an existing DOM element, then there can be more efficient ways of doing it. Have a look at our article about how to append an HTML string to an existing DOM element for more information.

This post was published 05 Oct, 2020 by Daniyal Hamid. Daniyal currently works as the Head of Engineering in Germany and has 20+ years of experience in software engineering, design and marketing. Please show your love and support by sharing this post.