How to Remove Punctuation From a String in PHP?

Removing Specific Punctuation Marks Using str_replace()

You could leverage the fact that str_replace() can take an array of values/needles to find and replace in a string, and do something like the following:

$string = 'Hello, how are you?';

echo str_replace(['?', '!', '.'], '', $string);

// output: 'Hello, how are you';

In the example above, we're only replacing ?, !, and . characters. Therefore, the resulting string has question mark stripped, but the comma remains.

Removing Punctuation Using preg_replace()

Below are some examples of how we can remove punctuation using regular expressions and built-in PCRE (Perl Compatible Regular Expressions) character classes:

Removing Specific Punctuation Marks:

You could specify which marks you wish to remove in a regular expression and replace them in a string using preg_replace() like so:

$string = 'Hello, how are you?';

echo preg_replace('/[?|.|!]?/', '', $string);

// output: 'Hello, how are you';

In the example above, we're only replacing ?, !, and . characters. Therefore, the resulting string has question mark stripped, but the comma remains.

In a character class (i.e. characters inside square brackets in a regular expression) any character, except ^, -, ] or \, is a literal and does not need to be escaped.

Removing Unicode Punctuation Characters Using PCRE Character Classes:

Using the PCRE unicode character class \p{P} (or \pP) we can remove all unicode punctuation characters from a string like so:

$string = 'Hello, how are you?';

echo preg_replace('/\p{P}/', '', $string);

// output: 'Hello how are you';

There are, of course, other PHP PCRE unicode punctuation character sets that you can use:

Sequence Description
\p{P} All punctuation characters.
\p{Pd} All Hyphens and dashes.
\p{Ps} Any kind of opening bracket.
\p{Pe} Any kind of closing bracket.
\p{Pi} Any kind of opening/initial quote.
\p{Pf} Any kind of closing/final quote.
\p{Pc} Any kind of character that connects words (e.g. underscore, etc.).

If these character classes are not available in your environment, then perhaps you need to use the --enable-parle-utf32 option at compilation time.

Following is a way you could use multiple character sets together using the pipe/alternation operator:

$string = 'Strip-All-Dashes_And_Underscores';

echo preg_replace('/\p{Pd}|\p{Pc}/', '', $string);

// output: 'StripAllDashesAndUnderscores';

Removing Punctuation Characters Using POSIX Character Class:

We could use the POSIX punct character class to find and replace all the punctuation characters with preg_replace(), like so:

$string = 'Hello, how are you?';

echo preg_replace('/[[:punct:]]/', '', $string);

// output: 'Hello how are you';

[:punct:] is a POSIX-style bracket expression that denotes a locale-aware punctuation character class/set. The syntax is also supported by PCRE regex syntax — which is what PHP uses.

POSIX character classes could be a bit flakey in some instances since they are locale-dependent. This means that if the vendor implementation of a character set for a locale is not up-to-date, then the results might suffer. Comparatively, unicode character classes, rely soley on standard unicode punctuations and are, therefore, more reliable.

Removing All Punctuation Excluding Some:

We can combine negative lookahead (?!) with a punctuation character set to exclude some punctuation characters from being removed, like so:

$string = 'Hello, how are you?';

// using PCRE character class:
echo preg_replace('/(?![!,])\p{P}/', '', $string);

// using POSIX character class:
echo preg_replace('/(?![!,])[[:punct:]]/', '', $string);

// output: 'Hello, how are you';

In the example above, we're replacing all punctuation characters except exclamation mark and a comma.

Another way of doing the same could be to strip off everything from the string except for the characters we allow. We can achieve this by using a negation (^) of all the characters we want to allow:

$string = 'Hello, how are you?';

// using PCRE character class:
echo preg_replace('/[^a-z0-9!, ]/i', '', $string);

// output: 'Hello, how are you';

This post was published by Daniyal Hamid. Daniyal currently works as the Head of Engineering in Germany and has 20+ years of experience in software engineering, design and marketing. Please show your love and support by sharing this post.