PHP: Remove Everything But Letters And Numbers – Reg Expressions

Image frame
PHP: Remove Everything But Letters And Numbers – Reg Expressions
Maruf
Jul 15th, 2008
Maruf scribbled this post.

where's wally?

I always forget this PHP function, but I’m finding myself regularly needing it. I thought I’d just jot it down for my own reference and for anyone else who might need it.

Function: ereg_replace

What this function basically does is remove all characters from a string which isn’t a letter or a number. It can be a very cool function for error checking. Regular expression functions like this enable you to search for patterns within a string.

1
2
3
$string = "remove ever^&thing but *&^*&%£ letters & numbers*&^*";
$cleansedstring = ereg_replace("[^A-Za-z0-9]", "", $string );
echo $cleansedstring;

$cleansedstring should output: removeeverthingbutlettersnumbers

I used “[^A-Za-z0-9]” to remove every chracter that isn’t a letter or number, but here are some different matches:

[abc] a, b, or c
[a-z] Any lowercase letter
[^A-Z] Any character that is not a uppercase letter
(gif|jpg) Matches either “gif” or “jpeg”
[a-z]+ One or more lowercase letters
[0-9\.\-] any number, dot, or minus sign
^[a-zA-Z0-9_]{1,}$ Any word of at least one letter, number or _
([wx])([yz]) wy, wz, xy, or xz
[^A-Za-z0-9] Any symbol (not a number or a letter)
([A-Z]{3}|[0-9]{4}) Matches three letters or four numbers

You should note that because regular expressions are more powerful than regular functions, they are also slower. You should only use regular expressions if you have a particular need. Out of curiosity, does anyone know a less taxing function of getting the same result, or are regular expressions the most suitable option for stripping everything but numbers and letters?

For a more indepth explanation, here’s an awesome tutorial on Using Regular Expressions.

UPDATE: 04 Sept, 2009

Thanks to a reader, I’ve been informed about a more efficient way of stripping everything but alphanumeric characters.

1
2
3
$string = "remove ever^&thing but *&^*&%£ letters & numbers*&^*";
$cleansedstring = preg_replace('#\W#', '' $string);
echo $cleansedstring;

Excellent.


Filed away: PHP

comments

Image frame
1

This is actually incorrect.
Your posted example will have the exact opposite effect that you are intending it to here.
Your replace will actually match ALL letters and numbers, and replace them with nothing. Leaving you a string with a bunch of random chars.

Nick Overstreet
Aug 11th, 2008
Image frame
2

You’re wrong :)
To double check, I just tried it out. Only numbers and letters remained.

Maruf
Aug 11th, 2008
Image frame
3

Thanks, i need this regular expression.

deniar
Sep 12th, 2008
Image frame
4

Otherway would be to make an array of all unwanted characters then use an strreplace(), more long winded but faster.. who knows?

Ben
Feb 16th, 2009
Image frame
5

This is perfect! Thanks. Worked like a charm.

Tom
Mar 30th, 2009
Image frame
6

Thank you verry much! You saved my ass. Works perfectly.

Daniel
Aug 4th, 2009
Image frame
7

Maybe I don’t quite get what you are after, but if you just want to be left with alphanumeric characters then wouldn’t you use w?

eg:
preg_replace(‘#\W#’, ”, $string)

BLOGERCISE
Sep 4th, 2009
Image frame
8

Hey Blogercise,

I’ve never come across that method before, but it works great!! Thanks a lot.

I’ve updated the blog post :)

Maruf
Sep 4th, 2009
Image frame
9

Excellent, thanks!

Ian
Oct 27th, 2009

feel free to leave a scribble

Name:
Email:
gravatar
Want an image next to your comments?
visit gravatar.com
Message:
Get a free quote