Strip_tags() – Less Than you Bargained For

While this may be handy for removing HTML from a string, be forewarned that the function is a lot less picky than you may think when it comes to the less than symbol ( < ). First, the section from the PHP book:


strip_tags($string [, allowed_tags])

allowed_tags – [optional] $string

Remove HTML tags and comments from $string. If specific tags should be
excluded, they can be specified inside allowed_tags.

Examples:
$string = "<p>This is a paragraph. </p><strong>Yay!</strong>";
echo strip_tags($string), strip_tags($string, '<p>');

HTML Source Code:

This is a paragraph. Yay! <p>This is a paragraph. </p>Yay!


So what happens to the following example, when we want to remove all the tags? Fair warning, something strange happens:

$string = "I <strong>love</strong> this book because it costs <$20.";
echo strip_tags($string);

HTML Source Code:

I love this book because it costs

As you can see, it removed the <$20 portion of the string as well, even without the closing greater than ( > ) tag at the end. Be careful when using strip_tags(), especially without specifying the allowed tags, or consider using an alternate such as htmlspecialchars() to encode the characters into their html equivalent rather than removing them.

Display all PHP Errors and Warnings

Every time I was debugging my pages I found myself searching around for this little chunk of code to display PHP errors. So, I put in the book so it was always nearby. Since people still search google endlessly, I thought I would provide it here as well. If you wish to see all the PHP errors and warnings in your script, include the following bit of code:

error_reporting(E_ALL);
ini_set('display_errors', '1');

Now, continue to beat your head against your keyboard while you continue to hunt down your missing semicolon or closing parenthesis.

Formatting Characters

We’ve all seen them:

  • \n – new line
  • \r – carriage return
  • \t – tab
  • \b – backspace

But many wonder when to use them or more specifically, why they aren’t working as expected. So let’s address the basic usage and rules.

Rule #1: When using a formatting character in your code, it must be within “double quotations” otherwise it will be taken as a literal backslash and letter.

When do you use it? When writing to a file with fwrite() or file_put_contents(), sending a text email with mail(), or when adding formatting to pre-populated data in the form element <textarea>. Now, notice I made no mention of HTML output directly. While it’s possible to represent new line, tab, carriage return in HTML if it is within the preformatted tags <pre></pre>, in most cases these tags are not present and HTML will ignore these formatting characters.

Rule #2: Not all computer systems obey the formatting characters the same. When using \n (new line), also include a carriage return (\r) character.

So what do you do if you have a paragraph, for instance submitted by a <textarea> form, that is preformatted and want it to display in the HTML with the \n (new line) breaks represented? That’s when you toss the string into the function nl2br(), which changes all \n to the xhtml line break <br />.

Example:

echo nl2br("Hello\n\rWorld\n\r!!!");

Results:

Hello
World
!!!