We’re just about ready to start building our theme’s template files. Before we do this, however, it’s time for a quick briefing on data validation and sanitation, an important procedure we’ll take to ensure that our theme follows best security practices.
Why Is Theme Security Important?
The following line from the WordPress Codex page on Data Validationsums it up nicely:
Untrusted data comes from many sources (users, third party sites, your own database!, …) and all of it needs to be validated both on input and output.
We have to assume that all data coming in and out of your WordPress database is unsafe, and validate and sanitize it depending on the nature of the data and the context in which it is used. This helps to prevent code and markup from becoming “live” when you try to display it on your site. For example, we don’t want HTML code entered into a text box on a settings page to actually run as real HTML within the theme files, because that could break our layout. Even worse is if that “live” code is JavaScript, or an SQL query, because then your site could be at risk for Cross-Site Scripting (XSS) attacks, or SQL Injections.
WordPress provides a number of functions that we can use to make our data safe. These functions help by:
- Converting special characters such as single and double quotes, ampersands, and greater-than and less-than signs into their entity equivalents (", <, >, etc) so that they can’t be interpreted as code. This is known as output sanitation, or escaping.
- Ensuring that data about to be input into your database is what you intend it to be (for example, checking that a text box actually contains safe text that is free of HTML tags). This is typically known as input validation.
During this tutorial, we’ll be mostly concerned with #1 above, sanitizing/escaping data.
Scenario #2 becomes important for themes that collect data from users, such as on a theme options page. Theme Options pages are outside of the scope of this tutorial, however.
Output Sanitation/Escaping
Our primary sanitation weapons of choice throughout this tutorial will be esc_attr(), and esc_attr_e(). We may use others at times, and I’ll point them out when we get to them.
Both of these functions weed out characters such as quotes, ampersands and greater-than and less-than signs that, when printed inside HTML attributes, could be misinterpreted as code. esc_attr() is meant for escaping code for use in PHP, while esc_attr_e() is used when we want to echo (display on the screen) the code we’re escaping.
Here’s a live example, using code that we’ll work with in our lesson on the index template.
<h1 class="entry-title">
<a href="<?php the_permalink(); ?>" title="<?php echo esc_attr( sprintf( __( 'Permalink to %s', 'shape' ), the_title_attribute( 'echo=0' ) ) ); ?>" rel="bookmark">
<?php the_title(); ?></a></h1>
This code displays post titles. Even if you don’t understand everything it’s doing, notice how we use esc_attr() to wrap everything inside the “title=” attribute on the <a>
tag? All data inside HTML attribute tags is assumed to be unsafe. Thus: <?php echo esc_attr( sprintf( __( 'Permalink to %s', 'book' ), the_title_attribute( 'echo=0' ) ) ); ?>
could contain anything, including potentially unsafe characters. esc_attr() adds a layer of protection by converting unsafe characters into their HTML entity equivalents.
We’ll see many more examples like this as we work through the lessons.
For an in-depth overview of Data Sanitation and Validation, check outData Validation and Sanitization With WordPress by Stephen Harris.
You’re on your way to becoming a security-conscious developer!