上下文编码Contextual Escaping

网站及其它B/S应用极易受到 XSS 攻击,尽管PHP提供了转义功能,在某些情况下依然不够安全。在Phalcon中 :doc:`Phalcon\Escaper <../api/Phalcon_Escaper>`提供了上下文转义功能,这个模块是由C语言实现的, 这在进行转义时可以有更好的性能。

Websites and Web applications are vulnerable to XSS attacks, despite PHP provides escaping functionality, in some contexts those are not sufficient/appropriate. Phalcon\Escaper provides contextual escaping, this component is written in C providing the minimal overhead when escaping different kinds of texts.

Phalcon的上下文转义组件基于 OWASP 提供的 XSS (Cross Site Scripting) Prevention Cheat Sheet_Prevention_Cheat_Sheet)

We designed this component based on the XSS (Cross Site Scripting) Prevention Cheat Sheet_Prevention_Cheat_Sheet) created by the OWASP

另外,这个组件依赖于 mbstring 扩展,以支持几乎所有的字符集。

Additionally, this component relies on mbstring to support almost any charset.

下面的例子中展示了这个组件是如何工作的:

To illustrate how this component works and why it is important, consider the following example:

  1. <?php
  2. //Document title with malicious extra HTML tags
  3. $maliciousTitle = '</title><script>alert(1)</script>';
  4. //Malicious CSS class name
  5. $className = ';`(';
  6. //Malicious CSS font name
  7. $fontName = 'Verdana"</style>';
  8. //Malicious Javascript text
  9. $javascriptText = "';</script>Hello";
  10. //Create an escaper
  11. $e = new Phalcon\Escaper();
  12. ?>
  13. <html>
  14. <head>
  15. <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
  16. <title><?php echo $e->escapeHtml($maliciousTitle) ?></title>
  17. <style type="text/css">
  18. .<?php echo $e->escapeCss($className) ?> {
  19. font-family : "<?php echo $e->escapeCss($fontName) ?>";
  20. color: red;
  21. }
  22. </style>
  23. </head>
  24. <body>
  25. <div class='<?php echo $e->escapeHtmlAttr($className) ?>'>hello</div>
  26. <script>var some = '<?php echo $e->escapeJs($javascriptText) ?>'</script>
  27. </body>
  28. </html>

结果如下:

Which produces the following:

../_images/escape.jpeg

Phalcon会根据文本所处的上下文进行转义。 恰当的上下文环境对防范XSS攻击来说是非常重要的。

Every text was escaped according to its context. Use the appropriate context is important to avoid XSS attacks.

HTML 编码Escaping HTML

最不安全的情形即是在html标签中插入非安全的数据。

The most common situation when inserting unsafe data is between HTML tags:

  1. <div class="comments"><!-- Escape untrusted data here! --></div>

我们可以使用escapeHtml方法对这些文本进行转义:

You can escape those data using the escapeHtml method:

  1. <div class="comments"><?php echo $e->escapeHtml('></div><h1>myattack</h1>'); ?></div>

结果如下:

Which produces:

  1. <div class="comments">&gt;&lt;/div&gt;&lt;h1&gt;myattack&lt;/h1&gt;</div>

HTML 属性编码Escaping HTML Attributes

对html属性进行转义和对html内容进行转义略有不同。对html的属性进行转义是通过对所有的非字母和数字转义来实现的。类例的转义都会如此进行的,除了一些复杂的属性外如:href和url:

Escape HTML attributes is different from escape a full HTML content. The escape works by changing every non-alphanumeric character to the form. This kind of escaping is intended to most simpler attributes excluding complex ones like ‘href’ or ‘url’:

  1. <table width="Escape untrusted data here!"><tr><td>Hello</td></tr></table>

我们这里使用escapeHtmlAttr方法对html属性进行转义

You can escape an HTML attribute by using the escapeHtmlAttr method:

  1. <table width="<?php echo $e->escapeHtmlAttr('"><h1>Hello</table'); ?>"><tr><td>Hello</td></tr></table>

结果如下:

Which produces:

  1. <table width="&#x22;&#x3e;&#x3c;h1&#x3e;Hello&#x3c;&#x2f;table"><tr><td>Hello</td></tr></table>

URL 编码(Escaping URLs)Escaping URLs

一些html的属性如href或url需要使用特定的方法进行转义:

Some HTML attributes like ‘href’ or ‘url’ need to be escaped differently:

  1. <a href="Escape untrusted data here!">Some link</a>

我们这里使用escapeUrl方法进行url的转义:

You can escape an HTML attribute by using the escapeUrl method:

  1. <a href="<?php echo $e->escapeUrl('"><script>alert(1)</script><a href="#'); ?>">Some link</a>

结果如下:

Which produces:

  1. <a href="%22%3E%3Cscript%3Ealert%281%29%3C%2Fscript%3E%3Ca%20href%3D%22%23">Some link</a>

CSS 编码Escaping CSS

CSS标识/值也可以进行转义:

CSS identifiers/values can be escaped too:

  1. <a style="color: Escape unstrusted data here">Some link</a>

这里我们使用escapeCss方法进行转义:

You can escape an HTML attribute by using the escapeCss method:

  1. <a style="color: <?php echo $e->escapeCss('"><script>alert(1)</script><a href="#'); ?>">Some link</a>

结果:

Which produces:

  1. <a style="color: \22 \3e \3c script\3e alert\28 1\29 \3c \2f script\3e \3c a\20 href\3d \22 \23 ">Some link</a>

Javascript 编码Escaping Javascript

插入Javascript代码的字符串也需要进行适当的转义:

Strings to be inserted into javascript code also must be properly escaped:

  1. <script>document.title = 'Escape untrusted data here'</script>

这里我们使用escapeJs进行转义:

You can escape an HTML attribute by using the escapeJs method:

  1. <script>document.title = '<?php echo $e->escapejs("'; alert(100); var x='"); ?>'</script>
  1. <script>document.title = '\x27; alert(100); var x\x3d\x27'</script>