NorthSec 2020 (Online Edition)

Unicode vulnerabilities that could byͥte you
2020-05-15, 17:40–18:25, Twitch

Transformation of Unicode characters can lead to various side effects. In this talk, you will learn why normalization and capitalization can be misused and affect modern applications.


The number of Unicode code points has never stopped growing just like its integration in modern technologies. Web applications you have developed or used are likely to support input and output formatted in UTF-8 character encoding.

In this talk, you will learn about the security implications of encoding conversion. Normalizing a UTF-8 string to ASCII only character has numerous potential side effects. The latest research affecting Unicode will be summarized including the HostSplit and HostBond attacks. The HostSplit attack abuses minor characters conversion to trigger open redirect or Server-Side Request Forgery (SSRF). While HostBond is a risk affecting service provider giving subdomain to account created by users. Aside from normalization, uppercase and lowercase transformation can introduce vulnerability. Encoding can be used to circumvent security controls such as Web Application Firewalls. Punycode is the new representation to support domains with special characters outside of ASCII. This representation can be used to create visual confusion to end users.

While some issues were patched in major software, many risks remain or are likely to resurface. Get ready for a complete summary of everything security professionals should know about Unicode!