Trend Micro Facebook TrendLabs Twitter Malware Blog RSS Feed You Tube - Trend Micro
Search our blog:

  • Mobile Vulnerabilities

  • Zero-Day Alerts

  • Recent Posts

  • Calendar

    September 2015
    S M T W T F S
    « Aug    
  • Email Subscription

  • About Us

    I recently made up two nonsensical domain and—can you spot the difference between them?

    In a modern Unicode-capable browser, they are likely to appear identical but if you copy and paste each one into a search engine, you will get different results. The domain on the right was created using Cyrillic characters while the one on the left was created using Western characters. While most Cyrillic characters vastly differ from US-ASCII characters, a handful of symbols look at home in either character set (see page 2 of the chart).

    When viewed in hexdump, you can clearly see the difference between the two domain names. As shown, .com is written in ASCII code in both names (see Figure 1).

    Click for larger view Click for larger view

    I then created a simple HTML file with links to each domain name. Note that the ASCII domain name is italicized in the anchor tag while the Unicode domain name is not (see Figure 2).

    When I first pulled it up in Firefox 3.5, the character encoding was set to ISO-8859-1(Western) so the Unicode link clearly differed (see Figure 3).

    Click for larger view Click for larger view

    A quick change of my character encoding to Unicode(UTF-8), however, resulted in an altogether different scenario (see Figure 4).

    GIZMODO points out that this works with other strings as well. An attacker thus only needs to find commonly used code pages that they can use to piece together the characters they will need to spoof legitimate sites. In my brief testing, I found that using more than one Unicode block in a single URL produces unpredictable results.

    Recent discussions about the Internet Corporation for Assigned Names and Numbers (ICANN)’s approval of the use of internationalized domain names (IDNs) and how they can pose additional security risks have been ensuing. Some believe that allowing the use of IDNs can make antiphishing efforts harder.

    Simply put, IDNs work by converting Unicode strings into punycode strings before the browser queries a Domain Name System (DNS). For instance, the punycode version of is At, they have a handy tool for converting Unicode strings into punycode and back.

    It took some digging but I did find a few registrars that support punycoded domain names on existing top-level domain names (.com, .net, etc.). Should a cybercriminal register, he/she can create a pass-through of the ASCII page and use email, instant messaging (IM), or social networking to entice users to click a Unicode link that will connect them to a look-alike phishing page. Simply double-checking a site’s name may thus no longer be enough.

    If you want to see how real IDNs react in your browsers and other tools, take a look here for some active samples.

    Share this article
    Get the latest on malware protection from TrendLabs
    Email this story to a friend   Technorati   NewsVine   MySpace   Google   Live   StumbleUpon

    • Felisa Picker

      Have you ever considered adding more videos to your blog posts to keep the readers more entertained? I mean I just read through the entire article of yours and it was quite good but since I'm more of a visual learner,I found that to be more helpful well let me know how it turns out. Keep up the great works guys I've added you guys to my blogroll. This is a great article thanks for sharing this informative information.. I will visit your blog regularly for some latest post.

    • Pingback: Reading blogs #21 : ::: Think Macro :::()

    • Normm

      There are safeguards already in place. The DNS community and ICANN has been working for years on this.

      Just because a "character" exists in Unicode does not make it available for use in domain names. Several characters are what we call DISALLOWED in the IDNA protocol. All domain names need to be valid in accordance with the IDNA protocol in order for the domain name to be registered and resolvable. Adding on top of the protocol comes the IDN Guidelines, where the non-script/language mixing rule sits and is enforced by a contractual relationship with ICANN, and more rules and requirements goes on top of that.
      See these articles for clarification:

      Or contact the people who look after this directly:

      Tina Dam is the Sr. Director of IDNs at ICANN

    • Normm

      At least you made an effort to be accurate by not using the paypal example. isn't going to fool anyone however since it doesn't mean anything to anyone. The Gizmodo article was also erroneous btw. It isn't possible to mix scripts between Cyrillic and Latin. Can you find an example that might actually be a threat? I think you will find it is fairly difficult. If you do find one better check to see if it hasn't be defensively registered thus blocking it.

      ICANN has been working since 2005 on this specific issue but nobody has asked them about the systems and procedures that have been put in place.

      I'm sure Tina Dam at ICANN would be happy to answer any questions you might have.

    • ameyer

      I don't think this will be much of a problem, ie7 & ie8 already recognize that they are international website addresses and display them in their punycode format in the browser address bar, even placing the mouse on the shortcut causes the punycode version to be displayed in the status bar at the bottom of the browser window.
      However ie6 is vulnerable to this form of spoofing. I did note however that in ie7 & ie8 if you view the source code for the html page the two href tags look identical in that window.
      Still I imagine someone will try it or some derivative, and they may catch a few unwitting users but for the most part this potential problem has already been recognized by the web browser software developers.


    © Copyright 2013 Trend Micro Inc. All rights reserved. Legal Notice