It uses an encoding scheme called Punycode (as you've already discovered from the Python testing you've done), capable of representing Unicode characters in ASCII-only format. Each label (delimited by dots, so get.me.a.coffee.com has five labels) that contains Unicode characters is encoded in Punycode and prefixed with the string xn--. The label encoding first copies all the ASCII characters, then appends the encoded Unicode characters. The Unicode characters are always after the final - in the label, so one is added after the ASCII characters if needed. More detail can be found in this page over at the w3 site, and in RFC 3987. For details on how Punycode actually encodes labels, see the Wikipedia page. (责任编辑:) |