UTF-8 Numerical Character References

It is well-known that there are many writing systems in the world and on the internet, and the standard known as Universal Character Set Transformation Format—8-bit (UTF-8 for short) has done its best to create characters for them. If you want to use these characters in a webpage, you'll likely find that copy/pasting these characters directly into a text editor or the like simply results in a question mark, blank rectangle, or something equally vague.

The best way around this is to you an HTML Numerical Character Reference, or NCR. The difficulty, of course, is finding out the specific code for the glyph(s) you want; as UTF-8 has literally thousands of code points (65,535, to be exact, though many are reserved for special uses).

Thus this webpage. It will find the code point of any glyph found in UTF-8. All you need to do is copy and past the character or characters you want to look up into the text input below and click on the button labelled Get Character Codes. The table captioned Characters And Codes will display each specific character and its respective HTML NCR and the character codes for CSS, and JavaScript.

The paragraph entitled Coded Phrase For HTML, has three options:

  1. Leave the ASCII characters in plain text (that is, uncoded).
  2. Change the ASCII to their HTML NCRs, excluding spaces.
  3. Change the ASCII to their HTML NCRs, including spaces.

The character/code table won't be affected either way.

Just a warning: This page only works for UTF-8, not for UTF-16 or UTF-32.


Back to Odds And Ends