• Link to X Link to X Link to X
  • Link to Facebook Link to Facebook Link to Facebook
  • Link to Youtube Link to Youtube Link to Youtube
  • Link to Reddit Link to Reddit Link to Reddit
  • Link to Rss Link to Rss Link to Rss
  • Blog
  • Support
    • FAQ
    • Help (HTML)
    • Manual (PDF)
    • Forums (read-only)
    • Library (GitHub)
    • Update/Resend Registration Keys
    • Contact Us
  • About
    • About Emurasoft
    • Meet the Team
    • Other Products
    • Awards
    • Privacy Policy
    • Go to Emurasoft Customer Center
  • 🌐 English
    • 日本語
    • 한국어
    • Deutsch
    • 简体中文
    • 繁體中文
  • Download
  • Buy
  • Features
  • Menu Menu

Unicode Normalization

EmEditor provides support for normalizing Unicode characters and sequences. One example of when text normalization is useful is if you have a dataset containing Unicode inputs from many sources. You may want to normalize all strings to a single form so that matching equivalent characters becomes easier.

UAX #15 Unicode Normalization Forms describes four algorithms for normalizing characters and sequences: canonical composition, canonical decomposition, compatibility composition, and compatibility decomposition.

Decomposition is the process of breaking a character into its smaller units. If we applied canonical decomposition to the single character ñ, a LATIN SMALL LETTER N WITH TILDE, and viewed the Character Code Value (Ctrl+I), it shows that sequence is now two characters, LATIN SMALL LETTER N; COMBINING TILDE. Canonical composition reverses the previous command.

All canonical equivalences are compatible, but not all compatible relations are canonically equivalent. Canonically equivalent forms are identical in appearance and meaning, like in the previous example with ñ.

On the other hand, two compatible forms may look slightly different and they only have the same meaning in certain contexts. ¼ and 1/4 are compatible forms but are not canonically equivalent. ¼ looks slightly different than 1/4. Whereas ¼ means “a quarter,” 1/4 sometimes means “one divided by four,” so they are only interchangeable in certain contexts.

The normalization commands are accessed through Convert > Encode/Decode.

← Tooltip to show HTML/XML character references Unicode Support →

  • Text Editing
    • Batch Replace
    • Binary Editing
    • Bracket/Quotation Mark Auto-Complete
    • Clipboard History
    • Compare Documents
    • EditorConfig Support
    • Filter Bar
    • Find and Replace
    • Format Selection
    • Fuzzy Matching
    • Language Server Protocol Support
    • Multiple Selection Editing
      • How to use Vertical Editing
    • Number Range Expression for Find and Replace
    • Regular Expressions
    • Syntax Checker for HTML, CSS, JSON, XML
    • Syntax Highlighting
  • Powerful CSV Tools
    • Autofill
    • Combine Lines
    • Combine/Split Columns
    • CSV Converter
    • Custom CSV formats
    • Delete Duplicate Lines
    • Extract Columns
    • Extract Frequent Strings
    • Flash Fill
    • Freeze header
    • Insert lines/columns
    • Join CSV
    • Manage Columns
    • Numbering
    • Pivot table
    • Sort
    • Transpose
  • Large File Support
    • Fast Processing of Large Files
    • Large File Controller
    • Large Files up to 16 TB
    • Lightweight, multithreaded design
    • Split and Combine Files
  • User Experience
    • Configurations
    • Customizable Interface
    • Customizable toolbars
    • Document Groups
    • International language and locale support
    • Markers
    • Quick Launch
    • Split window
    • Start Window
    • Tabbed Design
    • Workspace Memory
  • Extensibility
    • AI Assisted Writing
    • AI Toolbar
    • Chat with AI
    • External Tools
    • Plug-ins
      • CommitList Plug-in
      • Explorer Plug-in
      • HTML Bar Plug-in
      • Open Documents Plug-in
      • Projects Plug-in
      • Search Plug-in
      • Snippets Plug-in
        • How to Use Zen Coding
      • Web Preview Plug-in
      • Word Complete Plug-in
      • Word Count Plug-in
    • Scriptable Macros
  • More Features
    • Base64
    • Bookmarks
    • Character Check
    • Character Code Value
    • Customer support
    • Drag and Drop
    • Error Handler and Crash Recovery
    • Fast 64-bit Build
    • Full Screen View
    • Grab Text
    • Half-width/Full-width conversion
    • HTML Character References
    • Jump
    • Markdown Editor
    • Marks
    • Matching Tag Highlight
    • Messaging (plug-in)
    • MIME Encoded-Word
    • Multiple File Encoding Conversions
    • Narrowing
    • Offline Registration
    • Outline
    • Percent-encoding
    • Pin to List
    • Portability Options
    • Privacy-first
    • Quick Start
    • Save to Protected Folder
    • Spellcheck
    • Status Window
    • Tooltip to show HTML/XML character references
    • Unicode Normalization
    • Unicode Support
    • Universal Character Name
    • Uppercase/Lowercase conversion
    • Wildcard Support
    • Windows 11/10 Compatibility
    • CSV
  • History
    • New in Version 26.2
    • New in Version 26.1
    • New in Version 26.0
    • New in Version 25.4
    • New in Version 25.3
    • New in Version 25.2
    • New in Version 25.1
    • New in Version 25.0
    • New in Version 24.5
    • New in Version 24.4
    • New in Version 24.3
    • New in Version 24.2
    • New in Version 24.1
    • New in Version 24.0
    • New in Version 23.1
    • New in Version 23.0
    • New in Version 22.5
    • New in Version 22.4
    • New in Version 22.3
    • New in Version 22.2
    • New in Version 22.1
    • New in Version 22.0
    • New in Version 21.9
    • New in Version 21.8
    • New in Version 21.7
    • New in Version 21.6
    • New in Version 21.5
    • New in Version 21.4
    • New in Version 21.3
    • New in Version 21.2
    • New in Version 21.1
    • New in Version 21.0
    • New in Version 20.9
    • New in Version 20.8
    • New in Version 20.7
    • New in Version 20.6
    • New in Version 20.5
    • New in Version 20.4
    • New in Version 20.3
    • New in Version 20.2
    • New in Version 20.1
    • New in Version 20.0
    • New in Version 19.9
    • New in Version 19.8
    • New in Version 19.7
    • New in Version 19.6
    • New in Version 19.5
    • New in Version 19.4
    • New in Version 19.3
    • New in Version 19.2
    • New in Version 19.1
    • New in Version 19.0
    • New in Version 18.9
    • New in Version 18.8
    • New in Version 18.7
    • New in Version 18.6
    • New in Version 18.5
    • New in Version 18.4
    • New in Version 18.3
    • New in Version 18.2
    • New in Version 18.1
    • New in Version 18.0
    • New in Version 17.9
    • New in Version 17.8
    • New in Version 17.7
    • New in Version 17.6
    • New in Version 17.5
    • New in Version 17.4
    • New in Version 17.3
    • New in Version 17.2
    • New in Version 17.1
    • New in Version 17.0
    • New in Version 16.9
    • New in Version 16.8
    • New in Version 16.7
    • New in Version 16.6
    • New in Version 16.5
    • New in Version 16.4
    • New in Version 16.3
    • New in Version 16.2
    • New in Version 16.1
    • New in Version 16.0
    • New in Version 15.9
    • New in Version 15.8
    • New in Version 15.7
    • New in Version 15.6
    • New in Version 15.5
    • New in Version 15.4
    • New in Version 15.3
    • New in Version 15.2
    • New in Version 15.1
    • New in Version 15.0
    • New in Version 14.9
    • New in Version 14.8
    • New in Version 14.7
    • New in Version 14.6
    • New in Version 14
    • New in Version 13
    • New in Version 12
    • New in Version 11
    • New in Version 10
    • New in Version 9
    • New in Version 8
    • New in Version 7
    • New in Version 6
    • New in Version 5
    • New in Version 4
    • Basic Features
  • EmEditor Free
  • Compare Desktop Installer (MSI), Desktop Portable, and Old Store App (UWP) Versions

Download and try the “world's fastest text editor” now. (Source: ZDNet)

Download Download Free Download

Copyright © 1995-2026 by Emurasoft, Inc.
Download | Buy | Features | Blog | Support | About | Privacy Policy
日本語 | Deutsch | 한국어 |简体中文 | 繁體中文

Scroll to top Scroll to top Scroll to top