Specials (Unicode block)

  • specials
    rangeu+fff0..u+ffff
    (16 code points)
    planebmp
    scriptscommon
    assigned5 code points
    unused9 reserved code points
    2 non-characters
    unicode version history
    1.0.01 (+1)
    2.12 (+1)
    3.05 (+3)
    note: [1][2]

    specials is a short unicode block allocated at the very end of the basic multilingual plane, at u+fff0–ffff. of these 16 code points, five have been assigned since unicode 3.0:

    • u+fff9 interlinear annotation anchor, marks start of annotated text
    • u+fffa interlinear annotation separator, marks start of annotating character(s)
    • u+fffb interlinear annotation terminator, marks end of annotation block
    • u+fffc object replacement character, placeholder in the text for another unspecified object, for example in a compound document.
    • u+fffd replacement character used to replace an unknown, unrecognized or unrepresentable character
    • u+fffe <noncharacter-fffe> not a character.
    • u+ffff <noncharacter-ffff> not a character.

    fffe and ffff are not unassigned in the usual sense, but guaranteed not to be unicode characters at all. they can be used to guess a text's encoding scheme, since any text containing these is by definition not a correctly encoded unicode text. unicode's u+feff byte order mark character can be inserted at the beginning of a unicode text to signal its endianness: a program reading such a text and encountering 0xfffe would then know that it should switch the byte order for all the following characters.

  • replacement character
  • unicode chart
  • history
  • see also
  • references

Specials
RangeU+FFF0..U+FFFF
(16 code points)
PlaneBMP
ScriptsCommon
Assigned5 code points
Unused9 reserved code points
2 non-characters
Unicode version history
1.0.01 (+1)
2.12 (+1)
3.05 (+3)
Note: [1][2]

Specials is a short Unicode block allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:

  • U+FFF9 INTERLINEAR ANNOTATION ANCHOR, marks start of annotated text
  • U+FFFA INTERLINEAR ANNOTATION SEPARATOR, marks start of annotating character(s)
  • U+FFFB INTERLINEAR ANNOTATION TERMINATOR, marks end of annotation block
  • U+FFFC OBJECT REPLACEMENT CHARACTER, placeholder in the text for another unspecified object, for example in a compound document.
  • U+FFFD REPLACEMENT CHARACTER used to replace an unknown, unrecognized or unrepresentable character
  • U+FFFE <noncharacter-FFFE> not a character.
  • U+FFFF <noncharacter-FFFF> not a character.

FFFE and FFFF are not unassigned in the usual sense, but guaranteed not to be Unicode characters at all. They can be used to guess a text's encoding scheme, since any text containing these is by definition not a correctly encoded Unicode text. Unicode's U+FEFF BYTE ORDER MARK character can be inserted at the beginning of a Unicode text to signal its endianness: a program reading such a text and encountering 0xFFFE would then know that it should switch the byte order for all the following characters.