concept

Byte Order Mark

The Byte Order Mark (BOM) is a Unicode character (U+FEFF) used at the beginning of a text stream to indicate the byte order (endianness) and encoding of the text, such as UTF-8, UTF-16, or UTF-32. It helps software correctly interpret the text data by signaling the encoding format and byte sequence, preventing issues like garbled characters or incorrect parsing. While optional, it is commonly used in contexts where text files might be transferred between systems with different byte orders or encoding assumptions.

Also known as: BOM, Unicode BOM, U+FEFF, Zero Width No-Break Space, ZWNBSP

🧊Why learn Byte Order Mark?

Developers should learn about BOM when working with text files in Unicode encodings, especially in cross-platform applications, data interchange, or internationalization projects, to ensure consistent text handling and avoid encoding-related bugs. It is crucial in scenarios like reading/writing CSV, XML, or JSON files with non-ASCII characters, or when integrating systems that may default to different encodings, as omitting or misinterpreting BOM can lead to data corruption or display errors.