Eric Bergman-Terrell's Blog

.NET Programming Tip: How to Determine the Encoding of a Unicode File
October 4, 2010

The StreamReader class allows you to read in Unicode text from a file without having to worry about the precise encoding:

StreamReader SR = new StreamReader(FileName, true);

String Contents = SR.ReadToEnd();


For example, the above code works for Unicode files having the following Encodings: Encoding.BigEndianUnicode, Encoding.Unicode, and Encoding.UTF8. It also works if the file is encoded in Encoding.ASCII format.

The file's encoding is automatically detected because the StreamReader constructor's second argument (detectEncodingFromByteOrderMarks) is true.

There's no problem reading in Unicode text using the StreamReader. The problem is writing updated text back to the file with the original Encoding intact. For example, if your program reads text in Encoding.BigEndianUnicode format, it should write it back in the same format.

Unfortunately the StreamReader object doesn't keep the original Encoding around for later use. Don't try to use the CurrentEncoding member, it's always Encoding.UTF8, regardless of the text file's actual Encoding. At least it always was when I experimented with it.

So how can you use a StreamWriter to write back text read from a StreamReader, with the original encoding intact? Use the following code to determine the file's original encoding, and specify that encoding in the StreamWriter's constructor.

Unicode files start with a two byte prefix called a BOM (Byte Order Mark) that identifies the exact Encoding of the file. GetFileEncoding() iterates through various Unicode Encoding values and compares the file's BOM with the current Encoding's BOM (returned by the GetPrefix() member). When a match is found, the corresponding Encoding value is returned. If no matches are found, the Encoding.Default value is returned.

public static Encoding GetFileEncoding(String FileName)

// Return the Encoding of a text file.  Return Encoding.Default if no Unicode
// BOM (byte order mark) is found.

    Encoding Result = null;

    FileInfo FI = new FileInfo(FileName);

    FileStream FS = null;

        FS = FI.OpenRead();

        Encoding[] UnicodeEncodings = { Encoding.BigEndianUnicode, Encoding.Unicode, Encoding.UTF8 };

        for (int i = 0; Result == null && i < UnicodeEncodings.Length; i++)
            FS.Position = 0;

            byte[] Preamble = UnicodeEncodings[i].GetPreamble();

            bool PreamblesAreEqual = true;

            for (int j = 0; PreamblesAreEqual && j < Preamble.Length; j++)
                PreamblesAreEqual = Preamble[j] == FS.ReadByte();

            if (PreamblesAreEqual)
                Result = UnicodeEncodings[i];
    catch (System.IO.IOException)
        if (FS != null)

    if (Result == null)
        Result = Encoding.Default;

    return Result;
Keywords: Unicode, Encoding, StreamReader, StreamWriter, BOM, Byte Order Mark, BigEndianUnicode, Encoding.Default, Encoding.ASCII, Encoding.Default, Encoding.Unicode, GetPreamble

Reader Comments

Comment on this Blog Post

Recent Posts

EBTCalc (Android) Version 1.44 is now availableOctober 12, 2021
Vault (Desktop) Version 0.72 ReleasedOctober 6, 2021
EBT Compass is Now Available for Android DevicesJune 2, 2021
Convert a Windows 10 Notebook into a High-Capacity Photo FrameApril 3, 2021
EBT Music Player Added To Website, Source Code in GitHubMarch 15, 2021
T-Mobile 5G21-12W-A High-Speed Internet Gateway MonitorMarch 13, 2021
Vault 3 Source Code in GitHubMarch 3, 2021