Fixing Byte-Order-Mark issues with WordPress on IIS | Accidental Scientist

Fixing Byte-Order-Mark issues with WordPress on IIS

One common problem – it seems – with WordPress just plain acting funky on IIS is that occasionally, byte-order-marks get inserted into the UTF-8 PHP documents that make up the WordPress code.

Sometimes these come in as part of templates and plugins, other times they just magically* appear.

If you’ve got Visual C# (or Visual C# express), here’s a little nugget of code you can use to strip them out. No warranties. No expressed suitability for a given purpose. Just free code I wrote. Free, crappy, one-off, single-purpose code. But it works. Compile it up, and run it, and watch it go.

I recommend running it in two passes:

  • Generate a list of files to check using DIR /S *.* /B /A-D > out.cmd
  • Modify that list to call BOMFix on each file (e.g. BOMFix c:\myfile\app.php ). Make a note of which files have a BOM mark
  • Run it again, with /F as the second argument (e.g. BOMFix c:\myfile\app.php /F ). This will strip the BOM from the files.
  • Throw your files back up onto the server.

And for your viewing pleasure, here’s the code:

using System;
using System.IO;

namespace BOMFix
{
    class Program
    {
        static void Main( string[] args )
        {
            if ( args.Length == 0 )
                return;

            bool bHasBom = FileHasBOM( args[ 0 ] );

            if ( bHasBom )
            {
                Console.Out.WriteLine( "{0} has a BOM", args[ 0 ] );
                if ( args.Length == 2 && (args[1] == "/F" || 
                     args[1] == "/f") )
                {
                   StripBOM( args[0] );
                   Console.Out.WriteLine( "Removed BOM from {0}",
                                           args[ 0 ] );
                }
            }
        }

        const long READSIZE = 8192;

        public static bool FileHasBOM( string path )
        {
            FileStream s = new FileStream( path, FileMode.Open,
                                FileAccess.Read, FileShare.Read );
            long fileLen = s.Length;
            if ( fileLen < 3 )
                return false;

            byte[] file = new byte[ 3 ];
            s.Read( file, 0, 3 );
            s.Close();


            return ( file[ 0 ] == 0xEF && file[ 1 ] == 0xBB &&
                     file[ 2 ] == 0xBF );
        }

        public static void StripBOM( string path )
        {
            FileStream s = new FileStream( path, FileMode.Open,
                               FileAccess.Read, FileShare.Read );
            s.Seek( 3, SeekOrigin.Begin );
            long readleft = s.Length - s.Position;
            byte[] buffer = new byte[ READSIZE ];

            string tempFileName = Path.GetTempFileName();
            FileStream outStream = new FileStream( tempFileName,
                FileMode.Truncate, FileAccess.Write, FileShare.None,
                8192, FileOptions.SequentialScan );
            
            while ( readleft > 0 )
            {
                int chunkSize = (int)Math.Min( READSIZE, readleft );
                if ( s.Read( buffer, 0, chunkSize ) != chunkSize )
                {
                    throw new Exception( "Not enough data! File error?" );
                }

                outStream.Write( buffer, 0, chunkSize );

                readleft -= chunkSize;
            }

            outStream.Flush();
            outStream.Close();

            s.Close();

            File.Replace( tempFileName, path, null );
        }
    }
}

*Yes, I know, not actually magically… but no-one seems to have root-caused it.

About Simon Cooke

Simon Cooke is a video game developer, ex-freelance journalist, screenwriter, film-maker and all-round good egg in Seattle, WA. The views posted on this blog are his and his alone, and have no relation to anything he's working on, his employer, or anything else and are not an official statement of any kind.
facebook comments
blog comments

One Response to Fixing Byte-Order-Mark issues with WordPress on IIS

  1. Mark says:

    I ran into this issue, and your code fixed the issue for me.

    I updated your code to scan a directory and optional run the /fix command on the files.
    You can also use search patterns (etc. *.php)

    http://pastebin.com/6gWBmudU

    Thank you for taking the time to write a blog post about this.

    Regars,
    Mark