Consequence of Mac\Unix\Win Format Text Files
Last week, I was working on updating some old PHP code from 1999 and 2000. Now this is working code in PHP4 but was not quite up to todays standards. As an example it used the PHP code start tag of <? instead of the much better <?php. Also needed to make certain that all of my variables were declared and that no global or unsanitized variables were used. Overall not a big deal just a good use for grep or sed.
A WinXP laptop is my where I write most of my code and the files are then uploaded to a FreeBSD server. I do not and have never owned a Mac. Usually for this type of action, I would run grep or sed on my FreeBSD server but instead decided to try Windows Grep. I ran Windows Grep with the proper regular expressions and voila, my PHP start tags were updated. Next, the files were uploaded to a new server for testing. ERRORS, errors everywhere.
About half of the updated files were kicking out errors or failing completely. The failed files were rather random and the code looked just fine in the editor. What was going on?
Comments not recognized by PHP
Digging a little it became obvious that the comments were not being properly recognized by PHP. Multi-line comments were fine but not the single line comments.
// This type of comment would cause all following // lines to also be viewed as a comment. /* This type of comment worked fine and as expected. */
Try searching for this type of prolem in Google and you will find NOTHING. I dug into my php.ini files, read relevant PHP documentation, searched many different terms, spent a few hours on IRC, and reformatted code. Nothing worked.
Finally I thought, well maybe my character encoding is incorrect (getting desparate by now). So I fired up my code in Notepad++, went to view the character encoding and noticed that the file was in Mac format. Huh? Well now, that could be a problem so I changed the file to Windows format, uploaded it to the server, and it worked!
Proper text format conversions
Windows uses a pair of CR and LF characters to terminate lines. UNIX (Including Linux and FreeBSD) uses an LF character only. Apple uses a CR character only. Time to dig into the text toolbox. On my FreeBSD, I have used the win2unix tool before, this time, I would use mac2unix. Sorry Windows, this type of work is much easier done in FreeBSD.
Uploaded all of my files to FreeBSD, ran mac2unix on all files, and then downloaded the files back to Windows. Comments in PHP now work as expected.







December 8th, 2008 at 4:06 pm
[...] Vote Consequence of MacUnixWin Format Text Files [...]
April 28th, 2009 at 4:55 pm
I haven’t used Mac line endings since I started using Mac OS X, but then, I use my Mac like a *nix box when I do web development… and Windows? I just try to avoid it!