was successfully added to your cart.

Two Inexplicably Dangerous Characters

By May 15, 2010 January 3rd, 2011 Development, Humor

Yesterday, as I was writing an e-mail to one of my employer’s clients, I encountered a strange bug with Excel. The e-mail I was writing included instructions that the client should perform, and one of the steps actually involved creating a CSV file, which would then be opened in Excel. Before sending the e-mail, I decided to step through the instructions myself, and make sure everything worked as I expected. When I reached the CSV step, I found that while I could export it fine – I couldn’t open it in Excel. I’ll guide you through what followed using a really simple example CSV:

ID,NAME
1,KARA
2,GAIUS
3,SAUL

As you can see, the above CSV contents are completely valid. However, when I tried to open it in Excel 2000, I received this error message:

SYLK: file format is not valid.

When I tried to press the OK button, Excel just gave up loading the CSV. While I didn’t know what they meant by “SYLK“, I assumed something was wrong with the CSV export. Maybe there was an invalid, invisible character in the file, preventing Excel from parsing it – perhaps making it think it was another format. To test this theory, I used one of my favorite features in Notepad++, one that allows me to see all characters, even those which are normally invisible.

Using Notepad++ to reveal hidden characters

Using the feature, I could tell that there were only carriage returns (CR) and line breaks/feeds (LF), which are perfectly valid for a CSV.

With that possibility ruled out, I decided to try the same CSV in a newer version of Excel – the 2007 edition. When I went to open it, I was unfortunately greeted with the same type of errors:

The file you are trying to open, 'Example.csv', is in a different format than specified by the file extension. Verify that the file is not corrupted and is from a trusted source before opening the file. Do you want to open the file now?

Excel has detected that 'Example.csv' is a SYLK file, but cannot load it. Either the file has errors or it is not a SYLK file format. Click OK to try and open the file in a different format.

Despite these errors, through the selection of the appropriate dialog options – I was at least still able to view the CSV within Excel:

At this point, however, I was really intrigued – so I googled the original error message I received in Excel 2000. I found a Microsoft article explaining the behavior, while also offering a workaround. When reading the article, I could barely hold in my laughter:

SYMPTOMS
When you try to open a text file or a comma-separated variable (CSV) file, you may receive the following error message:
SYLK: File format is not valid

CAUSE
This problem occurs when you open a text file or CSV file and the first two characters of the file are the uppercase letters “I” and “D”. For example, the text file may contain the following text:
ID, STATUS
123, open
456, closed
Note This problem does not occur if the first two letters are lowercase “i” and “d”.

WORKAROUND
To open your file in Excel, open the file in a text editor, and then insert an apostrophe at the beginning of the first line of text.

How to Insert an Apostrophe
To add an apostrophe to the beginning of the first line of text in your file, follow these steps:
1.  Open the text file in a text editor, such as Notepad. Click before the first character in the first line of text. Press the APOSTROPHE key on your keyboard (‘).
2. On the File menu, click Save. Quit the text editor. You can now open the file in Excel.

Source: http://support.microsoft.com/kb/323626

I think you need to be a developer to truly appreciate how funny this is. I had to mention this bug to the client, because the likelihood of the ID being the first text in the file was pretty high. Even then, I mentioned they should read the Microsoft article only if they encountered the problem. No way am I going to explain the awkward workaround myself.

So, I confirmed that the bug was still present in Excel 2007. I just hope that Microsoft fired the developer that came up with the SYLK detection logic. Assuming that a file is a particular type based solely on the first two characters (“ID”) is a big risk, especially when those two characters are part of the alphabet, and are also common in IT.  I’m guessing there are complexities I don’t see… or at least, I hope so. Either way, their workaround is really unbelievable… they’re actually telling people to add an apostrophe as the first character of the file. I’m aware that Microsoft is a large complicated beast with many different teams  involved in releasing bug fixes, but this one is just so ridiculous.