Quotes and Web Apps

by Eric M. Burke

I am sick of this. I live in O'Fallon, MO. In my estimation, about 75% of all web applications are broken. They refuse to accept the apostrophe, forcing me to type "OFallon" or "O Fallon".


This is ridiculous. I see this constantly. I get messages telling me my city does not match my ZIP code. Or better yet, the page completely CRASHES as just happened when I tried to activate my TurboCad license. Taking out the apostrophe corrected the problem.


The problem goes beyond web apps. People don't seem to understand how to store quotes and apostrophes in database tables, either. I hardly ever see "O'Fallon" on any bills from various utility companies, on catalog mailing labels, etc. I'm sure there are big ZIP code lookup data files out there with my city name stored incorrectly, and these data files are sold and copied from company-to-company, further spreading incorrect information.


Here's a tip. Test your applications with "unusual" input data. Don't just try Jones and Smith. Try names like O'Reilly or O'Fallon, and throw in some hyphens, ampersands, less-than characters, and quotes.


19 Comments

anonymous2
2004-01-06 20:35:52
Can't handle first initials either.
Most software can't handle names like J. George Smith, either.
jwenting
2004-01-06 23:36:09
Can't handle first initials either.
I had to change the name on my passport when the government here switched to another system that does not allow for more than one middle initial...
bazzargh
2004-01-07 04:53:27
Also, name changes...
An astonishing amount of software conflates the users name with their id, or required user ids to include the users surname as company policy.


As Mr T would say, I pity the fool who tells a woman she has to log in with her maiden name (or her married name, in the case of divorcees). And continue to use that as her email address, too.

anonymous2
2004-01-07 05:35:27
Apostrophes and Double Quotes
I'm sure that it's just an unsophisticated way to guard against a SQL injection attack.
anonymous2
2004-01-07 06:20:31
I hear you
I grew up in St. John's, Newfoundland, Canada.. I always had to try several different "spellings" of the city before I could get anything to recognize it.
anonymous2
2004-01-07 07:35:48
+ in email addresses as well
The plus sign is an acceptable character in e-mail addresses. It is also the standard local separator in most OSS MUA, and is very convenient when you want to automatically sort your mail according to the recipient email address.


Why then should we cope with poorly written form entry verification scripts that reject the plus character?

jimothy
2004-01-07 07:55:25
Also, name changes...
I pity the fool named Mr. T who has to fill out web forms and has no first name and only an initial for a last name! "First name is a required field."
jimothy
2004-01-07 08:01:52
Apostrophes and Double Quotes
I suspect it's much more accidental than that, and the result of careless string concatenation in constructing an SQL statement:


insert into accounts (..., city, ...)
values (..., 'O'Fallon', ...)


Notice how the SQL statement breaks. In Java, the easiest way to avoid this is to use a PreparedStatement and setString() instead of concatenation. I'm sure other environments have an equivalent.


The other option is to escape single quotes, etc. before concatenating. i.e., 'O'Fallon' becomes 'O''Fallon'). But PreparedStatement, or its equivalent, places that responsibility where it belongs: in the database driver.

anonymous2
2004-01-07 08:04:31
Also, name changes...
I recently bought my younger brother a Mr. T keyring with the several recordings of the actual Mr. T's catchphrases, one of which was:


"First name: Mister, middle name: period, last name: T"


and a short bio on the back claimed that the name was chosen so that people had to call him "mister".

anonymous2
2004-01-07 11:01:50
Apostrophes and Double Quotes
This "careless string concatenation" is the essence of SQL injection vulnerabilities. I would be extremely wary of purchasing from a vendor who did not guard against such simple attacks.
jimothy
2004-01-07 14:33:05
Apostrophes and Double Quotes
I hadn't thought about the vulnerability to SQL injection attacks (and in fact, didn't know what they were 'till I googled). But this is another argument for using PreparedStatements, as they should also guard against this type of attack.
anonymous2
2004-01-07 16:45:04
I feel your pain
My last name is O'Connor. I used to be a software tester, so THOSE apps work, but it sure drives me nuts when I crash others with just my name!
anonymous2
2004-01-08 17:08:57
passwords are worse
When you have to choose a password, and it's restricted to alphanums. Bah. O'Fallon, eh? Tell me you're not a Cardinals fan. Go Cubs!
feint
2004-01-08 23:30:51
The reason. (not an excuse)
The reason that this is "more difficult" and often omitted in testing is that the single quote character is used in SQL. The solution is to escape every string going to the database and unescape it on the way out. Thus O'Connell would become O'Connell or something similar.


So why isn't this done more often? SQL fields are generally fixed length or have a maximum size. To ensure that the string will fit in the table, a programmer will use strlen in C or the String length attribute in Java. However, what should be used is the length of the escaped string, not the original string.


So if the answer is that easy, why not just do it? The issue is ususally actually the UI. Suppose that the UI says that it can accept 10 characters because the database field it 10 characters. Now if you type O'Connell in the blank, the user thinks it should fit. However the app will report that it is too long because it is actually stored in the DB as O'Connell which is 14 characters. Thus the user is confused. This is even worse with 256 char description fields that may have multiple things that need to be escaped. (think of all the contractions used in open text fields in feedback forms etc)


This is not to say that their isn't a solution, but the solution becomes more complex than what most small web sites are willing to invest to fix. Commercial enterprise software on the other hand *should* do this, but once again normally does not. The excuse on that level however is programmer and QA laziness (as reported by previous posters)

feint
2004-01-08 23:32:51
The reason. (not an excuse)
of course, being html and all.. it doesn't display the expanded version.. hopefully pre tags work. Here is my example from above:

O'Connell

feint
2004-01-08 23:33:32
The reason. (not an excuse)
nope.. still not working.. just look at the HTML source if you want to know what it looks like escaped..
coyner_b
2004-01-09 06:13:56
You Missed The Point
I think you are missing the point. The point is that applications should not crash if a user enters an apostrophe. The application should be able to accept an apostrophe (or any character) and render it correctly on the screen and store it correctly in a DB. Your example states that an apostrophe would be stored in a DB as $apos;, where the $ represents & (otherwise you would see a tick). This is totally wrong. The escape sequence $apos; is for rendering HTML only and should not be stored as data. JDBC's PreparedStatement implementation should take care of any escaping necessary to store and retrieve data.
anonymous2
2004-01-15 08:49:05
There's a security issue too.
Quite often when apostrophes aren't being correctly escaped for SQL it's possible to inject some SQL code into the data that will be executed with, as the trailers for bad sit-coms always put it "hilarious results".
kenj0418
2004-02-09 11:45:31
Post office is also to blame
Part of your problem with "Your zip code doesn't match your city" errors is that the post office doesn't seem to like O'Fallon either.


If you go to: http://www.usps.com/zip4/citytown.htm
and enter 63366, they list the 'Acceptable' city names. They like:
O Fallon
but they don't like:
OFallon
and they don't even mention:
O'Fallon


So even if your app was properly handling quote's it still would reject the city if it was validating it against the post office's data.