Nothing more, and nothing less

Dear CMS writers:

I think I hate all of you. Including me. Will you take it personally if I take your code and fling it against the nearest wall, screaming and cursing, until your code either a) bleeds or b) apologizes and fixes itself? It would really make me feel better if I could just do that and get it out of my system.

See, I'm just trying to write a portal script here, and every CMS I've tackled keeps making me madder and madder. At this rate, in approximately another ten minutes I will have puffed up to the size and shape of that enormous Sta-Puf marshmallow guy from Ghostbusters. Since I suspect that swelling to that size and shape will not be good for my longterm health, I am currently beating on my keyboard and creating words that are quickly taking on the form and shape of a rant.

Part I: what time is it?

Could someone, somewhere, please wake up to the realization that storing dates in local time is stupid, stupid, stupid? I mean, sure, saving things in local time is the quickest way in the short run, but not only are you shooting yourself in the foot, you are guaranteeing that aftermarket script writers like myself are dreaming of your ultimately gory demise.

Why, you say?

Take the script that I'm currently writing. You know what I need for each post? A GMT/UTC timestamp. Bless you, Gareth, for threatening to shoot me a year ago if I didn't store dates/times in GMT. You know what it takes in Quarto to get a nice happy seconds-from-epoch timestamp for a post? 'SELECT unix_timestamp(entry_date) AS timestamp FROM qt_entries.' Boom. That's it. You're set. Move on to something more interesting.

You know what it takes everywhere else? Finding out on a CMS-by-CMS basis how they're saving offsets from "server time" (and no, server time does not necessarily equate to GMT). In greymatter it's saved as a [+/-]decimal in the 9th (8th, to PHP) line of gm-config.cgi. In MovableType it's saved as [+/-]hh:mm and is accessible as the variable . b2 appears to save it as a [+/-]decimal under the column time_difference in the settings table of the database. (I dread finding out what PMachine's specs are; I'm sure they will somehow manage to be different from everything else that has come before.)

Once you've got that information - jolly! - you can eventually reconstruct a unix timestamp after a few jogs and giggles. Yay for PHP's gmmktime() function.

It should not be this way. Timezones are completely bloody irrelevant to a server; they're human constructs to help us equate time of day across the globe. Times should be saved as GMT, so that things are consistent across the globe, and only formatted into timezones when they are presented to a user, who, unlike the server, thinks in terms of timezones.

(In addition, this provides for the possibility of allowing users to set their own timezone, which is classy. Not everyone lives in your timezone - there are people reading domesticat who live in Hawaii, Australia, and England. It's nice to be able to provide each of them times that make sense … to them.)

Part II: give me only what I ask for

What, you thought timezones were all I wanted to rant about today? Ha.

When I request information from a CMS - database or otherwise - do not give me more than what I ask for. Don't return extraneous carriage returns. If I set up a file that, on the first line, requests data to be placed on that line, don't insert carriage returns before it without my permission. I beat on the wonderPortal data file for one of the data files for well over an hour before I realized that I was doing nothing wrong - that, indeed, it was always going to insert an unrequested carriage return at the beginning of the document.****

Why does this matter? wonderPortal expects one bit of information per line. For now, line[0] equates to the author's name, line[1] to the entry's URL, line[2] to the entry's title. An unasked-for carriage return shifts all information down a line and would, if I hadn't caught it, force wonderPortal to try to use the author's name as the URL, and the entry's URL as the entry's title.

Don't return extra crap when I query your CMS for information. Give me what I ask for. Nothing more, and nothing less. Otherwise every other data file has to be changed to work around your issues, and forcing others to work around your issues is tacky, even by my standards.

Part III: database design

So, I've made it this far. Now we reach the point where my head actually exploded (cleanup crews will arrive shortly). So I've figured out how to make greymatter and MovableType cough up timestamps in GMT. Works for me. It was a pain in the ass, but I figured out how to do it in a way that should cause the minimum amount of annoyance on the part of the end-user.

It was b2 that very nearly made me hurl my monitor against the wall. I thought, "Hey! It's based off of PHP and mySQL, so perhaps I won't have to do too much work on this one. Maybe they got it right…" Nope. Local timestamps. Grr. I can work with that, though I don't like to - so I started working through the conversion.

(I'm cranky - I refuse to look at the source code or install scripts of any other CMS while I'm working on my own, because I want my code to be wholly my own. Luckily, Chris was kind enough to help generate mySQL descriptions of all the tables in the b2 schema. Dancing virgins and pounds of Ben & Jerry's to you, sir.)

I started comparing table schemas, and began to piece together a query. Then I realized what I was looking at: fields in different tables with exactly the same names. Lovely. So, let me get this straight… the unique id for each entry in the posts table is called id, the unique id for each site in the settings table is called id, and the unique id for each user in the users table is called id?

So what happens when you join two tables together, and they both have fields named id for different things? Yep - every time, you must rename each instance of id to something that is more obvious, like entry_id, site_id, or user_id. True, it's not difficult to rename these fields in your queries, but do I really have to point out how excruciatingly annoying it is to have to type things like "SELECT j_users(id) AS user_id, j_posts(id) AS entry_id…" for every multi-table query when it should have been something like "SELECT user_id, entry_id…" in the first place?

Once upon a time, Gareth chastised me for feeling bad about how long it was taking me to get a Quarto database schema finalized, telling me that extra time spent on streamlining and bulletproofing my database schema would pay for itself in ease of use (and customization) later.

I'm starting to see that once again, he was right. In this sort of thing, he usually is.

Part IV: all those apologies

I have to be careful. Am I a professional coder? No. I am an experienced and patient amateur—but I am an amateur that has written her own CMS, and can speak firsthand of the thousands of arbitrary decisions that have to be made in the design and coding stages. I make absolutely no pretensions whatsoever about any supremacy of what I wrote; I can guarantee you that Quarto undoubtedly fails on tests of efficiency and usability.

I spent a lot of time questioning why I was even bothering to write a CMS of my own, when other CMSes, written by far more skilled coders, already existed. Initially, I believed those CMSes to be so far ahead of my own ability that anything I wrote would exist as nothing but a paler, less-functional copy of those better originals.

I spent a lot of time asking myself why the hell I bothered, when it would have been easier to pay $30, get a fully-functioning version of another CMS, convert my files to their database format of choice, and be done with it. At first my answer was sheer obstinacy. Then it was the native ability to save posts in multiple-and-equal categories (not primary and secondary categories). Then it was the ability to send notification emails on new posts/comments/user signups. Then it was providing extra privileges to email-verified users, such as being able to subscribe to new comments on posts.

I never, ever expected it to be something so basic as GMT timestamps - a concept so simple that I had it drilled into me during my one measly semester of database theory in college. This was basic stuff, the stuff I took for granted from day one and never thought about again.

There are plenty of 'compatibility' things out there now that are de rigueur bits for various CMSes, like supporting a Blogger API. Forgive me, but I'm having just a wee bit of trouble not rolling my eyes on that point - people championing 'cross-compatibility' features like that when, if you ask me, there are existing problems far deeper and basic than figuring out how to let some web wonk have the instant gratification of posting from their cell phone.

With that said, I think it unlikely now that I'm going to kill anyone or go Sta-Puf on the populace. I'll get back to working on the data file for b2. I realize that I can't change how others will code their own CMSes. All I can do is make mine the best it can be with my [admittedly, limited] coding ability, and do my best to make my scripts, like the wonderPortal script I'm working on, be as cross-CMS-compatible as possible.

Just don't expect me to always be cheery about it. Here's to stubborn independence.

**** I may have finally found a way around this. One issue down, eighty-four to go.

all tags: 

Comments

Well, you know, pushing for standards-compliance is a good thing ... might make the CMS's all get semi-standardized. Or they won't care, and end-users will hate them all. Kinda like my users hated me when I had to switch from YaBB SE to phpBB ... heh.

Well, I can't speak for any other CMS authors, but I can disavow myself as being a "professional coder." I was literally learning Perl while I was writing Greymatter (and yes, alas, it definitely shows). I do like the idea of developing "CMS standards" though (not that love or money will get me to work on Greymatter again, I just like the idea in principle). And if I ever write another CMS, I promise I'll store my timestamps in GMT. ;)

Along the unique column names for IDs, I think you should have gone with the system we used at Success. I think it was some bastardization of Hungarian notation. Primary keys started with "pk", foreign keys started with "fk". Then you'd tack on your data type -- "int" for integers, "lng" for longs. Then the name of the column -- like "entry" or "user". The reason I bring this up is because foreign keys, usually consisting of a long integer, would end up being named "fklnguser". Imagine the first time I saw this in a table at Success. "Why would they name something 'fucking user'? That's not very nice."

Amy: I have a feeling that we're going to have a talk about my wg_update.pl script. (Which I obviously run by hand, since I forgot to after my most recent journal entry.) How about this: You write the portal code, and when you're done, I'll worry about how to make my script talk to your stuff. :)

brad: Hungarian notation == poo. Variables should be named descriptively based on what they do, not on what they are. But then, that's just my opinion.

/me is amused ... ;)

John - I'm not condoning the use of Hungarian notation. I haven't used it since I left Success. I could argue either way about it, but in practice I use descriptive variables. Like the value you'll be returning from a function will be put in $return. Or an instrument name will be stored in $instrument. This might be a side-effect of using Perl, though. A scalar variable doesn't care if you're a string or an integer or a real number, most of the time it'll do what you expect. I'm of the opinion that some sort of pseudo-Hungarian notation is useful for databases since you're so often working with a schema. You've got the database layout printed out, and you don't want to have to go to the database to see if you'll be getting back a DATETIME value or a LONG value. This is especially good if you're writing a front-end to a database, since you don't often want to have to go to the database to get a table description each time you want to pull back a date field. But it was nice having a standard since a number of people were working on that code. And it was nice to have a company standard, not just a project standard.

Heh, that was a semi-random comment about the Windows API, more than anything else. Yeah, I know hungarian or semi-hungarian has it's uses. I just think MS took a good idea and went too far with it. Hungarian is great for when types aren't obvious. But I don't need to be doing things like: for(int int_i; int_i <= int_j; int_i++) ... etc. (fun note: I'm using links, a text based browser to enter this. The comment box is split amongst two screens (kinda hard to describe) so I have no idea how it's going to look. Why a text based browser, you ask? Well... it's a long, sorry tale of debian sid switching to gcc 3.2, and me trying to make vmware work. (hint: vmware needs a kernel module compiled for it. The current running kernel was compiled with 2.95, so I figured, ah, just re-compile the kernel with the new compiler, right? Ah, no.) Pardon my re-post. /me shakes his head at < thingies.

how do you enter the date in gmt time?

that is, in the PHP code to add it to the database. by the way, for some reason i can't edit my comments!

Kyle - if I get into too much detail here, just pat me on the head and I'll go about my business. I set quarto up so that it has a default timezone (for cat.net, that's -6 hours and CST) but that users can choose their own timezone as part of editing their account information. Since all timestamps are saved in GMT, the dates have to be converted to something prior to being displayed on the page. Each quarto-related page checks user/password info against the database. If there's a match, and there's a user-selected timezone associated with that account, quarto uses their preference to format times. If there's no preference, or there is no user, then quarto uses the default preference I set. As for editing comments - that's QUITE odd. Drop me a mail when you get a moment. I checked the logfile, and you were able to view the edit-comments page. Let me know what happened from there. Were your comments listed in the option box, or was it empty? If there were comments listed in the box, what happened when you tried to edit them? The edit-comments page hasn't been overhauled, and I don't doubt that there are some errors in it.

Oh, and your other question - about entering dates in GMT time - you don't. Everything the user sees is tailored to their timezone. PHP does the work of converting the dates back to GMT. The user shouldn't have to think about that sort of thing.

Amy, it might be a rant, but it's an educational rant. Note to self - always set time in GMT. When I had my webpage I did most of my coding by hand - every time I tried to use Word or Microcrap it made the HTML intelligible (yes, it worked, but man did it look sloppy). One day I'll actually brush up my skills & put up my webpage again....

I agree with you about timezones and GMT times, wholeheartedly. But by the way, just for the record -- pMachine (what i use as a CMS, until you make Quarto public *grin*) DOES store dates in GMT.

I meant actually in the PHP code, but I'll send you an email.