2014-03-21

A Wrinkle in Time

You've built a prototype, everything is going great. All your dates and times look great, they load and store correctly, everything is spiffy. You have your buddy give it a whirl, and it works great for them too. Then you have a friend in CuraƧao test it, and they complain that all the times are wrong - time zones strike again!

But, you've got this covered. You just add an offset to every stored date/time, so you know the origin time zone, and then you get the user's time zone, and voila! You can correct for time zones! Everything is going great, summer turns to fall, the leaves change, the clocks change, and it all falls apart again. Now you're storing dates in various time zones, without DST information, you're adjusting them to the user's time zone, trying to account for DST, trying to find a spot here or there where you forgot to account for offsets...

Don't fall into this trap. UTC is always the answer. It is effectively time-zone-less, as it has an offset of zero and does not observe daylight savings time. It's reliable, it's universal, it's always there when you need it, and you can always convert it to any time you need. Storing a date/time with time zone information is like telling someone your age by giving your birthday and today's date - you're dealing with additional data and additional processing with zero benefit.

When starting a project, you're going to be better off storing all dates as UTC from the get-go; it'll save you innumerable headaches later on. I think it is atrocious that .NET defaults to system-local time for dates; one of the few areas where I think Java has a clearly better design. .NET's date handling in general is a mess, but simply defaulting to local time when you call DateTime.Now encourages developers to exercise bad practices; the exact opposite of the stated goals of the platform, which is to make sure that the easy thing and the correct thing are, in fact, the same thing.

On a vaguely related note, I've found a (in my opinion) rather elegant solution for providing localized date/time data on a website, and it's all wrapped up in a tiny Gist for your use: https://gist.github.com/aprice/7846212

This simple jQuery script goes through elements with a data attribute providing a timestamp in UTC, and replaces the contents (which can be the formatted date in UTC, as a placeholder) with the date/time information in the user's local time zone and localized date/time format. You don't have to ask the user their time zone or date format.

Unfortunately it looks like most browsers don't take into account customized date/time formatting settings; for example, on my computer, I have the date format as yyyy-mm-dd, but Chrome still renders the standard US format of mm/dd/YYYY. However, I think this is a relatively small downside, especially considering that getting around this requires allowing users to customize the date format, complete with UI and storage mechanism for doing so.

2014-03-13

On Code Comments

I've been seeing a lot of posts lately on code comments; it's a debate that's raged on for ages and will continue to do so, but for some reason it's been popping up in my feeds more than usual the last few days. What I find odd is that all of the posts generally take on the same basic format: "on the gradient of too many to too few comments, you should aim for this balance, in this way, don't use this type of comments, make your code self-documenting." The reasoning is fairly consistent as well: comments get stale, or don't add value, or may lead developers astray if they don't accurately represent the code.

And therein lies the rub: they shouldn't be representing the code at all. Code - clean, self-documenting code - represents itself. It doesn't need a plain-text representative to speak on its behalf unless it's poorly written in the first place.

It may sound like I'm simply suggesting aiming for the "fewer comments" end of the spectrum, but I'm not; there's still an entity that may occasionally need representation in plain text: the developer. Comments are an excellent way to describe intent, which just so happens to take a lot longer to go stale, and is often the missing piece of the puzzle when trying to grok some obscure or obtuse section of code. The code is the content; the comments are the author's footnotes, the director's commentary.

Well-written code doesn't need comments to say what it's doing - which is just as well since, as so many others have pointed out, those comments are highly likely to wind up out-of-sync with what the code is actually doing. However, sometimes - not always, maybe even not often, but sometimes - code needs comments to explain why it's doing whatever it's doing. Sure, you're incrementing Frobulator.Foo, and everybody is familiar with the Frobulator and everybody knows why Foo is important and anyone looking at the code can plainly see you're trying to increment it. But why are you incrementing it? Why are you incrementing it the way you're doing it in this case? What is the intent, separate from its execution? That's where comments can provide value.

As a side note (no pun intended), I hope we can all agree that doc comments are a separate beast entirely here. Doc comments provide meta data that can be used by source code analyzers, prediction/suggestion/auto-completion engines, API documentation generators, and the like; they provide value through some technical mechanism and are generally intended for reading somewhere else, not for reading them in the source code itself. Because of this I consider doc comments to be a completely separate entity, that just happen to be encoded in comment syntax.

My feelings on doc comments are mixed; generally speaking, I think they're an excellent tool and should be widely used to document any public API. However, there are few things in the world more frustrating that looking up the documentation for a method you don't understand, only to find that the doc comments are there but blank (probably generated or templated), or are there but so out of date that they're missing parameters or the types are wrong. This is the kind of thing that can have developers flipping desks at two in the morning when they're trying to get something done.

2014-02-25

New Laptop! ASUS ROG G750JW

I recently received the generous gift of an ASUS ROG (Republic of Gamers) G750JW laptop, and let me tell you, the thing is a beast. Seriously, it's huge.

It's a 17" widescreen laptop (1920x1080 TN panel, no touch thankyouverymuch), with an extra two inches or so of chassis behind the hinge. It also weighs just short of ten pounds.

But, I wasn't looking for an ultraportable. I wanted something that I could use around the house and on the road, primarily for software development, but also for occasional gaming. That meant I needed a comfortably-sized keyboard, trackpad, and display; that meant a 17" laptop. I wanted decent battery life and decent performance, which meant it would be heavy for its size. And I got exactly what I asked for.

The G750JW runs a Core i7 at 3.2GHz, 12GB of RAM, an NVidia GeForce 765m, and a 750GB HDD. Step one was replacing the HDD with a 240GB Crucial M500 SSD I picked up for $135 on Amazon - less than half what I paid for a nearly identical drive just over a year ago. The difference in speed is truly staggering, going from a 5400 RPM laptop hard drive to a full-tilt SSD. It also cut a few ounces off the weight, and added a good half hour to hour of working time on the battery, so a win across the board.

I tried installing Windows 7 on it as I despise Windows 8, but kept running into an error during the "extracting files" stage of the installation. I found numerous posts online from people with the same problem, some of them with solutions, but none of those solutions worked for me; from what I can tell, it appears to be some conflict between the latest-and-greatest UEFI in the G750's motherboard and the aging Windows 7 OS. It's a shame, but I suppose being forced to gain more familiarity with Windows 8 isn't all bad; I just wish I had the option to use something more, well... usable.

Other than the OS though, it's been a joy. It performs extremely well, it has all the features and specs I need for what I'm using it for, and it's a beast for gaming - more horsepower than I really need considering I'm not a huge gamer and gaming was not the primary purpose of the laptop to begin with. Part of its bulk comes from the two huge rear-venting fans in the thing, which do a good job of keeping it cool - something I've had problems with when using other laptops, and which was the ultimate bane of my wife's old MacBook Air. I don't think I need to worry about it overheating and locking up while playing video like the MBA did on a regular basis.

My only gripe at the moment is that it seems to be impossible to find a decent Bluetooth mouse. Sure, the market is flooded with wireless laptop mice; but 95% of them use a proprietary receiver (I'm looking at you, Logitech!) rather than native Bluetooth, which requires you to use the provided USB dongle. That seems like an utter waste considering the laptop has a built-in transceiver capable of handling mice without any USB dongle.

All I really want is a decent-sized (I have large hands) Bluetooth wireless mouse, with a clickable scroll wheel and back/forward thumb buttons. That doesn't seem like too much to ask, but as far as I can tell, it just doesn't exist. Thankfully the laptop has a very generous touchpad with multi-touch, and clicking both the left and right buttons together generates a middle-click. Still, I really hope Logitech gives up on the proprietary wireless idea and gets on board with the Bluetooth standard, because I'd like to have a decent mouse to use with it.

It's telling that, on Amazon, you can find a discontinued Logitech Bluetooth mouse that meets my requirements - selling in new condition for a mere three hundred dollars. That's three times what Logitech's finest current proprietary wireless mouse costs, for an outdated, basic mouse. That's how much standard Bluetooth wireless is worth to people. Wake up Logitech!

Any suggestions on a suitable mouse in the comments would be greatly appreciated...

2014-02-07

Optimizing Entity Framework Using View-Backed Entities

I was profiling a Web application built on Entity Framework 6 and MVC 5, using the excellent Glimpse. I found that a page with three lists of five entities each was causing over a hundred query executions, eventually loading a huge object graph with hundreds of entities. I could eliminate the round trips using Include(), but that still left me loading way too much data when all I needed was aggregate/summary data.

The problem was that the aggregates I needed were complex and involved calculated properties, some of which were based on aggregates of navigation collection properties: a parent had sums of its children's properties, which in turn had sums of their children's properties, and in some cases parents had properties that were calculated partly based on aggregates of children's properties. You can see how this quickly spun out of control.

My requirements were that the solution had to perform better, at returning the same data, while allowing me to use standard entity framework, code first, with migrations. My solution was to calculate this data on the server side, using entities backed by views that did the joining, grouping, and aggregation. I also found a neat trick for backward-compatible View releases:

IF NOT EXISTS (SELECT Table_Name FROM INFORMATION_SCHEMA.VIEWS WHERE Table_Name = 'MyView')
    EXEC sp_executesql N'create view [dbo].[MyView] as select test = 1'
GO
ALTER VIEW [dbo].[MyView] AS
SELECT ...

It's effectively upsert for views - it's safe to run whether or not the view already exists, doesn't ever drop the view if it does exist (leaving no period where a missing view might cause an error), and it doesn't require keeping separate create and alter scripts in sync when changes are made.

I then created the entities that would represent the views, using unit tests to ensure that the properties now calculated on the server matched expected values the same way that the original, app-calculated properties did. Creating entities backed by views is fairly straightforward; they behave just like tables, but obviously can't be modified - I made the property setters protected to enforce this at compile time. Because my View includes an entry for every "real" entity, any query against the entity type can be cast to the View-backed type and it will pull full statistics (there is no possibility of an entity existing in the base table but not in the view).

Next I had to create a one to one association between the now bare entity type and the view type holding the aggregate statistics. The only ID I had for the view was the ID of the raw entity it was connected to. This turned out to be easier said than done - entity framework expects that, in a one to one relationship, it will be managing the ID at one end of the relationship; in my case, the ID's at both ends were DB-generated, even though they were guaranteed to match (since the ID in the view was pulled directly from the ID in the entity table).

I ended up abandoning the one-to-one mapping idea after a couple days' struggle, instead opting to map the statistics objects as subclasses of the real types in a table per type structure. This wound up being relatively easy to accomplish - I added a table attribute to the sub type, giving the name of the view, and it was off to the races. I went through updating references to the statistics throughout LINQ queries, views, and unit tests. The unit and integration tests proved very helpful in validating the output of the views and offering confidence in the changes.

I then ran my benchmarks again and found that pages that had required over a hundred queries to generate now used only ten to twenty, and were rendering in half to a third the time - a one to two hundred percent improvement, using views designed purely to mimic the existing functionality - I hadn't even gone about optimizing them for performance yet!

After benchmarking, it looks even better (times are in milliseconds, min/avg/max):

EF + LINQEF + Views
3 lists of 5 entities (3 types)360/785/167560/105/675
2 lists of 6 entities (1 type)325/790/193590/140/740
1 entity's details + 1 list of 50 entities465/975/268590/140/650

These tests were conducted by running Apache JMeter on my own machine against the application running on Windows Azure, across a sampling of 500 requests per page per run. That's a phenomenal 450 to 650 percent improvement across the board on the most intensive pages in the application, and has them all responding to 100% of requests in under 1 second. The performance gap will only widen as data sets grow; using views will make the scaling much more linear.

I'm very pleased with the performance improvement I've gotten. Calculating fields on the app side works for prototyping, but it just can't meet the efficiency requirements of a production application. View-backed entities came to the rescue in a big way. Give it a try!

2014-02-06

You're Being Held Hostage and You May Not Even Know It

To me, net neutrality isn't about fair business practices between businesses. That's certainly part of it, but it's not the crux of the issue. To me, net neutrality is about consumer protection.

Your broadband provider would like to charge companies - particularly content companies - extra in order to bring you their content. Setting aside the utterly delirious reasoning behind this for the moment, let's think about this from the consumer's perspective. You're paying your ISP to provide you access to the internet - the whole thing. When you sign up for service, you're signing up for just that: access to the internet. Period. What your ISP fails to disclose, at least in any useful detail, is how they intend to shape that access.

For your $40, $50, $60 or more each month, you might get high-speed access to some things, and not to others. You don't get to know what ahead of time, or even after you sign up - the last thing your ISP wants is for you to be well-informed about your purchase in this regard. They'll do whatever they can to convince you that your service is plain, simple, high-speed access to the whole internet.

Then, in negotiations behind closed doors, they're using you as a hostage to extort money from the businesses you're a customer of. Take Netflix as an example: you pay your ISP for internet service. Netflix also has an ISP, or several, that they pay for internet service. Those ISPs have what are called "peering arrangements" that determine who, if anyone, pays, and how much, when traffic travels between their networks on behalf of their customers. This is part and parcel of what you and Netflix pay for your service. You pay Netflix a monthly fee to receive Netflix service, which you access using your ISP. Netflix uses some part of that monthly fee to pay for their own internet service.

Your ISP has gone to Netflix and said "hey, if you want to deliver high-definition video to your customers who are also my customers, you have to pay me extra, otherwise my customers which are also your customers will receive a sub-par experience, and they might cancel their Netflix account." They're using you as a bargaining chip without your knowledge or consent, in order to demand money they never earned to begin with; everyone involved is already paying their fair share for their connection to the global network, and for the interconnections between parts of that global network.

To me, when a company I do business with uses me, and degrades my experience of their product, without my knowledge or consent, that's fraud from a consumer standpoint. Whatever Netflix might think about the deal, whether Netflix is right or wrong in the matter, doesn't enter into it; I'm paying for broadband so that I can watch Netflix movies, I'm paying for Netflix so that I can watch movies over my broadband connection, and my ISP is going behind my back and threatening to make my experience worse if Netflix doesn't do what they want. Nobody asked me how I feel about it.

Of course, they could give full disclosure to their customers (though they never would), and it wouldn't matter a whole lot, because your options as a broadband consumer are extremely limited; in the majority of cases, the only viable solution is cable, and when there is competition, it comes from exactly one place: the phone company. The cable companies and phone companies are alike in their use of their customers as hostages in negotiations.

What about fiber broadband? It's a red herring - it's provided by the phone company anyway. Calling fiber competition is like saying Coke in cans competes with Coke in bottles - it's all Coke, and whichever one you buy, your money goes into Coke's pocket.

What about wireless? Wireless will never, ever be able to compete with wired service, due to simple physics. The bandwidth just isn't there, the spectrum isn't there, there's noise to contend with, and usage caps make wireless broadband a non-starter for many cases, especially streaming HD video. Besides, the majority of truly high-speed wireless service is provided by the phone companies anyway; see the previous paragraph.

Why aren't they regulated? The FCC is trying, in its own way, but there's little traction; the cable and telephone companies have the government in their collective pockets with millions of dollars of lobbying money, and We The People haven't convinced Our Government that we care enough for them to even consider turning down that money.

In the United States, we pay many, many times what people pay in much of the developed world, and we get many, many times less for what we spend. On top of that, our ISPs are using us as bargaining chips, threatening to make our already overpriced, underpowered service even worse if the companies we actually chose in a competitive market - unlike our ISPs - don't pay up. This is absolutely preposterous, it's bordering on consumer fraud, and you should be angry about it. You should be angry enough to write your congressman, your senator, the president, the FCC, and your ISP (not that the last will do you much good, but it can't hurt.)

Some excellent places to find more information: