This started as a Google+ post about an article on getting top engineering talent and got way too long, so I'm posting here instead.
I wholeheartedly agree that 18-hour days are just not sustainable. It might work for a brand-new startup cranking out an initial release, understaffed and desperate to be first to market. But, at that stage, you can expect the kind of passion and dedication from a small team to put in those hours and give up their lives to build something new.
Once you've built it, though, the hours become an issue, and the playpen becomes a nuisance. You can't expect people to work 18-hour days forever, or even 12-hour days. People far smarter than I have posited that the most productive time an intellectual worker can put in on a regular basis is 4 to 6 hours per day; after that, productivity and effectiveness plummet, and it only gets worse the longer it goes on.
Foosball isn't a magical sigil protecting engineers from burn-out. Paintball with your coworkers isn't a substitute for drinks with your friends or a night in with your family. An in-house chef sounds great on paper, until you realize that the only reason they'd need to provide breakfast and dinner is if you're expected to be there basically every waking moment of your day.
Burn-out isn't the only concern, either. Engineering is both an art and a science, and like any art, it requires inspiration. Inspiration, in turn, requires experience. The same experience, day-in, day-out - interacting with the same people, in the same place, doing the same things - leaves one's mind stale, devoid of inspiration. Developers get tunnel-vision, and stop bringing new ideas to the table, because they have no source for them. Thinking outside the box isn't possible if you haven't been outside the box in months.
Give your people free coffee. Give them lunch. Give them great benefits. Pay them well. Treat them with dignity and respect. Let them go home and have lives. Let them get the most out of their day, both at work and at home. You'll keep people longer, and those people will be more productive while they're there. And you'll attract more mature engineers, who are more likely to stick around rather than hopping to the next hip startup as soon as the mood strikes them.
There's a certain pride in being up until sunrise cranking out code. There's a certain macho attitude, a one-upmanship, a competition to see who worked the longest and got the least sleep and still came back the next morning. I worked from 8am until 4am yesterday, and I'm here, see how tough I am? It's the geek's equivalent to fitness nuts talking about their morning 10-mile run. The human ego balloons when given the opportunity to endure self-inflicted tortures.
But I'm inclined to prefer an engineer who takes pride in the output, not the struggle to achieve it. I want someone who is stoked that they achieved so much progress, and still left the office at four in the afternoon. Are they slackers, compared to the guy who stayed another twelve hours, glued to his desk? Not if the output is there. It's the product that matters, and if the product is good, and gets done in time, then I'd rather have the engineer that can get it done without killing themselves in the process.
"I did this really cool thing! I had to work late into the night, but caffeine is all I need to keep me going. I kept having to put in hacks and work-arounds, but the important thing is that it's done and it works. I'm a coding GOD!" Your typical young, proud engineer. They're proud of the battle, not the victory; they're proud of how difficult it was.
"I did this really cool thing! Because I had set myself up well with past code, it was a breeze. I was amazed how little time it took. I'm a coding GOD!" That's my kind of developer. That's pride I can agree with. They're proud because of how easy it was.
This might sound like an unfair comparison at first, but think about it. When you're on a 20-hour coding bender, you aren't writing your best code. You're frantically trying to write working code, because you're trying to get it done as fast as you can. Every cut corner, every hack, every workaround makes the next task take that much longer. Long hours breed technical debt, and technical debt slows development, and slow development demands longer hours. It's a vicious cycle that can be extremely difficult to escape, especially once it's been institutionalized and turned into a badge of honor.
2013-12-18
2013-12-16
My Personal Project Workflow/Toolset
I do a lot of side projects, and my personal workflow and tooling is something that's constantly evolving. Right now, it looks something like this:
- Prognosticator for tracking features/improvements, measuring the iceberg, and tracking progress
- WorkFlowy for tracking non-development tasks (the most recent addition to the toolset)
- Trac for project documentation, and theoretically for defect tracking, though I've not been good about entering defects in Trac recently; it doesn't seem worth the effort on a one-person project, though with multiple people I think it would be a must
- Trello for cross-cutting all the above and indicating what's next/in progress/recently completed, and for quickly jotting down ideas/defects. Most of the defect tracking actually goes in here on one-man projects right now. This is a lot of duplication and the main source of waste in my current process.
- Bitbucket for source control (I also use Atlassian's excellent SourceTree as a Git/Hg client.)
It's been working well for me, the only issue I have is duplication between the tools, and failing to consistently use Trac for defect tracking. What keeps me in Trello is how quick and easy it is to add items to it, and the fact that I'm using it as a catch-all - I can put a defect or an idea or a task into it in a couple of seconds; I just have to replicate it to the appropriate place later, which is the problem.
I think the issue boils down to being torn between having a centralized repository for "stuff to be done" (Trello) and having dedicated repositories catered to each type of thing to be done (Prognosticator, Trac, and WorkFlowy); and convenience. Trello is excellent for jotting something down quickly, but lacks the additional specific utility of the other tools for specific purposes.
I think what I'll end up doing is creating a "whiteboard" list in WorkFlowy, and using that instead of Trello to jot down quick notes when I don't have the time to use the individual tools; then I can copy from there to the other tools when I need to. That will allow me to cut Trello down to basically being a Kanban board.
2013-11-04
Behold, My Newest Creation!
I'm entirely too proud too announce my latest creation, Rogue Prognosticator. This is a web-based application for doing project estimation and schedule for software development. I've written about these topics before, and rest assured I will again; you can count on the concepts you see discussed here being taken into account in the software.
Right now the site is in open beta, free for public use. As features are added, some may be subscriber-only, or may start out being subscriber-only.
This project breaks a lot of new ground for me, and I've learned a lot already.
Right now the site is in open beta, free for public use. As features are added, some may be subscriber-only, or may start out being subscriber-only.
This project breaks a lot of new ground for me, and I've learned a lot already.
- It's my first project from scratch using ASP.NET MVC or Entity Framework.
- It's my first personal project in production using C# or .NET.
- It's the first time I've used Windows Azure.
- It's the first time I've used UserVoice.
- It's the first time I've used continuous deployment from Git on a personal project in production.
- It's the first time I've used SQL Server on a personal project in production.
- It's the first time I've used WordPress in production.
This project was, as you may have guessed, the source for my post on entity framework model queries, as well as my post on value-based prioritization.
I've been using the project as I've been building it, and it's already been an excellent tool for me. Prioritizing features by estimated return was a particularly enlightening experience; it really helped me to get an objective look at the near-term plan and organize development more effectively.
I'll still talk here about the nitty-gritty of development but official product announcements will be coming through the product blog, Rogue Prognostications. I hope that others will find this project as useful as I have. Please feel free to drop any comments, questions, suggestions, or other feedback on the Rogue UserVoice.
More to come - watch this space!
2013-10-29
Simple Entity Framework Model Structure
I'll say right up front, I don't have a lot of experience with Entity Framework, and this could either be a well-known solution or a completely foolish one. All I know is that, so far, it has worked extremely well for my purposes.
Coming from the Java world, I'm used to using DAO's to serve as an abstraction layer between the controllers and the database, with the basic CRUD methods, plus additional methods for any specific queries needed for a given entity type.
Conveniently, entity framework provides a fairly complete DAO in the form of DbSet. DbSet is very easy to work with, and provides full CRUD functionality, and acts as a proxy for more complex queries. I wanted to keep queries out of my logic, however, and in the model.
Looking at it, I didn't want to have to write an entire wrapper for DbSet, and subclassing it seemed like asking for trouble. That's when it occurred to me to use extension methods for queries. It turns out you can define extension types against a generic type with a type argument specified (e.g. this IEnumerable). This not only allowed me to abstract out the queries and keep them in the model, without having to wrap or subclass anything; but by defining the extensions on IEnumerable instead of DbSet, I have access to my queries on any collection of the appropriate entity type, not just DbSet. I can then chain my custom queries in a very intuitive and fluid way, keeping all of the code clean, simple, and separate.
For example, I have a table of tags. I've created extension methods on IEnumerable to filter to tags used by a given user, and to filter by tags starting with a given string. I can chain these to get tags used by a given user and starting with a given string. I can also use these queries on the list of tags associated with another entity, as IList implements IEnumerable, and thus inherits my query extension methods.
I don't know if this is the best way - or even a good way - but it's worked for me so far. I do see some possible shortcomings; mainly, the extensions don't have access to the context, so they can't query any other DbSets, only the collection it's called against. This means that only explicit relationships can be queried against, which hasn't been a roadblock so far in my (admittedly simple) application. I'm not sure this is really a drawback though - you can still add a parameter to pass in an IEnumerable to query against, which again offers the flexibility to pass a DbSet or anything else.
2013-10-18
Pragmatic Prioritization
The typical release scheduling process works something like this:
- Stakeholders build a backlog of features they'd like to see in the product eventually.
- The stakeholders decide among themselves the relative priority of the features in the backlog.
- The development team estimates the development time for each feature.
- The stakeholders set a target feature list and ship date based on the priorities and estimates.
The problem here is primarily in step 2; this step tends to involve a lot of discussion bordering on arguing bordering on in-fighting. Priorities are set at best based on a sense of relative importance, at worst based on emotional attachment. Business value is a vague and nebulous consideration at most.
I propose a new way of looking at feature priorities:
- Stakeholders build a backlog of features they'd like to see in the product eventually.
- The stakeholders estimate the business value of each feature in the backlog.
- The development team estimates the development time for each feature.
- The stakeholders set a target feature list and ship date based on the projected return of each feature - i.e., the estimated business value divided by the estimated development time.
This turns a subjective assessment of relative priorities into an objective estimate of business value, which is used to determine a projected return on investment for each feature. This can then be used to objectively prioritize features and schedule releases.
I've been using this workflow recently for one of my upcoming projects, and I feel like it's helped me to more objectively determine feature priorities, and takes a lot of the fuzziness and hand-waving out of the equation.
Shameless self-promotion: Pragmatic prioritization is a feature of my project scheduling and estimation tool, Rogue Prognosticator.
Shameless self-promotion: Pragmatic prioritization is a feature of my project scheduling and estimation tool, Rogue Prognosticator.
2013-09-17
JComboBox with Custom ComboBoxModel Not Updating Value on setSelectedItem()
I wrestled with this issue for some time before figuring out the cause, so I hope this helps someone out there. I had a JComboBox with a custom ComboBoxModel. Everything was working fine, except a call to setSelectedItem would do everything it should (fire events, update the selected item property) except it wasn't updating the displayed value in the JComboBox itself. Clicking the drop-down would even show the correct item selected, and getSelectedItem() returned the correct result; it was just the box itself that was wrong.
Apparently the displayed value in the JComboBox isn't based on getSelectedItem(), but rather on the top item in the list. I don't know how or why this is, or if it's due to some intricacies of my GUI code, but bringing the selected item to the top of the ComboBoxModel's item list when calling setSelectedItem fixed the issue. Go figure.
If anyone has any insight into what causes this, please drop a comment!
Apparently the displayed value in the JComboBox isn't based on getSelectedItem(), but rather on the top item in the list. I don't know how or why this is, or if it's due to some intricacies of my GUI code, but bringing the selected item to the top of the ComboBoxModel's item list when calling setSelectedItem fixed the issue. Go figure.
If anyone has any insight into what causes this, please drop a comment!
2013-09-10
Qaudropus Rampage
I've taken up the excellent indie mobile title (and product of the 7-day roguelike challenge) Quadropus Rampage. It's an all-around excellent title, with some hilarious content, solid gameplay, and excellent replayability. It's free to play, with in-app purchases, and one of few cases where I've made an IAP in order to support the developers.
I haven't been playing long, and I haven't beaten it, but I thought I'd toss out a few tips, tricks, and strategies I've learned along the way.
Mechanics:
Upgrades:
Masteries:
I haven't been playing long, and I haven't beaten it, but I thought I'd toss out a few tips, tricks, and strategies I've learned along the way.
Mechanics:
- Attack has longer range than you think, and different weapons have different ranges.
- Hold down attack to get a spin attack, that damages all enemies around you. You'll end up turned about 60 degrees counter-clockwise from the direction you were facing when you started the spin.
- Note that the spin attack deals less damage than your normal attack. Note also that you still get the normal attack triggered by pressing the attack button, in order to charge up the attack. This means you can strike, holding down the button, then release, to get a quick one-two combo. Practice the timing of holding down the attack button, it can make a huge difference in crowded maps.
- Smash attack does a ton of damage in a radius similar to the spin attack (farther with upgrades & masteries), as well as knocking enemies back (and possibly off ledges.)
- Dodge lets you move over empty spaces and even off the edges of the map. You can hold down the dodge button to continue flying around the map until you release it.
- Bubble gives you a temporary shield that blocks all damage until it expires.
- Bingo flings himself toward a random nearby enemy every few seconds. If there are enemies grouped together, or in a line, he will damage every enemy he passes through. He does a lot of damage, and can crit.
- The Rage meter (top of the screen) fills up as you deal damage to enemies, and rapidly depletes over time. If it gets to full, you enter a Rampage, dealing bonus damage and taking reduced damage from enemies. In order to enter a rampage, you'll have to continuously dish out damage long enough to fill the meter before it starts to fall. This gets easier with more upgrades, and at lower depths (when there are more enemies to work with.)
Techniques:
- Most levels I start by dodging into the middle of the map, trying to lure as many enemies as possible into a central area, then I smash attack to take out as many as I can at once, and knock the rest away from me to get some breathing room.
- Dodging toward an enemy and then attacking is an excellent way to deal damage without taking any yourself. You can dodge in, attack, and dodge back out if the attack isn't enough to kill.
- Against large enemies, you can always run up, bubble, and hack away at them continuously until the bubble expires, then dodge away.
- Heartfish move pretty slowly, but they do follow you. If you're in trouble, dodge toward them to grab them, or bubble then dodge so you can grab them without dying on the way. If you're at or near full health, dodge away from them toward your enemies, to avoid picking them up until you actually need them.
- The most important weapon stats are health and damage; everything else is nice, but not nearly as important. Weapon size also plays a part, but generally speaking, just look for weapons where the top two stats (damage and health) are green (better than what you have now.) Learn to swap weapons quickly in the midst of a melee when you find a better weapon.
- Depth charges are excellent tools, but can be difficult to use properly. They always appear at the edges/corners of the map, so often the best technique is to dodge off the edge of the map, come at the depth charge from the far side, then smack it toward your enemies. The same basic techniques for the depth charges apply to Bingo's ball as well.
- If you smash attack off the edge or through a hole, you'll land in the next depth with a smash attack. If you have the mastery upgrade that refreshes your smash attack cooldown on each depth, you'll land with a smash attack and no cooldown. This makes it a viable strategy, if you end a level with full health and full smash, to smash off the edge of the map, destroy what you can when you land, dodge off toward another group of enemies, and smash attack again. At later depths, this is almost certain to trigger a rampage, letting you clean up the level in no time.
Upgrades:
- Strength, Vitality, and Smash are the most important skills; invest in these first. I did it round-robin in that order (Strength 1, Vitality 1, Smash 1, Strength 2, etc.) and it worked well for me.
- Next most important are probably Bingo and Bubble, in that order.
- Rampage isn't the least important, however, it doesn't really come into play until the lower depths, and until you've got the other skills levelled up enough.
- Keep in mind what upgrade you want next and how much it costs; you can pause mid-game and buy the upgrade as soon as you can afford it. You aren't limited to purchasing upgrades between games.
Masteries:
- Masteries are a combination of upgrade and achievement. When you hit a certain goal, the mastery will be unlocked, and you'll get the option of two upgrades for each mastery, which you can switch between at any time (including mid-game).
- Any time you get an achievement while playing, it's a good idea to pause, go into the character screen, and choose an upgrade for that mastery, to gain the bonus as soon as possible (neither option is selected by default, you must select one yourself to gain any benefit.)
- Keep in mind that you can switch mastery bonuses mid-game as well if you need to. I've not run into a situation where this would be needed.
- Many of the masteries will happen when they happen, but most can be achieved with considered action. I strongly recommend picking a mastery and focusing on it during your gameplay; for example, focus on using your smash attack as often as possible until you get that mastery, or focus on dodging over and over and over until you get that mastery, and so on.
- None of the mastery bonuses are game-changing, but many are very good, and the combination of a few of them, plus some upgrades, quickly make the first few depths a cakewalk.
Pets:
- You can have two pets active at a time, not including Bingo. Bingo is always active, and does not count as a pet. Likewise, the Bingo upgrades don't affect your other pets.
- I've only used Cy and Saww, but both have been very effective for me, though I'm considering swapping Cy for Smiles.
- It doesn't seem like there's a significant imbalance between them, I think it's mainly a matter of personal preference and play style.
Artifacts/Grubby:
- Don't bother purchasing anything from Grubby until you've maxed out all the upgrades. Your orbs are better spent there. You're very unlikely to beat the game without maxed upgrades, no matter how many fancy items you pick up from Grubby.
- Many of the artifacts just give a 20% increase to damage to a particular type of enemy. These are nice, but not worth paying orbs for to buy them from Grubby.
- The best artifacts, in my experience, are Heartfish Food, Fountain Pen, Bingarang, Forn Orb, Third Eye, Bermuda Triangle, Bingo Unchained, Spiked Collar, Star Biscuit, Embiggener, Urchin Spines, Gorgo's Shovel, and particularly Lucky Coin (free resurrection!).
- Don't waste your orbs on buying weapons from Grubby unless a) it's ridiculously better than anything you've seen at your current depth, and/or b) you're within two depths of facing off against Pete. The rest of the time, it's just not worthwhile unless you have so many orbs you don't care any more.
Purchases:
- You can purchase orbs (for buying upgrades and buying items from Grubby in game) and dubloons (for buying pets and resurrections) in the in-app store. These are relatively cheap compared to most games with similar freemium models.
- Any purchase will earn you a new starting weapon that's significantly better than the starting tennis racket; in fact, if you make a purchase, your new starting weapon will last you the first several depths easily.
- Don't waste your dubloons on unlocking masteries; they generally aren't worth what they cost in dubloons, especially since you can earn them through playing anyway.
Synergies: some things just work particularly well in combination. For example:
- All Dodge upgrades, Saww, Flurry upgrade from Quick mastery, Inksplosion upgrade from Nimble mastery, and Fountain Pen: dodge to kill. You cause an explosion (dealing damage and causing knockback) when you start a dodge, you get bonus damage when you end a dodge, and both you and your pet deal damage during a dodge.
- All Rampage upgrades, either upgrade from the Brawler mastery, Supple Crits upgrade from the Hulk mastery, Bingarang, Forn Orb, Eye Patch, and Third Eye: ultimate rampage. You shoot lasers out of your face. Bingo shoots lasers out of his face. He does this while spinning continuously around the map until the rampage ends. And you rampage more often.What's not to love? If you take Pain Tolerance from Brawler, and have some or all of the above dodge stuff, you can indiscriminately fly around the map lasering everything in sight while taking reduced damage. Alternatively, take the I'm Always Angry upgrade to rampage more often.
- Saww, Smiles, Bingarang, Forn Orb, Bingo Unchained, Spiked Collar, Star Biscuit, and Urchin Spines: let the pets do the work. Park yourself in an urchin for safety, dodging briefly to keep Saww going.
2013-08-04
Assumptions and Unit Tests
I've written previously on assumptions and how they affect software development. Taking this as a foundation, the value proposition of unit testing becomes much more apparent: it offers a subjective reassurance that certain assumptions are valid. By mechanically testing components for correctness, you're allowing yourself the freedom to safely assume that code which passes its tests is highly unlikely to be the cause of an issue, so long as there is a test in place for the behavior you're using.
This can be a double-edged sword: it's important to remember that a passing test is not a guarantee. Tests are written by developers, and developers are fallible. Test cases may not exercise the behavior in precisely the same way as the code you're troubleshooting. Test cases may even be missing for the particular scenario you're operating under.
By offering a solid foundation of trustworthy assumptions, along with empirical proof as to their validity, you can eliminate many possible points of failure while troubleshooting, allowing you to focus on what remains. You must still take steps to verify that you do have test coverage for the situation you're looking at, in order to have confidence in the test results. If you find missing coverage, you can add a test to fill the gap; this will either pass, eliminating another possible point of failure, or it will fail, indicating a probable source of the issue.
Just don't take unit test results as gospel; tests must be maintained just like any other code, and just like any other code, they're capable of containing errors and oversights. Trust the results, but not the tests, and learn the difference between the two: results will reliably tell you whether the test you've written passed or failed. It is the test, however, that executes the code and judges passing or failing. The execution may not cover everything you need, and the judgement may be incorrect, or not checking all factors of the result.
2013-07-04
Feature Disparity Between Web, Mobile, and Desktop
I understand that mobile is new territory, and that web applications have certain restrictions on them (though less and less so with modern standards and modern browsers), but it seems very strange to me that there are still such glaring disparities between the web, mobile, and desktop versions of some products - even products designed with mobile in mind.
Take Evernote as an example. It's been out for Android for ages, with regular new releases offering new features and functionality. Yet there are still basic features that are not available in the mobile client, including strike-through text, horizontal rules, alignment, and font face/size changes. If you have a note with these features, and you edit the note in the Android app, you get a friendly warning that the note contains unsupported features, and the editor forces you to edit paragraph-by-paragraph, like the old and irritating Google Docs app for Android. I find this more than a little bit ridiculous; why are you adding new, nice-to-have features when basic functionality is still unsupported?
Look at Google Keep for the opposite example. The mobile app allows reordering the items in a checklist with drag-and-drop. The web app doesn't allow you to reorder items. The only way to reorder items is using cut and paste. This is something you can absolutely achieve in a web app, and they've done it before, but for some reason that one, basic, important feature is just somehow missing.
The Mint mobile app allows changing budgets, but not changing whether or not the budget surplus/deficit should roll over month-to-month, which you can do in the web app. It's most of the feature, just missing one little part that can cause frustration because if most of the feature is there, you expect the whole feature to be there.
The GitHub web app doesn't even include a git client - the closest you can get is downloading a repo, but you can't actually check out and manage a working copy.
The Google Maps app for Android doesn't allow editing your "My Maps", or to choose from (or create) alternate routes when getting directions. It also doesn't include the web version's traffic forecasting. The Blogger web app is next to useless; editing a note created on the desktop gives you a WYSIWYG editor with the plain text littered with markup, and writing a post on mobile and then looking at it on desktop shows that there's some serious inconsistencies with handling of basic formatting elements like paragraphs. Don't even get me started on the useless bundle of bytes that is the Google Analytics Android app; it's such a pathetic shadow of the web application that there's no point in even having it installed.
These seem to me like cases of failure to eat your own dog food. If there were employees - especially developers or product managers - of these companies, using these applications on each supported platform, these issues would have been solved. They're the sorts of things that look small and insignificant on a backlog until they affect you on a day-to-day basis; those little annoyances, repeated often enough, become sources of frustration.
Take Evernote as an example. It's been out for Android for ages, with regular new releases offering new features and functionality. Yet there are still basic features that are not available in the mobile client, including strike-through text, horizontal rules, alignment, and font face/size changes. If you have a note with these features, and you edit the note in the Android app, you get a friendly warning that the note contains unsupported features, and the editor forces you to edit paragraph-by-paragraph, like the old and irritating Google Docs app for Android. I find this more than a little bit ridiculous; why are you adding new, nice-to-have features when basic functionality is still unsupported?
Look at Google Keep for the opposite example. The mobile app allows reordering the items in a checklist with drag-and-drop. The web app doesn't allow you to reorder items. The only way to reorder items is using cut and paste. This is something you can absolutely achieve in a web app, and they've done it before, but for some reason that one, basic, important feature is just somehow missing.
The Mint mobile app allows changing budgets, but not changing whether or not the budget surplus/deficit should roll over month-to-month, which you can do in the web app. It's most of the feature, just missing one little part that can cause frustration because if most of the feature is there, you expect the whole feature to be there.
The GitHub web app doesn't even include a git client - the closest you can get is downloading a repo, but you can't actually check out and manage a working copy.
The Google Maps app for Android doesn't allow editing your "My Maps", or to choose from (or create) alternate routes when getting directions. It also doesn't include the web version's traffic forecasting. The Blogger web app is next to useless; editing a note created on the desktop gives you a WYSIWYG editor with the plain text littered with markup, and writing a post on mobile and then looking at it on desktop shows that there's some serious inconsistencies with handling of basic formatting elements like paragraphs. Don't even get me started on the useless bundle of bytes that is the Google Analytics Android app; it's such a pathetic shadow of the web application that there's no point in even having it installed.
These seem to me like cases of failure to eat your own dog food. If there were employees - especially developers or product managers - of these companies, using these applications on each supported platform, these issues would have been solved. They're the sorts of things that look small and insignificant on a backlog until they affect you on a day-to-day basis; those little annoyances, repeated often enough, become sources of frustration.
2013-06-11
Teaching a Developer to Fish
I write a lot about development philosophy here, and very little about technique. There are reasons for this, and I'd like to explain.
In my experience, often what separates an easy problem from an intractable one is method and mindset. How you approach a problem tends to be more important than the implementation you end up devising to solve it.
Let's say you're given the task of designing a recommendation engine - people like you were interested in X, Y, and Z. Clearly this is an algorithmic problem, and a relatively difficult one at that. How do you solve it?
The algorithm itself isn't significant; as a developer, the algorithm is your output. The process you use to achieve the desired output is what determines how successful you'll be. I could talk about an algorithm I wrote, but that's giving a man a fish. I'd much rather teach a man to fish.
So how do you fish, as it were, for the perfect algorithm? You follow solid practices, you iterate, and you measure. That means you start with a couple of prototypes, you measure the results, you whittle down the candidate solutions until you have a best candidate, and then you refine it until it's as good as it can get. Then you deploy it to production, you continue to measure, and you continue to refine it. If you code well, you can A/B test multiple potential algorithms, in production, and compare the results.
How do you fish for a fix to a defect? You follow solid practices, you iterate, and you measure. You start by visual inspection, checking for code quality, and doing light refactoring to try to simplify the code and eliminate points of failure, to narrow down the possibilities. Often this alone will bring the root cause of the defect to the surface quickly, or even solve it outright. If it doesn't, you add logging, and you watch the results as you recreate the error, trying to recreate it in different ways, to assess the boundaries of the defect; if this is for an edge case, what exactly defines the "edge" that's affected? What happens during each step of execution when it happens? Which code is executing and which code isn't? What parameters are being passed around?
In my experience, logging tends to be a far more effective debugging tool than a step-wise debugger in most cases, and with a strong logging framework, you can leave your logging statements in place with negligible performance impact in production (with debug logging disabled), and with fine-grained controls to allow you to turn up verbosity for the code you're inspecting without turning all logging on and destroying the signal-to-noise ratio of your logging output.
You follow solid practices, you iterate, and you measure. If you use right process, with the right mindset, you'll wind up at the right solution.
That's why I tend to wax philosophical instead of writing about concrete solutions I've implemented. Chances are I wrote the solution to my problem, not your problem; and besides, I'd much rather teach a man to fish than give a man a fish.
2013-06-08
My Present Setup
I thought I'd take a quick moment to lay out my current setup. It's not perfect, it's not top-of-the-line (nor was it when any of the parts were purchased), it's not extravagant, but I find it extremely effective for the way I work.
The Machine (DIY Chronos Mark IV):
The Machine (DIY Chronos Mark IV):
- Intel Core i5 750 LGA1156, overclocked from 2.6GHz to 3.2GHz
- ASRock P55 Extreme
- 8GB DDR3 from GSkill
- ATi Radio HD 5870
- 256GB Crucial m4 SSD (SATA3) - OS, applications, caches & pagefile
- 2 x 1TB Seagate HDD - one data drive, one backup drive
- Plextor DVD-RW with LiteScribe
I find this configuration to be plenty performant enough for most of my needs. The only thing that would prompt an upgrade at this point would be if I started needing to run multiple VM's simultaneously on a regular basis. The GPU is enough to play my games of choice (League of Legends, StarCraft 2, Total War) full-screen, high-quality, with no lag. The SSD keeps everything feeling snappy, and the data drive has plenty of space for projects, documents, and media. The second drive I have set up in Windows Backup to take nightly backups of both the primary and data drives.
My interface to it:
- Logitech G9x mouse (wired)
- Microsoft Natural Elite 4000 keyboard (wired)
- 2 x Dell U2412M 24" IPS LCD @ 1920x1200
- Behringer MS16 monitor speakers
If you couldn't tell, I have a strong preference for wired peripherals. This is a desktop machine; it doesn't go anywhere. Wireless keyboards I find particularly baffling for anything other than an HTPC setup; the keyboard doesn't move, why would I keep feeding it batteries for no benefit? The mouse is an excellent performer, and I love the switchable click/free scroll wheel (though I wish the button weren't on the bottom).
The displays are brilliant and beautiful, they're low-power, I definitely appreciate the extra few rows from 1920x1200 over standard 1080p, and having two of them suits my workflow extremely well; I tend to have one screen with what I'm actively working on, and the other screen is some combination of reference materials, research, communications (chat, etc.), and testing whatever I'm actively working on. Particularly when working with web applications, it's extremely helpful to be able to have code on one screen and the browser on the other, so you can make a change and refresh the page to view it without having to swap around. These are mounted on an articulated dual-arm mount to keep them up high (I'm 6'6", making ergonomics a significant challenge) and free up a tremendous amount of desk space - more than you'd think until you do it.
The Behringers are absolutely fantastic speakers, I love them, to death, and I think I need to replace them. I recently rearranged my desk, and since hooking everything back up, the speakers have a constant drone as long as they're turned on, even with the volume all the way down. I've swapped cables and fiddled with knobs and I'm not sure the cause.
The network:
- ASUS RT-N66U "Dark Night" router
- Brother MFC-9320CW color laster printer/scanner/copier/fax (on LAN via Ethernet)
- Seagate 2TB USB HDD (on LAN via USB)
The RT-N66U or "Dark Night" as it's often called is an absolutely fantastic router. It has excellent wireless signal, it's extremely stable, it's got two USB ports for printer sharing, 3G/4G dongle, or NAS using a flash drive or HDD (which can be shared using FTP, Samba, and ASUS' aiDisk and aiCloud services). The firmware source is published regularly by ASUS, it's Linux-based, and it includes a complete OpenVPN server. It offers a separate guest wireless network with its own password, which you can throttle separately and you can limit its access to the internal network. It has enough features to fill an entire post on its own.
Mobility:
- Samsung Galaxy S4 (Verizon)
- ASUS Transformer Prime (WiFi only)
The SGS4 is an excellent phone, with a few quirks due to Samsung's modifications of the base Android OS. The display is outstanding, the camera is great, the phone is snappy and stable, and it has an SD card slot. That's about all I could ask for. The tablet I bought because I thought it would make an excellent mobile client for my VPN+VNC setup; unfortunately, I've had some issues getting VNC to work, and now that I'm on a 3840x1200 resolution, VNC @ 1080p has become less practical. However, it still serves as a decent mobile workstation using Evernote, Dropbox, and DroidEdit.
All in all, this setup allows me to be very productive at home, while providing remote access to files and machines, and shared access to the printer and network drive for everyone in the house. The router's NAS even supports streaming media to iTunes and XBox, which is a plus; between that, Hulu, and Netflix, I haven't watched cable TV in months.
2013-06-07
Code Patterns as Microevolution
Code patterns abide by survival of the fittest, within a gene pool of the code base. Patterns reproduce through repetition, sometimes with small mutations along the way. Patterns can even mate, after a fashion, by combining them, taking elements of each to form a new whole. This is the natural evolution of source code.
The first step to taming a code base is to realize the importance of assessing fitness and taking control over what patterns are permitted or encouraged to continue to reproduce. Code reviews are your opportunity to thin the herd, to cull the weak, and allow the strong to flourish.
Team meetings, internal discussions, training sessions, and learning investments are then your opportunity to improve both the quality of new patterns and mutations that emerge, as well as the group's ability to effectively manage the evolution of your source, to correctly identify the weak and the strong, and to have a lasting impact on the overall quality of the product.
If you think about it, the "broken windows" problem could also be viewed as bad genes being allowed to perpetuate. As the bad patterns continue to reproduce, their number grows, and so does their impact on the overall gene pool of your code. Given the opportunity, you want to do everything you can to make sure that it's the good code that's continuing to live on, not the bad.
Consider a new developer joining your project. A new developer will look to existing code as an example to learn from, and as a template for their own work on the project, perpetuating the "genes" already established. That being the case, it seems imperative that you make sure those genes are good ones.
They will also bring their own ideas and perspectives to the process, establishing new patterns and mutating existing ones, bringing new blood into the gene pool. This sort of cross-breeding is tremendously helpful to the overall health of the "code population" - but only if the new blood is healthy, which is why strong hiring practices are so critical.
2013-06-06
The New GMail for Android
The new GMail for Android UX sucks. I mean... it's really awful.
They've replaced the checkboxes next to each message (useful) with sender images (gimmick), or, if there is no sender message (i.e., everything that's not a G+ contact - so, every newsletter, receipt, order confirmation, etc. you'll ever get), a big colorful first initial (completely useless waste of space). This image then acts as if it were the checkbox that used to be there (confusing) for selecting messages. You can turn off the images, but you don't get the checkboxes back; you can only tap-hold to select multiple messages, though this isn't mentioned anywhere, you just have to guess.
They've gotten rid of the delete button (why?), and moved it to the menu.
If you have no messages selected, pressing the device's menu key gives you the menu. However, if you do have messages selected, the menu key does nothing, instead you must tap the menu button that appears at the top-right of the display. It's not there if you don't have messages selected.
Once you're viewing a message, there are two menus: one when you tap the menu button, with 90% of the options in it, and another at the top-right gives you just two options, forward and reply-all; this almost makes sense, except that it uses the same, standard "here's the menu" button that's used on (some) other screens as the *only* available menu.
In the message view they've also gotten rid of the delete button (to match the annoyance of the message list, I supposed).
There is also a new "label settings" screen that's fairly mysterious; I assume it applies to the current label, though this includes "Inbox", which - while I understand it's treated internally as a label - I think most users don't think of as being a label in the typical sense.
They've replaced the checkboxes next to each message (useful) with sender images (gimmick), or, if there is no sender message (i.e., everything that's not a G+ contact - so, every newsletter, receipt, order confirmation, etc. you'll ever get), a big colorful first initial (completely useless waste of space). This image then acts as if it were the checkbox that used to be there (confusing) for selecting messages. You can turn off the images, but you don't get the checkboxes back; you can only tap-hold to select multiple messages, though this isn't mentioned anywhere, you just have to guess.
They've gotten rid of the delete button (why?), and moved it to the menu.
If you have no messages selected, pressing the device's menu key gives you the menu. However, if you do have messages selected, the menu key does nothing, instead you must tap the menu button that appears at the top-right of the display. It's not there if you don't have messages selected.
Once you're viewing a message, there are two menus: one when you tap the menu button, with 90% of the options in it, and another at the top-right gives you just two options, forward and reply-all; this almost makes sense, except that it uses the same, standard "here's the menu" button that's used on (some) other screens as the *only* available menu.
In the message view they've also gotten rid of the delete button (to match the annoyance of the message list, I supposed).
There is also a new "label settings" screen that's fairly mysterious; I assume it applies to the current label, though this includes "Inbox", which - while I understand it's treated internally as a label - I think most users don't think of as being a label in the typical sense.
2013-06-02
Building a Foundation
It's been said that pharmaceutical companies produce drugs for pennies per pill - except the first pill, which costs millions. Things aren't so different in the land of software development: the first usage of some new functionality might take hours, building the foundation and related pieces. But it could be re-used a hundred times trivially, and usually expanded or modified with little effort as well (assuming it was well-written to start with).
This is precisely what you should be aiming for: take the time to build a foundation that will turn complex tasks into trivial ones as you progress. This is the main purpose behind design concepts like the single responsibility principle, the Hollywood principle, encapsulation, DRY, and so on.
This isn't to be confused with big upfront design; in face, it's especially important to keep these concepts in mind in an agile process, where you're building the architecture as you go. It can be tempting to just hack together what you need at the moment. That's exactly what you should be doing for a prototype, but not for real development. For lasting functionality, you should assemble a foundation to support the functionality you're adding now, and similar functionality in the future.
It can be difficult to balance this against YAGNI - you don't want to build what you don't need, but you want to build what you do need in such a way that it will be reusable. You want to save yourself time in the future, without wasting time now.
To achieve a perfect balance would require an extraordinary fortune teller, of course. Experience will help you get better at determining what foundation will be helpful, though. The more experience you have and the more projects you work on, the better sense you'll have of what can be done now to help out future you.
2013-05-29
My Take on "Collective Ownership"/"Everyone is an Architect"
I love the idea of "collective ownership" in a development project. I love the idea that in a development team, "everyone is an architect". My problem is with the cut-and-dried "Agile" definition of these concepts.
What I've been reading lately is a definition of "collective ownership" that revolves around the idea of distributing responsibility, primarily in order to lift the focus on finger-pointing and blaming. A defect isn't "your fault", it's "our fault", and "we need to fix it." That's all well and good, but distributing blame isn't exactly distributing ownership; and ignoring the source of an issue is a blatant mistake.
The latter point first: identifying the source of an issue is important. I see no need for blame, or calling people out, and certainly no point in trying to use defects as a hard metric in performance analysis. However, a development team isn't a factory; it's a group of individuals who are constantly continuing their education, and honing their craft, and in that endeavor they need the help of their peers and managers to identify their weaknesses so they know what to focus on. "Finding the source of an issue" isn't about placing blame or reprimanding someone, it's about providing a learning opportunity so that a team member can improve, and the team as a whole can improve through the continuing education of each member.
In regard to distributing ownership, it's all too rare to see discussion of distributing ownership in a positive way. I see plenty of people writing about eliminating blame, but very few speaking of a team wherein every member looks at the entire code base and says "I wrote that." And why should they? They didn't write it alone, so they can't make that claim. For the product, they can say "I had a hand in that," surely. But it's unlikely they feel like they had a hand in the development of every component.
That brings us around to the idea that "everyone is an architect." In the Agile sense, this is generally taken to mean that every developer is given relatively free rein to architect the component they're working on at any given moment, without bowing down to The Architect for their product. I like this idea, in a sense - I'm all for every developer doing their own prototyping, their own architecture, learning their own lessons, and writing their own code. Up to a point.
There is a level of architecture that it is necessary for the entire team to agree on. This is where many teams, even Agile teams, tend to fall back on The Architect to keep track of The Big Picture and ensure that All The Pieces Fit Together. This is clearly the opposite of "everyone is an architect". So where's the middle ground?
If a project requires some level of architecture that everyone has to agree on - language, platform, database, ORM, package structure, whatever applies to a given situation - then the only way to have everyone be an architect is design by committee. Panning design by committee has become a cliche at this point, but it has its uses, and I feel this is one of them.
In order to achieve collective ownership, you must have everyone be an architect. In order for everyone to be an architect, and feel like they gave their input into The Product as a whole - or at least had the opportunity to do so - you must make architectural decisions into group discussions. People won't always agree, and that's where the project manager comes in; as a not-an-architect, they should have no bias and no vested interest in what choices are made, only that some decision is made on each issue that requires consideration. Their only job in architectural discussions is to help the group reach a consensus or, barring that, a firm decision.
This is where things too often break down. A senior developer or two, or maybe a project manager with development experience, become de facto architects. They make the calls and pass down their decrees, and quickly everyone learns that if they have an architecture question, they shouldn't try to make their own decision, they shouldn't pose it to the group in a meeting, they should ask The Guy, the architect-pro-tem. Stand-up meetings turn into a doldrum of pointless status updates, and discussion of architecture is left out entirely.
Luckily, every team member can change this. Rather than asking The Guy when a key decision comes up, ask The Group. Even better, throw together a prototype, get some research together, and bring some options with pros and cons to the next stand-up meeting. Every developer can do their part to keep the team involved in architecture, and in ownership, and to slowly shift the culture from having The Architect to having Everyone Is An Architect.
2013-05-10
Lack of Progress Indicator
Zen Templates is still under active development, I just haven't been able to work on it much lately due to life interruptions - primarily, two weeks ago, we went out of town for my birthday, and when we got back, our ferret had become spontaneously 90% paralyzed in the hind half of his body. This was all at once terrifying, saddening, frustrating, and heartbreaking. This put a great many things on hold, and catching up with those things took priority over working on side projects. But, the ferret is back in fighting form now, and work will resume soon.
If you aren't familiar with ferrets, they are energetic, playful, crazy, impulsive, fearless creatures. Even as a kit, our little guy would take flying leaps at me, arms akimbo, ready to take me on. I'm six and a half feet tall and near two hundred pounds. He was about eight inches long, and about two pounds. He either didn't know or didn't care, because he was ready to party.
Seeing this little ball of fur and excitement - or, as my friend calls them, meat slinky - be crippled, but remain his energetic, fearless/clueless self, was absolutely heartbreaking. He still wanted to play, so he would race around as best he could with his front legs - dragging his back legs behind him like dead weight. He would try to climb the bars of his cage, arms-only with a facility that would make a capoerista stare slack-jawed, but he couldn't keep it up for long without his back legs. It was just too much effort.
The worst, the most heartbreaking sight of all, though, was the weasel war dance. If you've never seen or heard of the weasel war dance, look it up on YouTube. There are tons of videos. Really, go, I'll wait.
Charming, isn't it? Now, imagine you've seen this creature do this dance a hundred times, full of grit and energy and fearlessness. Now, imagine watching this same creature, attempting to do this same dance, only for reasons unknown to him, half of his body won't cooperate. He desperately tries to hop and scurry and fling himself around, but it just doesn't work. Instead, he hobbles, crawls, and falters, and in short order, is too exhausted to try, collapsing on himself, panting.
Needless to say, it was awful; for him, and for us. But, thankfully, with the help of the excellent people at the Ark Animal Hospital, modern medicine in the form of Prednisone, and some old fashioned TLC, he's back in fighting form.
As soon as I get caught up, I will be too.
If you aren't familiar with ferrets, they are energetic, playful, crazy, impulsive, fearless creatures. Even as a kit, our little guy would take flying leaps at me, arms akimbo, ready to take me on. I'm six and a half feet tall and near two hundred pounds. He was about eight inches long, and about two pounds. He either didn't know or didn't care, because he was ready to party.
Seeing this little ball of fur and excitement - or, as my friend calls them, meat slinky - be crippled, but remain his energetic, fearless/clueless self, was absolutely heartbreaking. He still wanted to play, so he would race around as best he could with his front legs - dragging his back legs behind him like dead weight. He would try to climb the bars of his cage, arms-only with a facility that would make a capoerista stare slack-jawed, but he couldn't keep it up for long without his back legs. It was just too much effort.
The worst, the most heartbreaking sight of all, though, was the weasel war dance. If you've never seen or heard of the weasel war dance, look it up on YouTube. There are tons of videos. Really, go, I'll wait.
Charming, isn't it? Now, imagine you've seen this creature do this dance a hundred times, full of grit and energy and fearlessness. Now, imagine watching this same creature, attempting to do this same dance, only for reasons unknown to him, half of his body won't cooperate. He desperately tries to hop and scurry and fling himself around, but it just doesn't work. Instead, he hobbles, crawls, and falters, and in short order, is too exhausted to try, collapsing on himself, panting.
Needless to say, it was awful; for him, and for us. But, thankfully, with the help of the excellent people at the Ark Animal Hospital, modern medicine in the form of Prednisone, and some old fashioned TLC, he's back in fighting form.
As soon as I get caught up, I will be too.
The Importance of Logging
Add more logging. I'm serious.
Logging is what separates an impossible bug report from an easy one. Logging lets you replace comments with functionality. I'd even go so far as to say good logging separates good developers from great ones.
Try this: replace your inline comments with equivalent logging statements. Run your program and tail the log file. Suddenly, you don't need a step wise debugger for the vast majority of situations, because you can see, in the log, exactly what the program is doing, what execution path it's taking, where in the source where each logging statement is coming from, and where execution stopped in the event of a crash.
My general development process focuses on clean, readable, maintainable, refactorable, self-documenting code. The process is roughly like this:
- Block out the overall process, step by step, in comments.
- Any complex step (more than five or ten lines of code), replace the comment with a clearly-named method or function call, and create a stub method/function.
- Replace comments with equivalent logging statements.
- Implement functionality.
- Give all functions, methods, classes, parameters, properties, and variables clear, concise names, so that the code ends up in some semblance of readable English.
- Use thorough sanity checking, by means of assertions or simple if blocks. When using if blocks, include logging for any failed checks, including what was expected and what was found. These should be warnings.
- Include logging in any error/exception handling code. These should be errors if recoverable, or fatal if not. This is all too often the only logging a developer includes!
- Replace inline comments with equivalent logging statements. These should be debug or info/trace level; major section starts should be higher level, while mid-process statements should be lower level.
- Add logging statements to the start of each method/function. These should also be debug or info/trace level. Use higher-level logging statements for higher-level procedures, and lower-level logging statements for more deeply-nested calls.
- For long-running or resource-intensive processes, particularly long loops, add logging statements at regular intervals to provide progress and resource utilization details.
Make good use of logging levels! Production systems should only output warnings and higher by default, but it should always be possible to enable deeper logging in order to troubleshoot any issues that arise. However, keep the defaults in mind, and ensure that any logging you have in place to catch defects will provide enough information in the production logs to at least begin an investigation.
Your logging messages should be crafted with dual purpose in mind: first, to provide useful, meaningful outputs to the log files during execution (obviously), but also to provide useful, meaningful information to a developer reading the source - i.e., the same purpose served by comments. After a short time with this method you'll find it's very easy to craft a message that serves both purposes well.
Good logging is especially useful in an agile environment employing fast iteration and/or continuous integration. It may not be obvious why at first, but all the advantages of good logging (self-documenting code, ease of maintenance, transparency in execution) do a lot to facilitate agile development by making code easier to work with and easier to troubleshoot.
But wait, there's more! Good logging also makes it a lot easier for new developers to get up to speed on a project. Instead of slogging through code, developers can execute the program with full logging, and see exactly how it runs. They can then review the source code, using the logging statements as waypoints, to see exactly how the code relates to the execution.
If you need a tool for tailing log files, allow me a shameless plug: try out my free log monitor, Rogue Informant. It's been in development for several years now, it's stable, it's cross-platform, and it's completely free to use privately or commercially. It allows you to monitor multiple logs at once, filter and search logs, and float a log monitoring window on top of other applications, to make it easier to watch the log while using the program to see exactly what's going on behind the scenes.Give it a try, and if you find any issues or have feature suggestions, feel free to let me know!
2013-05-06
The Problem with Responsive Design
A huge problem I see with responsive/adaptive design today is that, all too often, it treats "small viewport" and "mobile" as being synonymous, when the two concepts are orthogonal. A mobile device can have a high-resolution display, just as a desktop user can have a small display, or just a small browser window.
Responsive designs need to design for viewport size, and nothing more. It's not mobile, it's a small display. Repeat that to yourself about a thousand times.
What's holding back single-design philosophies isn't display size, it's user interface; for decades, web designers have counted on there being a mouse cursor to generate events - mouseovers, clicks, drags. That's not how it works on touchscreen devices, and we need some facility - JavaScript checks, CSS media queries - to cater to touch-based devices as opposed to cursor-based devices.
Responsive designs need to design for viewport size, and nothing more. It's not mobile, it's a small display. Repeat that to yourself about a thousand times.
What's holding back single-design philosophies isn't display size, it's user interface; for decades, web designers have counted on there being a mouse cursor to generate events - mouseovers, clicks, drags. That's not how it works on touchscreen devices, and we need some facility - JavaScript checks, CSS media queries - to cater to touch-based devices as opposed to cursor-based devices.
Sanity Checks: Assumptions and Expectations
Assertions and unit tests are all well and good, but they're too narrow-minded in my eyes. Unit tests are great for, well, testing small units of code to ensure they meet the basic requirements of a software contract - maybe a couple of typical cases, a couple of edge cases, and then additional cases as bugs arise and new test cases are created for them. No matter how many cases you create, however, you'll never have a test case for every possible scenario.
Assertions are excellent for testing in-situ; you can ensure that unacceptable values aren't given to or by a piece of code, even in production (though there is a performance penalty to enabling assertions in production, of course.) I think assertions are excellent, but not specific enough: any assertion that fails is automatically a fatal error, which is great, unless it's not really a fatal error.
That's where the concept of assumptions and expectations come in. What assertions and unit tests really do is test assumptions and expectations. A unit test says "does this code behave correctly when given this data, all assumptions considered?" An assertion says "this code assumes this thing, and will not behave correctly if it gets another, so throw an error."
When documenting an API, it's important to document assumptions and expectations, so users of the API know how to work with your code. Before I go any further, let me define what I mean by these very similar terms: to me, code that assumes something operates as if its assumptions are correct, and will likely fail if its assumptions turn out to be incorrect. Code that expects something operates as if its expectations are met, but will likely still operate correctly even if they aren't. It's not guaranteed to work, or guaranteed to fail; it's likely to work, but someone should probably know about it and look into it.
Therein lies the rub: these are basically two types of assertions, one fatal, one not. What we need is an assertion framework that allows for warning-level assertion failures. What's more, we need an assertion framework that is performant enough to be regularly enabled in production.
So, any code that's happily humming along in production, that says:
Assume.that(percentage).isBetween(0,100);
will fail immediately if percentage is outside those bounds. It's assuming that percentage is between zero or one hundred, and if it assumes wrong, it will likely fail. Since it's always better to fail fast, any case where percentage is outside that range should trigger a fatal error - preferably even if it's running in production.
On the other hand, code that says:
Expect.that(numRows).isLessThan(1000);
will trigger a warning if numRows is over a thousand. It expects numRows to be under a thousand; if it isn't, it can still complete correctly, but it may take longer than normal, or use more memory than normal, or it may simply be that if it got more rows than that, something may be amiss with the query that got the rows or the dataset the rows came from originally. It's not a critical failure, but it's cause for investigation.
Any assumption or expectation that fails should of course be automatically and immediately reported to the development team for investigation. Naturally a failed assumption, being fatal, should take priority over a failed expectation, which is recoverable.
This not only provides greater flexibility than a simple assertion framework, it also provides more explicit self-documenting code.
Assertions are excellent for testing in-situ; you can ensure that unacceptable values aren't given to or by a piece of code, even in production (though there is a performance penalty to enabling assertions in production, of course.) I think assertions are excellent, but not specific enough: any assertion that fails is automatically a fatal error, which is great, unless it's not really a fatal error.
That's where the concept of assumptions and expectations come in. What assertions and unit tests really do is test assumptions and expectations. A unit test says "does this code behave correctly when given this data, all assumptions considered?" An assertion says "this code assumes this thing, and will not behave correctly if it gets another, so throw an error."
When documenting an API, it's important to document assumptions and expectations, so users of the API know how to work with your code. Before I go any further, let me define what I mean by these very similar terms: to me, code that assumes something operates as if its assumptions are correct, and will likely fail if its assumptions turn out to be incorrect. Code that expects something operates as if its expectations are met, but will likely still operate correctly even if they aren't. It's not guaranteed to work, or guaranteed to fail; it's likely to work, but someone should probably know about it and look into it.
Therein lies the rub: these are basically two types of assertions, one fatal, one not. What we need is an assertion framework that allows for warning-level assertion failures. What's more, we need an assertion framework that is performant enough to be regularly enabled in production.
So, any code that's happily humming along in production, that says:
Assume.that(percentage).isBetween(0,100);
will fail immediately if percentage is outside those bounds. It's assuming that percentage is between zero or one hundred, and if it assumes wrong, it will likely fail. Since it's always better to fail fast, any case where percentage is outside that range should trigger a fatal error - preferably even if it's running in production.
On the other hand, code that says:
Expect.that(numRows).isLessThan(1000);
will trigger a warning if numRows is over a thousand. It expects numRows to be under a thousand; if it isn't, it can still complete correctly, but it may take longer than normal, or use more memory than normal, or it may simply be that if it got more rows than that, something may be amiss with the query that got the rows or the dataset the rows came from originally. It's not a critical failure, but it's cause for investigation.
Any assumption or expectation that fails should of course be automatically and immediately reported to the development team for investigation. Naturally a failed assumption, being fatal, should take priority over a failed expectation, which is recoverable.
This not only provides greater flexibility than a simple assertion framework, it also provides more explicit self-documenting code.
2013-05-03
Be Maxwell's Demon
Source code tends to follow the second law of thermodynamics, with some small differences. In software, as in thermodynamics, systems tend toward entropy: as you continue to develop an application, the source will increase in complexity. In software, as well as in thermodynamics, connected systems tend toward equilibrium: in development, this is known as the "broken windows" theory, and is generally considered to mean that bad code begets bad code. People often discount the fact that good code also begets good code, but this effect is often hidden by the fact that the overall system, as mentioned earlier, tends toward entropy. That means that the effect of broken windows is magnified, and the effect of good examples is diminished.
In thermodynamics, Maxwell's Demon thought experiment is, in reality, impossible - it is purely a thought experiment. However, in software development, we're in luck: any developer can play the demon, and should, at every available opportunity.
Maxwell's demon stands between two connected systems, defeating the second law of thermodynamics by selectively allowing less-energetic particles through only in one direction, and more-energetic particles through only in the other direction, causing the two systems to tend toward opposite ends of the spectrum, rather than naturally tending toward entropy.
By doing peer reviews, you're doing exactly that; you're reducing the natural entropy in the system and preventing it from reaching its natural equilibrium by only letting the good code through, and keeping the bad code out. Over time, rather than tending toward a system where all code is average, you tend toward a system where all code is at the lowest end of the entropic spectrum.
Refactoring serves a similar, but more active role; rather than simply "only letting the good code through", you're actively seeking out the worse code and bringing it to a level that makes it acceptable to the demon. In effect, you're reducing the overall entropy of the system.
If you combine these two effects, you can achieve clean, efficient, effective source. If your review process only allows code through that is as good or better than the average, and your refactoring process is constantly improving the average, then your final code will, over time, tend toward excellence.
Without a demon, any project will be on a continuous slide toward greater and greater entropy. If you're on a development project, and it doesn't have a demon, it needs one. Why not you?
In thermodynamics, Maxwell's Demon thought experiment is, in reality, impossible - it is purely a thought experiment. However, in software development, we're in luck: any developer can play the demon, and should, at every available opportunity.
Maxwell's demon stands between two connected systems, defeating the second law of thermodynamics by selectively allowing less-energetic particles through only in one direction, and more-energetic particles through only in the other direction, causing the two systems to tend toward opposite ends of the spectrum, rather than naturally tending toward entropy.
By doing peer reviews, you're doing exactly that; you're reducing the natural entropy in the system and preventing it from reaching its natural equilibrium by only letting the good code through, and keeping the bad code out. Over time, rather than tending toward a system where all code is average, you tend toward a system where all code is at the lowest end of the entropic spectrum.
Refactoring serves a similar, but more active role; rather than simply "only letting the good code through", you're actively seeking out the worse code and bringing it to a level that makes it acceptable to the demon. In effect, you're reducing the overall entropy of the system.
If you combine these two effects, you can achieve clean, efficient, effective source. If your review process only allows code through that is as good or better than the average, and your refactoring process is constantly improving the average, then your final code will, over time, tend toward excellence.
Without a demon, any project will be on a continuous slide toward greater and greater entropy. If you're on a development project, and it doesn't have a demon, it needs one. Why not you?
2013-05-01
Tales of Accidental SEO
I have a post on this blog that's a top-10 result for a pretty generic search term. Yes, the page is relevant, and uses the terms in question pretty frequently. But, honestly, I don't make any effort at SEO on this blog: it's a soapbox, not a marketing engine, and I don't care to invest the time and energy necessary to get myself into the top ranks on the SERPs. But somehow I've done it accidentally, and I think I know how.
By linking my Blogger profile to my Google+ profile, my blog posts become "social media" content in some part of Google's algorithm. Because "social media" and the "live web" are the hip things in search engineering these days, that gets me an arbitrarily huge boost in rank. It's not based on profiling either: I can run the search anonymously and get the same results, and I can have friends that don't use Google+ run the search and get the same results.
Why do I think it's related to Google+ at all? My profile picture of G+ is right next to the post (though, oddly, none of the photos from the actual post are in there), and it includes the byline "by Adrian Price - in 70 Google+ circles". That's not part of my blog post, that's not even part of my Blogger profile, aside from the fact that it is linked to my Google+ profile.
Social marketing in so many ways is your parents trying to talk to you in the same language you use with your friends. To poach a phrase, it's so unhip it's a wonder their bums don't fall off. Honestly, almost every social marketing effort I've ever seen reeks of desperation, confusion, and so much effort trying to "seem engaged" that they would have saved time in the end actually being engaged.
So, any marketers out there desperately trying to squeeze every drop of ROI they can out of social media, consider this: it looks like, just maybe, you can get quite a lot out of it just by having it at all, even if you aren't using it to push out news, or contests, or desperately promoting your latest video in an attempt to force it to "go viral". Who knows how long the free lunch will last, but you might as well take advantage while you can.
By linking my Blogger profile to my Google+ profile, my blog posts become "social media" content in some part of Google's algorithm. Because "social media" and the "live web" are the hip things in search engineering these days, that gets me an arbitrarily huge boost in rank. It's not based on profiling either: I can run the search anonymously and get the same results, and I can have friends that don't use Google+ run the search and get the same results.
Why do I think it's related to Google+ at all? My profile picture of G+ is right next to the post (though, oddly, none of the photos from the actual post are in there), and it includes the byline "by Adrian Price - in 70 Google+ circles". That's not part of my blog post, that's not even part of my Blogger profile, aside from the fact that it is linked to my Google+ profile.
Social marketing in so many ways is your parents trying to talk to you in the same language you use with your friends. To poach a phrase, it's so unhip it's a wonder their bums don't fall off. Honestly, almost every social marketing effort I've ever seen reeks of desperation, confusion, and so much effort trying to "seem engaged" that they would have saved time in the end actually being engaged.
So, any marketers out there desperately trying to squeeze every drop of ROI they can out of social media, consider this: it looks like, just maybe, you can get quite a lot out of it just by having it at all, even if you aren't using it to push out news, or contests, or desperately promoting your latest video in an attempt to force it to "go viral". Who knows how long the free lunch will last, but you might as well take advantage while you can.
2013-04-24
Real Sprints
Agile methodologies talk about "sprints" - workloads organized into one to four week blocks. You schedule tasks for each sprint, you endeavour to complete all of it by the end of the sprint, then you look back and see how close your expectations (schedule) were to reality (what actually got done).
Wait, wait, back up. When I think of a sprint, I think short and fast. That's what sprinting means. You can't sprint for a month straight; you'll die. That's a marathon, not a sprint.
There are numerous coding competitions out there. Generally, you get around 48 hours, give or take, to build an entire, working, functional game or application. Think about that. You get two days to build a complete piece of software from scratch. Now that's what I call sprinting.
Of course, a 48 hour push is a lot to ask for on a regular basis; sure, your application isn't in a competition, this is the real world, and you need to get real work done on an ongoing basis. You can't expect your developers to camp out in sleeping bags under their desks. But that doesn't mean turning a sprint into a marathon.
The key is instilling urgency, while moderating burnout. This is entirely achievable, and can even make development more fun and engaging for the whole team.Since the term sprint has already been thoroughly corrupted, I'll use the term "dash". Consider this weekly schedule:
- Monday: Demo last week's accomplishments for stakeholders, and plan this week's dash. This is a good week to schedule any unavoidable meetings.
- Tuesday and Wednesday: your 48 hours to get it done and working. These are crunch days, and they will probably be pretty exhausting. These don't need to be 18-hour days, but 10 hours wouldn't be unreasonable. Let people get in the zone and stay there as long as they can.
- Thursday: Refactoring and peer reviews. After a run, athletes don't just take a seat and rest; they slow to a jog, then a walk. They stretch. The cool off slowly. Developers, as mental athletes, should do the same.
- Friday: Testing. QA goes through the application with a fine-toothed comb. The developers are browsing the web, playing games, reading books, propping their feet up, and generally being lazy bums, with one exception: they're available at a moment's notice if a QA has any questions or finds any issues. Friday is a good day for your development book club to meet.
- By the end of the week, your application should be ready again for Monday's demo, and by Tuesday, everyone should be well-rested and ready for the next dash.
Ouch. That's a tough sell. The developers are only going to spend two days a week implementing features? And one basically slacking off? Balderdash! Poppycock!
Think about it, though. Developers aren't factory workers; they can't churn out X lines of code per hour, 40 hours per week. That's not how it works. A really talented developer might achieve 5 or 6 truly productive hours per day, but at that rate, they'll rapidly burn out. 4 hours a day might be sustainable for longer. Now, mind you, in those four hours a day, they'll get more done, better, with fewer defects, than an army of incompetent developers could do in a whole week. But the point stands: you can't run your brain at maximum capacity eight hours straight, five days a week. You just can't - not for long, anyway.
The solution is to plan to push yourself, and to plan to relax, and to keep the cycle going to maximize the effectiveness of those productive hours. It's also crucial not to discount refactoring as not being productive; it sets up the following weeks' work, and reduces the effort required to get the rest of the development done for the rest of the life of the application. It's a critical investment in the future.
Spending a third of your development time on refactoring may seem excessive, and if it were that simple, I'd agree. But if you really push yourself for two days, you can get a lot done - and write a lot of code to be reviewed and refactored. In one day of refactoring, you can learn a lot, get important work done, and still start to cool off from the big dash.
That lazy Friday really lets you relax, improve your craft, and get your product ready for next week, when you get to do it all over again.
Zen Templates Development Journal, Part 2
Having my concept complete (see Part 0) and my simple test case working (see Part 1), I was ready to start on my moderate-complexity test case. This would use more features than the simple test case, and more pages. I really didn't want to have to build a complete site just for the proof of concept, so I decided to use an existing site, and I happened to have one handy: rogue-technologies.com.
The site is currently built in HTML5, using server-side includes for all of the content that remains the same between pages. It seemed like a pretty straightforward process to convert this to my template engine, so I got to work: I started with one page (the home page), and turned it into the master template. I took all of the include directives and replaced them with the content they were including. I replaced all of the variable references with model references using injection or substitution. I ID'd all the elements in the master template that would need to be replaced by child templates. I then made another copy of the homepage, and set it up to derive from the master template.
I didn't want to convert the site to use servlets, since it wasn't really a dynamic site; I just wanted to be able to generate usable HTML from my template files. So I created a new class that would walk a directory, parse the templates, and write the output to files in an output directory. Initially, it set up the model data statically by hand at the start of execution.
All was well, but I needed a way for the child template to add elements to the page, rather than just replacing elements from the parent template. I came up with the idea of appending elements, using a data attribute data-z-append="before:" or "after:", to cause an element to be appended to the page either before or after the element from the parent with the specified ID. This worked perfectly, allowing me to add the Google Webmaster Tools meta tag to the homepage.
With this done, I set to work converting the remaining pages. Most of the pages were pretty straightforward, being handled just like the homepage; I dumped the SSI directives, added some appropriate IDs and classes, and all was well. However, the software pages presented some challenges. For one thing, they used a different footer than the rest of the site. It was time to put nested derivation to the test.
I created a software page template, which derived from the master template, that appended the additional footer content. I then had the software pages derive from this template, instead of deriving from the master template and - by some stroke of luck - it worked perfectly on the first try. I wasn't out of the woods yet, though.
The software pages also used SSI directives to dynamically insert the file size for downloadable files next to the links to download them. I wasn't going to reimplement this functionality, however, I was prepared to replace these directives with file size data stored in the model. But I wanted to keep the model data organized, so I needed to support nesting. The software pages also used include directives to include a Google+ widget on the pages; this couldn't be added to the template, as it was embedded in the body content, so it seemed like a perfect case for snippets - which meant I needed to implement snippet support.
Snippet support was pretty easy - find the data attribute, look up the snippet file, parse it as an HTML fragment, and replace the placeholder element with the snippet. Easy to implement, worked pretty well.
Nested properties I thought would be a breeze, as I had assumed it was natively supported by StrSubstitutor. Unfortunately it wasn't, so I had to write my own StrLookup. I decided that, since I was already doing some complex property lookups for injection, I'd build a unified model lookup class that could act as my StrLookup and could be used elsewhere. I wanted nested scope support as well, for my project list: each project had an entry in the model, that consisted of a name, latest version, etc. I wanted the engine to iterate this list, and for each entry, rather than replacing the entire content of the element with the text value of the model entry, I wanted it to find sub-elements and replace each with an appropriate property of the model entry. This meant I needed nested scoping support.
I implemented this using a scope stack and a recursive lookup. Basically, every time a nested scope was entered (e.g., content injection using an object or map, or iteration over a list), I would push the current scope onto the stack. When the nested scope was exited (i.e., the end of the element that added the scope), I popped the scope off. When iterating a loop, at the start of the iteration, I'd push the current index, and at the end of the iteration, I'd pop it back off.
This turned out to be very complex to implement, but after some trial and error, I got it working correctly. I then re-tested against my simple test case, having to fix a couple of minor defects introduced there with the new changes. But, at last, both my simple and moderate test cases were working.
I didn't like the static creation of model data - not very flexible at all - so I decided to swap it out with JSON processing. This introduced a couple of minor bugs, but it wasn't all that difficult to get it all working. The main downside was that it added several additional dependencies, and dependency management was getting more difficult. I wasn't too concerned on that front though, since I was already planning for the real product to use Maven for dependency tracking; I was just beginning to wish I had used Maven for the prototype as well. Oh well, a lesson for next time. For now, I was ready for my complex test case - I just had to decide what to use.
The site is currently built in HTML5, using server-side includes for all of the content that remains the same between pages. It seemed like a pretty straightforward process to convert this to my template engine, so I got to work: I started with one page (the home page), and turned it into the master template. I took all of the include directives and replaced them with the content they were including. I replaced all of the variable references with model references using injection or substitution. I ID'd all the elements in the master template that would need to be replaced by child templates. I then made another copy of the homepage, and set it up to derive from the master template.
I didn't want to convert the site to use servlets, since it wasn't really a dynamic site; I just wanted to be able to generate usable HTML from my template files. So I created a new class that would walk a directory, parse the templates, and write the output to files in an output directory. Initially, it set up the model data statically by hand at the start of execution.
All was well, but I needed a way for the child template to add elements to the page, rather than just replacing elements from the parent template. I came up with the idea of appending elements, using a data attribute data-z-append="before:
With this done, I set to work converting the remaining pages. Most of the pages were pretty straightforward, being handled just like the homepage; I dumped the SSI directives, added some appropriate IDs and classes, and all was well. However, the software pages presented some challenges. For one thing, they used a different footer than the rest of the site. It was time to put nested derivation to the test.
I created a software page template, which derived from the master template, that appended the additional footer content. I then had the software pages derive from this template, instead of deriving from the master template and - by some stroke of luck - it worked perfectly on the first try. I wasn't out of the woods yet, though.
The software pages also used SSI directives to dynamically insert the file size for downloadable files next to the links to download them. I wasn't going to reimplement this functionality, however, I was prepared to replace these directives with file size data stored in the model. But I wanted to keep the model data organized, so I needed to support nesting. The software pages also used include directives to include a Google+ widget on the pages; this couldn't be added to the template, as it was embedded in the body content, so it seemed like a perfect case for snippets - which meant I needed to implement snippet support.
Snippet support was pretty easy - find the data attribute, look up the snippet file, parse it as an HTML fragment, and replace the placeholder element with the snippet. Easy to implement, worked pretty well.
Nested properties I thought would be a breeze, as I had assumed it was natively supported by StrSubstitutor. Unfortunately it wasn't, so I had to write my own StrLookup. I decided that, since I was already doing some complex property lookups for injection, I'd build a unified model lookup class that could act as my StrLookup and could be used elsewhere. I wanted nested scope support as well, for my project list: each project had an entry in the model, that consisted of a name, latest version, etc. I wanted the engine to iterate this list, and for each entry, rather than replacing the entire content of the element with the text value of the model entry, I wanted it to find sub-elements and replace each with an appropriate property of the model entry. This meant I needed nested scoping support.
I implemented this using a scope stack and a recursive lookup. Basically, every time a nested scope was entered (e.g., content injection using an object or map, or iteration over a list), I would push the current scope onto the stack. When the nested scope was exited (i.e., the end of the element that added the scope), I popped the scope off. When iterating a loop, at the start of the iteration, I'd push the current index, and at the end of the iteration, I'd pop it back off.
This turned out to be very complex to implement, but after some trial and error, I got it working correctly. I then re-tested against my simple test case, having to fix a couple of minor defects introduced there with the new changes. But, at last, both my simple and moderate test cases were working.
I didn't like the static creation of model data - not very flexible at all - so I decided to swap it out with JSON processing. This introduced a couple of minor bugs, but it wasn't all that difficult to get it all working. The main downside was that it added several additional dependencies, and dependency management was getting more difficult. I wasn't too concerned on that front though, since I was already planning for the real product to use Maven for dependency tracking; I was just beginning to wish I had used Maven for the prototype as well. Oh well, a lesson for next time. For now, I was ready for my complex test case - I just had to decide what to use.
2013-04-22
Zen Templates Development Journal, Part 0
Zen Templates is based on an idea I've been tossing around for about six months. It started with a frustration that there was no way to validate a page written in PHP or JSP as valid HTML without executing it to get the output. It seemed like there had to be a way to accomplish that.
I started out looking into what I knew were global attributes, class and id. I did some research and found that the standard allows any character in a class or id; this includes parens and such, meaning a functional syntax could be used in these attributes, which a parser could then process to render the template.
This seemed practically ideal; I could inject content directly into the document, identifying the injection targets using these custom classes. I toyed with the idea of using this exclusively, but saw a couple of serious shortcomings. For one, sometimes you want to insert dynamic data into element attributes, and I didn't see a good way to handle that without allowing a substitution syntax like that of JSP or ASP. I decided this would be a requirement to do any real work with it.
I also saw the problem of templates. Often each page in a dynamic site is called a template, but I'm referring to the global templates that all pages on a site share, so there is only one place to update the global footer, for example. I had no good solution for this. I started thinking about the idea of each page being a template and sharing a global template - much akin to subclasses in object oriented programming, one template derives from another.
I started batting around different possibilities for deriving one template from another, and decided on having a function (in a class attribute) to identify the template being derived from, with hooks in the parent template to indicate which parts the "subtemplate" would be expected/allowed to "override".
I let the idea percolate for a while - a few weeks - as other things in life kept me too busy to work on it. Eventually it occurred to me that all these special functions in class attributes were pretty messy, and a poor abstraction for designers. It could potentially interfere with styling. It would produce ugly code. And I was on a semantic markup kick, and it seemed like a perfect opportunity to do something useful with semantic markup.
So I started rebuilding the concept and the current Zen Templates was born (and the name changed from its original, Tabula Obscura.) As I committed to maximizing the use of semantic markup and keeping template files as valid, usable HTML, I reworked old ideas and everything started falling into place. I remembered that the new HTML5 data attributes are global attributes as well, and would give me an additional option for adding data to markup without interfering with classes or ruining the semantics of the document.
I ironed out all the details of how derivation should work; it made semantic sense that a page that derived from another page could be identified by class; and, taking a page from OOP's book, it made sense that an element in the subpage with the same ID as an element in the parent page would override that element, making any element with an ID automatically a hook; somewhat like a subclass overriding methods in the superclass by defining a method with the same signature.
I sorted out the details of content Injection as well, thinking that, semantically, it made sense that an element of a certain class would accept data from the model with an identifier matching the class name. Even better, I didn't need a looping syntax; if you try to inject a list of items into an element, it would simply repeat the element for each item in the list. This simplified a lot of syntax I've had to use in the past using JSP or Smarty.
I also wrote out how substitution should work, using a syntax derived from JSP. Leaning on JSP allowed me to answer a lot of little questions easily. I would try to avoid the use of functions in the substitution syntax, because it does make the document messier, and forces more programming tasks on the designer. I conceded that some functions would likely be unavoidable.
When I felt like I had most of the details ironed out, a guiding principal in mind, and a set of rules of thumb to help guide me through questions down the road, I was ready for a prototype. Stay tuned for Part 1!
I started out looking into what I knew were global attributes, class and id. I did some research and found that the standard allows any character in a class or id; this includes parens and such, meaning a functional syntax could be used in these attributes, which a parser could then process to render the template.
This seemed practically ideal; I could inject content directly into the document, identifying the injection targets using these custom classes. I toyed with the idea of using this exclusively, but saw a couple of serious shortcomings. For one, sometimes you want to insert dynamic data into element attributes, and I didn't see a good way to handle that without allowing a substitution syntax like that of JSP or ASP. I decided this would be a requirement to do any real work with it.
I also saw the problem of templates. Often each page in a dynamic site is called a template, but I'm referring to the global templates that all pages on a site share, so there is only one place to update the global footer, for example. I had no good solution for this. I started thinking about the idea of each page being a template and sharing a global template - much akin to subclasses in object oriented programming, one template derives from another.
I started batting around different possibilities for deriving one template from another, and decided on having a function (in a class attribute) to identify the template being derived from, with hooks in the parent template to indicate which parts the "subtemplate" would be expected/allowed to "override".
I let the idea percolate for a while - a few weeks - as other things in life kept me too busy to work on it. Eventually it occurred to me that all these special functions in class attributes were pretty messy, and a poor abstraction for designers. It could potentially interfere with styling. It would produce ugly code. And I was on a semantic markup kick, and it seemed like a perfect opportunity to do something useful with semantic markup.
So I started rebuilding the concept and the current Zen Templates was born (and the name changed from its original, Tabula Obscura.) As I committed to maximizing the use of semantic markup and keeping template files as valid, usable HTML, I reworked old ideas and everything started falling into place. I remembered that the new HTML5 data attributes are global attributes as well, and would give me an additional option for adding data to markup without interfering with classes or ruining the semantics of the document.
I ironed out all the details of how derivation should work; it made semantic sense that a page that derived from another page could be identified by class; and, taking a page from OOP's book, it made sense that an element in the subpage with the same ID as an element in the parent page would override that element, making any element with an ID automatically a hook; somewhat like a subclass overriding methods in the superclass by defining a method with the same signature.
I sorted out the details of content Injection as well, thinking that, semantically, it made sense that an element of a certain class would accept data from the model with an identifier matching the class name. Even better, I didn't need a looping syntax; if you try to inject a list of items into an element, it would simply repeat the element for each item in the list. This simplified a lot of syntax I've had to use in the past using JSP or Smarty.
I also wrote out how substitution should work, using a syntax derived from JSP. Leaning on JSP allowed me to answer a lot of little questions easily. I would try to avoid the use of functions in the substitution syntax, because it does make the document messier, and forces more programming tasks on the designer. I conceded that some functions would likely be unavoidable.
When I felt like I had most of the details ironed out, a guiding principal in mind, and a set of rules of thumb to help guide me through questions down the road, I was ready for a prototype. Stay tuned for Part 1!
Zen Templates Development Journal, Part 1
Once my concept was well documented (see Part 0), I was ready to start developing my prototype. I had many questions I needed to answer:
- Is the concept feasible, useful, and superior in some meaningful way to existing solutions?
- What kind of performance could I expect?
- How would derivation work in real-world scenarios? What shortcomings are there in the simple system described in my concept?
- Ditto for content injection and substitution.
- How would I handle model data scoping?
- Would it be better to parse the template file into a DOM Document, or to parse it as a stream?
I started with an extremely simple use case: two templates, one deriving from the other; a couple of model data fields, one of them a list; use of basic derivation, injection, and substitution, with no scope concerns. I built the template files and dummy model data, such that I could quickly tell what was working and what wasn't ("this text is from the parent template", "this text is from the child template", "this text shouldn't appear in the final output", etc.) I also build a dead-simple servlet that did nothing but build the model, run the template renderer, and dump the output to the HttpServletResponse.
With this most basic use case in place, I started to work on the template renderer. I started with the state, initialization, and entry point. For the state, I knew I needed a reference to the template file, and I needed a Map for the model data. For initialization, I needed to take in a template file, and initialize the empty model. For convenience, I allowed initialization with a relative file path and a ServletContext, to allow referencing template files located under WEB-INF, so that they couldn't be accessed directly (a good practice borrowed from JSP.) I created accessors for adding data to the model.
The entry point was a function simply named "render". It was 5 lines long, each line calling an unimplemented protected method: loadTemplate, handleDerivation, handleInjection, handleSubstitution, and writeOut. These were the five steps needed for my basic use case.
I then went to work on building out each of the steps. The easiest was loading the template file from disk into a DOM Document using Jsoup (since XML handlers don't deal well with HTML content). At this point I added two Documents to the renderer's state, inDoc and outDoc. inDoc was the Document parsed from the template file, outDoc was the Document in memory being prepared for output. I followed a basic Applications Hungarian Notation, prefixing references to the input document with "in" and references to the output document with "out".
Since I needed to be able to execute derivation recursively, I decided to do it by creating a new renderer, passing it the parent template, and running only the loadTemplate and handleDerivation methods; then the parent renderer's outDoc became the child's starting outDoc. In this way, if the parent derived from another template, the nested derivation would happen automagically. I would then scan the parent document for ID's that matched elements in the child document, and replace them accordingly. Derivation was done.
Next up was injection: I started out by iterating over the keys in my model Map, scanning the document for matching class names. Where I found them, I simply replaced the innerHtml of the found element with the toString() value of the model data; if the model data was an array or collection, I would instead iterate the data, duplicating the matched element for each value, and replacing the cloned element's innerHtml with the list item's toString() value. This was enough functionality for my simple test case.
Reaching the home stretch, I did substitution ridiculously simply, using a very basic regex to find substitution placeholders (${somevariable}) and replacing each with the appropriate value from the model. I knew this solution wouldn't last, but it was enough for this early prototype.
Last up was writing the rendered output, and in this case, I allowed passing in an HttpServletResponse to write to. I would set the content type of the response, and dump the HTML of my final Document to the response stream.
I ran it, and somehow, it actually worked. I was shocked, but excited: in the course of a little over an hour, I had a fully working prototype of the most basic functions of my template engine. Not a complete or usable product by any means, but an excellent sign. I made a few tweaks here and there, correcting some minor issues (collection items were being inserted in reverse order, for example), but it was pretty much solid. I also replaced my RegEx-based substitution mechanism with the StrSubstitutor from commons-lang; this was pretty much a direct swap that worked perfectly.
Time for the next test, my moderate complexity test case.
Time for the next test, my moderate complexity test case.
The Development Stream
I was reading today about GitHub's use of chat bots to handle releases and continuous integration, and I think this is absolutely brilliant. In fact, it occurs to me that using a chat bot, or a set of chat bots, can provide an extremely effective workflow for any continuous-deployment project. Of course, it doesn't necessarily have to be a chat room with chat bots; it can be any sort of stream that can be updated in real-time - it could be a Twitter feed, or a web page, or anything. The sort of setup I envision would work something like this:
Everyone on the engineering team - developers, testers, managers, the whole lot - stay signed in to the stream as long as they're "on duty". Every time code is committed to a global branch - that is, a general-use preproduction or production branch - it shows up in the stream. Then the automated integration tests run, and the results are output to the stream. The commit is deployed to the appropriate environment, and the deployment status is output to the stream. Any issues that occur after deployment are output to the stream as well, for immediate investigation; this includes logged errors, crashes, alerts, assertion failures, and so on. Any time a QA opens a defect against a branch, the ticket summary is output to the stream. The stream history (if it's not already compiled from some set of persistent-storage sources) should be logged and archived for a period of time, maybe 7 to 30 days.
It's very important that the stream be as sparse as possible: no full stack traces with error messages, no full commit messages, just enough information to keep developers informed of what they will need to look into further elsewhere. This sort of live, real-time information stream is crucial in the success of any continuous-deployment environment, in order to keep the whole team abreast of any issues that might be introduced into production, along with when and how they were introduced.
Now, what I've described is a read-only stream: you can't do anything with it. GitHub's system of using an IRC bot allows them to issue commands to the bot to trigger deployments and the like. That could be part of the stream, or it could be part of another tool; as long as the deployment, and its results, are output to the shared stream for all to see. This is part of having the operational awareness necessary to quickly identify and fix issues, and to maintain maximum uptime.
There are a lot of possible solutions for this sort of thing; Campfire looks particularly promising because of its integration with other tools for aggregating instrumentation data. If you have experience with this sort of setup, please post in the comments, I'd love to hear about it!
Subscribe to:
Posts (Atom)