Friday, February 12, 2010

Authorization And You

I have been noticing a dangerous anti-pattern recently. It comes in the form of not enough authorization checking. When you have a function in your website that should only be available for particular users (administrators, moderators, whatever), you might be tempted to implement the code to restrict access once, in the view. Consider the following view to delete from a list of users (I snuck in the %: scriptlet tag which I've discussed before):


<ul>
<% @users.each do |user| %>
<li>
<%: user.name %>
<% if @me.can_delete_users? %>
<a href="/users/delete?id=<%: user.id %>">delete</a>
<% end %>
</li>
<% end %>
</ul>


This list might be accessible to everyone, but the deletion should be allowed only to privileged users. This could be a good argument to put all administrative features like this in their own actions/controllers, but there's really no need. Just be sure to check authorization again on the controller side:


class UsersController < ApplicationController
before_filter :construct_me # Set up @me from session

def delete
raise "Nice try!" unless @me.can_delete_users?
User.delete params["id"].to_i
redirect_to "/users/list"
end

def list
@users = User.find :all
end
end


Did you see the extra check that the user is allowed to delete users from within the delete action? That is extra critical. It's easy to consider relying on security through obscurity. Who will try that URL to delete the user if they have no idea what the URL is and what parameters are needed? If the URL were less obvious, you may be even more strongly considering just hiding the URL. While the thought to reduce code clutter and duplicated work may be tempting, it leaves a gaping hole to anyone malicious who decides to start playing around with URLs, searching for administrative functions they shouldn't have access to. You may even get disgruntled former administrators/moderators/whatever who knew the URLs to decide to have some nasty fun with those functions they've been since blocked from.

Security is a hard thing to get right, but there's no excuse to succumb to laziness and knowingly leave dangerous holes in your system. The more powerful the function, the more on edge you need to be for possible ways to break your security.

Note that my snippets include an XSRF hole, but that's a problem for another post.

Friday, January 15, 2010

Design for Extension

I'm in the beginning of writing a simple library, primarily written in Java, with the intention of open sourcing it when it is in a reasonably functional state. As I began work on this project, I decided to try out a new way of designing my classes.

Ruby is my favorite programming language, and one of my favorite aspects of the language is that anything goes. Any object is extensible, even its private methods. Not even the built in classes are safe from extension. This is what's called giving you enough rope to hang yourself. Sure, you could hang yourself with this much power... but you could also use that rope to do some pretty cool things.

Nothing irks me more than non-extensible code. Many times I have run across Java code in the wild that either made a method final that I felt the need to override, or didn't expose a critical bit of API that prevented me from extending it in a key way I needed. It is fine if it is code I can change... I can just make something more open and extend as needed. The problem lies in third party libraries that I don't want to maintain private patches for. This, combined with my love of Ruby's openness, has led me to the goal of making all my Java classes as extensible as possible. The final keyword will only be used in dire circumstances (aka likely never), or to fake a closure in an anonymous class. The private keyword will be reserved only for fields and methods of no consequence.

Before you prepare to write a nasty comment about how ignorant and stupid this design will be, let me jump the gun and tell you why I think it is a bad idea. Either the API becomes locked into potentially poor choices, or the API changes underneath the feet of users of my code. The former is a serious problem if I want to have the freedom to evolve my code, so I opt for the latter. I see exposing a bunch of protected methods as a way to let the users of my code extend my classes in ways I didn't anticipate, yet also imply that it is fair game to change as I see fit. My goal would be to keep public methods as stable as possible, but even those I would be willing to shift around in the name of a better design (be it for code clarity, performance, or whatever).

Granted, this exposes implementation details that aren't really necessary for the client code. However, if someone abuses the details enough to cause themselves pain... I feel the pain is necessary to teach them a lesson. After all, we grow by making mistakes.

The benefit is code that can be essentially patched without needing to modify the source directly, and the ability to use the code in whatever way the client author can imagine. I'm always in favor of more freedom. If it were really that bad of an idea, I don't think Ruby (and other dynamic languages) would be as successful and popular as they are.

As I begin to try out this design mentality, I have run across some important discoveries immediately. It is very important to not depend on fields, because if you do, you limit the ways your code can be extended. Consider:


class Person {
private String firstName;
private String lastName;

public String getFirstName() {
return firstName;
}

public String getLastName() {
return lastName;
}

@Override
public String toString() {
return firstName + " " + lastName;
}
}


The above code is fairly extensible, but what if I wanted to change how getFirstName() works? Maybe I want to prepend a prefix, for example. Regardless, I can override getFirstName() and getLastName(), but toString() will not follow suit to the newly defined behavior in the subclass. Instead, I should have written Person as follows:


class Person {
private String firstName;
private String lastName;

public String getFirstName() {
return firstName;
}

public String getLastName() {
return lastName;
}

@Override
public String toString() {
return getFirstName() + " " + getLastName();
}
}


I could have also resolved this by making the fields protected, or exposing setters. Exposing fields beyond private still doesn't typically sit well with me in Java land (perhaps that's foolishly old school, but I'm stuck in some of my ways). Exposing setters seems to just expose complexity that isn't needed. I may not want my Person to have changeable names. Besides, the ability to redefine a getter by appending, prepending, replacing, or mangling the result is quite different from needing to set the value ahead of time.

The other key aspect I have discovered is to be very cautious when designing your constructors. Constructors in Java pose one of the biggest ways to accidentally lock your code into specific details. For example:


public class Person {
public Person(String username) {
// Connect to a database and load the person
...
}
}


Do you see the problem? Yeah... you have made it so that you cannot construct a Person without connecting to a database and loading a person. Even if you subclass this, you are still stuck with that behavior no matter what. This has a couple possible solutions. You could extract the database connection details like so:


public class Person {
public Person(String username) {
loadFromDatabase(username);
}

protected void loadFromDatabase(String username) {
// Connect to a database and load the person
...
}
}


This will allow a subclass to override loadFromDatabase(String) so that it does nothing. This feels a bit hacky to me, but it works perfectly. The other way around this is to expose an alternative constructor (such as one without parameters). This other constructor could have an empty body, and you could even make it protected if you only want to expose it to give more power to subclasses.

So, I will see how this design works for me, but I already like it. Also, don't take my meaning to say that all methods will be exposed as protected or public. For example, a method that has a very well defined behavior might grow rather large and require some breaking apart. There is no need to make all the pieces protected if they really only make sense as a means of implementing the bigger method. There might be no serious harm in exposing them, but I don't believe in absolute rules either.

There is no One True Way to design. The flaws in this plan might prove more serious than I anticipate, but it doesn't negate that there is some value in allowing free extension. Design is also a bit of a personal adventure, so if this doesn't float your boat, look for alternatives that give you the Happy Feeling.

Any thoughts? Am I destined to fail? Is this something you've tried and enjoy?

Wednesday, January 13, 2010

Static State

Beware of static state. Need I say more?

Ok, I guess I shall, regardless of if I need to. If your automated tests all run from the same context (in my case, a JVM instance), static state will bite you... eventually. You may think you are dealing with the state just fine, but it will hurt you eventually.

I have seen many problems crop up with static state so far. Cached data from the database causes data from one test to bleed into another test. Between tests, you will likely have different data. If the data is not based on a fixed set, it is inevitable that some of the data will be slightly tweaked between tests. If that data is cached, the later test will end up with a cached copy of the earlier test's data. This may not crop up right away, and can happen at any time with cached data. A new test that runs before some old tests could cause the old tests to fail. If the order of tests is not predictable in your automated build, you could get a new ordering that sets up a perfect storm to cause some invalid cached data to stick around and haunt another test.

Similarly, you might have some mock versions of static data that makes testing easier. If you don't clean up the mock data, another test may be expecting the default behavior, and again you are failing because of an indirect cause.

Ultimately, static data can be useful and convenient at times, just be aware that your test environment gets weakened with every static cache or storage you use. You can't always simply try to clean up the cached data when you use it. Consider a class, A, that caches some form of data. Your current test depends on class B, which luckily doesn't use the cached data in A, but it does use class C. You don't deal with cleaning up A. Then, down the road, someone changes C to update or view the static state in A. All of a sudden the tests for B are dependent on static state you had no idea about. Or you are modifying that data unknowingly, and affecting all tests that run after yours that depend on the state of A being in a default state.

I've found the most effective approach is a setup or tear down method in your tests that explicitly clears all known cache. This will ensure no test can hide data that leaks into another test, causing an unexpected (and difficult to track) test failure.

Just be on the lookout for the telltale sign of static state causing failures. If a test class fails on the build machine (or in a group), but it doesn't when run individually... you might be seeing static state showing its ugly head.

Monday, January 11, 2010

Meaningful Comments

Comments are an interesting beast. You can comment your code too much, which will clutter the code so much as to decrease readability. You can comment your code too little such that it is difficult to understand the more complicated areas. You can even not comment at all, and hope the code speaks for itself.

Probably due mostly to laziness, I don't comment my code heavily. When I do, I like my comments to have meaning. I try to write JavaDocs or RDocs for every method, even if it is just a simple comment describing the purpose of the function, with expected behavior. More often than not, I inspect unknown code by looking at the actual implementation. However, sometimes I prefer to jump to the online documentation, so I try to do my part in making sure it is fleshed out.

The problem I see with JavaDocs and RDocs is that it takes diligence to keep them up to date. Sometimes you can unintentionally change the behavior of some method by twiddling with a private or protected method, causing an undocumented change that breaks the usefulness of your JavaDocs and RDocs. Even so, there is benefit in being able to quickly see what a method does, and use that as your base assumptions. If tests prove otherwise, the code can hopefully speak for itself and resolve your conflicts.

The trickiest comments, though, are inline comments. When it comes time to document the details of your code, I've found it very difficult to do a good job of thoroughly commenting it. The problem lies in the fact that the comments are useless until it is time to revisit old code. Once you have lost the context of why you wrote some code, it is very difficult to get it back. We have all run into this problem... you find a bug in some month old code, and start to retrace what the code is doing. You are browsing around, and you see some oddities, but can't for the life of you figure out why the code was written in this way. If you wrote it, you might remember you made some critical decision based on some immediate goal, but the goal and decision are now completely lost.

Therein lies the problem. Code needs to be documented to show why you wrote it that way, not what that code does. This comment is useless:


// Add 1 to the count of items.
itemCount++;


If you describe why you did something, you might give your future self a break when it comes time to track down an error, so try a comment more like:


// The item count is off by 1 because the first item is
// implicitly dealt with already.
itemCount++;


Now, this still leaves the problem open of when to write those comments. The best advice I can give is to look for those moments that you had to think a while to figure out what to do, or the code just looks too complicated. The best solution is so obvious it documents itself, but that can sometimes be rather difficult to achieve.

What are your guidelines of when to document code? Do you still run into those "what was I thinking" moments?

Friday, January 8, 2010

Peer Demo

A lot of software practices make sense when described. Source control, automated tests, automated build server, etc. Even the most obvious practices become crystal clear the moment they have a real impact on your development. That automated test that caught the bug as soon as you created it. The automated build that notified you of an old test that is now failing because of a new bug. A massive refactoring that turns out to be fruitless, easily reverted with some easy source control commands. An old diff helping you regain context of why you (or someone else) made a particular change.

Recently I had the revelation with simple peer review. Not even peer review of code, but the simple practice of demoing your finished code to a peer. Before going in to this recent demo, I fully understood the benefits of such practice, but it transformed a neat idea to a must do.

Now, like most developers, I am very obsessive about my code. I do everything I can to make sure it is polished and perfect when it is time to deploy. I try to write as many reasonable unit tests as I can. I constantly exercise my code, both with test driven development and with actually looking at the visible output to make sure it is doing what I expect. With all my obsessing over my code, I tend to finish with a fairly confident feeling that it is ready for production. Don't misunderstand my meaning to think that I feel my code is perfect, or even necessarily good, and most definitely not bug free, but I take pride in my craft and try to call it done when I really think it is done.

There is a big problem though... like every other developer, I am human. I don't anticipate a certain kind of error condition. I forget to check a particular path in the code. I over complicate the solution and miss a way to make the UI a lot simpler and easier to use. Ultimately, I am left with a couple bugs that I just couldn't quite run across.

For one, several, or all of the above reasons, I end up with a demo that makes blatantly clear a few key mistakes that I made. This happened with my most recent demo. Together, the reviewer and I filed quite a few bugs in our issue tracking software that I needed to resolve. Not all were directly caused by me and my code... some were issues that dated back quite a while... but quite a few of them were most definitely because I hadn't exercised the code quite as thoroughly as I had thought.

The key thing I take away from this reminds me of Jeff Atwood's post on not going dark. I'm not afraid to share my code as I develop it. In fact, I enjoy showing off a small snippet or clever bug fix I just wrote that I feel does something neat. I similarly like to see when other developers finish something they feel proud of. However, I still end up writing my feature in relative darkness. The demo is like shining a flashlight in the corners that I didn't get to. By having another person walk through your code, or even just the simple act of showing that it works, you will often end up seeing a different perspective that you never thought of while you were working. Sometimes you run across code that had been manually tested, but changed since the tests and is now broken. Whatever the case, you will inevitably run across something that could be improved upon that you didn't quite catch.

It doesn't even have to be a bug, necessarily. For example, in the demo, we decided that the action we were exposing to the user could benefit from being run multiple times, sometimes. So, we made an easy to access link that allows you to retry the action when you want to, which saved the user from the need to search a table for the same row to run the action again. This was a missing feature in the original spec, not exactly a bug. Nevertheless, the demo made this change an obvious need. Perhaps it is the halfway point between just deploying new code, and dogfooding your code. In the absence of being able to dogfood your code, a simple demo of all new features will expose a good chunk of what you would have been able to find through dogfooding.

So before you deploy that new feature... grab a friend and just demo it to them. Even if you don't find a single issue, at least you will be more confident that your code is ready.

Wednesday, January 6, 2010

Android Critique

I like to think that I can evaluate and judge a product based on its merits, but there are some companies that elicit emotional responses from me. Apple is one. Contrary to common emotional responses, I get repulsed by Apple products, just because it is Apple. I used to be neutral with them... I liked the iPod, but was never very interested in anything else they did. With their handling of the iPhone (and the App Store specifically), I have come to mistrust them, and now I will actively avoid their products when possible.

Google also brings up emotional feelings for me, but quite the opposite. I have a soft spot for Google for a few reasons. They are strong supporters of open source, both by providing a lot of open source projects (Android, Chrome, Chrome OS, some neat Java libraries... to name a few), but also by providing infrastructure (google code). Most of their products are free, which is hard to dislike. Not only are they free, they tend to be pretty good. I have never had problems with their search. GMail has always been so simple and easy to use that I would never think of switching (especially after seeing people struggle with Yahoo Mail, which seems so cluttered and complicated by contrast). Android blew me away as my first smart phone (with the G1).

With the emotional responses, it's a bit hard to judge whether I like a product from its merits, or it is just my fanboyism that is blinding me to the issues. I want to take a moment to critique Android in this light. I still love the platform, but it would be foolish to claim there are no issues. I honestly believe that overall it is a step up from the iPhone, but I have never had extended time with the iPhone to really have an honestly unbiased opinion there.

First, the good things. Given the right hardware, it can be a snappy OS. The G1 can be sluggish, but my new Droid has proven a significant speed boost. Widgets represent a huge advantage over the bland icon-only iPhone interface. I like having the calendar straight there on my desktop, no touching to see what I have coming up. The ability for an application to always display pertinent information is pretty cool.

Multithreading and background processes can be a pain to implement, but it is a neat feature that Android has over the iPhone. I can have applications actively retrieving useful information, or interacting with each other, or switch between processes while long running tasks are executing. It makes my phone experience much more in line with my desktop computer experience.

The updates are constantly making the Android experience better. Before I switched to the Droid, I had navigation on my G1, something that was nowhere in sight when I first got it. Widgets didn't really exist, and I can now do things such as search with voice commands. The Market (equivalent to the Apple App Store) has slowly evolved with the platform. It only supported free apps initially, but now has both paid and free. Screenshots were added recently, along with the ability to view the top free and paid apps separately. I expect the Market experience only to improve with time.

Now to move on to some bad things. I really like the Droid hardware, but one thing I miss from the G1 are a dedicated number row on the physical keyboard. It is really annoying to have to press alt to type a number. I also miss a dedicated call/answer and hangup button. Those 2 missing features of the hardware are offset by the much better screen and the increased processing and storage power. However, I really miss them.

As for Android itself, the biggest issue is probably caused by the choice of Java as the development language. Applications tend to pause every now and then, including the standard apps like the browser. It could be too many background processes running, but it is very reminiscent of garbage collection pauses which are common in Java. I hope this will be improved with ever better hardware along with improvements in the Dalvik VM that Android uses.

My G1 didn't seem to lock up that often, but my Droid has already locked up a few times, and experienced other glitches that you might not be used to happening in a phone. The oddest recently happened where I could hear just fine during a call, but only when the speaker was on. Switch to non-speaker mode and I couldn't hear a word. Rebooting the phone fixed it. I expect software to fail like this now and then, but it is not good to have my phone crash when I could be receiving an important call, or otherwise experience weird glitches while speaking with someone. I expect updates to resolve these kinds of issues, but it is unfortunate to see any of such catastrophic failures. Probably the worst failure I have had was on my G1, when I had lost reception for about a day without realizing it. The phone just stopped receiving phone calls until I rebooted.

There are a lot of features that are very useful, but very hard to discover. For example, long press on the home button will bring up a dialog that will let you run recent applications. This is extremely useful when switching between a couple applications. Similarly, I only recently learned how to cut and paste with keyboard shortcuts instead of clunky menu controls. I had to find those by looking online. Perhaps it isn't fair to blame the platform for hard to discover features, but it might be nice if the platform gave easy to use, short tutorials the first time you do certain key tasks.

The Market still needs a lot of work. What is the one thing Google should get right in its mobile platform? Search! Searching in the Market doesn't work very well. I have run keywords looking for specific applications, only to get back no results, even though the keywords should have worked. For example, "snogg" will not find the "DoggCatcher" application, even though the developers are "SnoggDoggler." There are also very minimal ways to find applications. You can search, you can browse various categories, and you can look at the top paid, top free, or "just in" of a particular category. Unfortunately "top" has no clear meaning. Under the top free list of all games, PapiJump is below WordSearch Unlimited Free. Except, WordSearch has less downloads (50k - 250k vs. PapiJump's over 250k), and a lower rating (4 stars vs. 4.5 stars). It would be nice to sort the lists in various ways, such as by overall vote score, number of downloads, whatever the current "top" algorithm is, alphabetically, etc.

Overall I am very happy with the platform, but some of the issues I have run into are probably non-starters for some people. It has clearly gotten better since I first started with my G1, so I have no doubts that the platform will only improve with time, and eventually beat out Apple as the #1 smart phone.

Monday, January 4, 2010

Love Hate

Is there a language feature you absolutely adore? For me it's closures. Blocks in Ruby make me warm and fuzzy, and anonymous functions in JavaScript make me giddy. Anonymous classes in Java make me... puke. Sorry, Java just doesn't have good closure support.

If you have a language feature you adore, chances are good that there is a language feature you loath. For me, it is the ternary operator. I don't care which language you are talking about, ternary operators cause me to run in the corner and hide, cowering, and waiting till they go away, on their merry business.

Perhaps it is a bit of an irrational loathing. I like dense code, but I want my dense code to be readable. I find nothing wrong with something like:


str = list.map { |x|
x * x if x
}.compact.uniq.sort.join ","


But write something like:


str = value.nil? ? "null" : "'#{value}'"


And I will refactor your code. Whatever the case, a ternary operator makes me stop and think more carefully about what is going on whenever I see it. Code nirvana, to me, means code that is intuitive and self documenting. Method names like compact, uniq and sort all lend themselves to self documentation. The ternary operator, on the other hand, feels cluttered and doesn't lend itself to immediately understanding what the purpose of the code is.

I will go to great lengths to avoid a ternary operator. I prefer the more verbose if control structure. Ruby makes the deal even sweeter since an if is an expression (like everything else in the language). I will happily pay the cost of the added lines of code, though, if it means I can stay clear of any ternary code.

If you feel you must use a ternary, do me a favor and use just one per expression. Not just use one, do nothing else on that line of code, so the ternary avoids external clutter to make it even more non-obvious what is going on in the code.

Are there language features that make you cringe every time a fellow developer uses it, regardless of context?