May 31, 2009

Maker Faire

Went to the maker faire in San Mateo yesterday. Quite a crowd compared to my recollection of previous years - way too packed. I think next year I might go on Sunday to see if that reduces the crowd any. A few choice photos are below.

First, the cardboard surfboard.

The steampunk victorian house:

The shark car:

And the snail car:

A flaming flower:

A child with wings:

And an egg painting robot:

Tennessee Cove Trail

View Tennessee Cove Trail in a larger map

Today Cristin and I took a short hike on a little known trail up in the Marin Headlands called Tennessee Cove. I had never been to this area of the park, and it was nice to explore somewhere new so close to home. This hike is 1.9 miles one-way, almost perfectly flat with only about 200ft of elevation change and is paved half-way. A little after the trail branches off the paved trail, it splits into two trails for about a mile. One completely flat walking trail through the marsh and the other a dirt road with small elevation changes that is appropriate for an easy bike ride. Both trails end up at a small cove on the ocean just north of the golden gate bridge. You can't see the bridge from the beach, but it's still a nice view. Here is a photo looking down the beach:

There are also several side-trails with lots of elevation climbing over the headlands for the more adventurous.

May 27, 2009

Backpacking the Lost Coast

This monday was memorial day in the US meaning a 3-day weekend.  My friend, Jeremy Shapiro, organized a backpacking trip on California's Lost Coast trail.  He tricked convinced me to come along for the trip.

Here is an embedded map of our 3 days showing each day's trail with a different color:

View Lost Coast Trail in a larger map

It's a great trail if you are ever looking for a backpacking trip. It's ~25 miles and completely flat but still rough terrain. Much of the time you are walking directly on the beach sand or hopping across boulders. You have to carry a tide table because parts of the trail are only passable at low tide. And lastly, since there are bears, you are legally required to haul around a heavy/bulky bear canister which prevents bears from eating your food even if they do get it. However, for your efforts, you get to hike along almost completely undisturbed coast for 3 days. The trail has almost no structures, definitely no roads, and is pretty isolated. At many times I felt like our group was the only one around, even though it was a pretty popular time to hike the trail. The best campsite areas were a bit full, but since it was BLM land you could camp anywhere you pleased and even rarer for CA - you could have campfires. As for wildlife, I saw several deer (one with fawn), seals (up close), sea lions, octopus (washed up), tidepool life, pelicans (hundreds), and other fun stuff. No bears sadly.

Here is a photo of our small crew. Click the image to be taken to a facebook photo gallery of the trip.

May 9, 2009

Strip unused form fields from form submissions

In my recent post "Why do we even need url shorteners?", I laid out a case for why URLs on the web are actually useful User Interface elements as well as wrote a little bit about how certain bits of history colluded to create needlessly long URLs as the norm. One specific case raised was that of HTML forms. Today I want to show you one way in which to simplify form submissions.
Many forms will contain many different input elements with very few of them actually being used at the same time. Most server-side software is equipped to accept the form even if some of the arguments are missing. It will just assume that the arguments take on the default values. For example, the Google advanced search interface has dozens of input elements. Just entering the query [gregable] and submitting the form will generate the following URL:

However, take away all of the default values for the field, and we get this completely equivalent URL:

However, browsers offer no easy way for the creator of an HTML form to craft these more usable URLs. Through javascript however, we certainly can. First, the demo. This demo is a stripped down version of the advanced search interface. I also added a checkbox and two radio buttons to illustrate some additional input types. If you fill out any field and hit submit, that set of fields and only that set of fields get submitted. If you are not running javascript this trick doesn't work, all the fields get submitted, but the form still operates. Give it a whirl:

How does it work? You can of course look at the source code. I define one short javascript function: stripFormDefaults. Then where I declare the form, I add onsubmit="return stripFormDefaults(this)". Nothing else fancy is going on.

The W3C tells us that when submitting a form, you only submit the values from the successful form controls. One of the rules for being successful is that the control must not be disabled. So our stripFormDefaults code just disables all of the form controls that we don't want to submit immediately before submitting. It then re-enables them in case the user hits the back button.

In order to know which form controls to submit, we take advantage of a little known set of form element properties that store the default value or state of each form control. Depending on the type of control element, this property is called defaultValue, defaultSelected, or defaultChecked. We simply compare the actual value of each form control with the default value and if they are the same, we disable that control before submitting.

May 6, 2009

Why do we even need URL Shorteners?

My first thought was to title this post: "Why are URLs long?" but I realize that the reason I'm writing this was because of the recent issues being raised around URL Shorteners (aka: TinyURL).  While this post is over a month late to the party, the context seems relevant.

So, why do we even need URL Shorteners?  The answer is simple: because URLs are too long.  This may be an issue made more obvious with twitter, cell phones, or any kind of manual text-entry, but it isn't only related to this.  Essentially, most interesting content on the web has a URL that is too long to remember/type in/share.  This can be a problem if you are:
  1.  Sending an email to someone who uses a crappy email client that wraps (breaks) lines over some character limit.
  2. Hanging posters in your dorm with a URL to get more information.
  3. Giving a talk at a conference and want the audience to write down/remember some URL later.
  4. Having a verbal conversation with a friend: "I'll send you a link later" is a symptom of this issue.
Worse than just long, most URLs are a crappy User Interface.

Root Causes:
When "moving pictures" (video) first became possible to a large audience, we largely just recorded plays - what we were used to pre-video.  Only with time did we learn that the new medium afforded interesting new possibilities: camera angles, shifting scenes, overlaid audio, special effects, etc.

The web evolved similarly.  In the original web, most web servers were designed to be a way to access a collection of files on a server some where.  We were familiar with file systems and the pre-web internet was a lot of FTP and BBS servers.  Our URLs naturally then mirrored file systems.  There was certainly nothing that I know of in the HTTP spec that said they had to be.  This got us into some trouble:

With the file system as a metaphor, URLs got extensions (.html, .php, .asp).  Even though the HTTP spec defined a way to communicate the content type outside the URL structure, we were familiar with the extension UI element. However the vast majority of the URLs we interacted with were all one content-type: HTML.  Sure, HTML embedded .gif and .js, but users didn't directly interact with those URLs often, they were hidden.  What type of software generated the page (.php, .asp, .jsp) wasn't remotely interesting.  For the vast majority of URLs we were viewing, the information presented in extension was redundantly obvious or plain irrelevant.  Even this post will have a URL that ends with .html, 5 characters of needless redundancy!

With the file system as a metaphor, URLs became organized hierarchically into directories.  We grouped them by topic, date or whatever with well-defined levels of hierarchy.  Each file in one folder.  Most early http servers would even automatically generate and serve an "index" page which listed all the files in a particular directory. What was a weak metaphor for a hard drive file system became worse on the internet.  Hyperlinks made certain of that.  Instead of there being only one path to navigate through a series of directories to a document on the internet, links made sure there were plenty of paths to navigate.  Our URLs looked like a tree, but on closer inspection, we had really built a web.

Take this post for example.  It's path looks something like:

However, I sincerely doubt that you navigated to this post by first looking for documents that I created in 2009, followed by those I created in May (month 05).  You came through either a hyperlink or a feed reader.  The directory structure here is showing information that isn't usually that interesting to a user actually interacting with a URL.  How often are book titles based on Dewey Decimal categories?

Search Engines
The file system metaphor can't explain all our woes.  After all who in their right mind would ever name a file something so long as why-do-we-even-need-url-shorteners.html?  And originally, the web wasn't named this way.  Had I chosen it, this page might have a name of url-shorteners.html or long-URLs-rant.html.  But then search engines came along.  And before long it became known that one of their ranking signals was words contained in the URLs.  Users didn't type in URLs anyway, right?  They just clicked on them, so it quickly became more important to create URLs for Search Engine Marketing than for Usability: more keywords are always better.

But you can't blame Search Engines.  People frequently named their pages with descriptive URLs.  Using this as a signal made lots of sense.  And once webmasters noticed it and reacted to it, this custom was only further reinforced.  As a result we have, why-do-we-even-need-url-shorteners.html(39 characters) instead of url-shorteners.html (19 characters).

The HTML spec isn't completely blameless either.  Since our metaphor was a file system, we never really expected significant amounts of dynamic content.  When HTML forms were designed, we imagined things like a way to leave a comment for a webmaster, or a way to upload a file.  After all, what other interactions had we really done in the days of FTP or BBS systems?

As content on the internet became more dynamic, forms started to be used more frequently for navigation: search boxes, preference settings pages, javascript drop down elements.  All of these things created URLs that were strictly defined by how the HTML spec required GET method forms to interact.  For example, when submitting a form, even if only one of the fields is filled in, all of the fields become part of the URL: ?q=foo+bar&page=&sort=&width=  Repeated values create repeated keys as well: ?opt=red&opt=blue&opt=green  What a waste.


Historically each hostname (subdomain) generally referred to a different machine.  Most machines exposed to the internet were not running HTTP servers.  As a result, most uses of hostnames were for things other than a web browser.  Since the default was not HTTP, we needed a way to refer to the machine running the HTTP server.  A custom arose - the HTTP server would run on the machine named www.  It was short, easy to type, memorable, and unique.  These days with hardware load balancers, HTTP hostnames rarely refer to individual machines directly.  Instead a single hostname can refer to hundreds of separate machines.  However www has stuck around because people have come to expect it.  The mere presence of a www prefix calls up the concept of a web page in most minds.  As you'll notice, doesn't have a www and neither do url shorteners - 4 unneeded characters that will be with most URLs for a long time.

Change you can believe in:
Fortunately, this is not a chicken and egg problem.  If you run a website or a CMS system, you could write better URLs today without waiting for your customers to do something first.  Not all chickens have that much control, but many do.  And many websites are already paying attention.  Take a close look at how Twitter carefully crafts their URLs to be user interface elements in themselves.

A few of my suggested rules of thumb, but first an important disclaimer.  I do work for a search engine company, but the opinions expressed on my blog are my own and not necessarily those of my employer.  These recommendations may not be valid in the context of search engine optimization.  They are simply my opinions about how URLS could be effectively used as a User Interface Element.  With that out of the way, here we go:

  1. Drop the www.  But if your users type it, make sure you still get them to the right place.
  2. Drop the extensions (.html, .php) for HTML pages - they are the default.  Keep them for non-HTML documents (PDF, images, text) because they are useful hints to a user about what to expect.
  3. Don't let HTML forms dictate your URL structure.  They are a necessary evil for actual user-input, but they create awful URL UI experiences.
  4. Use directory structures for things users care about, not uninteresting categorization.  Each level you add makes the URL longer and potentially harder to remember/reuse.
  5. Urls should be descriptive.  Long numbers are often really bad, a few words are really good.
Finally, think about what is the shortest URL for a given page that would be specific and convey alot of information about what you might expect to find there.

For example, this URL could easily have been as long as:  (80 chars)

Or it could potentially have been as short and descriptive as: (34 chars)
34 chars isn't bad.  Even a tinyurl would look like (25 chars).  And consider how much more information is conveyed in the short and descriptive URL for a cost of 9 measly characters.