Jason Morrison

Make stuff.

RedStorm: Distributed Computation in Ruby

On December 11, 2012 I gave a talk at Boston.rb about writing distributed realtime computations in Ruby using Storm by Nathan Marz and RedStorm by Colin Surprenant.

There is a video of the talk on the Boston.rb website, and the slides are posted online.

Basically, Storm provides a framework for building streaming/realtime computations (like log analysis, for example) and distributed RPC for running large adhoc computations on a cluster. RedStorm is a JRuby-based adapter for writing these computations and assembling them into topologies (workflows) in Ruby.

Here are the recommended resources from my talk:

Getting started

Related software tools

  • storm-contrib provides integration with many third-party tools like communicating with queues, service buses, and databases.
  • storm-deploy “makes it dead-simple to deploy Storm clusters on AWS.”
  • storm-mesos provides integration with Apache Mesos for cluster resource management.

Documentation

Talks

Two excellent talks by Storm author Nathan Marz:

Book

  • Big Data is an early access book by Nathan Marz which covers “Principles and best practices of scalable realtime data systems”

Other ESP/CEP resources

Storm lives in a space that’s often referred to as ESP (“Event Stream Processing”) or CEP (“Complex Event Processing”):

CLAHub: Easy Contributor Agreements on GitHub

CLAHub is a small side project I cooked up a few months ago, and just got around to open-sourcing. The goal is to remove the friction of Contributor License Agreements for contributors and maintainers alike. It’s not done yet, but I’m curious to hear what people think.

What is it?

The general idea with CLAs is this: contributors grant the maintainer a license to distribute the their code, and state that they’re legally able to do so. A fair number of projects have a CLA in place, including jQuery, Node.js, Django, and Chef. In the best cases the CLA is signed via electronic signature, like Node.js does with a Google Form. In the worst cases you have to print, sign, and fax the agreement. In all cases, maintainers are responsible for cross-referencing contributions and signatures to make sure all contributions have a corresponding signature.

With CLAHub and an open source project on GitHub you can:

  • Sign in with GitHub and create a CLA for your project.
  • Ask contributors to sign in with GitHub to electronically sign the CLA.
  • See on each pull request whether the contributors have all signed your CLA. This uses the handy Commit Status API, similar to what CI tools do.

Here’s the app. There’s a little slideshow on the frontpage to see how it works. And here’s the source on GitHub.

Learn more about CLAs

Here’s some more background on CLAs:

Want to choose a CLA? Project Harmony is a web tool that helps you quickly select a CLA.

Feedback

There’s more that needs to be done, but the core of the app works. The next steps are in GitHub issues.

Do you use a CLA for your project(s)? Would this encourage you to add a CLA if you don’t have one already? (That’s not really my goal - just to reduce friction where CLAs are already valuable.) If you have a CLA, would you use something like this to reduce the barrier to entry and your overhead? What kinds of features would be useful?

Papernaut: Exploring Online Discussion of Academic Papers

If you regularly read scholarly papers, you likely use a reference manager to maintain your personal library. Papernaut connects to your library to find online coverage and discussion of your papers in blogs, forums, and mainstream media. My hope is that these discussions can provide broader perspective on research and, in some cases, be the spark that starts a new collaboration.

Here’s a very quick video demo. We start with a Zotero library that includes a paper from Science on the effect of pesticides on honey bees. We then connect to Papernaut, and find several discussions and articles, including one in The Guardian:

I’ve been working on Papernaut in my spare time for a few months, and I’m happy to say that it’s now open source. The project comes in two parts, and the source is on GitHub:

If you are interested in how the application is put together, the rest of this article is a technical overview of the moving parts and how they interact.

Overview: A simple example

Let’s walk through a simplified example. Say I have only one paper in my reference manager – that paper from earlier, about the effect of pesticides on honey bees:

Henry, M., Beguin, M., Requier, F., Rollin, O., Odoux, J., Aupinel, P., Aptel, J., Tchamitchian, S., & Decourtye, A. (2012). A Common Pesticide Decreases Foraging Success and Survival in Honey Bees. Science, 336 (6079), 348-350 DOI:10.1126/science.1215039

Let’s also say that the engine is crawling content from only one source feed, ResearchBlogging.org. Among many other content items, that source feed contains a relevant entry, whose content page is on The Guardian.

We’ll look at how the engine crawls and indexes this source feed. Then, we’ll see how the frontend pulls the paper from my reference manager and asks the engine for relevant discussions.

Papernaut-engine: Loading content and identifying papers

The goal of the engine is to produce a collection of Discussion records, each of which links to several Identifier records, representing journal papers that are referenced from the Discussion. In our example, the Discussion is the article in The Guardian, and the relevant Identifier is DOI:10.1126/science.1215039. There are also intermediate objects, Page and Link which connect Discussions to Identifiers.

The engine consists of two main parts: loaders (which are Ruby classes), and the query API (a Rails app). For loading, it also depends on an external running instance of the Zotero translation-server.

Loading content by crawling feeds

The loaders load discussion candidates from feeds and archives, extract outbound links, and store these in the database.

In the first step, I invoke the ResearchBlogging.org loader to crawl and index the most recent 100 pages of their archives:

[engine] rails runner "Loaders::ResearchbloggingWebLoader.new(100).load"

This will load a large number of Discussion entries into the database, with zero or more Page entries for each Discussion, corresponding to outbound links.

At this point, the engine database contains the Discussion:

#<Discussion id: 3424,
             url: "http://www.guardian.co.uk/science/grrlscientist/2012/may/08/1",
             title: " Bee deaths linked to common pesticides | video | G...", ...>

and the linked Page entries:

[#<Page id: 7531, url: "http://dx.doi.org/10.1126/science.1215039", ... >,
 #<Page id: 7532, url: "http://pubget.com/doi/10.1126/science.1215039", ... >,
 #<Page id: 7533, url: "http://dx.doi.org/10.1126/science.1215025", ... >,
 #<Page id: 7534, url: "http://pubget.com/doi/10.1126/science.1215025", ... >]

Identifying papers via the Zotero translation-server

The engine determines which outbound links (or Pages) are academic papers by issuing calls to the Zotero translation-server HTTP API. The translation-server is a third-party project from open-source reference manager Zotero. It examines a given URL and, if that page contains an academic paper, it returns common publication identifiers such as DOI or PMID.

The translation-server wraps the Zotero translators, a set of JavaScript scripts that do the heavy lifting of parsing a webpage and attempting to identify it as one or more academic publications. These translators are maintained by the community, keeping them fairly up-to-date with publishers. The translation-server uses XULRunner to run these scripts in a Gecko environment, and makes them available through a simple HTTP API:

[~] ~/dev/zotero/translation-server/build/run_translation-server.sh &
    zotero(3)(+0000000): HTTP server listening on *:1969

[~] curl -d '{"url":"http://www.sciencemag.org/content/336/6079/348.short","sessionid":"abc123"}' \
     --header "Content-Type: application/json" \
     http://localhost:1969/web | jsonpp

    [
      {
        "itemType": "journalArticle",
        "creators": [
          { "firstName": "M.", "lastName": "Henry", "creatorType": "author" },
          { "firstName": "M.", "lastName": "Beguin", "creatorType": "author" },
          { "firstName": "F.", "lastName": "Requier", "creatorType": "author" },
          { "firstName": "O.", "lastName": "Rollin", "creatorType": "author" },
          { "firstName": "J.-F.", "lastName": "Odoux", "creatorType": "author" },
          { "firstName": "P.", "lastName": "Aupinel", "creatorType": "author" },
          { "firstName": "J.", "lastName": "Aptel", "creatorType": "author" },
          { "firstName": "S.", "lastName": "Tchamitchian", "creatorType": "author" },
          { "firstName": "A.", "lastName": "Decourtye", "creatorType": "author" }
        ],
        "notes": [],
        "tags": [],
        "publicationTitle": "Science",
        "volume": "336",
        "issue": "6079",
        "ISSN": "0036-8075, 1095-9203",
        "date": "2012-03-29",
        "pages": "348-350",
        "DOI": "10.1126/science.1215039",
        "url": "http://www.sciencemag.org/content/336/6079/348.short",
        "title": "A Common Pesticide Decreases Foraging Success and Survival in Honey Bees",
        "libraryCatalog": "CrossRef",
        "accessDate": "CURRENT_TIMESTAMP"
      }
    ]

There are several useful standardized identifiers here - DOI, URL, and ISSN.

So, continuing with our example from above, I’ll next start the Zotero translation server and identify the pages:

[engine] ~/dev/zotero/translation-server/build/run_translation-server.sh &
         zotero(3)(+0000000): HTTP server listening on *:1969

[engine] rails runner "ParallelIdentifier.new(Page.unidentified).run"

The engine issues calls to the translation-server and records new Identifiers. Now, the Page entries we previously crawled:

[#<Page id: 7531, url: "http://dx.doi.org/10.1126/science.1215039", ... >,
 #<Page id: 7532, url: "http://pubget.com/doi/10.1126/science.1215039", ... >,
 #<Page id: 7533, url: "http://dx.doi.org/10.1126/science.1215025", ... >,
 #<Page id: 7534, url: "http://pubget.com/doi/10.1126/science.1215025", ... >]

have corresponding Identifier records:

[#<Identifier id: 1819, page_id: 7531, body: "DOI:10.1126/science.1215039" ...>,
 #<Identifier id: 1820, page_id: 7531, body: "URL:http://www.sciencemag.org/content/336/6079/348" ...>],
 #<Identifier id: 1821, page_id: 7533, body: "DOI:10.1126/science.1215025" ...>,
 #<Identifier id: 1822, page_id: 7533, body: "URL:http://www.sciencemag.org/content/336/6079/351" ...>,

Two of the four pages were identified (7531 and 7533), and both of those pages received two identifiers apiece. This means that the Guardian Discussion actually referenced two different papers, not just the one we’re interested in.

Now that there is a link between the paper in question and this discussion page, we are ready to visit the frontend.

Papernaut-frontend: importing libraries, finding discussions

The frontend works in two distinct phases: first, it helps you import papers from your reference manager. Second, it shows you discussions for those papers.

You can import your papers via the Zotero API or Mendeley API by giving Papernaut access to your libraries via OAuth. This happens with omniauth-zotero and omniauth-mendeley libraries, followed by the ZoteroClient and MendeleyClient classes.

Alternatively, you can import papers from most reference management software by exporting and uploading a .bibtex file. Papers and their identifiers are then extracted with the BibtexImport class.

Many papers will have multiple identifiers, and the frontend attempts to clean and validate your papers’ identifiers as best it can in an attempt to find the best matches.

Once your papers are loaded into the frontend, it issues requests to the papernaut-engine query API to find discussions that match papers in your library.

The interface between the frontend and the engine are Identifier strings, which take a type/value form:

  • DOI:10.1038/nphys2376
  • ISSN:1542-4065
  • PMID:10659856
  • URL:http://nar.oxfordjournals.org/content/40/D1/D742.full

So, in our example video above, we authenticate via Zotero and authorize Papernaut’s API access via OAuth. The frontend extracts our library of papers from Zotero and stores their Identifiers locally. It issues requests to the engine’s query API for matching discussions, and displays those to the end user:

Deployment

In production, the Papernaut engine and frontend are deployed to Heroku. The translation-server is deployed to EC2. I spin it up and run the loaders periodically, to reduce hosting overhead.

There is a DEPLOY.md file for both the frontend and the engine that goes into further detail.

Next steps

I’m excited to see what kinds of results people get with Papernaut, but it’s still very early software. I look forward to making a variety of improvements.

I’d really like to add a bulk request API endpoint to the engine, so that the frontend can discover discussions in a single HTTP request, rather that one request per paper. That’s a big performance hit, and the user experience right now for large libraries is that the frontend just hangs for a while.

On the engine side, I’d like to do a better job of culling false positives in the matching engine, and of contributing to Zotero’s translators to improve the match rate. I think the primary issue there is that the translator-server actually only runs a subset of all the Zotero translators, as some declare that they only work inside a real browser context (see “browserSupport”).

I’d like to get a larger sample set of BibTeX files to try, as there are probably edge cases and assumptions in the importer waiting to be hit.

I’d also like to background some of the tasks in the frontend’s import process; validating DOIs is a big one there. Ideally, the whole library import would be backgrounded, and the user interface would be notified when the import is complete.

Currently, some matches are missed because the engine and frontend have different identifiers for the same paper - say a DOI and a PMID. I also have an experimental branch that cross-references papers with the crossref.org API, which yields more complete information. Ideally that would happen in the engine. I’ve also seen some library management and import tools that use Google Scholar to improve matching and identification.

After that, I’d like loaders to run semi-continuously instead of manually, and to have more robust infrastructure around paper identification.

In the long term, it would be interesting to try and bring the discussion matching experience directly into reference managers. This is one reason why I provide the engine query API separately from the frontend.

Conclusion

I’m most interested in hearing feedback from people. Is this useful to you? If you use a reference manager, give Papernaut a spin and let me know how it goes.

A Year of Travel

On December 4, Lindsay and I returned to the US after a year of traveling abroad. Lindsay diligently blogged our experiences and her photos at cadeparade.com.

We have spent December visiting family and friends. On December 31, we fly to San Francisco to start the next chapter of our lives.

It’s adventure time all over again

Time to sift through apartments and carefully consider our work, to reacquaint ourselves with first world amenities and first world problems. To reunite with family and friends, to fondly shuffle through our notes and photos, and to reflect on our travel experiences and put them into context.

Also, to eat fajitas and burritos en masse, because let me tell you: Mexican and Tex-Mex food outside the Americas just is not the same.

The first half in photos

All photos are by Lindsay Cade, and are from the first half of the year.

During the first six months of December 2011 through May 2012, we traveled in India, Thailand, Laos, Vietname, Cambodia, and Burma (Myanmar).

We traveled to places beautiful and remote:

Ate incredible foods:

And some not-so-incredible ones:

We enjoyed amazing sunsets:

We ventured across deserts:

into backwaters:

through rivers and valleys:

We marvelled at constructions old and new:

During the second six months of June through November, we traveled in the Czech Republic, Italy, Turkey, Germany, France, England, Thailand (again! we are quite fond of it), South Korea, Malaysia, and Hong Kong. We ended the trip where we began, returning to India for a month.

I cannot recommend this experience highly enough. My sense of perspective and patience have been changed at a fundamental level. At the same time, I’m very much ready for this return to the US, to be with friends and to focus on my career, to do good in this world of which I’ve now seen a tiny slice more.

Hitting the Road!

On November 28, my wife Lindsay and I are flying to India. We have no return tickets, and little plan. I’m leaving a great job; “professional ennui” is the furthest thing from my motivations. What’s going on?!

Adventure Time!

It’s adventure time!

If there’s one common lesson I could distill from my collegiate and professional engagements, it would be the value of diverse experience, and the difficulty of planning to build that experience. Sometimes you just gotta jump in learning’s way.

We’re young, not tied down, and have seen like 0.0001% of the world. So, earlier this year, after getting engaged, we decided: let’s hit the road! Our plans are loose. As of now, we:

  • Have 1-way tickets to Delhi and 5-year visas to India. Many countries in Asia have VOA (visa on arrival) for US citizens.
  • Got our arms jabbed (immunizations).
  • Are brandishing a fat sack of doxy and a veritable menagerie of antibiotics.
  • Booked two days booked at a hotel to buffer our jetlag.
  • Asked a friend-of-a-friend to find a short-term lease in Delhi.
  • Are super frigging pumped. I mean, come on!

I’ll miss the crap out of my friends here in the US. We’re flying around a bit to visit folks before heading overseas - San Fran tomorrow through Wednesday, then Buffalo, then Houston for Thanksgiving.

Then, on November 28, IAH-ORD-DEL.

Closing thoughts

Journeys are the midwives of thought. Few places are more conducive to internal conversations than a moving plane, ship or train. There is an almost quaint correlation between what is in front of our eyes and the thoughts we are able to have in our heads: large thoughts at times requiring large views, new thoughts new places. Introspective reflections which are liable to stall are helped along by the flow of the landscape. The mind may be reluctant to think properly when thinking is all it is supposed to do.

If we find poetry in the service station and motel, if we are drawn to the airport or train carriage, it is perhaps because, in spite of their architectural compromises and discomforts, in spite of their garish colours and harsh lighting, we implicitly feel that these isolated places offer us a material setting for an alternative to the selfish ease, the habits and confinement of the ordinary, rooted world.

― Alain de Botton, The Art of Travel

Backbone.js on Rails Talk

On Tuesday, September 20, I gave a talk at the New Hampshire Ruby Users Group on Backbone.js on Rails. I’ll be giving a very similar talk on Tuesday, October 11 at boston.rb and a version more targeted to front-end developers on Wednesday, October 26 at the Boston Front End Developers meetup

I have posted the Backbone.js on Rails slides online, and the slide source is on my GitHub.

As an aside, I’m using landslide for the slides - I love the resulting HTML and interface, though I’ve heard great things about deck.js.

People found the resources sections useful. Many of the links are buried in the presenter notes, so I’ll repeat them here. There are plenty more online, and I’m sure I’m missing some content. Please link to any of your favorites in the comments, and I’ll add them.

Testing

Push synchronization

Get started with Backbone

Further reading: Books on JavaScript

Further reading: Online resources

Notes From the MIT Startup Bootcamp 2011

If you’d like to talk with other people who made it to this event, check out the Hacker News discussion thread.

Yesterday, September 24 2011, I had the pleasure of attending MIT’s 2011 Startup Bootcamp. In its third year, Startup Bootcamp brought an inspiring and thoughful collection of speakers who have had a variety of startup successes.

The event hashtag #sb2011 is a stream of reactions and pull-quotes from the event - mixed here and there with excited anticipation for a dance festival in Goa.

Ten speakers presented a variety of viewpoints, insight, and food for thought.

It was a mixed bag - yes, there was unnecessary focus on vanity metrics and the rah-rah of startup theater. Breathless celebration of hockeysticking uniques and of flying around to court VCs makes for good TechCrunch articles. Like it or not, that’s an inculcated part of startup culture.

But if you get past the Hollywooding and the Silicon Valley adulation, there were gems of solid advice, grounded in experience, on hiring (Paul English of Kayak), data-driven product development (Naveen Selvadurai of foursquare), optimizing your life for personal growth (Drew Houston of Dropbox), identifying underlying social and technological shifts that enable new products (Charlie Cheever of Quora, Patrick Collison of Stripe), negotiation (Alex Polvi of Cloudkick), the importance of on-the-ground and unscalable product development tactics early on (Nathan Blecharczyk of Airbnb), earning and answering to the responsibility of finding your own way in the world (Anthony Volodkin of Hype Machine) and how important it is to empower yourself in perhaps the largest disruptive theme of our time by learning to code (Patrick Collison of Stripe).

Paul English, CTO and co-founder of Kayak.

Recruit a diversity of success.

Paul spoke on three kinds of recruiting: companies recruiting new hires, companies recruiting investors, and job-seekers recruiting companies.

When you’re recruiting, look for success, regardless of the kind. In fact, look for a diversity of success. Paul once hired an olympic rower, and a chess grandmaster, and couldn’t be happier with these decisions. Find people who operate at the top levels of excellence.

Some companies have a “no assholes” rule - at Kayak, they have a policy of “no neutrals”. Like Charlie Cheever, who later discussed the importance of hiring people you have high-bandwidth communication with, Paul encouraged building a team of people who are fully engaged: “intense and in-your-face - in a good way.”

Leah Culver, CEO and co-founder of Convore

Show up, say yes.

Leah told an lighthearted and likeable story of her journey from big state school CS major to Silicon Valley startup founder. Full of serendipity and luck, she shared stories of driving a UHaul from her native Minnesota out to the Bay Area (picked not primarily for its burgeoning tech scene, but for how much better the weather is), getting started with Instructables, and bumping into Pownce co-founders Kevin Rose and Daniel Burka at a party.

Have a good story to tell the press - you don’t have to tell people the ugly, dirty truth.

Another of Leah’s pieces of advice was a common thread through the talks - that of consistent applied effort. “Show up,” she said - in places with a critical mass of startup people, such as Silicon Valley - and “say yes” to opporunities that come your way.

Andrew Sutherland, founder of Quizlet

I didn’t just rush it on my parents that I was leaving MIT. It took two whole weeks.

Andrew shared his story of inspiration for an online learning tool. When he hacked together a prototype to help study for a French III class in high school and subsequently aced the test, he knew he was onto something.

Andrew discouraged market research - “If I had googled for online flash cards, I would have found other sites, that were not as good, and I wouldn’t have made Quizlet. Now, we’re 10x the [volume] of our next competitor.”

This phrasing raised some contention. I would reframe his advice as: focus on your own products rather than on the competition, and don’t be discouraged by incumbent players; rather, recognize them as a validation of the market space, and proceed to out-execute them.

Naveen Selvadurai co-founder of foursquare

At first, go with your hunch. Later, with data.

Naveen worked for Lucent and Sun in college. This was important - it was real-world learning. Seeing engineering culture, doing code reviews, shipping real products. Sun had an open culture of learning where you can dive into other products. “How’d they build Solaris? File systems?” Just sign up for the mailing list.

Naveen shared seven pieces of distilled advice:

  1. Keep good company.
  2. Make something that people want.
  3. Build around an atomic action.
  4. Seek mentors early.
  5. At first, go with your hunch. Later, with data.
  6. Balance unknowns with knowns.
  7. Always be recruiting.

On the last point Naveen shared the four stages of foursquare’s hiring strategy:

  1. Hire friends
  2. Hire friends of friends
  3. Use an external agency (but they didn’t find this valuable)
  4. Hire an internal fulltime recruiter.

It needs to be someone’s job to think about recruiting, seven days a week. Additionally, as a founder, you must always be recruiting.

Charlie Cheever, founder of Quora

Work with people you have really high-bandwidth communication with. Understand how the other person is thinking.

Charlie shared great advice on early-stage tactics. Start with few users (Quora started with fewer than fifty) and a low-cost MVP. Foster the community by hand, be high-touch and, if your business builds on user-generated content, be prepared at the beginning to build a lot of it by yourself. See how the experiment goes, and then take the learning from that experience and apply it to your MVP.

He shared the importance of collecting metrics early on. With Quora, they actually stored the entire webpage for every visit for every customer, so that they could go back later, having identified trends or formulated hypotheses, and see the site as their users saw it.

They noticed a set of high-engagement users, looked at these users’ expereinces, and found that they had all used Facebook connect. Running with this, the team spent time focusing on improving their social experience.

Charlie also left the audience with good food for though:

What wave enables your product? Why is now the right time to build it?

For foursquare, it was GPS-enabled mobile phones. For Quora, it was that “normal” people were comfortable sharing things online, and that the web was turning into a mess; with Google turning up more content farm results, people were moving onto safe harbors of organized information like IMDB and Wikipedia. The timing was right.

Drew Houston, co-founder of Dropbox

Get out of your comfort zone. Learn a little about a lot.

“Everything big starts small” - Drew’s original perception of startups was that of Tolkien’s Mount Doom. His original strategy to build a successful startup was to be overwhelmingly prepared - nab an MIT CS degree, get a few years’ exerpience working for small companies and big companies alike, come back for a PhD, maybe an MBA.

He then related a story from Dropbox’s origins: Drew had just settled into his seat on a Chinatown bus from Boston, in which he could usually get in several hours of undisturbed work. He popped open his laptop, and searched his pockets for his ever-present USB thumb drive. “Shit.” Realization set in just as he visualized, in his mind’s eye, the thumb drive sitting on his desk at home. “Like any good engineer with a problem to solve, I opened my editor.” Drew then wrote the first lines of what would eventually become Dropbox. Today, his company has a multi-billion dollar valuation and “stores more files than Twitter stores tweets.”

Drew exhorted the audience to learn about a broad variety of topics: sales, marketing, finance, accounting, product design, psychology, influence, negotiation, organizational design, management and leadership, business strategy. Buy books (“today we have this amazing thing, Amazon”), dip in, find mentors, and surround yourself with smart people.

Wrapping up, Drew shared his advice for success:

  • Take on more than you’re “ready for.”
  • Maximize how much you learn per unit time.
  • Stack the odds in your favor. Surround yourself with great people; you are the average of your five closest friends.
  • The fastest way to learn about startups is to join one.
  • Starting a company is one of the best ways for engingeers to change the world.

Alex Polvi, founder of Cloudkick

No matter what number they offer, pause, count to 10 in your head, and then act as disappointed as possible.

Alex spoke on negotiation, specifically about his experience of his company Cloudkick being acquired by Rackspace.

  • If a VP of Corp Dev says “strategic” to you, they are talking about acquisition.
  • Acquisitions are a bit like romantic relationships: you often get the most attention when you’re looking for it the least. Once you are involved with one party, others can sense it. You somehow become more desirable.
  • Once you have a term sheet from one prospective buyer, you have great leverage. When others call you up, you can very quickly get to hard numbers.

The best negotiation position is one of truth. Build something of value that people want, and your position is irrefutable.

Alex also discussed the importance of taking care of your team, and the people around you. Upon acquisition, he fully accelerated all employees’ options - whether they had been with Cloudkick for four years or four weeks, they were all fully vested and could share in the company’s success. It was important that the acquiring party, Rackspace was on board with this - and they were. Rackspace wanted the new team members to stick around not because they were waiting to vest, but because they wanted to be there.

Anthony Volodkin, founder of Hype Machine

Venture Capital? You do not need anyone’s permission to make stuff.

Anthony shared the perspective that VC or angel investment can be very important, but it’s not for everyone. “I don’t want to shut something off because the math doesn’t work. For people to not remember it. That would make me sad.”

Anthony’s vision was a question: while people with cool friends can get interesting music recommendations from that network, what about people without cool friends? He knew that there was great taste and insight being shared by music bloggers online, and sought to aggregate and distill it. “I didn’t want to miss anything.”

(If music startups are your thing, Anthony couldn’t recommend highly enough Dalton Caldwell’s talk from Startup School 3 on music startups.)

Find your own way.

He started Hype Machine from his dorm room. He didn’t take investor money. This gave Anthony and his team the freedom to run the company as they pleased.

“We wanted to travel,” he said - so they packed their bags and hung out in Berlin for a month. It was cheaper than they would have thought, “about six thousand dollars,” and incredibly fun. But if they’d had VC money? “No way,” Anthony imagined an advisor’s response, “we thought you were, you know, going to be working sixteen hour days. Now you want to go to Berlin and maybe work?”

YCombinator? TechStars? Just fucking make something.

Anthony exhorted: it’s okay to have a different process. Don’t discount investment and the accompanying advisors, but don’t go blindly down that most celebrated path. With a different process, it’s easier to stand out, to be differentiated. You can always get money if you are making something great.

Nathan Blecharczyk of Airbnb

You have to have a vision, you have to be able to execute that vision.

Nathan shared a 2008 pitch deck for Airbnb (then AirBed&Breakfast) - the first time this deck had ever seen the light of day. Tiffany Kosolcharoen posted photos of the slides on her blog.

He highlighted its strengths - it had a problem statement, and had a bottom-up business projection by analogy to CouchSurfing and Craigslist. He was also quick to point out its weaknesses - it involved hand-wavy notions of unlikely major player partnerships, and touted top down projections (“If we can capture 2% of the $1.9B travel booking market… imagine!”) that are quick to raise doubt from savvy adviors or investors.

The company was accepted into Y Combinator’s Winter 2009 class. YC companies are supposed to be heads-down; but at Paul Graham’s behest, the cofounders zeroed in their market focus to just New York and hopped redeyes back and forth every few weeks. They met with their initial supply-side renteres in bars, and chatted about how things were going. As the team refined the product and identified sticking points, they could be on the ground to help optimize listings. They’d go with people into their homes and take high-quality photos. They found that the initial asking rates were a little too high, so they asked their listers (after a few drinks) to lower their prices. Things clicked, and soon they had handled $250,000 in bookings of which they collected 10%.

Fast-forward to the YC W09 Demo Day, and although at that point Airbnb has already accepted Sequoia investment, they had prepared a Demo Day deck. Gone was the hand-wavy top-down projection and partnership hopefulness, replaced with a quarter million dollars of demonstrable traction, a tight initial market focus, and a tight, clear problem statement.

Like many of the speakers, Nathan stressed the importance of finding quality mentors.

Patrick Collison co-founder of Stripe

It is impossible to motivate great people by something that is merely going to be profitable.

Patrick’s talk was an excellent finish to the day. He delivered an essay full of engaging stories - I sincerely hope it will be posted online in full.

Patrick’s story was of his trip from hardcore Lisp academic to startup founder. Along the way, he developed one of the first iPhone apps, an offline Wikipedia, before the SDK and App Store, by debugging ARM assmebly. He shared the touching experience of getting emails form users whose lives he had changed; from bringing the world’s knowledge to villages in rural Peru and Ghana to delivering the freedom to browse Wikipedia without overisght to people behind the Great Firewall of China. At nineteen, he co-founded and sold an online action tool, and is currently working on a new payment startup, Stripe.

The anthropological story of the last twenty years is that software is taking over the world. Even if you’re a traveling violinist, you should learn how to program. Do all you can to ensure code is not a foreign language.

Hello, Octopress. Hello, Blog.

On writing

I’ve written sporadically here for several years about programming and language theory, synthetic biology, amateur biology, running user groups and barcamps, multitouch and immersive interactions.

I’ve imported my old posts from WordPress into Octopress. That was – oh wait, I was about to write about that experience before I even began. I was going to say how buttercream-frosting-smooth it was, and that’s probably because I have a lot of confidence in exactly that, mostly due to their well-coiffed htmls. Update! Turns out they’re Jekyll migrations instead. Still easy-peasy.

I’ve written more frequently and recently over on the thoughtbot blog, on development-related topics from from little tips to medium-size tips to architecture deep-dives, from product announcements to high-performance bears.

I’ll be traveling extensively over the next year, and will be writing about that, too. But that’s a different post.

On tools

I wrote most of my previous posts in Mephisto, which was kind of janky after a white, and then switched to WordPress, which is totally not Ruby, and more or less means I have to run a VPS and make sure I don’t get chainsawed by spammers. Also, I’m interested in switching to a toolset more near and dear to my heart. Octopress fits the bill.

This also means I can write using vim and git, like a champ.