August 30, 2012»

My small startup, Tasty Labs, has recently launched human.io. Human.io provides a simple way to allow a publisher to turn a passive audience into a mobile army of participants. This allows publishers to easily create missions and activities to get people involved more directly than just reading stuff on a screen. If Twitter is HTML, then Human.io is CGI.

Human.io lends itself to small, simple tasks: Vote on an item, take a picture of a storefront, etc. It allows you to script with humans as easily as you would script with software. It also offers easy access to the sensors on the phone: GPS, camera, and so on.

The architecture borrows from Twilio: The developer sends UI widgets to the client, receives HTTP callbacks when their users perform actions, and then responds with more UI widgets. We can also use a long poll so you can easily run apps from behind a firewall, on your laptop, or otherwise without opening a port on the firewall.

We also built Photo Scavenger Hunt on the platform as a fun little demo. I'm going to give away an ELPH camera to whoever's winning by tomorrow.

July 29, 2010»

Well, that's one way to get a potential investor's attention. I'm either impressed or creeped out.

June 10, 2010»

A few months ago, I traveled to New York City to participate in Rhizome's Seven On Seven which paired artists and technologists and challenged them to develop something new. The New York Times wrote about the event.

Seven on Seven: Trailer

Seven on Seven: Monica Narula & Joshua Schachter

The data was gathered using the Mechanical Turk. I didn't really have enough time to talk about data, so here is actual dataset.

I've posted presentation as well.

December 21, 2009»

I haven't really written anything for this blog in a while.

There are a variety of reasons for this, but I'm generally pretty sensitive to my tools, and I haven't been thrilled with either what I am currently using or what I might use in the future. Do I want to use Wordpress on a virtual machine at some hosting provider? Do I want to write something custom on AppEngine? Or one of a dozen dozen other choices? It makes me want to lie down.

It occurs to me that the tools we have available each do a large variety of things, and that there's no good reason for these functions to be bound together into one application. For example, Maciej's recent article on why not to have a public of Wordpress (and more details) shows that serving the website and editing it can be very separate pieces. The original ancient Blogger software also used to push a copy up to your site via FTP.

There are a number of separable pieces in the system:

Authoring - the actual role of editing a blog entry is usually just a big text field, but several tools use FCKEditor and other nice javascript-based editors. Google Documents could fill this role too. Current weblog APIs allow this part to be decoupled, for the most part, usually for desktop or mobile clients.

Storage - a simple database would suffice. Not much metadata is required, nor is complicated indexing. Amazon S3 and Google Docs both fit the bill here.

Templating - The system that turns the raw blog posts from the storage engine into the pretty HTML version. There's nothing that really fits this bill in the current systems

Hosting - there's no need for the system that runs the blog authoring and storage software to serve the raw HTML pages. Amazon S3 would also suffice here, IF it dealt with directory index pages in a useful manner. (Currently, a url ending in a slash cannot map to a document on S3, so far as I know.) RSS/Atom feeds would also be served from the same system.

Feeds - as standards change over time, it would be nice to be able add the appropriate functionality. Feedburner already does some of this.

Comments - there are several solutions for hosting comments outside the blog applications: Disqus, Intense Debate, JSKit, and so on. I think moderation outsourcing and aggregating comment behavior will be increasingly necessary due to spam issues. Nor do I think that publishers should own comments, but that's a matter for another article.

I wonder if there is a way to define loose interfaces between these systems so that they could both work together but also not set APIs in concrete solid enough to stop innovation. Because the various pieces of the systems currently are all tightly bound together, it is very hard for the parts to move forward separately. For example, I've wanted to be able to specifically reply to comments in place in a visually differentiated way as the publisher, rather than just as another commenter. But this feature hasn't emerged, and if I hacked it into one platform via plugins, I'd be stuck with it forever.

It would also be nice if these systems could work together without all being client-side embeddable widgets. This usually slows down page loads tremendously.

What else have I left out?

April 03, 2009»

URL shortening services have been around for a number of years. Their original purpose was to prevent cumbersome URLs from getting fragmented by broken email clients that felt the need to wrap everything to an 80 column screen. Addendum: They're useful in print, too. But it's 2009 now, and this problem no longer exists. Instead it's been replaced by the SMS-oriented 140 character constraints of sites like Twitter. (Let's leave aside the fact that any phone that can run a web browser and thus follow links can also run a proper client, and doesn't have to hew to the SMS character limit.) Since TinyURL, there has been a rapid proliferation of shortening services.

Aside from the raw utility of allowing URLs to fit within a Twitter message, newer services add several interesting bits of functionality. The most important of these is that let the linker turn any link into THEIR link, and view metrics on how far it's spread and how many clicks it's gotten. Showing a user how popular his actions are is inevitably addictive. Shorteners are relatively easy and lightweight to set up. Adding a simple interstitial before the redirect provides an obvious way to monetize. And maybe someday all the link data will be worth something.

So there are clear benefits for both the service (low cost of entry, potentially easy profit) and the linker (the quick rush of popularity). But URL shorteners are bad for the rest of us.

The worst problem is that shortening services add another layer of indirection to an already creaky system. A regular hyperlink implicates a browser, its DNS resolver, the publisher's DNS server, and the publisher's website. With a shortening service, you're adding something that acts like a third DNS resolver, except one that is assembled out of unvetted PHP and MySQL, without the benevolent oversight of luminaries like Dan Kaminsky and St. Postel.

There are three other parties in the ecosystem of a link: the publisher (the site the link points to), the transit (places where that shortened link is used, such as Twitter or Typepad), and the clicker (the person who ultimately follows the shortened links). Each is harmed to some extent by URL shortening.

The transit's main problem with these systems is that a link that used to be transparent is now opaque and requires a lookup operation. From my past experience with Delicious, I know that a huge proportion of shortened links are just a disguise for spam, so examining the expanded URL is a necessary step. The transit has to hit every shortened link to get at the underlying link and hope that it doesn't get throttled. It also has to log and store every redirect it ever sees.

The publisher's problems are milder. It's possible that the redirection steps steals search juice — I don't know how search engines handle these kinds of redirects. It certainly makes it harder to track down links to the published site if the publisher ever needs to reach their authors. And the publisher may lose information about the source of its traffic.

But the biggest burden falls on the clicker, the person who follows the links. The extra layer of indirection slows down browsing with additional DNS lookups and server hits. A new and potentially unreliable middleman now sits between the link and its destination. And the long-term archivability of the hyperlink now depends on the health of a third party. The shortener may decide a link is a Terms Of Service violation and delete it. If the shortener accidentally erases a database, forgets to renew its domain, or just disappears, the link will break. If a top-level domain changes its policy on commercial use, the link will break. If the shortener gets hacked, every link becomes a potential phishing attack.

There are usability issues as well. The clicker can't even tell by hovering where a link will take them, which is bad form. Some sites offer link previews, but there's no way to make a preview preference stick globally across the many shortening services. And just like ad networks, link shortening services could track a user's behavior across many domains. That makes the paranoid among us uncomfortable. We hope the shortener never decides to add interstitials or otherwise "monetize" the link with ads, but we have no guarantee.

For these reasons, I feel that shorteners are bad for the ecosystem as a whole. But what can be done to improve the situation?

One important conclusion is that services providing transit (or at least require a shortening service) should at least log all redirects, in case the shortening services disappear. If the data is as important as everyone seems to think, they should own it. And websites that generate very long URLs, such as map sites, could provide their own shortening services. Or, better yet, take steps to keep the URLs from growing monstrous in the first place.

You could guarantee that the shortened link is the one that was originally shortened by using a cryptographic hash. But this causes URLs that aren't as short as is possible.

A variety of greasemonkey scripts resolve shortened URLs and replace them inline.

Finally, shortening services could provide archives of their entire database - but this raises all sorts of privacy concerns that I hesitate to even dig into.

The most likely, of course, is that we don't do anything and that the great linkrot apocalypse causes all of modern culture to dissapear in a puff of smoke. Hopefully.

With thanks to Maciej Ceglowski

Updates

  • June 15th, 2009: cli.gs, the "4th most popular" shortener, gets hacked, redirecting a huge number of sites to a new location. 93% of hacked urls can be restored from backup
  • August 9th, 2009: tr.im throws in the towel after being able to figure out how to monetize the site. There are zero interested buyers. The site will redirect links until "at least" the beginning of 2010, but no future is guaranteed.

November 02, 2008»

The growth of both bandwidth and storage mean that in the last few years practically everyone from individuals to large universities have begun putting lectures and talks online. While I can easily pick out a dozen or a hundred videos that that would be fascinating and educational, I am hamstrung by my short attention span, and I drift off almost immediately. Not to mention the fact that one browser crash or accidental tab closure loses my place and probably the video itself as well.

After tinkering a while, I've managed to figure out a way to cut down the time it takes to watch a video. This works for me, on my Mac; your mileage may vary:

  1. Make sure you have the appropriate codecs installed. I generally use the Perian codec package. I additionally find that some FLVs require QTPro to be installed; it's not very expensive.
  2. Download the video somehow. Some sites, like Google Video, let you download a copy. Others, like YouTube, do not allow this. However, most embedded flash video can be grabbed via the technique in the bottom video in the demo videos at Perian.
  3. Open the video in QuickTime. The video is now happily outside the browser.
  4. Go to Window → Show A/V Controls; change the playback speed in the relevant window. I find that 2.0x generally works pretty well; the video will be faster and the audio is a little clipped but nicely de-chipmunked.
  5. Enjoy your new lecture! The glacial discussion now arrives at a rapid-fire pace. You'll be too busy trying to keep up to play Desktop Tower Defense, and you'll be done in a half hour.

how to watch lectures faster

Continue reading "overclocking the lecture" »

September 12, 2008»

Ever since seeing a presentation by Dolores Labs about Amazon's Mechanical Turk, I've been itching for an excuse to play with the system.

I recently saw a thread that highlights the distinction between expected value and utility. Would you take a more likely but lower payoff instead of a less likely but higher payoff? Similarly, the St. Petersburg Paradox takes the problem to its logical extreme. By constructing a game that has a series of increasingly rare payoffs of increasingly larger size, a game with infinite expected value is created.

So I constructed 21 versions of the questions, varying the size of the dollars as well as the rate of payoff for the second outcome.

Example Question

For one cent apiece, I sent the questions to be answered by one hundred people each, and collated the results. 2100 questions, three hours, and thirty dollars later, I have my results.

Batch_3890_result.csv

Clearly, people (or at least these Turks) begin to cross over at larger values, reaching equilibrium at around $1,000.

While this isn't the most groundbreaking work, it is nice to be able to generate an experiment and gather the results in the course of an evening and then have the results be so pleasing.

The Mechanical Turk is presented as a way to solve problems that are easily explained to people but difficult to implement for computers, frequently described as "artificial artificial intelligence." However, I think some of the most intriguing uses yet will be to explore the edges of our own uniquely human behavior and self-understanding.


by joshua schachter | projects |