Jeff Duntemann's Contrapositive Diary Rotating Header Image

Annotating Reality

We’ve had evening clouds here for well over a week. Maybe ten days. I’ve lost count, but I may well have to kiss off seeing that supernova in M101. That’s a shame, because I’ve downloaded the Google Sky Map app to my new Android phone, and I want to try it out under the stars.

The app knows what time it is and where you are, and if you hold the phone up against the sky, it will show you what stars/planets/constellations lie in that part of the sky. Move the phone, and the star map moves to reflect the phone’s new position. How the phone knows which way it’s pointed is an interesting technical question that I still need to research, but let it pass: The phone basically annotates your view of the sky, and that’s not only useful, it suggests boggling possibilities. I’m guessing there are now apps that will identify a business if you point your phone at it, and possibly display a menu (the food kind) or a list of daily sales and special deals. With a rich enough database, a phone could display short history writeups of historical buildings, identify landforms for hikers, and things like that.

That mechanism is not an original insight with me; Vernor Vinge described almost exactly that (and much more) in his Hugo-winning 2006 novel Rainbows End . Most of my current boggle stems from not expecting so much of it to happen this soon. When I read the book back in 2006 I was thinking 2060. We are well on our way, and may be there by 2040. (Vinge himself said 2025, but me, well, I’m a severe pessimist on such questions. How long have we been waiting thirty years for commercial fusion power?)

In general terms, I call this idea “annotating reality.” In its fully realized form, it would be an app that will tell me in very specific terms (and in as much detail as I request) what I’m looking at. I do a certain amount of this now, but with the limitation that I have to know how to name what I’m looking at, and that’s hit-or-miss. I have an excellent visual vocabulary in certain areas (tools, electronic components, wheeled vehicles, aircraft) and almost none in others (clothes, shoes, sports paraphernalia, exotic animals.) I was 25 before I’d ever heard the term “lamé” (metallic-looking cloth) and had no idea what it was when I saw it mentioned in one novel or another. I had indeed seen lamé cloth and lamé women’s shoes, but I didn’t know the word. It’s more than the simple ignorance of youth. As much as Carol and I are involved in the dog show scene, I still see dog breeds here and there that I don’t recognize. (Is that a bergamasco or a Swedish vallhund?) Even my core competence has limits: I received a Snap-On A173 radiator hose tool in Uncle Louie’s estate, and if it hadn’t had Snap-On’s part number on it I doubt that I’d know what it was even today, because I don’t work on cars.

I want something that lives in my shirt pocket and works like Google Images in reverse: Show it the image and it gives you the text description, with links to longer descriptions, reviews, and shopping. This is a nasty computational challenge; much worse, I’m guessing, than query-by-humming. (I’ve been experimenting with Android’s SoundHound app recently. Nice work!) Dual-core smartphones won’t hack it, and we’ll need lots more bandwidth than even our best 4G networks can offer.

But we’re working on it. Facial recognition may be worst-case, so I have hopes that the same algorithms that can discriminate between almost-identical faces can easily tell a tubax from a soprillo. I can’t imagine that identifying the Insane Clown Posse band logo is all that hard–unless, of course, you don’t follow rap. (I don’t.) Bp. Sam’l Bassett did some clever googling and identified Li’l Orby for me, but as with the Insane Clowns logo, the problem isn’t so much drawing distinctions as building the database. Pace Sagan, there are billions and billions of things right down here in the workaday world. Giving them all names may be the ultimate exercise in crowdsourcing. But hey, if we can do Wikipedia in forward, we can do it in reverse. C’mon, let’s get started–it’s gotta be easier than fusion power!

UPDATE: Well, if I read Bruce Sterling more I’m sure I’d have known this, but Google’s already started, with Google Goggles. I downloaded the app to the Droid X2, and surezhell, it knew I was drinking a Coke Zero. The app said clearly that it doesn’t work on animals, but when I snapped QBit it returned photos of three white animals as “similar,” including a poodle, a kitten, and two bunnies. Close enough to warrant a cigar, at least in 2011. More as I play with it. (And thanks to the six or seven people who wrote to tell me!)


  1. Erbo says:

    Your phone’s accelerometer or gyro should be able to determine what direction “down” is, and its compass should be able to tell you what “north” is, so that’s all it’d need to know to determine which way the camera lens is pointing. Yes, you’d need to correct for magnetic north, but that’s a matter of computing the correction factor based on your current location relative to the magnetic north pole…and, of course, thanks to GPS, your phone knows that, too. No doubt this is a common-enough task that the routines for doing so are part of the system libraries.

    Applications that overlay things on the real-world view in the camera are commonly known as “augmented reality” apps. A good example is the “Monocle” feature in Yelp’s iPhone app, where you can point the camera down the street and see review tags for businesses Yelp knows about. Another one I have is called “Augmented Reality USA” by Presselite.

    Google Goggles is also on iPhone, as part of Google’s standard app.

  2. What you’re talking about is what they mean by augmented reality. As for identifying products on the fly, didja know you can point the phone’s camera at the barcodes and have it look them up for you?


  3. […] recognizing things in the physical world and looking them up online, as I wistfully wished for in my September 17, 2011 entry: Google Goggles. I vaguely recall hearing of the product on its first release, which (because it […]

Leave a Reply

Your email address will not be published. Required fields are marked *