|
BLOG MOVED: I'm now blogging from here. Update your bookmarks! If you're following on RSS, you can keep your old subscription -- I'll have FeedBurner forward properly.
MY FAVORITE ANDROID APPS (SO FAR): Since December, Google employees have been carrying T-Mobile G1 devices with the Android operating system installed. Here are some of my favorite applications so far: - G-Backup: Backs up your call log, text messages and pictures to your Gmail account. $4 one-time charge to download.
- PhonePlus Callback: Typically when I don't answer a call, I like to reassure the caller that I'm planning to get back to them. The PhonePlus app adds a button to your incoming call screen that sends the person a text message: "Will call you back in a bit!," or a message of your choice. $1 for a high quality version (one-time payment).
- Loopt: The original social location sharing service. Of course, Google has its own Latitude service as well, which is also a social location sharing service that works on Android.
- My Tracks: An application you can launch to record your GPS tracks every few seconds. This is useful so that you can share where you hiked, biked, drove or whatever on your vacation or day trip. Here's a recent motorcycle trip I took around San Francisco. After you record your journey, you can easily upload the data to Google Maps directly from your phone and send to your friends.
- Zombie, Run!: This game, by my friend and colleague Peter Dolan, locates you on a map of your town using your phone's GPS (or cell phone triangulation). It then populates the map with "zombies" -- mosters trying to eat you whose locations appear on your map nearby you. Your job: Escape the zombies moving towards your location on the map. If you don't run fast enough, you get eaten and you lose!
- Google Reader Mobile: Note that this isn't an installable app, just a nice mobile friendly version for Android's browser.
- ChompSMS: An SMS client that looks like the iPhone's. As if the default doesn't already.
- Trap: Jezzball for Android. For those dull moments in the subway where you don't have a connection for your feed reader.
Granted, some of these should be part of the OS. Maybe they will in future iterations. But one of the great things about an open handset is that developers can plug the gaps while the OS evolves.
OOPS!: A seriously unfortunate house listing. Check out the Street View for this house by Google Maps.
PREDICTIVE MODEL MARKUP LANGUAGE: Here's a cool product: The Predictive Model Markup Language (PMML). PMML is an open standard for sharing predictive models.
This allows different statistical packages and data mining systems to talk coherently to each other about a particular model. PMML supports a variety of algorithms -- so whether your model is based on regression, trees or whatever, it can be expressed in PMML.
Here is a package for R and here is a tutorial and background. If you're looking for support, check out Zementis.
FREE ZIP CODE TO CONGRESSIONAL DISTRICT DATA: If you're doing quantitative research on Congressional districts, its often helpful to get a mapping between zip code and Congressional district. Shouldn't this kind of data be free and easy to find?
It wasn't, at least in my experience. There are a number of commercial providers of this data who seem to be charging way too much.
Fortunately, a few cool new organizations have popped up that can help. Sunlight Labs, a technology branch of the Sunlight Foundation, has an API that will allow you to extract this data. Similarly, Watchdog.net makes this data accessible via an API as well as text files (check out this 1.5GB mapping of zip+4 values to Congressional districts or this smaller file containing zip codes and districts only, without the +4).
In case these files move, the links came from this Watchdog.net support page. While you're at it, why not donate to the Sunlight Foundation?
RECENTLY I WAS ON A PANEL at the Predictive Analytics World '09 conference in San Francisco about the R Programming Language and the R Project for Statistical Computing.
The topic of the panel was "How Google and Facebook are using R" -- and I was joined by Itamar Rosenn of the Facebook Data team (for some info about what their team does, see this slightly old but still informative talk).
Jim Porzak gave some remarks about the basic features and history of R, and David Smith gave a presentation about Revolution Computing -- a company aiming to do for R what Red Hat did for Linux. Red Hat seems to be the inspiration for a lot of open-source startups these days.
Google uses R a lot -- as far as I can tell, more than any other stats package. We have a fairly active discussion forum on R topics consisting of over 200 people. Since unqualified praise is boring, I offered some critiques of the language on the panel -- I was thinking that members of the audience might be R contributors who might be considering future projects. These shouldn't obscure the fact that the R Foundation has done a brilliant job of creating a highly flexible and valuable piece of open source technology.
Michael Driscoll moderated the panel and had a writeup here. If I get some time this week, I'll write up my full remarks.
HAL VARIAN in McKinsey Quarterly.
YOU KNOW ITS A RECESSION WHEN: More people are searching Google for coupons than Britney Spears.
The switch came in early 2008. Does this have something to do with Britney's declining popularity? I don't think so. Her query share numbers have been around the same since 2006.
Hat tip: Hal Varian.
NEW PREDICTION MARKET PAPER by Neil Malhotra and Eric Snowberg here. See review from the Social Science Statistics Blog here:
The most clever insight in the paper is that you can combine data from different prediction markets to estimate an interesting conditional probability -- the probability that a primary candidate will win the general election conditional on winning the nomination. (If p(G) is the probability of winning the general election and p(N) is the probability of winning the nomination (both of which are evident in prediction market contract prices), p(G|N) -- the probability of winning the general election if nominated -- can be calculated as p(G)/p(N).) In the first part of the paper, the authors focus on how individual primaries in the 2008 election affected this conditional probability for each candidate. This is interesting because classic theories in political science posit that primary elections force candidates to take positions that satisfy their partisans but hurt their general election prospects by making it harder for them to appeal to the electoral middle. If that is the case, then ceteris paribus one would expect that the conditional election probabilities would have gone down for Obama and Clinton each time it looked like the primary season would become more drawn out -- which is what happened as results of several of the primaries rolled in.
As it turns out, p(G|N) didn't move much in most primaries; if anything, it went up when the primary season seemed likely to extend longer (e.g. for Obama in New Hampshire). Perhaps this was because of the much talked about positive countervailing factors -- i.e. the extended primary season actually sharpened each candidate's electoral machines and increased their free media exposure. Of course, Malhotra and Snowberg have no way of knowing whether the binding effect of primaries exists and was almost perfectly counterbalanced by these positive factors, or whether none of these factors really mattered very much. [...]
The second part of the paper explicitly considers the problem of assessing how "surprised" the prediction markets were in particular primaries (without explaining why this was not an issue in the first part), and employs a pretty ad hoc means of upweighting effect estimates for the relatively unsurprising contests. Some kind of correction makes sense but it seemed to me that the correction was so important in producing their results that it should be explained more fully in further revisions of the paper.
CHECK OUT THIS NEW PAPER by Hal Varian discussing some of the microeconomic features of online ad auctions. Its a followup to this previous 2006 paper on position auctions and also cites David Pennock.
ONE OF THE MORE MYSTERIOUS LINES I'VE EVER READ IN AN ACADEMIC PAPER: Here:We have no reason to thank the University of Chicago Press. Okay!
YELPDROID.COM: One of the cooler, lesser known features of Google Gears lets users tell a website your location (without having to enter it manually). This enables website publishers to give you location targeted content. For example: Imagine navigating Yelp without having to enter your address.
I'm not sure why Yelp hasn't implemented this, but fortunately my colleague Rob On moved without them. Check out Yelpdroid.com from a machine that has Google Gears installed (this includes all Android phones).
Its simple: Just type what you're looking for, and Yelpdroid will send your query to Yelp with your location appended. Note: You'll need to give Yelpdroid permission to view your location so it can send it to Yelp. Yelpdroid and Yelp don't otherwise read or store your Gears-provided location.
How does it do this? If you're on a mobile, it uses GPS or cell tower triangulation. If you're on a device without GPS or cell towers (such as a desktop or laptop), it uses your IP address. This works astonishingly well at deriving your approximate location.
You also need to authorize each website to view your location data (otherwise all kinds of companies would know potentially private information about where you are). See more about this feature here.
HERE'S A BETTER COPY OF GOOGLE AND YAHOO'S LETTER TO THE CFTC ABOUT PREDICTION MARKETS.
This is identical to the one on the CFTC's website (linked from this page of responses to the CFTC's request for comment on prediction markes) -- except that it can be more easily copied and pasted.
|
|
Disclaimer: Opinions expressed on this site are the author's and not necessarily his employer's.
|