This is part 2 of our earlier post with the same title. There we described the problem and what makes it interesting. Here we share our story – our iterations and our learning since early 2009.
Our approach differs from that of Microsoft Research and Google; being a tiny product startup our bets differed from those of a research lab. In principle, all three approaches rely on the user community to contribute improvements to machine translation and store/retrieve/manage the contributed translations in the cloud. The metric for success is also the same across all three teams – improve the rate of translated sentences contributed per language.
Here’s our story:
The first few ideas we hashed out were inspired from popular Web 2.0 models that relied on the power of crowd contributions in different problem spaces. We tinkered with something similar to a Digg-like idea for translation and a Mahalo-like idea for translated pages for popular keywords. However, we found that voting something up/down had little correlation to it’s ability to gather translated sentences. A single contributor with the right motivation provides a better push for completion of the translation. The other search (SEO) based approach was painstakingly slow in gaining readership and it took a lot of manual effort to find search queries with good search volume yet were not answered well in the target language.
Battling Monetization vs. Gut at DEMO
After initial experiments and probing we decided to include a business model focus as well. Our best bet was to target businesses first. Our first offering was a virtual cloud service that could instantly enable any website or web application for localization. Just give us the original URL and we could take care of the rest.
This prototype was selected by Matt Marshall (CEO, VentureBeat) to present at DEMO, September 2009, San Diego. We were the only product company from India at DEMO and it was exhilarating to give live-demos of our working prototype at the conference.
The lessons learned at DEMO were invaluable. Small and medium businesses were not ready to spend their marketing budgets on secondary markets unless we took the onus/risk of proving its market value. Would you be willing to pay thousands of dollars for an unproven secondary market without any lead data? Given the length and complexity of the sales cycle, it was ideal to pitch this solution along with a range of other website/content services. Most importantly, this path was not going to bring us any closer or any faster to an answer for the original problem – how will the Web transform translation into an online and hyper-connected activity? An answer here could impact millions of existing users and bring new users online. This was definitely more challenging and a little scary given our competition was Google and Microsoft Research. But finally our decision boiled down to internally answering a simple question – which problem do we subconsciously think about in the shower?
Making Collaboration the Core
Back to the drawing board, we studied the dynamics of every crowdsourced model out there – from social bookmarking, social answers, social videos, social shopping, community support services, and many others. The foremost goal with which we built the currently available version of Dubzer was to make it easy to translate web content collaboratively and share it with others. As we say on our home page “every page in our pool is continually improved by everyone who reads it”. What did this goal mean for the product?
For instance, we made it incredibly easy to translate and share links. None of the others are doing this. We did not bet only on Wikipedia pages but on any URL where a user can now share Dubzer’s collaborative and flexible translation instead of inflexible machine translation.
We also lowered the granularity of the contribution required from each user. Made the process as easy as adding a comment on a blog. If you see an entire page that needs to be translated, you may never get started. But if all we ask is for you to translate a single line or submit a simple URL, you may do it. This approach has worked well for us. On Dubzer every sentence can be uniquely referenced and shared; for instance, you can tweet the Dubzer URL to a single sentence if you need to ask your friends for its correct translation.
Our goal was always user traction so we bet on collaboration. On Dubzer, several users can work on the same article each contributing to a different sentence and each aware of who is simultaneously working on other sentences. Articles bookmarked for translation on Dubzer have permalinks with different views for reading and editing making them easily shareable on Facebook, Twitter, and other social platforms. Bloggers can use our embed option to embed the translated version and seek contributions from their readers.
Many language enthusiasts revealed their interest in a Vanity page. We made the user’s activity available on his public profile on Dubzer (dubzer.com/username) where he can showcase his language skills and contributions to build credibility.
Did it work?
We’ve been out there since late July. Many submissions have the complete translation – The WikiLeaks CIA report, Steve Jobs’ speech at Stanford in 2005, and several articles from popular websites such as Wikipedia and Mashable.com are on their way to completion. Last week we released our FireFox addon MyTranslationShoes. We’ve also created communities on Facebook to help us actively engage language enthusiasts. These help validate ideas before building them into the product saving us both time and money.
That’s the story so far. A lot of learning has happened and a lot still remains to be tested. We’re breaking it down into achievable milestones as we go along. The answer is not going to come easy and not in a predictable way as all good ideas have come from patience, persistence, and the invaluable serendipity!
We’re happy to share data and insights beyond the scope of this blog. We crave opportunities to collaborate with individuals or companies interested in similar problems. Do connect with us.