tinker on

dreaming up a web that works.

Category: Lipikaar

Digital Library of 100 timeless titles in each Indian language. Interested?

by Anjali Gupta

The Gutenberg project makes available over 33,000 previously published books in the form of e-books for free. This is done with the help of thousands of volunteers – a project called Distributed Proofreaders. The contributions made by these volunteers empowers readers to enjoy these books on Apple’s ipad, Kindle, Android, and similar platforms.

With OCR or even manual typing there will be several errors in the text produced. Human proofreading becomes a necessary activity before the book is converted into a downloadable e-book. Similar to translation, only a real person can spot and correct the errors. I often notice at least a few typos in newly published books. I wonder why authors don’t employ crowd-sourcing to get their chapters proofread. The ability to read the content early is reward enough for volunteers.

Old magazine articles, comics and famous letters from Indian can be made available with the power of distributed or crowd-powered proofreading. It’s unfortunate that there there are no digitized old books available in Indian languages on Gutenberg.

Using Dubzer (free crowd-sourced proofreading), Lipikaar (easy unicode-based typing for Indian languages), Pothi (self-publishing, print on demand, downloadable e-books), and other such web-based platforms we can create a digital library for timeless Indian content whose copyright has expired and can be publicly distributed. Even semi-urban or rural folks who read well in their local language and have poor access to libraries will be empowered to make reading an enjoyable leisure activity. With India’s 3G  powered smart-phone revolution, is this hard to imagine?  We can initially aim to create 100 e-book titles in each Indian language including English.

The possibilities are exciting and challenging. These ideas came up as a result of our conversations with Abhaya Agarwal, co-founder of Pothi.com, who has a keen interest  in the work published by Indian authors/journalists who did not have the benefit of digitization.

We would love to jump start this initiative with a group of like-minded folks. Do write to us if you have any of these – insights or leads to such attempts, OCR expertise, relevant OCR open-source software,  timeless books/articles/magazines/literature, typed text, etc.  Even if you don’t have these please join in with your ideas and enthusiasm. Students are welcome too!

Update (February 1, 2011)

StoryDB.in (A Story Database for India) has been created.  Four Hindi books that are now out of copyright have been listed here which were contributed to the database by Abhaya Agarwal.

Lipikaar crowdsources translations for its Help pages on Dubzer

by Anjali Gupta

Lipikaar, a popular software for typing in Indian languages is now using Dubzer’s social translation platform to translate its help pages currently in English to major Indian languages. We’re excited to see crowdsourced translation in action.

How does this work?


Lipikaar’s online website gets several thousand visits every day. The visitors are fluent in at least one local Indian language and are using Lipikaar’s technology for the same language.

Therefore, Lipikaar already has a bilingual user following which is fluent in the target language. That’s the only prerequisite for social translation to work.


Lipikaar is offering a free license of it’s Windows Desktop Software (Rs. 499 value) to those users who translate 10 sentences or more. The reward is aligned to the user community and is something that Lipikaar can offer without having a budget for translation.

The reward is highlighted on each Help help page as shown here:  http://www.lipikaar.com/desktop/help/how-to-type-in-hindi

With one click, the user directly lands on the corresponding Dubzer page in Hindi where he can contribute translations. Adding the relevant pages to Dubzer takes only a few minutes. At no point does the user have to select his language or select the page he wants to translate. The experience is designed to be intuitive and engaging.


Within 3 weeks of launching this social translation experiment on the website, Lipikaar completed translations for 3 major languages – Hindi, Marathi, and Kannada. The contributors were new users who were probably visiting Lipikaar.com for the first time. To involve loyal users, Lipikaar plans to include this intiative in the monthly newsletter. This will further increase the pace of  translation.

8 weeks later:  The translations for Hindi, Telugu, Bengali, Oriya, Marathi, Kannada were completed.

Lipikaar gets popular across languages and applications

by Anjali Gupta

The recent data from Lipikaar shows that we have gathered users across the spectrum.

Key highlights:

No one language accounts for more than 20% of users. A year ago we had Hindi and Punjabi dominating our charts.

The top 10 languages used by Lipikaar users are – Hindi (19%), Arabic (17%), Punjabi (13%), Marathi (10%), Gujarati (8%), Telugu, Malayalam, Bengali, and Tamil.  Urdu and Kannada are tied at the 10th spot.

On the Applications front, we have users across 300 Unique Software Applications!  Users have typed in the above Indian languages on 300 different Windows Applications. The most popular one Microsoft Word accounts for only 3%!

The top applications – Microsoft Word, Excel, Access, Internet Explorer, Acrobat, Firefox, Chrome, Outlook, Notepad, PowerPoint, GoogleTalk, Yahoo Messenger, PhotoShop, and so on.

Some of the new entrants that are being actively used with Lipikaar are Google Earth and iTunes.

After powering the PC and websites, we’re gearing up to power the mobile phone with Indian languages.  Do send us your ideas. Write to me if you would like to include Lipikaar with your software or mobile application.