Languages

The main languages covered by the EMILLE Corpora are as follows:

Bengali
Gujarati
Hindi
Punjabi
Urdu
Singhalese
Tamil

In addition, the EMILLE/CIIL Monolingual written corpora contains data in Assamese, Kannada, Kashmiri, Malayalam, Marathi, Oriya and Telegu.

Home | About | Who We Are | Languages | Encoding | Sample Data | Links | Contact Us