THE EMILLE PROJECT: About the EMILLE Project

The EMILLE Corpora were released in August 2003. They are distributed via ELRA / ELDA.

PLEASE NOTE: the EMILLE team at Lancaster University cannot provide you with a copy of the data. Please direct all requests to ELRA/ELDA.

The data is available in two forms. The full EMILLE/CIIL Corpus is made available, for free, for research use only. See:

Resource page for W0037 on ELRA catalogue (or catalogue search)

The EMILLE Lancaster Corpus is a 59-million word subset of the full data, which is available for commerical exploitation, at a cost. See:

Resource page for W0038 on ELRA catalogue (or catalogue search)

For more information and pricing details, see the ELDA catalogue here:

http://catalog.elra.info

For information on how to order, see the following page on the ELRA site:

http://catalog.elra.info/purchase_procedure.php

A beta version of the corpus, consisting of a restricted sample of the data, was released in 2003. We do not advise making use of this corpus, as it contains some errors of encoding which were identified after the beta release.

Distribution