CONTRAST-IT and COMPARE-IT belong to the typology of comparable corpora. They both include collections of similar texts (see below: "General features of the corpora").
CONTRAST-IT and COMPARE-IT are a multilingual and monolingual corpus, respectively. They have been created during two research projects funded by the Swiss National Science Foundation (ICOCP and ISAaC) with the aim of investigating Italian in a contrastive and comparative perspective. The publication output of these projects can be found by clicking here: CONTRAST-IT. Specific references.
The CONTRAST-IT and COMPARE-IT corpora include the following language components:
The CONTRAST-IT and COMPARE-IT corpora allow working on a wide array of language pairs or groups, as well as on a single language. The two corpora can be used for instance to investigate the following language combinations:
The comparable CONTRAST-IT and COMPARE-IT corpora are based on:
- comparable text collections
- original text collections
- full-length text collections
- commonly occurring text collections
- small to medium size text collections
CONTRAST-IT and COMPARE-IT are high quality corpora. All the texts have been manually checked to ensure that all the components are present and that they belong to the correct news section (politics, economy, sports etc.).
More information on the design of the CONTRAST-IT and COMPARE-IT corpora are available on the pages devoted to each corpus.