Simplext: Text Simplification for Spanish

Stefan Bott

Omega-S208 Campus Nord UPC

Wed Oct 17, 2012



Text Simplification (TS) is the task of making a text easier to read and to understand for target groups that may otherwise have problems with reading comprehension. It may be addressed to people with cognitive disabilities or second language learners, among others. TS has also been used as a pre-process for other NLP-tasks, such as Information Extraction or Automatic Translation. The task has received growing attention in the last years, but for several reasons most of the research was carried out for English. These reasons include, for example, the availability of new data sets, such as the Simple English Wikipedia.

TS is a mixed task, which includes the simplification of syntactic structures and lexical items, content reduction and possibly the insertion of clarifying content. Accordingly, TS is similar to, but different from, other NLP tasks, such as Automatic Translation, Automatic Summarization or Paraphrasing.

The Simplext project aims to develop Text Simplification Tools for Spanish. In this talk we will present two modules we have developed within the project: a simplification grammar for syntactic simplification and a lexical simplification module.


