All news on TALP

LC-STAR Lexica and Corpora for Speech-to-Speech Translation Components

12 May, 2002

The project “Lexica and Corpora for Speech-to-Speech Translation Components” (LC-STAR) aims to develop lexica for automatic speech recognition and text to speech synthesis for thirteen languages, and multilingual corpora for speech centered translation applications for nine languages. The project is led by a consortium comprising two universities and several industrial companies. All resources to be developed are encoded using the Extensible Markup Language (XML). This paper describes XML related issues in the LC-STAR project from three different perspectives; the XML encoding of the lexica, the XML encoding of the multilingual corpora and issues regarding the validation of XML encodings like that of the LC-STAR lexica.

Category: Projects

HISTORIC Domestic Research Projects

24 May, 2001

Acceso multilingüe a servicios de comunicación interactivos para el Mediterráneo y Oriente Medio 2001 5030-E. Años 2002-2004. IP Asuncion Moreno

Tecnologías del habla multilingüe (TEHAM) TIC 2000 1005 C03. Años 2001-2003. IP Javier Hernando Universidades participantes: UP Vigo, U Pais Vasco

Interfaces orales para aplicaciones avanzadas de mensajería unificada
TIC 2000 1735 C02-01 Años 2001-2003 IP Javier Hernando

Procesado de la voz para la integración de la telefonía y los ordenadores
TIC 98 0683 Años 1998-2000 IP J. A. Rodríguez Fonollosa

Elaboración de recursos lingüïsticos del Español para el desarrollo de sistemas accionados por voz
TIC98-0685. Años 1998-2000. IP Asuncion Moreno

Elaboración de una base de datos oral para aplicaciones de reconocimiento de voz en automóviles
TIC –981715-CE. Años 1998-2000. IP Asuncion Moreno

Bases de datos de voz para la creación de teleservicios accionados por voz
TIC 97-0271-CE. Años 1997- 1998

Procesado de la señal en servidores vocales interactivos
TIC 95-1022-C05-03 Años 1995-1998. IP J. A. Rodríguez Fonollosa

Aplicación de técnicas avanzadas de reconocimiento de imágenes a un sistema de conversión texto-voz
TIC 95-1022-C05-04 Años 1995-1997.

Corpus oral para el desarrollo de aplicaciones de reconocimiento automático del catalán
Generalitat de Catalunya. Dirección general de investigación. CIRIT. 1994-1995

Category: Projects

FAME Facilitating Agent in Multiculture Exchange

12 May, 2001

The FAME project goal is to provide and integrate core technologies (video and speech perception, augmented reality, translation, information retrieval) to show feasibility of the concept.

•Partners: Uni Karlsruhe, INPG Grenoble, UJF Grenoble, UPC Barcelona, ITC-irst Trento, SONY Stuttgart, ATLAS SA

Category: Projects

Meaning Developing Multilingual Web-scale Language Technologies

12 May, 2001

MEANING will be concerned with automatically collecting and analysing language data from the WWW on a large scale, and building more comprehensive multilingual lexical knowledge bases to support improved word sense disambiguation (WSD).

Current web access applications are based on words; MEANING will open the way for access to the Multilingual Web based on concepts, providing applications with capabilities that significantly exceed those currently available. MEANING will facilitate development of concept-based open domain Internet applications (such as Question/Answering, Cross Lingual Information Retrieval, Summarisation, Text Categorisation, Event Tracking, Information Extraction, Machine Translation, etc.). Furthermore, MEANING will supply a common conceptual structure to Internet documents, thus facilitating knowledge management of web content.

Progress is being made in Human Language Technology (HLT) but there is still a long way towards Natural Language Understanding (NLU). An important step towards this goal is the development of technologies and resources that deal with concepts rather than words. MEANING will develop concept-based technologies and resources through large-scale knowledge processing over the web, robust and fast machine learning algorithms, very large lexical resources and novel strategies for combining them. Small-scale, isolated experiments with limited infrastructure (such as Internet access, processing power, and storage space) have no chance of bridging the gap to understanding. Advances in this area can only be expected in the context of large-scale long-term research projects.

Category: Projects

NAMIC News Agencies Multilingual Information Categorization

12 May, 1999

This project aims to develop and bring to marketable stage advanced technologies of Natural Language Processing for multilingual news customisation and broadcasting throughout distributed services, a major problem for International and National News Agencies (NA) as well as for the spread of Web technologies. Within their own business cases, NAs need to integrate within their own repositories news distributed by other NAs usually in different languages and according to different classification standards. Mismatching is at language level, a different languages are used, as well as at conceptual level, as the organization/storage of news proceeds according to diverging schemes.This project will include the linguistic engineering tools developed in ITEM , RILE and EuroWordNet.

This project is developed by our group from Technical University of Catalonia (UPC), toghether with members of the University of Barcelona (UB) Computational Linguistics research group with the research groups of University of Sheffield, University of Roma Tor Vergata and Vrije Universiteit Brussel, and Itaca, International Press Telecommunications Council, Agenzia ANSA, Agencia EFE and Financial Times.

Funding: European Union through Information Society Technologies programme (IST-1999-12392).

Category: Projects

EUROWORDNET Building a multilingual wordnet with semantic relations between words

22 January, 1999

This project aims to develop a generic multilingual database with WordNets for several European languages -English, Dutch, Italian and Spanish- with 30,000 senses each one. Those WordNets will be linked through the English WordNet, so each English synonym will be associated with its equivalent in the other languages.

Partners in this project are: University of Amsterdam, University of Sheffield, Instituto di Linguistica Computazionale of CNR, Lernout& Hauspie and the NL research groups from UPC, UB and UNED.

Funding: European Union through Language Engineering sector of the Telematics Applications Programme (LE-2 4003)

Category: Projects

HISTORIC European Projects

10 January, 1999

Spoken Language Interaction in Telecommunication COST 278 Year: 2002-2005

Preparatory action for the project: Technology and Corpora for Speech to Speech Translation" (TC-STAR_P).
IST-2001-37376. Year 2002-2003. IP and Project Coordinator Asuncion Moreno

COST Action 276 Information and Knowledge Management for Integrated Media Communication Year 2000- 2003

Speech Driven Interfaces for Costumer Applications (SPEECON) IST 1999-10003 Año 2000-2002 Subcontratados por la empresa Siemens. IP Asuncion Moreno

Interface: Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST-1999-10036. Year: 2000-2002. IP Asuncion Moreno

Evaluation of Multilingual Spoken Language Resources and Tools. Bilateral Cooperation bilateral with Maribor University , Eslovenia. Years: 2000-2002. Ministerio de asuntos exteriores. IP Asuncion Moreno

SpeechDat Car Speech databases for voice driven teleservices and control in automotive environments. Telematics Program. LE4-8334. Years: 1998-2000. IP Asuncion Moreno

Speech Databases for the creation of voice driven teleservices - SPEECHDAT(II). LRE 4002 Years: 1996-1998. IP Asuncion Moreno

Aquilex I II 1992-1995

HANDY: Design and development for manufacturing of a pocket ergonomic data keyboard for handicap people use. CEE CRAFT BAST-CT95-5016. Years: 1996-1998. IP Francesc Vallverdú

Category: Projects

Other technology transfer Projects

13 December, 1997

2013 - 2015 Ministerio de Ciencia e Innovación (INNPACTO): PROGRAMA SEGUNDA VOZ
2012 DatknoSys: On-line community crosslingual summarization pilot
2011-2012 viClone: PLN techniques for virtual assistants
2011-2012 Verbio Technologies: Assessorament i realització de cursos de formació sobre tecnologies de processament de la veu, àudio i llenguatge
2010-2012 TV3 - Ministerio de Ciencia e Innovación (CENIT): Assessorament en el desenvolupament de tecnologia de la parla en el marc del projecte BUSCAMEDIA del programa CeNIT
2010 Catedra Telefónica-UPC: Jornadas y Piloto para promocionar las tecnologías de procesamiento de habla y texto.
2010 Fundació: Pat.Mapa/Romànic-cellers
2010-2011 Programa Avanza: QUESTIA: Motor de respuestas escalable capaz de responder a las preguntas expresadas directamente en lenguaje natural
2010 SemantixGroup: Integración detector de idiomas y desambiguador morfosintáctico
2009 Herta: Reconocimiento de locutores dependiente del texto
2008 Fundación CTIC:T-INCLUYE, an Inclusive Language Recommendation system to detect exclusive language in Spanish documents developed in the framework of Web con Género project
2007-2010 Ministerio de Industria, Turismo y Comercio (CENIT): INREDIS, Interfaces de Relación entre el Entorno y las personas con Discapacidad
2007-2009 Secretaría de Política Lingüística. Generalitat de Catalunya: TECNOPARLA: Tecnologías del habla en catalán.
2007-2010 Ministerio de Industria (CENIT): MARTA: Movilidad y Automoción con Redes de Transportes Avanzadas
2007 Cromosoma: Videogame knowledge management

Category: Projects

Search form

You are here

Pages