Marco Turchi
Miscellaneous

Marco turchi

Miscellaneous

- Co-Organizer of:

Intelligent Analysis and Processing of Web News Content workshop at WI-IAT - Milan 15 September 2009
Statistical Multilingual Analysis for Retrieval and Translation associated workshop at EAMT - Barcelona 13 May 2009
European Project SMART Meeting in Bristol May, 2008

- Coordinator and head coach of basketball teams from September 1993

- Student Co-advisor for Master and Degree thesis on Text Analysis

Talked About My Work

- ONTS: "Optima" News Translation System has been mentioned here

- Our PLoS ONE paper "The Structure of EU Mediasphere" has been mentioned in the following media

NLP/Text Mining Libraries

- Gate a General Architecture for Text Engineering

- Weka Data Mining software in Java

- Apache Lucene: information retrieval library

- lingpipe: Java libraries for the linguistic analysis of human language

SMT tools

- Moses: statistical Machine Translation System

- srilm: toolkit for building and applying statistical language models (LMs)

- irstlm: LM toolkit

- Giza++: training of statistical translation models

- Multi-thread GIZA: multi-thread extension to GIZA++ word aligning tool.

General purpose Libraries

- SVMlight: an implementation of Support Vector Machines (SVMs) in C

- Apache Cayenne: persistence framework providing object-relational mapping (ORM) and remoting services

- SciPy: software for mathematics, science, and engineering in Python

- mysql++: C++ wrapper for MySQL’s C API

Corpora

- Europarl: parallel corpus for SMT in 11 European languages: Romanic (French, Italian, Spanish, Portuguese), Germanic (English, Dutch, German, Danish, Swedish), Greek and Finnish.

- JRC-Acquis: parallel corpus for SMT in 22 languages.

- seTimes: parallel corpus for SMT for Balcanic languages: Turkish, Croatian, Albanian, Serbian, Macedonian, Bulgarian, Greek, Romanian, English.

- EMEA: parallel corpus from the European Medicines Agency in 22 languages.

- CzEng: Czech-Englsih parallel corpus.

- EPPS: word alignment documents

- Spanish-Dutch NER human annotated data

My extended CV

- Download here

Whiteboard:


email me