You are here

Cesare Furlanello

Head of Unit
  • Phone: +39 0461 314580
  • FBK Povo
Short bio

Cesare Furlanello received his degree in Mathematics at the University of Padua, Italy, in 1986. He is at Fondazione Bruno Kessler (Centre for Scientific and Technological Research of Trento) since 1987, now a Senior Researcher. He is currently leader of the MPBA Project (previously the ITC-IRST Neural Networks for Complex Data Analysis Project, since 1995).

In general terms, he is a data scientist, with main research interests in the interdisciplinary applications of machine learning methods to biomedical and environmental data. He is active in the field of bioinformatics, developing methods and software solutions to find patterns in very high throughput molecular data (such as Next Generation Sequencing and microarrays). He also have years of experience with machine learning and data analysis for geoinformatics, aiming at creating a bridge (geo-bioinformatics) between molecular profiles and spatial data structures. He has designed and managed many collaborative studies with life science researchers, in which math and software infrastructures are integrated to discover patterns in high-throughput datasets. He was first Project manager at IRST for the National Bioelectronics Project (1991-94), and he is currently PI and project manager of research projects in which Predictive Models are applied to Biomedical and Environmental data, for a total of 58 funded projects since 1988. These studies combine statistical machine learning methods with new sw infrastructures for data collection, management and distribution of the resulting models: Predictive Health Platforms and Geoinformatics platforms are thus the final outcome. The most recent research is directed to applications in functional genomics, including the development of computational pipelines and a focus to the problems of scientific reproducibility.

Basic and applied studies have been developed at the MPBA group with colleagues in other institutions on molecular oncology, vector-borne disease mapping, wildlife epidemiology, traffic safety, landscape risk analysis. CF has actively contributed to computational aspects, supporting the development of open source geoinformatics (GIS GRASS, WebGIS) and high performance machine learning (mlpy). Since 2002, he has contributed to the development of predictive classification models and gene selection procedures for molecular diagnostics, in collaboration with national and international centres of excellence in molecular oncology. He is a bioinformatics PI collaborator of international consortia such as the SEQC/MAQC FDA initiative and the FANTOM5 project led by the RIKEN OMICS centre. He has been a PI for AIRC with the IFOM-FIRC institute. He is also a collaborator PI in several projects of the Mach Foundation (FEM) for computational biology (metagenomics) and environmental mapping (climate change and plant genomics)

Several of the systems realized in experimental studies are now data platforms in use as infrastructures by public agencies: IET, MITRIS (Trentino and Friuli-VG), UXB-TN (Trentino), FaunaTN and FaunaBL (Trentino and Belluno) are the largest. The spinoff company MPA Solutions is mantaining these systems and developing WebGIS technologies with predictive modeling functions.

CF was Scientific secretary of the GNCB-CNR school on Neural Networks for Signal Processing (Trento 1989) and organizer of other workshops on Applications of Machine Learning and Neural Networks. In September 2008, he was Local Conference Chair of the MGED11 International Workshop of the MGED Society (in its Advisory Board since 2007) and he is now in the Board of Directors of the FGED society. Lecturer on Neural Network and Statistics at Master School of Advanced Information Science of Salerno University. Chairman of Session Theory 1 at IEEE NNSP-95 Cambridge MA, 1995. Member of the Scientific Board of the Multiple Classifier Systems series of conferences. Invited participant in the Machine Learning and Neural Networks Program of the Newton Institute of Mathematical Science, Cambridge UK, 1998. Member of the Italian Neural Network Society (serving in its Scientific Board 1991-2005), of the International Association for Pattern Recognition.

Invited lectures (a selection): NATO-ASI school Learning with Ensemble models (Vietri 2002), the ECEM/EAML Conference (Bled 2004), at the Int. BCB-Workshop on Machine Learning in Bioinformatics (Oct. 2005, Berlin), Int. School "The analysis of patterns" (Nov. 2005, Erice), "Predictive modeling on spatio-temporal patterns" (April 2007, Univ. Bristol), and "Signature Stability Analysis" (Nov 2007, Silver Springs, FDA).

He has been supervisor of 30 graduate or postgraduate theses for the universities of Trento (Mathematics and Engineering), Milano, Bologna, and Torino, supervisor of Leonardo graduate placements, tutor of 8 ASI-CONAE fellows in 2003-2012. Currently a supervisor of internships for Master thesis in Mathematics, Information and Telecommunication Engineering for the University of Trento, as well as a tutor for post-doc fellowships. Courses held: 1998-2003: Lecturer on COMPUTATIONAL STATISTICS AND PREDICTIVE MODELS, Math MsC, Trento University, and 2004-06: Lecturer on "Statistical Machine Learning", a course for the International Graduate School in ICT, Trento University. He is currently a member of the PhD School in Biomolecular sciences of UniTN.

He is a founder of the WEBVALLEY project, the FBK summer course for dissemination of interdisciplinary scientific research. Since 2001, CF is responsible for the WebValley Scientific program, and a resident tutor for all the 12 editions of this event. Developing the culture of data with open source platforms (web scripting, geodatabases, webGIS, tools for data visualization, statistical analysis decision making) based on a challenging project is the theme of 3 fast-paced weeks, in which about 20 high schools students team up with senior and junior researchers. In 2012, for this activity CF has been listed as "one of the 50 persons that are changing the world" by Wired, Italian edition (at #42, as in the H Guide).

  1. S. Merler; B. Caprile; C. Furlanello,
    Bias-Variance Control via Hard Points Shaving,
    vol. 18,
    n. 5,
    , pp. 891 -
  2. C. Furlanello; M. Serafini; S. Merler; G. Jurman,
    Methods for predictive classification and molecular profiling from DNA microarray data,
    vol. 5,
    n. 1,
    , pp. 199 -
  3. M. Neteler; D. Grasso; I. Michelazzi; L. Miori; S. Merler; C. Furlanello,
    New image processing tools for GRASS,
    Proceedings of FOSS/GRASS 2004 Conference,,
  4. S. Menegon; M. Neteler; C. Furlanello; S. Fontanari,
    Open Source GIS/WebGIS nella Amministrazione Pubblica,
    On-line Proceedings of SALPA Forum 2004: Sapere libero e aperto nella Pubblica Amministrazione,
  5. B. Caprile; S. Merler; C. Furlanello; G. Jurman,
    Exact Bagging with k-Nearest Neighbour Classifiers,
    Multiple Classifier Systems, Proceedings of the 5th International Workshop, MCS 2004,
    , pp. 72-
    , (5th International Workshop on Multiple Classifier Systems,
    06/09/2004 to 06/11/2004)
  6. P. Brunetti; D. Minati; R. Flor; C. Furlanello,
    Utilizzo di palmari GNU/Linux per rilevamenti statistici,
    Un ruolo fondamentale per la comprensione delle trasformazioni nei vari settori dell'Impresa, delle attività della Pubblica Amministrazione e della vita delle famiglie, è rivestito dalla raccolta dei dati e dalla loro elaborazione su base statistica. Il metodo di intervista sul territorio, basato sul solo supporto cartaceo, non è in grado di garantire la flessibilita' richiesta dalla crescente complessità dei rilevamenti statistici moderni. Le tecnologie informatiche permettono numerosi vantaggi, come ad esempio la gestione semplificata anche di un ampio campione di intervistati, il controllo automatico del flusso dell'intervista, la coerenza delle risposte con correzione automatica degli errori, il trasferimento e la sincronizzazione tra le fonti dei dati in modo centralizzato. In questo ambito si propone un nuovo metodo di indagine statistica, basato sull'utilizzo del questionario web Mod_survey, compilato su palmare Linux iPAQ H5550 dotato di connettività wireless. Si illustrerà l'implementazione del sistema, realizzzato nell'ambito del progetto WILMA (Wireless Internet and Location Management Architecture) ed in accordo con il Servizio Statistica della Provincia Autonoma di Trento, valutandone i pregi e le limitazioni emersi nella sua sperimentazione,
  7. C. Furlanello; M. Serafini; S. Merler; G. Jurman,
    Semi-supervised learning for molecular profiling,
    Class prediction and feature selection are two learning tasks that are strictly paired in the search of molecular profiles from microarray data. Researchers have become aware how easy is to incur a selection bias effect and complex validation setups are required to avoid overly optimistic estimates of the predictive accuracy of the models and incorrect gene selections. This paper describes a semi-supervised pattern discovery approach that uses the by-products of complete validation studies on experimental setups for gene profiling. In particular, we introduce the study of the patterns of single sample responses sample-tracking profiles) to the gene selection process induced by typical supervised learning tasks in microarray studies. We originate sample-tracking profiles as the aggregated off-training evaluation of SVM models of increasing gene panel sizes. Genes are ranked by E-RFE, an entropy-based variant of the recursive feature elimination for support vector machines (RFE-SVM). A Dynamic Time Warping (DTW) algorithm is then applied to define a metric between sample-tracking profiles. An unsupervised clustering based on the DTW metric allows automating the discovery of outliers and of subtypes of different molecular profiles. Applications are described on synthetic data and in two gene expression studies,
  8. S. Merler; C. Furlanello; B. Caprile,
    Giving AdaBoost a Parallel Boost,
    AdaBoost is one of the most popular classification methods in use. Differently from other ensemble methods (e.g., Bagging), AdaBoost is inherently sequential. In many data intensive, real world applications this may limit the practical applicability of the method. In this paper, a scheme is presented for the parallelization of the AdaBoost. The procedure builds upon earlier results concerning the dynamics of AdaBoost weights, and yields approximations to the standard AdaBoost models that can be easily and efficiently distributed over a network of computing nodes. Margin maximization properties of the proposed procedure are discussed, and experiments are reported on either synthetic and benchmark data sets,
  9. Stefano Merler; Cesare Furlanello; Barbara Larcher; Andrea Sboner,
    Automatic model selection in cost-sensitive boosting,
    vol. 4,
    n. 1,
    , pp. 3 -
  10. Stefano Merler; Bruno Caprile; Cesare Furlanello,
    Bias-Variance Control via Hard Points Shaving,
    vol. 18,