You are here

Year 2004

  1. C. Furlanello; M. Serafini; S. Merler; G. Jurman,
    Methods for predictive classification and molecular profiling from DNA microarray data,
    in «ITALIAN HEART JOURNAL»,
    vol. 5,
    n. 1,
    2004
    , pp. 199 -
    202
  2. S. Merler; B. Caprile; C. Furlanello,
    Bias-Variance Control via Hard Points Shaving,
    in «INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE»,
    vol. 18,
    n. 5,
    2004
    , pp. 891 -
    903
  3. G. Jurman,
    in «JOURNAL OF ALGEBRA»,
    vol. 271,
    n. 2,
    2004
    , pp. 454 -
    481
  4. B. Caprile; S. Merler; C. Furlanello; G. Jurman,
    Exact Bagging with k-Nearest Neighbour Classifiers,
    Multiple Classifier Systems, Proceedings of the 5th International Workshop, MCS 2004,
    Berlin,
    Springer,
    vol.3077,
    2004
    , pp. 72-
    81
    , (5th International Workshop on Multiple Classifier Systems,
    Cagliari,
    06/09/2004 to 06/11/2004)
  5. S. Menegon; M. Neteler; C. Furlanello; S. Fontanari,
    Open Source GIS/WebGIS nella Amministrazione Pubblica,
    On-line Proceedings of SALPA Forum 2004: Sapere libero e aperto nella Pubblica Amministrazione,
    2004
  6. M. Neteler; D. Grasso; I. Michelazzi; L. Miori; S. Merler; C. Furlanello,
    New image processing tools for GRASS,
    Proceedings of FOSS/GRASS 2004 Conference,,
    2004
  7. S. Merler; C. Furlanello; B. Caprile,
    Giving AdaBoost a Parallel Boost,
    AdaBoost is one of the most popular classification methods in use. Differently from other ensemble methods (e.g., Bagging), AdaBoost is inherently sequential. In many data intensive, real world applications this may limit the practical applicability of the method. In this paper, a scheme is presented for the parallelization of the AdaBoost. The procedure builds upon earlier results concerning the dynamics of AdaBoost weights, and yields approximations to the standard AdaBoost models that can be easily and efficiently distributed over a network of computing nodes. Margin maximization properties of the proposed procedure are discussed, and experiments are reported on either synthetic and benchmark data sets,
    2004
  8. S. Merler; G. Jurman,
    Regularized Slope Function Networks for Microarray Data Analysis,
    We propose a novel algorithm, Regularized Slope Function Networks (RSFN), for classification and feature ranking purposes in the family of Support Vector Machines. The main improvement relies on the fact that the kernel is automatically determined by the training examples. It is built as a function of local classifiers, called slope functions, obtained by parting oppositely labeled pairs of training points. The algorithm, although possessing a meaningful geometrical interpretation, is derived in the framework of Tikhonov regularization theory. Its unique free parameter is the regularization one, representing a trade-off between empirical error and solution complexity. A theoretical bound on the generalization error is also derived, together with Vapnik Chervonenkis dimension. Performances are tested on a number of synthetic and real data sets, where the emphasis is on the microarray case,
    2004
  9. C. Furlanello; M. Serafini; S. Merler; G. Jurman,
    Semi-supervised learning for molecular profiling,
    Class prediction and feature selection are two learning tasks that are strictly paired in the search of molecular profiles from microarray data. Researchers have become aware how easy is to incur a selection bias effect and complex validation setups are required to avoid overly optimistic estimates of the predictive accuracy of the models and incorrect gene selections. This paper describes a semi-supervised pattern discovery approach that uses the by-products of complete validation studies on experimental setups for gene profiling. In particular, we introduce the study of the patterns of single sample responses sample-tracking profiles) to the gene selection process induced by typical supervised learning tasks in microarray studies. We originate sample-tracking profiles as the aggregated off-training evaluation of SVM models of increasing gene panel sizes. Genes are ranked by E-RFE, an entropy-based variant of the recursive feature elimination for support vector machines (RFE-SVM). A Dynamic Time Warping (DTW) algorithm is then applied to define a metric between sample-tracking profiles. An unsupervised clustering based on the DTW metric allows automating the discovery of outliers and of subtypes of different molecular profiles. Applications are described on synthetic data and in two gene expression studies,
    2004
  10. P. Brunetti; D. Minati; R. Flor; C. Furlanello,
    Utilizzo di palmari GNU/Linux per rilevamenti statistici,
    Un ruolo fondamentale per la comprensione delle trasformazioni nei vari settori dell'Impresa, delle attività della Pubblica Amministrazione e della vita delle famiglie, è rivestito dalla raccolta dei dati e dalla loro elaborazione su base statistica. Il metodo di intervista sul territorio, basato sul solo supporto cartaceo, non è in grado di garantire la flessibilita' richiesta dalla crescente complessità dei rilevamenti statistici moderni. Le tecnologie informatiche permettono numerosi vantaggi, come ad esempio la gestione semplificata anche di un ampio campione di intervistati, il controllo automatico del flusso dell'intervista, la coerenza delle risposte con correzione automatica degli errori, il trasferimento e la sincronizzazione tra le fonti dei dati in modo centralizzato. In questo ambito si propone un nuovo metodo di indagine statistica, basato sull'utilizzo del questionario web Mod_survey, compilato su palmare Linux iPAQ H5550 dotato di connettività wireless. Si illustrerà l'implementazione del sistema, realizzzato nell'ambito del progetto WILMA (Wireless Internet and Location Management Architecture) ed in accordo con il Servizio Statistica della Provincia Autonoma di Trento, valutandone i pregi e le limitazioni emersi nella sua sperimentazione,
    2004