Advanced Statistics (year 2021)

Oggetto:

Advanced Statistics (year 2021)

Oggetto:

Advanced Statistics (year 2021)

Oggetto:

Academic year 2020/2021

Course ID

AdStat

Teaching staff

Dott. Davide Ascoli
Luigi Bollani
Prof. Matteo Garbarino
Prof. Giampiero Lombardi
Prof. Michele Lonati
Prof. Andrea Onofri

Type

Basic

Credits/Recognition

6.7

Delivery

Formal authority

Language

English

Attendance

Obligatory

Oggetto:

Program

Correspondence analysis (Luigi BOLLANI)

Correspondence analysis (CA) or multiple correspondence analysis (MCA) find their basis in dimensional reduction techniques such as principal component analysis and extend their use to qualitative variables. In particular, the CA allows the analysis of a contingency table, in the case of two qualitative variables. It supports the study of the association, carried out for example with the chi-square test, adding the reasons for the association between the modalities of the two variables. The MCA extends this methodology to several qualitative variables. CA and MCA also produce quantitative factorial dimensions with decreasing importance (i.e. variance); they allow for easy later use of clustering techniques. The module has a practical intention and uses the R environment for data processing.

Multivariate Statistical Analysis for Ecologists (Matteo GARBARINO)

The module aims at introducing the main multivariate statistical tools (grouping, ordination, group testing and modeling) used in ecological studies. Multivariate stats will be described from the theory and practice point of view. Some of the tools are: Cluster Analysis, PCA, NMDS, RDA, CCA, Mantel test, MRPP, MANOVA, etc.. Several software will be presented and discussed (PC-ORD, CANOCO, R, PAST), but only the simpler one (PAST) will be used in the lab exercises.

Cluster Analysis (Michele LONATI, Giampiero LOMBARDI)

For the goals of many biological experiments large quantities of data are often gathered. Among the different techniques used to explore data, cluster analysis is commonly used in many fields. Cluster analysis is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other clusters. During the seminar, the theoretical bases of hierarchical clustering will be shown and the different steps to group objects and check the goodness of results will be analyzed. The main clustering techniques will be immediately applied by students using clustering algorithms present in SPSS, R and Past.

Time Series Analysis in R (Davide ASCOLI)

The module aims at describing the properties of environmental time series (ts), understand mechanism by which ts evolve over time, studying short term dependencies and low frequency relationships between ts. The module will deepen modeling approaches in ts analysis such as time frequency domain properties (e.g. autocorrelation, cross correlation, stationarity, series decomposition: trend, cycles, white noise), model linear dynamic univariate and multivariate relationships, deepen autoregressive stochastic models AR, ARIMA, and Spectral and Wavelet analysis. The module will make an intense use of R and students will run analyses in real time. A basic knowledge of R is required for effective learning.

Linear and nonlinear mixed models in R (Andrea ONOFRI)

Very often, experimental data are grouped by locations, years, blocks, main-plots, randomisation units and individuals. In these conditions, modelling and statistical inference require great care, to avoid that the basic independence assumption is broken, which invalidates the results. Furthermore, it is also possible that, for some experimental factors, we are not interested in the observed response for each factor level, but we are mainly interested in the overall variability produced on the experimental units. For these situations, mixed models provide a very convenient modeling platform, to introduce random effects and account for the differences among groups, based on the estimation of variance and covariance components. Within this platform, it is also possible to introduce complex variance and covariance structures, which can help explain the relationships among individual observations and among groups of observations. This seminar will cover linear and nonlinear mixed models, with particular attention to the practical aspects of model definition and implementation, within the R statistical environment.

Descrizione