Processing math: 69%
+ - 0:00:00
Notes for current slide
Notes for next slide

Workshop 9: Multivariate Analyses

QCBS R Workshop Series

Québec Centre for Biodiversity Science

1 / 111

About this workshop

badge badge badge badge badge

2 / 111

QCBS members who contributed to this workshop

by modifying and improving its content as part of the
Learning and Development Award

2022 - 2021 - 2020

Pedro Henrique P. Braga

Katherine Hébert

Mi Lin

Linley Sherin


2019 - 2018 - 2017

Gabriel Muñoz

Marie Hélène-Brice

Pedro Henrique P. Braga

2016 - 2015 - 2014

Bérenger Bourgeois

Xavier Giroux-Bougard

Amanda Winegardner

Emmanuelle Chrétien

Monica Granados


If you would like to contribute too, visit r.qcbs.ca/contributing
and don't hesitate to get in touch with us!

3 / 111

Required material

This workshop requires the latest RStudio and R versions.

You must also use these packages:

To install them from CRAN, run:

install.packages(c("ape",
"ade4",
"codep",
"gclus",
"vegan",
"GGally",
"PlaneGeometry",
"remotes"))


Throughout this workshop, there will be a series of challenges that you can recognize by this Rubik's cube.

During these challenges, do not hesitate to collaborate!

4 / 111

Learning objectives

  1. Learn the basics of multivariate analysis to reveal patterns in community composition data

  2. Use R to perform an unconstrained ordination

  3. Learn about similarity and dissimilarity coefficients and transformations to perform multivariate analysis

  4. Use R to create dendrograms

  5. Learn the following methods:

    • Clustering analysis
    • Principal Component Analysis (PCA)
    • Principal Coordinate Analysis (PCoA)
    • Non-Metric MultiDimensional Scaling (NMDS)
5 / 111

1. Preamble

6 / 111

Recap: Univariate analyses

We have learned a multitude of analyses that allowed us to interpret ecological data while depicting the effects of one or multiple variables in one response variable.

7 / 111

Recap: Univariate analyses

We have learned a multitude of analyses that allowed us to interpret ecological data while depicting the effects of one or multiple variables in one response variable.

We can recall the:

  1. General Linear Models

    1. lm();
    2. anova();
    3. t.test();
    4. lmer().
  2. Generalized Linear Models

    1. glm() and glmer() with several family() link functions.
  3. Generalized Additive Models

    1. gam().
7 / 111

Recap: Univariate analyses

We have learned a multitude of analyses that allowed us to interpret ecological data while depicting the effects of one or multiple variables in one response variable.

We can recall the:

  1. General Linear Models

    1. lm();
    2. anova();
    3. t.test();
    4. lmer().
  2. Generalized Linear Models

    1. glm() and glmer() with several family() link functions.
  3. Generalized Additive Models

    1. gam().

These models allowed us to ask questions such as:

  1. What are the effects of precipitation and temperature on species richness?
  2. How does the abundance of microbes change between hosts?
  3. Do co-occurring fish become more aggressive after being induced to fear?
7 / 111

Multivariate statistics

However, one may be interested in making inferences from ecological data containing more than one outcome or dependent variable.

This interest may be driven by hypothesis testing and modelling, but also be entirely exploratory.

8 / 111

Multivariate statistics

However, one may be interested in making inferences from ecological data containing more than one outcome or dependent variable.

This interest may be driven by hypothesis testing and modelling, but also be entirely exploratory.

For instance, our research question might be:

  1. How does the bacterial composition on maple leaves change along the elevational gradient?

  2. What is the composition dissimilarity of bat communities?

  3. How closely-related are local spider communities in terms of their composition?

8 / 111

Multivariate statistics

However, one may be interested in making inferences from ecological data containing more than one outcome or dependent variable.

This interest may be driven by hypothesis testing and modelling, but also be entirely exploratory.

For instance, our research question might be:

  1. How does the bacterial composition on maple leaves change along the elevational gradient?

  2. What is the composition dissimilarity of bat communities?

  3. How closely-related are local spider communities in terms of their composition?

In all these cases, the outcome is composed of several variables, e.g. usually a sample-by-species or sample-by-environment matrix.

8 / 111

Multivariate statistics

However, one may be interested in making inferences from ecological data containing more than one outcome or dependent variable.

This interest may be driven by hypothesis testing and modelling, but also be entirely exploratory.

For instance, our research question might be:

  1. How does the bacterial composition on maple leaves change along the elevational gradient?

  2. What is the composition dissimilarity of bat communities?

  3. How closely-related are local spider communities in terms of their composition?

In all these cases, the outcome is composed of several variables, e.g. usually a sample-by-species or sample-by-environment matrix.


We will now dive into multivariate statistics, a tool set that will allow us to address questions requiring the simultaneous observation or analysis of more than one outcome variable.

8 / 111

Background on multivariate methods

  1. Association (or dis-similarity) measures and matrices

  2. Classification (or cluster) analysis

  3. Unconstrained ordination

  4. Constrained (or canonical) ordination

9 / 111

Background on multivariate methods

  1. Association (or dis-similarity) measures and matrices

  2. Classification (or cluster) analysis

  3. Unconstrained ordination

  4. Constrained (or canonical) ordination


But, before that, we must recall the basics of matrix algebra.

9 / 111

Matrix algebra: a very brief summary

Matrix algebra is well-suited for ecology, because most (if not all) data sets we work with are in a matrix format.

10 / 111

Matrix algebra: a very brief summary

Matrix algebra is well-suited for ecology, because most (if not all) data sets we work with are in a matrix format.

Ecological data tables are obtained as object-observations or sampling units, and are often recorded as this:

Objects y1 y2 yn
x1 y1,1 y1,2 y1,n
x2 y2,1 y2,2 y2,n
xm ym,1 ym,2 ym,n


where xm is the sampling unit m; and yn is the ecological descripor that can be, for example, species present in a sampling unit, locality, or a chemical variable.

10 / 111

Matrix algebra: a very brief summary

Matrix algebra is well-suited for ecology, because most (if not all) data sets we work with are in a matrix format.

Ecological data tables are obtained as object-observations or sampling units, and are often recorded as this:

Objects y1 y2 yn
x1 y1,1 y1,2 y1,n
x2 y2,1 y2,2 y2,n
xm ym,1 ym,2 ym,n


where xm is the sampling unit m; and yn is the ecological descripor that can be, for example, species present in a sampling unit, locality, or a chemical variable.

The same ecological data table can be represented in matrix notation like this:

Y=[ym,n]=[y1,1y1,2y1,ny2,1y2,2y2,nym,1ym,2ym,n]

where lowercase letters indicate elements, and the subscript letters indicate the position of these elements in the matrix (and in the table!).

10 / 111

Matrix algebra: a very brief summary

Matrix algebra is well-suited for ecology, because most (if not all) data sets we work with, are in a matrix format.

Moreover, any subset of a matrix can be recognized!


a row matrix [y1,1y1,2y1,n]

a column matrix [y1,1y2,1ym,1]

The same ecological data table can be represented in matrix notation like this:

Y=[ym,n]=[y1,1y1,2y1,ny2,1y2,2y2,nym,1ym,2ym,n]

where lowercase letters indicate elements, and the subscript letters indicate the position of these elements in the matrix (and in the table!).

11 / 111

Matrix algebra: a very brief summary

Association matrices

Two important matrices can be derived from the ecological data matrix: the association matrix among objects and the association matrix among descriptors.

12 / 111

Matrix algebra: a very brief summary

Association matrices

Two important matrices can be derived from the ecological data matrix: the association matrix among objects and the association matrix among descriptors.

Using the data from our matrix Y,

Y=x1x2xm[y1,1y1,2y1,ny2,1y2,2y2,nym,1ym,2ym,n]

one can examine the relationship between the first two objects:

x1[y1,1y1,2y1,n]
x2[y2,1y2,2y2,n]

and obtain a1,2.

12 / 111

Matrix algebra: a very brief summary

Association matrices

Two important matrices can be derived from the ecological data matrix: the association matrix among objects and the association matrix among descriptors.

Using the data from our matrix Y,

Y=x1x2xm[y1,1y1,2y1,ny2,1y2,2y2,nym,1ym,2ym,n]

one can examine the relationship between the first two objects:

x1[y1,1y1,2y1,n]
x2[y2,1y2,2y2,n]

and obtain a1,2.

We can populate the association matrix An,n with the relationships between all objects from Y:

An,n=[a1,1a1,2a1,na2,1a2,2a2,nan,1an,2an,n]

Because An,n has the same number of rows and columns, it is denoted a square matrix.

Therefore, it has n2 elements.

12 / 111

Matrix algebra: a very brief summary

Association matrices

Two important matrices can be derived from the ecological data matrix: the association matrix among objects and the association matrix among descriptors.

13 / 111

Matrix algebra: a very brief summary

Association matrices

Two important matrices can be derived from the ecological data matrix: the association matrix among objects and the association matrix among descriptors.

We can also obtain the relationship between the first two descriptors of Y, y1 and y2:

[y1,2y2,2ym,2]

[y1,1y2,1ym,1]

and store it in (a_{1,2}).

13 / 111

Matrix algebra: a very brief summary

Association matrices

Two important matrices can be derived from the ecological data matrix: the association matrix among objects and the association matrix among descriptors.

We can also obtain the relationship between the first two descriptors of Y, y1 and y2:

[y1,2y2,2ym,2]

[y1,1y2,1ym,1]

and store it in a1,2.

We can populate the association matrix Am,m with the relationships between all descriptors from Y:

Am,m=[a1,1a1,2a1,ma2,1a2,2a2,mam,1am,2am,m]

This Am,m is a square matrix, and it has m2 elements.

13 / 111

Matrix algebra: a very brief summary

Association matrices

Two important matrices can be derived from the ecological data matrix: the association matrix among objects and the association matrix among descriptors.

These matrices are the basis of Q-mode and R-mode analyses in ecology.

Ym,n=[y1,1y1,2y1,ny2,1y2,2y2,nym,1ym,2ym,n]

Am,m=[a1,1a1,2a1,ma2,1a2,2a2,mam,1am,2am,m]

An,n=[a1,1a1,2a1,na2,1a2,2a2,nan,1am,2an,n]

R-mode analysis for descriptors or species

Q-mode analysis for objects or sites

14 / 111

Our plans for the next steps...

We will dive into R-mode and Q-mode analyses, and we will explore:

  1. Association coefficients: dissimilarity and similarity
  2. Composition data transformation
  3. Cluster analyses
  4. Ordinations in the reduced space


But, first, let us introduce a real data set.

It will be your time to get your hands dirty!

15 / 111

Doubs river fish communities

Verneaux (1973) proposed to use fish species to characterize ecological zones along European rivers and streams.

He collected data at 30 localities along the Doubs river, which runs near the France-Switzerland border, in the Jura Mountains.

He showed that fish communities were biological indicators of these water bodies.

16 / 111

Doubs river fish communities

Verneaux (1973) proposed to use fish species to characterize ecological zones along European rivers and streams.

He collected data at 30 localities along the Doubs river, which runs near the France-Switzerland border, in the Jura Mountains.

He showed that fish communities were biological indicators of these water bodies.

Their data is split in three matrices:

  1. The abundance of 27 fish species across the communities;
  2. The environmental variables recorded at each site; and,
  3. The geographical coordinates of each site.


Verneaux, J. (1973) Cours d'eau de Franche-Comté (Massif du Jura). Recherches écologiques sur le réseau hydrographique du Doubs. Essai de biotypologie. Thèse d'état, Besançon. 1–257.

16 / 111

Doubs river fish communities

You can download these datasets from r.qcbs.ca/workshops/r-workshop-09.

spe <- read.csv("data/doubsspe.csv",
row.names = 1)
env <- read.csv("data/doubsenv.csv",
row.names = 1)
17 / 111

Doubs river fish communities

You can download these datasets from r.qcbs.ca/workshops/r-workshop-09.

spe <- read.csv("data/doubsspe.csv",
row.names = 1)
env <- read.csv("data/doubsenv.csv",
row.names = 1)

This data can also be retrieved from the ade4 package:

library(ade4)
data(doubs)
spe <- doubs$fish
env <- doubs$env

Alternatively, from the codep package:

library(codep)
data(Doubs)
spe <- Doubs.fish
env <- Doubs.env
17 / 111

Doubs river environmental data

We can then explore the objects containing our newly loaded data.

Let us peek into the env data:

str(env)
# 'data.frame': 30 obs. of 11 variables:
# $ das: num 0.3 2.2 10.2 18.5 21.5 32.4 36.8 49.1 70.5 99 ...
# $ alt: int 934 932 914 854 849 846 841 792 752 617 ...
# $ pen: num 48 3 3.7 3.2 2.3 3.2 6.6 2.5 1.2 9.9 ...
# $ deb: num 0.84 1 1.8 2.53 2.64 2.86 4 1.3 4.8 10 ...
# $ pH : num 7.9 8 8.3 8 8.1 7.9 8.1 8.1 8 7.7 ...
# $ dur: int 45 40 52 72 84 60 88 94 90 82 ...
# $ pho: num 0.01 0.02 0.05 0.1 0.38 0.2 0.07 0.2 0.3 0.06 ...
# $ nit: num 0.2 0.2 0.22 0.21 0.52 0.15 0.15 0.41 0.82 0.75 ...
# $ amm: num 0 0.1 0.05 0 0.2 0 0 0.12 0.12 0.01 ...
# $ oxy: num 12.2 10.3 10.5 11 8 10.2 11.1 7 7.2 10 ...
# $ dbo: num 2.7 1.9 3.5 1.3 6.2 5.3 2.2 8.1 5.2 4.3 ...
Variable Description
das Distance from the source [km]
alt Altitude [m a.s.l.]
pen Slope [per thousand]
deb Mean min. discharge [m3s-1]
pH pH of water
dur Ca conc. (hardness) [mgL-1]
pho K conc. [mgL-1]
nit N conc. [mgL-1]
amn NH₄⁺ conc. [mgL-1]
oxy Diss. oxygen [mgL-1]
dbo Biol. oxygen demand [mgL-1]
summary(env) # summary statistics
18 / 111

Doubs river fish communities

Let us peek into the spe data:

head(spe)[, 1:8]
# CHA TRU VAI LOC OMB BLA HOT TOX
# 1 0 3 0 0 0 0 0 0
# 2 0 5 4 3 0 0 0 0
# 3 0 5 5 5 0 0 0 0
# 4 0 4 5 5 0 0 0 0
# 5 0 2 3 2 0 0 0 0
# 6 0 3 4 5 0 0 0 0
str(spe)
...
# 'data.frame': 30 obs. of 27 variables:
# $ CHA: int 0 0 0 0 0 0 0 0 0 0 ...
# $ TRU: int 3 5 5 4 2 3 5 0 0 1 ...
# $ VAI: int 0 4 5 5 3 4 4 0 1 4 ...
# $ LOC: int 0 3 5 5 2 5 5 0 3 4 ...
# $ OMB: int 0 0 0 0 0 0 0 0 0 0 ...
# $ BLA: int 0 0 0 0 0 0 0 0 0 0 ...
# $ HOT: int 0 0 0 0 0 0 0 0 0 0 ...
...
# Try some of these!
names(spe) # names of objects
dim(spe) # dimensions
str(spe) # structure of objects
summary(spe) # summary statistics
head(spe) # first 6 rows
19 / 111

(Dis)similarity measures

Community resemblance is almost always assessed on the basis of species composition data in the form of a site-by-species data table Ym,n.

We can obtain an association matrix Am,m in the form of pairwise distances or dissimilarities Dm,m (or similarities Sm,m) and then analyse those distances.

20 / 111

(Dis)similarity measures

Community resemblance is almost always assessed on the basis of species composition data in the form of a site-by-species data table Ym,n.

We can obtain an association matrix Am,m in the form of pairwise distances or dissimilarities Dm,m (or similarities Sm,m) and then analyse those distances.

In R, we can compute distance or dissimilarity matrices using stats::dist():

dist(spe)
# 1 2 3 4 5 6 7
# 2 5.385165
# 3 7.416198 2.449490
# 4 7.874008 4.123106 3.000000
# 5 10.816654 10.677078 10.862780 9.219544
# 6 7.348469 4.582576 4.123106 2.828427 8.185353
# 7 6.855655 2.449490 2.000000 3.605551 10.488088 3.605551
# 8 3.000000 7.071068 8.717798 8.774964 10.954451 7.937254 8.246211
# 9 7.810250 8.717798 9.380832 8.774964 9.380832 6.708204 8.485281
# 10 6.708204 5.099020 5.291503 5.000000 9.273618 3.605551 4.472136
# 11 4.472136 3.316625 5.000000 5.477226 10.246951 5.291503 4.795832
# 12 6.708204 3.162278 3.464102 4.582576 11.045361 4.795832 3.162278
# 13 7.071068 4.358899 5.196152 6.164414 11.532563 6.633250 5.385165
# 14 9.110434 6.324555 6.164414 6.557439 11.916375 7.000000 6.324555
# 15 9.899495 7.810250 7.681146 7.483315 10.816654 6.633250 6.855655
# 16 11.090537 9.899495 9.797959 9.327379 10.198039 8.426150 9.055385
# 17 10.630146 9.486833 9.486833 8.774964 9.797959 8.062258 9.055385
# 18 9.848858 9.591663 9.899495 8.660254 9.055385 7.810250 9.380832
# 19 11.704700 11.135529 11.045361 10.148892 9.380832 8.888194 10.770330
# 20 13.453624 14.212670 14.628739 13.453624 10.770330 12.206556 14.212670
# 21 14.456832 15.362291 15.811388 14.525839 11.489125 13.601471 15.556349
# 22 16.643317 17.663522 18.110770 16.763055 13.416408 15.779734 17.663522
# 23 3.872983 7.483315 9.055385 9.000000 10.583005 7.937254 8.485281
# 24 7.071068 9.539392 10.816654 10.583005 11.357817 9.486833 10.246951
# 25 5.291503 8.306624 9.643651 9.273618 9.746794 8.246211 9.110434
# 26 11.135529 12.609520 13.304135 12.409674 10.723805 11.224972 12.922848
# 27 15.362291 16.462078 16.881943 15.811388 13.076697 14.696938 16.583124
# 28 16.522712 17.549929 18.000000 16.941074 14.352700 15.905974 17.606817
# 29 18.761663 19.364917 19.621417 18.384776 15.524175 17.776389 19.364917
# 30 20.396078 21.377558 21.794495 20.542639 17.117243 19.849433 21.517435
# 8 9 10 11 12 13 14
# 2
# 3
# 4
# 5
# 6
# 7
# 8
# 9 7.211103
# 10 6.480741 6.480741
# 11 5.385165 7.549834 4.582576
# 12 8.124038 8.717798 5.477226 3.872983
# 13 8.426150 10.049876 6.855655 4.000000 3.316625
# 14 10.198039 10.583005 7.615773 6.244998 4.242641 3.316625
# 15 10.630146 9.746794 6.855655 7.745967 6.403124 6.633250 5.000000
# 16 11.489125 10.862780 8.485281 10.049876 9.591663 9.746794 8.944272
# 17 10.770330 9.899495 8.246211 9.219544 9.273618 9.643651 9.380832
# 18 9.695360 8.602325 7.874008 8.774964 9.380832 9.949874 9.797959
# 19 11.313708 9.486833 9.273618 11.000000 11.313708 12.041595 11.832160
# 20 13.114877 11.489125 12.727922 13.527749 14.422205 15.000000 14.966630
# 21 14.142136 13.266499 14.142136 14.662878 15.684387 16.031220 16.062378
# 22 16.370706 15.033296 16.248077 16.941074 17.832555 18.303005 18.165902
# 23 2.449490 6.324555 6.633250 5.744563 8.366600 8.774964 10.392305
# 24 6.403124 7.549834 8.544004 8.124038 10.148892 10.583005 11.789826
# 25 4.358899 7.280110 7.000000 6.782330 9.110434 9.486833 10.723805
# 26 10.723805 10.148892 11.445523 11.747340 13.000000 13.490738 13.892444
# 27 15.066519 13.892444 15.264338 15.748016 16.703293 17.146428 17.117243
# 28 16.248077 14.899664 16.309506 16.822604 17.720045 18.193405 18.220867
# 29 18.681542 17.521415 18.303005 18.761663 19.416488 19.646883 19.313208
# 30 20.174241 19.467922 20.371549 20.736441 21.610183 21.863211 21.931712
# 15 16 17 18 19 20 21
# 2
# 3
# 4
# 5
# 6
# 7
# 8
# 9
# 10
# 11
# 12
# 13
# 14
# 15
# 16 5.916080
# 17 8.185353 6.164414
# 18 8.660254 7.615773 3.464102
# 19 10.630146 9.165151 6.633250 6.000000
# 20 13.674794 12.649111 9.380832 7.615773 6.324555
# 21 15.066519 13.928388 11.224972 9.380832 7.874008 3.741657
# 22 16.763055 15.556349 13.416408 11.489125 10.583005 6.480741 4.000000
# 23 10.630146 11.489125 10.295630 9.055385 10.488088 11.916375 13.114877
# 24 11.747340 12.449900 10.723805 9.110434 9.746794 10.148892 11.269428
# 25 10.583005 11.180340 10.148892 8.660254 9.539392 10.630146 11.618950
# 26 13.190906 13.076697 10.816654 8.774964 7.810250 5.385165 5.385165
# 27 16.062378 15.329710 13.304135 11.532563 10.246951 6.244998 4.582576
# 28 17.175564 16.370706 14.352700 12.489996 11.489125 7.745967 6.000000
# 29 18.330303 17.058722 14.662878 13.076697 12.767145 9.000000 7.000000
# 30 21.023796 19.621417 17.117243 15.652476 15.000000 11.180340 9.000000
# 22 23 24 25 26 27 28
# 2
# 3
# 4
# 5
# 6
# 7
# 8
# 9
# 10
# 11
# 12
# 13
# 14
# 15
# 16
# 17
# 18
# 19
# 20
# 21
# 22
# 23 15.362291
# 24 13.304135 4.358899
# 25 13.964240 3.000000 3.741657
# 26 7.549834 9.433981 7.071068 8.000000
# 27 5.000000 14.035669 11.916375 12.649111 5.291503
# 28 5.099020 15.231546 13.076697 13.964240 7.000000 3.872983
# 29 5.567764 17.804494 15.874508 16.431677 9.899495 6.633250 5.567764
# 30 8.185353 19.416488 17.832555 18.055470 11.916375 9.165151 7.280110
# 29
# 2
# 3
# 4
# 5
# 6
# 7
# 8
# 9
# 10
# 11
# 12
# 13
# 14
# 15
# 16
# 17
# 18
# 19
# 20
# 21
# 22
# 23
# 24
# 25
# 26
# 27
# 28
# 29
# 30 6.480741
20 / 111

(Dis)similarity measures

For simplicity, we have done this without specifying arguments.

Run dist(spe) from your end, note what you observe and see what the commands below show you:

class(dist(spe))
str(dist(spe))
as.matrix(dist(spe))
dim(as.matrix(dist(spe)))
21 / 111

If possible, do this part with the participants.

Show them how a dist class object is structured. Highlight that it is a lower-triangular matrix, where the distances between the same objects are hidden.

(Dis)similarity measures

There are three groups of distance coefficients: metrics, ... , ... .

The first group consists of metrics, and its coefficients satisfy the following properties:

  1. minimum 0: if species a is equal to species b, then D(a,b)=0;
  2. positiveness: if ab, then D(a,b)>0;
  3. symmetry: D(a,b)=D(b,a);
  4. triangle inequality: D(a,b)+D(b,c)D(a,c). The sum of two sides of a triangle drawn in the Euclidean space is equal or greater than the third side.

We can spot all these properties below:

as.matrix(dist(spe))[1:3, 1:3]
# 1 2 3
# 1 0.000000 5.385165 7.416198
# 2 5.385165 0.000000 2.449490
# 3 7.416198 2.449490 0.000000
22 / 111

Highlight the triangular nature of the dist class, that it can be converted to a matrix, and the dimensions (noting that it has the same number of species).

(Dis)similarity measures

There are three groups of distance coefficients: metrics, semimetrics, ... .

The second group consists of semimetrics, and they violate the triangle inequality property:

  1. minimum 0: if species a is equal to species b, then D(a,b)=0;
  2. positiveness: if ab, then D(a,b)>0;
  3. symmetry: D(a,b)=D(b,a);
  4. triangle inequality: D(a,b)+D(b,c)or<D(a,c). The sum of two sides of a triangle drawn in the Euclidean space is not equal or greater than the third side.
23 / 111

Highlight the triangular nature of the dist class, that it can be converted to a matrix, and the dimensions (noting that it has the same number of species).

(Dis)similarity measures

There are three groups of distance coefficients: metrics, semimetrics, nonmetrics.

The third group consists of nonmetrics, and they violate the positiveness property of metric distances:

  1. minimum 0: if species a is equal to species b, then D(a,b)=0;
  2. positiveness: if ab, then D(a,b)>or<0;
  3. symmetry: D(a,b)=D(b,a);
  4. triangle inequality: D(a,b)+D(b,c)or<D(a,c). The sum of two sides of a triangle drawn in the Euclidean space is not equal or greater than the third side.
24 / 111

Highlight the triangular nature of the dist class, that it can be converted to a matrix, and the dimensions (noting that it has the same number of species).

(Dis)similarity measures: Euclidean distances

The most common metric distance measure is the Euclidean distance.

It is computed using the Pythagorean formula:

D1(x1,x2)=pj=1(y1jy2j)2

Using stats::dist(), we can compute it with:

spe.D.Euclid <- dist(x = spe,
method = "euclidean")

And, we can test whether a distance is Euclidean using:

is.euclid(spe.D.Euclid)
# [1] TRUE
25 / 111

The figure on the side needs to be finished. It is supposed to represent the Euclidean distance between sites x1 and x2 in the 2D Cartesian space. The axes are the descriptors y1 and y2, respectively. The arrow on the bottom is the distance between the position of y21 and y11.

You can use it to recall the Pythagorean theorem that everyone learns in high school, where the length of the hypotenuse and a and b denote the two lengths of the legs of a right triangle.

method = "euclidean" is the default parameter

Challenge #1

Your turn! Using the dist() function, compute the Euclidean distance matrix Dhmm for the species abundances by site matrix Yhmm below:

Y.hmm <- data.frame(
y1 = c(0, 0, 1),
y2 = c(4, 1, 0),
y3 = c(8, 1, 0))
Sites y1 y2 y3
s1 0 4 8
s2 0 1 1
s3 1 0 0

After this, look into the numbers, think critically about them, and be ready to chat! (5 to 10 minutes)

26 / 111

Challenge #1

Your turn! Using the dist() function, compute the Euclidean distance matrix Dhmm for the species abundances by site matrix Yhmm below:

Y.hmm <- data.frame(
y1 = c(0, 0, 1),
y2 = c(4, 1, 0),
y3 = c(8, 1, 0))
Sites y1 y2 y3
s1 0 4 8
s2 0 1 1
s3 1 0 0

After this, look into the numbers, think critically about them, and be ready to chat! (5 to 10 minutes)

Solution:

Y.hmm.DistEu <- dist(x = Y.hmm,
method = "euclidean")
as.matrix(Y.hmm.DistEu)

Output:

# 1 2 3
# 1 0.000000 7.615773 9.000000
# 2 7.615773 0.000000 1.732051
# 3 9.000000 1.732051 0.000000


Tip: Look into the composition and the distances between sites s2 and s3 and between s1 and s2.

26 / 111

Challenge #1

Solution:

Y.hmm
# y1 y2 y3
# 1 0 4 8
# 2 0 1 1
# 3 1 0 0

Output:

as.matrix(Y.hmm.DistEu)
# 1 2 3
# 1 0.000000 7.615773 9.000000
# 2 7.615773 0.000000 1.732051
# 3 9.000000 1.732051 0.000000

The Euclidean distance between sites s2 and s3, which have no species in common, is smaller than the distance between s1 and s2, which share species y2 and y3 (!).

From an ecological perspective, this is a problematic assessment of the relationship among sites.

27 / 111

Challenge #1

Solution:

Y.hmm
# y1 y2 y3
# 1 0 4 8
# 2 0 1 1
# 3 1 0 0

Output:

as.matrix(Y.hmm.DistEu)
# 1 2 3
# 1 0.000000 7.615773 9.000000
# 2 7.615773 0.000000 1.732051
# 3 9.000000 1.732051 0.000000

The Euclidean distance between sites s2 and s3, which have no species in common, is smaller than the distance between s1 and s2, which share species y2 and y3 (!).

From an ecological perspective, this is a problematic assessment of the relationship among sites.

This issue is known as the double-zero problem, i.e. double zeroes are treated in the same way as double presences, so that the double zeros shrink the distance between two sites.

Euclidean distances ( D1 ) should thus not be used to compare sites based on species abundances.

27 / 111

(Dis)similarity measures: Chord distances

Orlóci (1967) proposed the Chord distance to analyse community composition.

It consists of:

1. Normalizing the data, i.e. scaling site vectors to length 1 by dividing species abundances in a given sample by the square-rooted sum of square abundances in all samples as

yUj=yUj/sj=1y2Uj

2. Calculating the Euclidean distances on this normalized data:

D3(x1,x2)=pj=1(y1jy2j)2

We can use vegan::vegdist() for this one:

spe.D.Ch <- vegdist(spe,
method = "chord")
as.matrix(spe.D.Ch)[1:3, 1:3]
# 1 2 3
# 1 0.0000000 0.7653669 0.9235374
# 2 0.7653669 0.0000000 0.2309609
# 3 0.9235374 0.2309609 0.0000000

When two sites share the same species in the same proportions of the number of individuals the value of D3 is 0, and when no species are shared, its value is 2.

28 / 111

method = "euclidean" is the default parametre

(Dis)similarity measures: Chord distances

There are many other metric distances of interest, which we can use:

What happens if we compute Chord distances in the same site-by-species matrix Yhmm?

Y.hmm
# y1 y2 y3
# 1 0 4 8
# 2 0 1 1
# 3 1 0 0
as.matrix(Y.hmm.DistEu)
# 1 2 3
# 1 0.000000 7.615773 9.000000
# 2 7.615773 0.000000 1.732051
# 3 9.000000 1.732051 0.000000
Y.hmm.DistCh <- vegdist(Y.hmm,
method = "chord")
as.matrix(Y.hmm.DistCh)
# 1 2 3
# 1 0.0000000 0.3203645 1.414214
# 2 0.3203645 0.0000000 1.414214
# 3 1.4142136 1.4142136 0.000000
29 / 111

(Dis)similarity measures: Chord distances

There are many other metric distances of interest, which we can use:

What happens if we compute Chord distances in the same site-by-species matrix Yhmm?

Y.hmm
# y1 y2 y3
# 1 0 4 8
# 2 0 1 1
# 3 1 0 0
as.matrix(Y.hmm.DistEu)
# 1 2 3
# 1 0.000000 7.615773 9.000000
# 2 7.615773 0.000000 1.732051
# 3 9.000000 1.732051 0.000000
Y.hmm.DistCh <- vegdist(Y.hmm,
method = "chord")
as.matrix(Y.hmm.DistCh)
# 1 2 3
# 1 0.0000000 0.3203645 1.414214
# 2 0.3203645 0.0000000 1.414214
# 3 1.4142136 1.4142136 0.000000

Adding any number of double zeroes to a pair of sites does not change the value of D3.

Hence, Chord distances can be used to compare sites described by species abundances!

29 / 111

method = "euclidean" is the default parameter

(Dis)similarity measures: Jaccard's

Another popular association coefficient is the Jaccard similarity coefficient (1900).

It is only appropriate for binary data, and its distance coefficient is defined with the size of the intersection divided by the size of the union of the sample sets.

D7(x1,x2)=1|x1x2||x1x2|=1|x1x2||x1|+|x2||x1x2|=1aa+b+c

where,

  • a is the number of species shared between x1 and x2 that are coded 1;
  • b is the number of occurrences where x1 and x2 are known to be different;
  • c is the number of common absences between x1 and x2, i.e. both 0.
30 / 111

(Dis)similarity measures: Jaccard's

Another popular association coefficient is the Jaccard similarity coefficient (1900).

It is only appropriate for binary data, and its distance coefficient is defined with the size of the intersection divided by the size of the union of the sample sets.

D7(x1,x2)=1|x1x2||x1x2|=1|x1x2||x1|+|x2||x1x2|=1aa+b+c

where,

  • a is the number of species shared between x1 and x2 that are coded 1;
  • b is the number of occurrences where x1 and x2 are known to be different;
  • c is the number of common absences between x1 and x2, i.e. both 0.


For example, for sites x1 and x2:

x1,x2 y1 y2 y3 y4 y5
x1 0 1 0 1 0
x2 0 1 1 1 1

So:

  • a = 1 + 1 = 2
  • b = 1 + 1 = 2
  • c = 1


D7(x1,x2)=

122+2+1=

0.6

30 / 111

(Dis)similarity measures: Sørensen's

All parameters in Jaccard's similarity coefficient have equal weights.

D7(x1,x2)=1aa+b+c

However, you may want to consider that a presence of a species is more informative than its absence.

The distance corresponding to Sørensen's similarity coefficient (1948) gives weight to double presences:

D13(x1,x2)=12a2a+b+c=b+c2a+b+c

where,

  • a is the number of species shared between x1 and x2 that are coded 1;
  • b is the number of occurrences where x1 and x2 are known to be different;
  • c is the number of common absences between x1 and x2, i.e. both 0.
31 / 111

(Dis)similarity measures: Bray-Curtis

The Bray-Curtis dissimilarity coefficient is a modified version of the Sørensen's index and allows for species abundances:

D14(x1,x2)=|y1jy2j|(y1j+y2j)=

D14(x1,x2)=12WA+B where,

  • W is the sum of the lowest abundances in each species found between sites x1 and x2;
  • A is the sum of all abundances in x1; and,
  • B is the sum of all abundances in x2.
32 / 111

(Dis)similarity measures: Bray-Curtis

The Bray-Curtis dissimilarity coefficient is a modified version of the Sørensen's index and allows for species abundances:

D14(x1,x2)=|y1jy2j|(y1j+y2j)=

D14(x1,x2)=12WA+B where,

  • W is the sum of the lowest abundances in each species found between sites x1 and x2;
  • A is the sum of all abundances in x1; and,
  • B is the sum of all abundances in x2.

For example, for sites x1 and x2:

x1,x2 y1 y2 y3 y4 y5
x1 2 1 0 5 2
x2 5 1 3 1 1


So:

  • W=2+1+0+1+1=5
  • A=2+1+0+5+0=8
  • B=5+1+3+1+2=12

D14(x1,x2)=12×58+12= D14(x1,x2)=0.5

32 / 111
(Dis)similarity measures: Jaccard's and Sørensen's coefficients

In R, you can use use the vegan::vegdist() function to calculate the Jaccard's and Sørensen's indices:

spe.D.Jac <- vegdist(spe,
method = "jaccard",
binary = TRUE)
spe.D.Sor <- vegdist(spe,
method = "bray",
binary = TRUE)

Because both Jaccard's and Sørensen's are only appropriate for presence-absence data, you must binary-transform abundance data using binary = TRUE in vegdist().

Bray-Curtis dissimilarity coefficient

To calculate the Bray-Curtis dissimilarity coefficient, which can account for abundances, you need to set binary = FALSE.

spe.db.pa <- vegdist(spe,
method = "bray",
binary = FALSE)
spe.db <- as.matrix(spe.db.pa)
33 / 111

(Dis)similarity measures: Mahalanobis distance

In many datasets, variables are correlated, and using a distance metric that ignores this can lead to misleading results.

The Mahalanobis distance takes into account the correlation between variables in the data.

D2Mahal=(xμ)TΣ1(xμ) D represents the Mahalanobis distance, x is the point you're measuring the distance from, μ is the mean vector of the distribution you're measuring against, and Σ is the covariance matrix of the distribution.

It is also robust to changes in the distribution of the data (it does not assume normality).

34 / 111

(Dis)similarity measures: Mahalanobis distance

In many datasets, variables are correlated, and using a distance metric that ignores this can lead to misleading results.

The Mahalanobis distance takes into account the correlation between variables in the data.

D2Mahal=(xμ)TΣ1(xμ) D represents the Mahalanobis distance, x is the point you're measuring the distance from, μ is the mean vector of the distribution you're measuring against, and Σ is the covariance matrix of the distribution.

It is also robust to changes in the distribution of the data (it does not assume normality).

In R, we can compute it as follows:

# Set the seed for reproducibility
set.seed(123)
# Generate random data from a multivariate normal distribution
data <- data.frame(x = rnorm(100, mean = 10, sd = 2),
y = rnorm(100, mean = 5, sd = 1),
z = rnorm(100, mean = 20, sd = 4))
# Calculate the Mahalanobis distance
mahalanobis_dist <- mahalanobis(data,
center = colMeans(data),
cov = cov(data))
34 / 111

(Dis)similarity measures: representation

We can create graphical depictions of association matrices using the coldiss() function:

coldiss(spe.D.Jac)

35 / 111

Transformations for community composition data

Communities sampled over homogeneous or short environmental conditions can have species compositions with few zeroes, so that Euclidean distances could be enough to characterize them.

Nevertheless, this is rarely the reality.

Species may be highly frequent when conditions are favourable, or may be absent from many sites. Sometimes, this skewness may introduce spurious problems to our analyses.

We may then have to transform our composition data to appropriately analyze it.

36 / 111

Transformations for community composition data

Communities sampled over homogeneous or short environmental conditions can have species compositions with few zeroes, so that Euclidean distances could be enough to characterize them.

Nevertheless, this is rarely the reality.

Species may be highly frequent when conditions are favourable, or may be absent from many sites. Sometimes, this skewness may introduce spurious problems to our analyses.

We may then have to transform our composition data to appropriately analyze it.

In R, we can rely on vegan::decostand() for many types of transformations.

Take a look into the help of this function to see the available options:

?decostand()

Let us see some of them.

36 / 111

Transformations: presence-absence

We can change the argument method to "pa" to transform our abundance data into presence-absence data:

If yij1, then, yij=1.

Let us recall our spe data set:

spe[1:6, 1:6]
# CHA TRU VAI LOC OMB BLA
# 1 0 3 0 0 0 0
# 2 0 5 4 3 0 0
# 3 0 5 5 5 0 0
# 4 0 4 5 5 0 0
# 5 0 2 3 2 0 0
# 6 0 3 4 5 0 0

Let us transform spe abundances to presence-absences:

spe.pa <- decostand(spe, method = "pa")
spe.pa[1:6, 1:6]
# CHA TRU VAI LOC OMB BLA
# 1 0 1 0 0 0 0
# 2 0 1 1 1 0 0
# 3 0 1 1 1 0 0
# 4 0 1 1 1 0 0
# 5 0 1 1 1 0 0
# 6 0 1 1 1 0 0
37 / 111

Transformations: species profiles

Sometimes, one wants to remove the effects of highly abundant units. We can transform the data into profiles of relative species abundances through the following equation:

yij=yijyi+

Let us recall our spe dataset:

spe[1:5, 1:6]
# CHA TRU VAI LOC OMB BLA
# 1 0 3 0 0 0 0
# 2 0 5 4 3 0 0
# 3 0 5 5 5 0 0
# 4 0 4 5 5 0 0
# 5 0 2 3 2 0 0

In decostand():

spe.total <- decostand(spe,
method = "total")
spe.total[1:5, 1:6]
# CHA TRU VAI LOC OMB BLA
# 1 0 1.00000000 0.00000000 0.00000000 0 0
# 2 0 0.41666667 0.33333333 0.25000000 0 0
# 3 0 0.31250000 0.31250000 0.31250000 0 0
# 4 0 0.19047619 0.23809524 0.23809524 0 0
# 5 0 0.05882353 0.08823529 0.05882353 0 0
38 / 111

yi+ is used to indicate the sample total count over all j=1,…,m species, for the ith sample.

Transformations: Hellinger

We can take the square-root of the species profile transformation and obtain the Hellinger transformation, which has very good mathematical properties and allows us to reduce the effects of yij values that are extremely large.

yij=yijyi+

Let us recall our spe dataset:

spe[1:5, 1:6]
# CHA TRU VAI LOC OMB BLA
# 1 0 3 0 0 0 0
# 2 0 5 4 3 0 0
# 3 0 5 5 5 0 0
# 4 0 4 5 5 0 0
# 5 0 2 3 2 0 0

In decostand():

spe.total <- decostand(spe,
method = "hellinger")
spe.total[1:5, 1:6]
# CHA TRU VAI LOC OMB BLA
# 1 0 1.0000000 0.0000000 0.0000000 0 0
# 2 0 0.6454972 0.5773503 0.5000000 0 0
# 3 0 0.5590170 0.5590170 0.5590170 0 0
# 4 0 0.4364358 0.4879500 0.4879500 0 0
# 5 0 0.2425356 0.2970443 0.2425356 0 0
39 / 111

Transformations: Standardization

Standardizing environmental variables is crucial as you cannot compare the effects of variables with different units:

## ?decostand
env.z <- decostand(env, method = "standardize")


This centres and scales the variables to make your downstream analysis more appropriate:

apply(env.z, 2, mean)
# das alt pen deb pH
# 1.000429e-16 1.814232e-18 -1.659010e-17 1.233099e-17 -4.096709e-15
# dur pho nit amm oxy
# 3.348595e-16 1.327063e-17 -8.925898e-17 -4.289646e-17 -2.886092e-16
# dbo
# 7.656545e-17
apply(env.z, 2, sd)
# das alt pen deb pH dur pho nit amm oxy dbo
# 1 1 1 1 1 1 1 1 1 1 1

More on this later!

40 / 111

Clustering

41 / 111

Clustering

One application of association matrices is clustering.

Clustering highlights structures in the data by partitioning either the objects or the descriptors.

One goal of ecologists could be to divide a set of sites into groups with respect to their environmental conditions or their community composition.

Its results can be represented as dendrograms (tree-like diagrams), which describe how closely observations are.

42 / 111

Hierarchical clustering methods

Clustering algorithms generally build on this series of steps:

  1. calculate an association matrix with all pair-wise similarities among all objects;
  2. Join together pairs of objects that are most similar (or dissimilar);
  3. Recompute the similarity matrix for that cluster versus all remaining objects;
  4. Repeat steps 2 and 3 until all objects are joined.
43 / 111

Hierarchical clustering methods

Clustering algorithms generally build on this series of steps:

  1. calculate an association matrix with all pair-wise similarities among all objects;
  2. Join together pairs of objects that are most similar (or dissimilar);
  3. Recompute the similarity matrix for that cluster versus all remaining objects;
  4. Repeat steps 2 and 3 until all objects are joined.


From the many existing hierarchical clustering algorithms, we will explore:

  1. Single linkage agglomerative clustering;
  2. Complete linkage, agglomerative clustering;
  3. Ward's minimum variance clustering.
43 / 111

Hierarchical clustering methods

Clustering algorithms generally build on this series of steps:

  1. calculate an association matrix with all pair-wise similarities among all objects;
  2. Join together pairs of objects that are most similar (or dissimilar);
  3. Recompute the similarity matrix for that cluster versus all remaining objects;
  4. Repeat steps 2 and 3 until all objects are joined.


From the many existing hierarchical clustering algorithms, we will explore:

  1. Single linkage agglomerative clustering;
  2. Complete linkage, agglomerative clustering;
  3. Ward's minimum variance clustering.


In R, we will use the functions hclust() and agnes() to build our dendrograms.

43 / 111

In single linkage agglomerative clustering (also called nearest neighbour sorting), the objects at the closest distances agglomerate. which often generates long thin clusters or chains of objects. Conversely, in complete linkage agglomerative clustering, an object agglomerates to a group only when linked to the furthest element of the group, which in turn links it to all members of that group. It will form many small separate groups, and is more appropriate to look for contrasts, discontinuities in the data. Ward's minimum variance clustering differ from these two methods in that it clusters objects into groups using the criterion of least squares (similar to linear models). Its dendogram shows squared distances by default. To compare with other methods, calculate the sqaure root of the distances first.

Complete linkage clustering

  • The objects divided into small groups (1-2, 3-4, 5)

  • Connect small groups using the largest distance between their elements

  • (1-3=0.15, 2-4=0.35, 2-3=0.6, select 0.6 to connect group 1-2 and 3-4)

44 / 111

Comparison

Create a distance matrix from Hellinger transformed Doubs river data and compute the single linkage clustering:

spe.dhe1 <- vegdist(spe.hel, method = "euclidean")
spe.dhe1.single <- hclust(spe.dhe1, method = "single")
plot(spe.dhe1.single)

45 / 111

Comparison

spe.dhe1 <- vegdist(spe.hel, method = "euclidean")
spe.dhe1.complete <- hclust(spe.dhe1, method = "complete")
plot(spe.dhe1.single, main="Single linkage clustering", hang =-1)
plot(spe.dhe1.complete, main="Complete linkage clustering", hang=-1)

Single linkage:

Chains of objects occur (e.g. 19,29,30,26)

Complete linkage: Contrasted groups are formed of objects occur

46 / 111

Ward's minimum variance method

  • Uses the criterion of least squares to cluster objects into groups

  • At each step, the pair of clusters merging is the one leading to the minimum increase in total within-group sum of squares

  • Clusters generated using this method tend to be more spherical and to contain similar number of objects

47 / 111

Ward's method

Compute the Ward's minimum variance clustering and plot the dendrogram by using the square root of the distances:

spe.dhel.ward <- hclust(spe.dhe1, method = "ward.D2")
spe.dhel.ward$height <- sqrt(spe.dhel.ward$height)
plot(spe.dhel.ward, hang = -1) # hang = -1 aligns objects at 0

Ward's clustering algorithm can only be applied with Euclidean distances. Bray-Curtis distances are not Euclidean, so they must be square-rooted first before used with Ward's algorithm.

48 / 111

Important questions to address

  • Choosing the 'appropriate' method depends on your objective;
  • Do you want to highlight gradients or contrasts?
  • This is not an statistical method and you may need to rely on complementary methods to support your hypotheses.

However, clustering allows us to:

  • Determine the optimal number of interpretable clusters;
  • Compute clustering statistics;
  • Combine clustering to ordination to distinguish groups of sites.
49 / 111

Now, what?

While cluster analysis looks for discontinuities in a dataset, ordination extracts the main trends in the form of continuous axes.

From now, we will look into four types of unconstrained ordination methods...

50 / 111

Now, what?

While cluster analysis looks for discontinuities in a dataset, ordination extracts the main trends in the form of continuous axes.

From now, we will look into four types of unconstrained ordination methods...

...wait, what do we mean about unconstrained ordinations? Anyone?

50 / 111

Now, what?

While cluster analysis looks for discontinuities in a dataset, ordination extracts the main trends in the form of continuous axes.

From now, we will look into four types of unconstrained ordination methods...

...wait, what do we mean about unconstrained ordinations? Anyone?
If no one speaks out, choose a "volunteer", presenter!

50 / 111

Now, what?

While cluster analysis looks for discontinuities in a dataset, ordination extracts the main trends in the form of continuous axes.

From now, we will look into four types of unconstrained ordination methods...

...wait, what do we mean about unconstrained ordinations? Anyone?
If no one speaks out, choose a "volunteer", presenter!


Unconstrained ordinations assess relationships within a single set of variables. No attempt is made to define the relationship between a set of independent variables and one or more dependent variables.

50 / 111

Now, what?

While cluster analysis looks for discontinuities in a dataset, ordination extracts the main trends in the form of continuous axes.

From now, we will look into four types of unconstrained ordination methods...

...wait, what do we mean about unconstrained ordinations? Anyone?
If no one speaks out, choose a "volunteer", presenter!


Unconstrained ordinations assess relationships within a single set of variables. No attempt is made to define the relationship between a set of independent variables and one or more dependent variables.

In other words, the interpretation of potential effects of other factors that generated observed patterns can only be made indirectly, because those factors are not explicitly included in the analyses.

50 / 111

Now, what?

While cluster analysis looks for discontinuities in a dataset, ordination extracts the main trends in the form of continuous axes.

From now, we will look into four types of unconstrained ordination methods...

...wait, what do we mean about unconstrained ordinations? Anyone?
If no one speaks out, choose a "volunteer", presenter!


Unconstrained ordinations assess relationships within a single set of variables. No attempt is made to define the relationship between a set of independent variables and one or more dependent variables.

In other words, the interpretation of potential effects of other factors that generated observed patterns can only be made indirectly, because those factors are not explicitly included in the analyses.

Here, we will explore:

Principal Component Analysis

Principal Coordinate Analysis

Correspondence Analysis

Non-Metric MultiDimensional Scaling

50 / 111

But first, let us recap...

We already understand the meaning of variance and mean, and how to calculate them:

σ2x=ni=1(xiμ)2n

μx=1nni=ixi

These are very useful to understand the centre and the dispersion of a given variable or dimension.

Nevertheless, we are often interested in more than one dimension, and want to measure how much of each dimension vary from the mean with respect to each other.

51 / 111

But first, let us recap...

We already understand the meaning of variance and mean, and how to calculate them:

σ2x=ni=1(xiμ)2n

μx=1nni=ixi

These are very useful to understand the centre and the dispersion of a given variable or dimension.

Nevertheless, we are often interested in more than one dimension, and want to measure how much of each dimension vary from the mean with respect to each other.

Covariance is such a measure, which can depict how two dimensions co-vary:

varx=σ2x=ni=1(xiμ)2n

covx,y=Ni=1(xiˉx)(yiˉy)N1

51 / 111

But first, let us recap...

Intuitively, we can measure the covariance between more than two variables. Let us say, between the variables x, y, and z:

covx,y=Ni=1(xiˉx)(yiˉy)N1

covx,z=Ni=1(xiˉx)(ziˉz)N1

covz,y=Ni=1(ziˉz)(yiˉy)N1 We can represent these calculations in a covariance matrix:

C(x,y,z)=[covx,xcovy,xcovz,xcovx,ycovy,ycovz,ycovx,zcovy,zcovz,z]

52 / 111

But first, let us recap...

Intuitively, we can measure the covariance between more than two variables. Let us say, between the variables x, y, and z:

covx,y=Ni=1(xiˉx)(yiˉy)N1

covx,z=Ni=1(xiˉx)(ziˉz)N1

covz,y=Ni=1(ziˉz)(yiˉy)N1 We can represent these calculations in a covariance matrix:

C(x,y,z)=[covx,xcovy,xcovz,xcovx,ycovy,ycovz,ycovx,zcovy,zcovz,z]

QUIZ TIME

What are the diagonals? And, what happens if variables are independent?

52 / 111

Still recapping...

What are the diagonals?

If, covx,y=Ni=1(xiˉx)(yiˉy)N1, then:

covx,x=Ni=1(xiˉx)(xiˉx)N=ni=1(xiˉx)2N=varx

53 / 111

Still recapping...

What are the diagonals?

If, covx,y=Ni=1(xiˉx)(yiˉy)N1, then:

covx,x=Ni=1(xiˉx)(xiˉx)N=ni=1(xiˉx)2N=varx So, that:

C(x,y,z)=[covx,xcovy,xcovz,xcovx,ycovy,ycovz,ycovx,zcovy,zcovz,z]=[varxcovy,xcovz,xcovx,yvarycovz,ycovx,zcovy,zvarz]


The covariance of a variable with itself is its variance!

53 / 111

Still recapping...

What happens if the variables are independent?

x <- rnorm(5000, mean = 0, sd = 1)
y <- rnorm(5000, mean = 0, sd = 1)
z <- rnorm(5000, mean = 0, sd = 1)
xyz <- data.frame(x, y, z)
GGally::ggpairs(xyz)

54 / 111

Still recapping...

What happens if the variables are independent?

x <- rnorm(5000, mean = 0, sd = 1)
y <- rnorm(5000, mean = 0, sd = 1)
z <- rnorm(5000, mean = 0, sd = 1)
xyz <- data.frame(x, y, z)
GGally::ggpairs(xyz)

cov(xyz)
# x y z
# x 0.98469 -0.00787 0.00880
# y -0.00787 1.00997 0.00286
# z 0.00880 0.00286 1.01408

If variables are perfectly independent (or uncorrelated), the covariance matrix C(x,y,z) is:

C(x,y,z)=[varx000vary000varz]

i.e. a covariance closer to 1 means that variables are colinear.

And, here, varx=vary=varz=1.

54 / 111

Linear transformations

We are often interested in observing variables in different ways.

In this process, we create a new variable, let us say xnew, by multiplying and/or adding the values of the original variable x by constants. For instance:

55 / 111

Linear transformations

We are often interested in observing variables in different ways.

In this process, we create a new variable, let us say xnew, by multiplying and/or adding the values of the original variable x by constants. For instance:

We can transform a variable of distances measured in kilometres dkm into miles, as: dmi=0.62×dkm

We can also transform Fahrenheit degrees to Celsius degrees as:

TC=0.5556×TFahr17.778

These are examples of linear transformations: the transformed variables are linearly related to the original variables and the shapes of the distribution are not changed.

55 / 111

Linear transformations

We are often interested in observing variables in different ways.

In this process, we create a new variable, let us say xnew, by multiplying and/or adding the values of the original variable x by constants. For instance:

We can transform a variable of distances measured in kilometres dkm into miles, as: dmi=0.62×dkm

We can also transform Fahrenheit degrees to Celsius degrees as:

TC=0.5556×TFahr17.778

These are examples of linear transformations: the transformed variables are linearly related to the original variables and the shapes of the distribution are not changed.

Two types of transformations are very important for us:

Centering, which subtracts the values of a predictor by the mean:

x=xiˉx

Scaling, which divides the predictor variables by their standard deviation:

x

???

Centering is basically a technique where the mean of independent variables is subtracted from all the values. It means all independent variables have zero mean. Scaling is similar to centering. Predictor variables are divided by their standard deviation.

The presenter should mention here that centering brings the mean to zero and scaling brings the standard deviation to one-unit. They also should mention that variables become comparable when scaling, as their unit is lost.

55 / 111

Eigendecomposition

Square matrices, such as the covariance matrix, can be decomposed into Eigenvalues and Eigenvectors.

For a square matrix, A_{n \times n}, a vector v is an Eigenvector of A, if there is a scalar, \lambda, for which:

A_{n \times n} v_{n \times 1} = \lambda v_{n \times 1}, or \left(\begin{matrix}a_{11}&\cdots&a_{1n}\\\vdots&\ddots&\vdots\\a_{1n}&\cdots&a_{nn}\\\end{matrix}\right)\left(\begin{matrix}v_1\\\vdots\\v_n\\\end{matrix}\right)=\lambda\left(\begin{matrix}v_1\\\vdots\\v_n\\\end{matrix}\right)

with the value of \lambda being the corresponding Eigenvalue.

56 / 111

Eigendecomposition

Square matrices, such as the covariance matrix, can be decomposed into Eigenvalues and Eigenvectors.

For a square matrix, A_{n \times n}, a vector v is an Eigenvector of A, if there is a scalar, \lambda, for which:

A_{n \times n} v_{n \times 1} = \lambda v_{n \times 1}, or \left(\begin{matrix}a_{11}&\cdots&a_{1n}\\\vdots&\ddots&\vdots\\a_{1n}&\cdots&a_{nn}\\\end{matrix}\right)\left(\begin{matrix}v_1\\\vdots\\v_n\\\end{matrix}\right)=\lambda\left(\begin{matrix}v_1\\\vdots\\v_n\\\end{matrix}\right)

with the value of \lambda being the corresponding Eigenvalue.

In other words, the matrix A effectively stretches the Eigenvector v by the amount specified by the Eigenvalue (scalar) \lambda.

An Eigenvector is a vector whose direction remains unchanged when a linear transformation is applied to it.


Wait! What do we mean by unchanged direction?

56 / 111

Eigendecomposition

Wait! What do we mean by unchanged direction?

Let us represent this with this simple example.

We can transform a square into a parallelogram using a single-axis shear transformation.

57 / 111

Eigendecomposition

Wait! What do we mean by unchanged direction?

Let us represent this with this simple example.

We can transform a square into a parallelogram using a single-axis shear transformation.

Let S be the square with vertices (0,0),\,(1,0),\,(1,1),\,(1,0) that will be shear-transformed to the P parallelogram with vertices (0,0),\,(1,0),\,(1,1.57),\,(1,0.57).

We can see that after the linear transformation, the purple arrow has not changed direction, i.e. it is an Eigenvector of S.

On the other hand, the red arrow changed direction, and thus is not an Eigenvector of S.

57 / 111

Eigendecomposition: implications

Keep up with the algebra torture!

Orthogonality

A fabulous and simple property of symmetric matrices that we can explain here!

Let us assume that x is an eigenvector of A corresponding to the eigenvalue λ_1 and y an eigenvector of A corresponding to the eigenvalue λ_2, with λ_1≠λ_2.

Ax=\lambda_1x \\ Ay=\lambda_2y

Let us multiply each one by the other transposed Eigenvector. y^{\intercal}Ax=\lambda_1y^{\intercal}x \\ x^{\intercal}A^{\intercal}y=\lambda_2x^{\intercal}y

58 / 111

Eigendecomposition: implications

Keep up with the algebra torture!

Orthogonality

A fabulous and simple property of symmetric matrices that we can explain here!

Let us assume that x is an eigenvector of A corresponding to the eigenvalue λ_1 and y an eigenvector of A corresponding to the eigenvalue λ_2, with λ_1≠λ_2.

Ax=\lambda_1x \\ Ay=\lambda_2y

Let us multiply each one by the other transposed Eigenvector. y^{\intercal}Ax=\lambda_1y^{\intercal}x \\ x^{\intercal}A^{\intercal}y=\lambda_2x^{\intercal}y

Now subtract the second equation from the first one and use the commutativity of the scalar product:

y^{\intercal}Ax-x^{\intercal}A^{\intercal}y=\lambda_1y^{\intercal}x - \lambda_2x^{\intercal}y \\ 0 = (\lambda_1 - \lambda_2)y^{\intercal}x

Because we know that \lambda_1-\lambda_2\neq 0, then y^{\intercal}x = 0, i.e., \mathbf{x}\perp\mathbf{y}, i.e. are orthogonal!



So, what does the Eigendecomposition of a variance-covariance matrix tell us?

58 / 111

Hi, Presenter. The explanation of this part is very useful and quite simple, so everyone can understand what orthogonality is. It is a matter of simple equation operations and subtractions.

Eigendecomposition: implications

Keep up with the algebra torture!

Maximization

If v_i' v_i = 1, then Av_i=\lambda_iv_i can be written as: v_i' A v_i = \lambda_i In fact, v' A v is the variance of a linear combination with weights in v, i.e. \text{Var}(v_i'\,A)=v_i'\,\text{Var}(A)\,v_i.

Hence, we can connect the dots!

59 / 111

Eigendecomposition: implications

Keep up with the algebra torture!

Maximization

If v_i' v_i = 1, then Av_i=\lambda_iv_i can be written as: v_i' A v_i = \lambda_i In fact, v' A v is the variance of a linear combination with weights in v, i.e. \text{Var}(v_i'\,A)=v_i'\,\text{Var}(A)\,v_i.

Hence, we can connect the dots!

Remember that the Eigenvalues in our variance-covariance matrix A are directly related to the variance!

Thus, to find a vector v that maximizes the variance, v' A v, all we must do is to choose the Eigenvector corresponding to the largest Eigenvalue \lambda_i!

So that the maximum variance is \lambda_1!

59 / 111

Eigendecomposition: implications

Keep up with the algebra torture!

Maximization

If v_i' v_i = 1, then Av_i=\lambda_iv_i can be written as: v_i' A v_i = \lambda_i In fact, v' A v is the variance of a linear combination with weights in v, i.e. \text{Var}(v_i'\,A)=v_i'\,\text{Var}(A)\,v_i.

Hence, we can connect the dots!

Remember that the Eigenvalues in our variance-covariance matrix A are directly related to the variance!

Thus, to find a vector v that maximizes the variance, v' A v, all we must do is to choose the Eigenvector corresponding to the largest Eigenvalue \lambda_i!

So that the maximum variance is \lambda_1!

The explained variance of each Eigenvector obeys the order: \lambda_1 > \lambda_2 > \dots > \lambda_k.

This allows us to condense a larger number of original variables into a smaller set of selected vectors with minimal loss of information (i.e. dimensionallity reduction).

59 / 111

Unconstrained ordination methods

This is a good startpoint to set us in the direction of the unconstrained ordination methods we will study today!

They allow us to:

  • Assess relationships within a set of variables (species or environmental variables);

  • Find key components of variation among samples, sites, species;

  • Reduce the number of dimensions in multivariate data while limiting substantial loss of information;

  • Create new variables for use in subsequent analyses.

Here, we will learn:

  1. Principal Component Analysis;

  2. Principal Coordinate Analysis;

  3. Non-Metric Multidimensional Scaling;

60 / 111

Principal Component Analysis

The Principal Component Analysis (PCA) is a linear dimensionality-reduction technique, i.e. it reduces strongly correlated data.

In a nutshell, the PCA linearly transforms the feature from the original space to a new feature space, containing principal components that explain most of the variance in the dataset, i.e. maximize the separation between the data.

61 / 111

Principal Component Analysis

The Principal Component Analysis (PCA) is a linear dimensionality-reduction technique, i.e. it reduces strongly correlated data.

In a nutshell, the PCA linearly transforms the feature from the original space to a new feature space, containing principal components that explain most of the variance in the dataset, i.e. maximize the separation between the data.

The principal component space can be written as:

Z_p = ∑_{j=1}^p ϕ_j * X_j where,

  1. Z_p is the principal component p;
  2. ϕ_j is the loading vector comprising the j loadings for the p principal component, i.e. the coefficients of the linear combination of the original variables from which the principal components are constructed;
  3. X_j is the normalized predictors, i.e. with means equal to zero and standard deviations equal to one.
61 / 111

That is: the reconstruction for the data can be given by a simple linear complination of the components.

Future versions of this should include a figure (such as a heatmap) with the original standardized data on a left side, and on the right side, a heatmap of the loadings times a heatmap of the components (which can be thought as equal to the sum of rank-one matrices) being equal to the heatmap of the reconstructed data.

Principal Component Analysis

PCA can be computed in at least four different ways.

For the sake of simplicity, we will focus here on how to obtain principal components from a correlation matrix.

We will learn how to do it from "scratch" and then how to use R packages to compute the principal components.

62 / 111

Principal Component Analysis: step-by-step

  1. Starting point: a matrix Y of n observations and p normally distributed continuous variables;

In R, from scratch!

data(varechem)
str(varechem)
# 'data.frame': 24 obs. of 14 variables:
# $ N : num 19.8 13.4 20.2 20.6 23.8 22.8 26.6 24.2 29.8 28.1 ...
# $ P : num 42.1 39.1 67.7 60.8 54.5 40.9 36.7 31 73.5 40.5 ...
# $ K : num 140 167 207 234 181 ...
# $ Ca : num 519 357 973 834 777 ...
# $ Mg : num 90 70.7 209.1 127.2 125.8 ...
# $ S : num 32.3 35.2 58.1 40.7 39.5 40.8 33.8 27.1 42.5 60.2 ...
# $ Al : num 39 88.1 138 15.4 24.2 ...
# $ Fe : num 40.9 39 35.4 4.4 3 ...
# $ Mn : num 58.1 52.4 32.1 132 50.1 ...
# $ Zn : num 4.5 5.4 16.8 10.7 6.6 9.1 7.4 5.2 9.3 9.1 ...
# $ Mo : num 0.3 0.3 0.8 0.2 0.3 0.4 0.3 0.3 0.3 0.5 ...
# $ Baresoil: num 43.9 23.6 21.2 18.7 46 40.5 23 29.8 17.6 29.9 ...
# $ Humdepth: num 2.2 2.2 2 2.9 3 3.8 2.8 2 3 2.2 ...
# $ pH : num 2.7 2.8 3 2.8 2.7 2.7 2.8 2.8 2.8 2.8 ...
63 / 111

Principal Component Analysis: step-by-step

  1. Starting point: a matrix Y of n observations and p normally distributed continuous variables;

In R, from scratch!

data(varechem)
# Step 1
Y <- varechem[, 1:2]
head(Y)
# N P
# 18 19.8 42.1
# 15 13.4 39.1
# 24 20.2 67.7
# 27 20.6 60.8
# 23 23.8 54.5
# 19 22.8 40.9
64 / 111

Principal Component Analysis: step-by-step

  1. Starting point: a matrix Y of n observations and p normally distributed continuous variables;

  2. Standardizing observations, as in Y_{std} = \frac{y_i - \bar{y}}{\sigma_y}; which is the same as centring, as in y_c = [y_i - \bar{y}], and then scaling, as in y_s = \frac{y_i}{\sigma_y};

In R, from scratch!

data(varechem)
# Step 1
Y <- varechem[, 1:2]
# Step 2
Y_std <- as.matrix(scale(Y))
head(Y_std)
# N P
# 18 -0.46731082 -0.1993234
# 15 -1.62503567 -0.4000407
# 24 -0.39495301 1.5134640
# 27 -0.32259521 1.0518143
# 23 0.25626722 0.6303080
# 19 0.07537271 -0.2796103
round(apply(Y_std, 2, mean))
# N P
# 0 0
round(apply(Y_std, 2, sd))
# N P
# 1 1
65 / 111

The presenter may recall here or even ask the participants what the standardization is doing.

Principal Component Analysis: step-by-step

  1. Starting point: a matrix Y of n observations and p normally distributed continuous variables;

  2. Standardizing observations, as in Y_{std} = \frac{y_i - \bar{y}}{\sigma_y}; which is the same as centring, as in y_c = [y_i - \bar{y}], and then scaling, as in y_s = \frac{y_i}{\sigma_y};

  3. Compute the variance-covariance matrix R = cov(Y_{std});

In R, from scratch!

data(varechem)
# Step 1
Y <- varechem[, 1:2]
# Step 2
Y_std <- as.matrix(scale(Y))
# Step 3
(Y_R <- cov(Y_std))
# N P
# N 1.0000000 -0.2511603
# P -0.2511603 1.0000000
66 / 111

Principal Component Analysis: step-by-step

  1. Starting point: a matrix Y of n observations and p normally distributed continuous variables;

  2. Standardizing observations, as in Y_{std} = \frac{y_i - \bar{y}}{\sigma_y}; which is the same as centring, as in y_c = [y_i - \bar{y}], and then scaling, as in y_s = \frac{y_i}{\sigma_y};

  3. Compute the variance-covariance matrix R = cov(Y_{std});

  4. Perform the Eigendecomposition of the covariance matrix to obtain the matrix U of Eigenvectors, containing the Principal Components;

In R, from scratch!

data(varechem)
# Step 1
Y <- varechem[, 1:2]
# Step 2
Y_std <- as.matrix(scale(Y))
# Step 3
Y_R <- cov(Y_std)
# Step 4
(Eigenvalues <- eigen(Y_R)$values)
# [1] 1.2511603 0.7488397
(Eigenvectors <- eigen(Y_R)$vectors)
# [,1] [,2]
# [1,] -0.7071068 -0.7071068
# [2,] 0.7071068 -0.7071068
67 / 111

Principal Component Analysis: step-by-step

The Eigenvectors here are the Principal Components, and as we have seen, each Eigenvector has its corresponding Eigenvalue.

We can represent the distances from the observations to the first Eigenvector (PC1, in red).

The first principal component is drawn so that the variation of the values along its line is maximal.

The arrows on the principal components are obtained by multiplying their Eigenvalues by the Eigenvectors.

68 / 111

In this first representation, we can observe that a first direction (or the first linear component) is drawn attempting to maximize the variance of the data.

Participants may ask you how different this is from a linear regression. One key difference is on the how the error squares are inimized perpendicularly to the straight line (90 degrees; making it orthogonal), while in the linear regression, the error squares are minimized towards the y-direction.

Principal Component Analysis: step-by-step

The Eigenvectors here are the Principal Components, and as we have seen, each Eigenvector has its corresponding Eigenvalue.

We can then represent the distances from the observations to the second Eigenvector (PC2, in orange).

The second principal component is also drawn maximizing the variance of the data.

Note how the principal components are orthogonal!

69 / 111

Principal Component Analysis: step-by-step

The Eigenvectors here are the Principal Components, and as we have seen, each Eigenvector has its corresponding Eigenvalue.

We can then represent the distances from the observations to the second Eigenvector (PC2, in orange).

The second principal component is also drawn maximizing the variance of the data.

Note how the principal components are orthogonal!

We represented the Eigenvectors, i.e. the principal components!

But, what is the use of the Eigenvalues?

69 / 111

Here, the a second direction (or the second linear component) is drawn such as that the variance of the data is maximized with respect to this second component.

Principal Component Analysis: step-by-step

We have seen that the Eigenvalues represent the magnitude (the variance) in the principal components.

In fact, the sum of all Eigenvalues is equal to the sum of variances, which are represented on the diagonal of the variance-covariance matrix.

sum(diag(cov(Y_std)))
# [1] 2
sum(eigen(cov(Y_std))$values)
# [1] 2
70 / 111

Principal Component Analysis: step-by-step

We have seen that the Eigenvalues represent the magnitude (the variance) in the principal components.

In fact, the sum of all Eigenvalues is equal to the sum of variances, which are represented on the diagonal of the variance-covariance matrix.

sum(diag(cov(Y_std)))
# [1] 2
sum(eigen(cov(Y_std))$values)
# [1] 2

Intuitively, one can obtain the relative influence of each Eigenvector v_{k} (or \text{PC}_{k}) by dividing their values by the sum of all Eigenvalues.

\text{Explained variance of}~v_{k} = \frac{\lambda_{v_k}}{\sum^p_{i=1}{\lambda_{v}}}

By doing this, we can say that the \text{PC}1 explains 63% of the variance in the data, while \text{PC}2 explains 37% of the variance.

70 / 111

Principal Component Analysis: step-by-step

We have seen that the Eigenvalues represent the magnitude (the variance) in the principal components.

In fact, the sum of all Eigenvalues is equal to the sum of variances, which are represented on the diagonal of the variance-covariance matrix.

sum(diag(cov(Y_std)))
# [1] 2
sum(eigen(cov(Y_std))$values)
# [1] 2

Intuitively, one can obtain the relative influence of each Eigenvector v_{k} (or \text{PC}_{k}) by dividing their values by the sum of all Eigenvalues.

\text{Explained variance of}~v_{k} = \frac{\lambda_{v_k}}{\sum^p_{i=1}{\lambda_{v}}}

By doing this, we can say that the \text{PC}1 explains 63% of the variance in the data, while \text{PC}2 explains 37% of the variance.

Finally, we can proceed to the last step of our computation of principal components!

70 / 111

Principal Component Analysis: step-by-step

  1. Starting point: a matrix Y of n observations and p normally distributed continuous variables;

  2. Standardizing observations, as in Y_{std} = \frac{y_i - \bar{y}}{\sigma_y}; which is the same as centring, as in y_c = [y_i - \bar{y}], and then scaling, as in y_s = \frac{y_i}{\sigma_y};

  3. Compute the variance-covariance matrix R = cov(Y_{std});

  4. Perform the Eigendecomposition of the covariance matrix to obtain the matrix U of Eigenvectors, containing the Principal Components;

  5. Obtain the feature space by multiplying U with the standardized matrix Y_{std}, i.e. the score matrix F.

In R, from scratch!

# Step 1
Y <- varechem[, 1:2]
# Step 2
Y_std <- as.matrix(scale(Y))
# Step 3
Y_R <- cov(Y_std)
# Step 4
Eigenvalues <- eigen(Y_R)$values
Eigenvectors <- eigen(Y_R)$vectors
# Step 5
F_PrComps <- Y_std %*% Eigenvectors
head(F_PrComps)
# [,1] [,2]
# 18 0.1894957 0.4713816
# 15 0.8662023 1.4319452
# 24 1.3494546 -0.7909067
# 27 0.9718543 -0.5156358
# 23 0.2644868 -0.6269034
# 19 -0.2510109 0.1444178
71 / 111

Principal Component Analysis: step-by-step

The score matrix, F, (object F_PrComps) allows one to rotate the new data space, so it is represented in relation to the principal components.

\text{N} ~ \text{P}

\text{PC}1 ~ \text{PC}2

72 / 111

The axis labels are not being printed. I included them at the top of each plot while this is not fixed.

The presenter here should emphasize the rotation, and speak about what the scores are. You can hover the points and show them what was the position of the points in the "de-rotated" plots, and now in the "rotated" ones, highlighting that now, the "new axes" are PC1 and PC2.

This understand will be useful when the participants are going to use the PCA functions that are implemented in R.

Principal Component Analysis: step-by-step

PCA can also be computed using the stats::prcomp(), stats::princomp(), vegan::rda(), and ade4::dudi.pca() functions.

How our PCA from scratch compares

data(varechem)
Y <- varechem[, 1:2]
Y_std <- as.matrix(scale(Y))
Y_R <- cov(Y_std)
Eigenvalues <- eigen(Y_R)$values
Eigenvectors <- eigen(Y_R)$vectors
F_PrComps <- Y_std %*% Eigenvectors
head(F_PrComps)
# [,1] [,2]
# 18 0.1894957 0.4713816
# 15 0.8662023 1.4319452
# 24 1.3494546 -0.7909067
# 27 0.9718543 -0.5156358
# 23 0.2644868 -0.6269034
# 19 -0.2510109 0.1444178

to stats::prcomp()?

PCA_prcomp <- prcomp(Y,
center = TRUE,
scale = TRUE)
# or PCA_prcomp <- prcomp(Y_std)
head(PCA_prcomp$x)
# PC1 PC2
# 18 -0.1894957 -0.4713816
# 15 -0.8662023 -1.4319452
# 24 -1.3494546 0.7909067
# 27 -0.9718543 0.5156358
# 23 -0.2644868 0.6269034
# 19 0.2510109 -0.1444178
73 / 111

Principal Component Analysis: step-by-step

PCA can also be computed using the stats::prcomp(), stats::princomp(), vegan::rda(), and ade4::dudi.pca() functions.

How our PCA from scratch compares

data(varechem)
Y <- varechem[, 1:2]
Y_std <- as.matrix(scale(Y))
Y_R <- cov(Y_std)
Eigenvalues <- eigen(Y_R)$values
Eigenvectors <- eigen(Y_R)$vectors
F_PrComps <- Y_std %*% Eigenvectors
head(F_PrComps)
# [,1] [,2]
# 18 0.1894957 0.4713816
# 15 0.8662023 1.4319452
# 24 1.3494546 -0.7909067
# 27 0.9718543 -0.5156358
# 23 0.2644868 -0.6269034
# 19 -0.2510109 0.1444178

to stats::princomp()?

PCA_princomp <- princomp(Y_std)
head(PCA_princomp$scores)
# Comp.1 Comp.2
# 18 -0.1894957 -0.4713816
# 15 -0.8662023 -1.4319452
# 24 -1.3494546 0.7909067
# 27 -0.9718543 0.5156358
# 23 -0.2644868 0.6269034
# 19 0.2510109 -0.1444178
74 / 111

Principal Component Analysis: step-by-step

PCA can also be computed using the stats::prcomp(), stats::princomp(), vegan::rda(), and ade4::dudi.pca() functions.

How our PCA from scratch compares

data(varechem)
Y <- varechem[, 1:2]
Y_std <- as.matrix(scale(Y))
Y_R <- cov(Y_std)
Eigenvalues <- eigen(Y_R)$values
Eigenvectors <- eigen(Y_R)$vectors
F_PrComps <- Y_std %*% Eigenvectors
head(F_PrComps)
# [,1] [,2]
# 18 0.1894957 0.4713816
# 15 0.8662023 1.4319452
# 24 1.3494546 -0.7909067
# 27 0.9718543 -0.5156358
# 23 0.2644868 -0.6269034
# 19 -0.2510109 0.1444178

to vegan::rda()?

PCA_vegan_rda <- rda(Y_std)
scores(PCA_vegan_rda,
display = "sites",
scaling = 1,
choices = seq_len(PCA_vegan_rda$CA$rank),
const = sqrt(PCA_vegan_rda$tot.chi * (nrow(PCA_vegan_rda$CA$u) - 1)))[1:5, ]
# PC1 PC2
# 18 -0.1894957 -0.4713816
# 15 -0.8662023 -1.4319452
# 24 -1.3494546 0.7909067
# 27 -0.9718543 0.5156358
# 23 -0.2644868 0.6269034

vegan::rda() is a bit special. It uses alternative scalings. We will not cover them here, but you can study the vignette("decision-vegan").

75 / 111

Tell participants that the name rda refers to a diferent type of constrained ordination technique, but that if we run rda() with just one variable, it will execute a PCA.

Principal Component Analysis

We have implemented PCA on a two-variables dataset, for simplicity.

Let us advance and apply it to our fish species dataset.

For this, we will use the vegan::rda() function on the Hellinger-transformed fish data and summarise the results:

spe.h.pca <- rda(spe.hel)
# summary(spe.h.pca)
76 / 111

Principal Component Analysis

The first lines of summary.rda() tell us about the Total variance and Unconstrained variance in our model.

# [1] "Partitioning of variance:" " Inertia Proportion"
# [3] "Total 0.5023 1" "Unconstrained 0.5023 1"
77 / 111

Principal Component Analysis

The first lines of summary.rda() tell us about the Total variance and Unconstrained variance in our model.

# [1] "Partitioning of variance:" " Inertia Proportion"
# [3] "Total 0.5023 1" "Unconstrained 0.5023 1"
# [1] "Importance of components:"
# [2] " PC1 PC2 PC3 PC4 PC5 PC6 PC7"
# [3] "Eigenvalue 0.2491 0.06592 0.04615 0.03723 0.02125 0.01662 0.01477"
# [4] "Proportion Explained 0.4959 0.13122 0.09188 0.07412 0.04230 0.03309 0.02940"
# [5] "Cumulative Proportion 0.4959 0.62715 0.71903 0.79315 0.83544 0.86853 0.89794"
# [6] " PC14 PC15 PC16 PC17 PC18 PC19"
# [7] "Eigenvalue 0.002612 0.001505 0.001387 0.001037 0.0007815 0.0004749"
# [8] "Proportion Explained 0.005200 0.002996 0.002761 0.002065 0.0015557 0.0009454"
# [9] "Cumulative Proportion 0.987229 0.990225 0.992986 0.995051 0.9966069 0.9975523"

This is followed by the Eigenvalues, and their contribution to the variance.

In fact, if we sum all our Eigenvalues, we will obtain the amount of uncostrained variance explained by the analysis!

sum(spe.h.pca$CA$eig)
# [1] 0.5023429
77 / 111

Since we have not constrained our ordination, the proportional unconstrained variance is equal to the total variance.

Take a moment to explain the proportion explained, and show that the cummulative proportion will equal to 1 at the 27th PC.

Principal Component Analysis

The next information is related to the scaling, to the species scores, and to the site scores.

# [1] "Eigenvalue 0.0004263 0.0002812 0.0002188 0.0001382 0.0000876"
# [2] "Proportion Explained 0.0008487 0.0005598 0.0004356 0.0002752 0.0001744"
# [3] "Cumulative Proportion 0.9984010 0.9989608 0.9993965 0.9996717 0.9998460"
# [4] " PC25 PC26 PC27"
# [5] "Proportion Explained 1.062e-04 2.938e-05 1.835e-05"
# [6] "Cumulative Proportion 1.000e+00 1.000e+00 1.000e+00"
# [7] "Scaling 2 for species and site scores"
# [8] "* Species are scaled proportional to eigenvalues"
# [9] "* Sites are unscaled: weighted dispersion equal on all dimensions"
# [10] "* General scaling constant of scores: 1.953663 "
# [11] ""
# [12] ""
# [13] "Species scores"
# [14] "BCO -0.20055 0.08332 -0.074787 -0.0504875 0.073890 0.0249842"
# [15] "PCH -0.14626 0.05268 -0.072012 -0.0432572 0.050318 0.0178776"
# [16] "GAR -0.35085 -0.09353 0.198664 0.0178669 0.023796 -0.0971362"
# [17] "BBO -0.24167 0.03598 -0.079528 -0.0339049 0.096690 0.0620979"
# [18] "ABL -0.42269 -0.22879 0.007158 0.1128353 0.006759 0.1248913"
# [19] "ANG -0.20521 0.11557 -0.072060 -0.0159902 0.072030 -0.0003801"
# [20] ""
# [21] ""
# [22] "Site scores (weighted sums of species scores)"

]

78 / 111

Principal Component Analysis

Species refer to your descriptors (i.e. the columns in your dataset), which here are the fish species.

Scores refer to the position of every species along the principal components.

# [1] "Cumulative Proportion 1.000e+00 1.000e+00 1.000e+00"
# [2] "Scaling 2 for species and site scores"
# [3] "* Species are scaled proportional to eigenvalues"
# [4] "* Sites are unscaled: weighted dispersion equal on all dimensions"
# [5] "* General scaling constant of scores: 1.953663 "
# [6] ""
# [7] ""
# [8] "Species scores"
79 / 111

Principal Component Analysis

Species refer to your descriptors (i.e. the columns in your dataset), which here are the fish species.

Scores refer to the position of every species along the principal components.

# [1] "Cumulative Proportion 1.000e+00 1.000e+00 1.000e+00"
# [2] "Scaling 2 for species and site scores"
# [3] "* Species are scaled proportional to eigenvalues"
# [4] "* Sites are unscaled: weighted dispersion equal on all dimensions"
# [5] "* General scaling constant of scores: 1.953663 "
# [6] ""
# [7] ""
# [8] "Species scores"


# [1] "PCH -0.14626 0.05268 -0.072012 -0.0432572 0.050318 0.0178776"
# [2] "GAR -0.35085 -0.09353 0.198664 0.0178669 0.023796 -0.0971362"
# [3] "BBO -0.24167 0.03598 -0.079528 -0.0339049 0.096690 0.0620979"
# [4] "ABL -0.42269 -0.22879 0.007158 0.1128353 0.006759 0.1248913"
# [5] "ANG -0.20521 0.11557 -0.072060 -0.0159902 0.072030 -0.0003801"
# [6] ""
# [7] ""
# [8] "Site scores (weighted sums of species scores)"

Sites represent the rows in your dataset, which here are the different sites along the Doubs river.

79 / 111

Principal Component Analysis

Species refer to your descriptors (i.e. the columns in your dataset), which here are the fish species.

Scores refer to the position of every species along the principal components.

# [1] "Cumulative Proportion 1.000e+00 1.000e+00 1.000e+00"
# [2] "Scaling 2 for species and site scores"
# [3] "* Species are scaled proportional to eigenvalues"
# [4] "* Sites are unscaled: weighted dispersion equal on all dimensions"
# [5] "* General scaling constant of scores: 1.953663 "
# [6] ""
# [7] ""
# [8] "Species scores"


# [1] "PCH -0.14626 0.05268 -0.072012 -0.0432572 0.050318 0.0178776"
# [2] "GAR -0.35085 -0.09353 0.198664 0.0178669 0.023796 -0.0971362"
# [3] "BBO -0.24167 0.03598 -0.079528 -0.0339049 0.096690 0.0620979"
# [4] "ABL -0.42269 -0.22879 0.007158 0.1128353 0.006759 0.1248913"
# [5] "ANG -0.20521 0.11557 -0.072060 -0.0159902 0.072030 -0.0003801"
# [6] ""
# [7] ""
# [8] "Site scores (weighted sums of species scores)"

Sites represent the rows in your dataset, which here are the different sites along the Doubs river.


This information can be obtained with the score() function that we used before:

scores(spe.h.pca,
display = "species" or "sites")
79 / 111

Principal Component Analysis: condensing data

Here, we have 27 principal components. However, we can apply algorithms to select the lowest number of principal components that still account for a large variance in the data.

80 / 111

Principal Component Analysis: condensing data

Here, we have 27 principal components. However, we can apply algorithms to select the lowest number of principal components that still account for a large variance in the data.

Kaiser-Guttman criterion

We can select the principal components that capture more variance than the average explanation of all principal components. We do this by:

  1. Extracting the Eigenvalues associated to the principal components;

  2. Subsetting the Eigenvalues above the mean Eigenvalue:

ev <- spe.h.pca$CA$eig
# ev[ev > mean(ev)]
n <- length(ev)
barplot(ev, main = "", col = "grey", las = 2)
abline(h = mean(ev), col = "red3", lwd = 2)
legend("topright", "Average eigenvalue",
lwd = 2, col = "red3" , bty = "n")

80 / 111

Principal Component Analysis: condensing data

Here, we have 27 principal components. However, we can apply algorithms to select the lowest number of principal components that still account for a large variance in the data.

Broken-stick model

The broken-stick model retains components that explain more variance than would be expected by randomly dividing the variance into p parts.

head(bstick(spe.h.pca))
# PC1 PC2 PC3 PC4 PC5 PC6
# 0.07240169 0.05379640 0.04449375 0.03829199 0.03364067 0.02991961
screeplot(spe.h.pca,
bstick = TRUE, type = "lines")

81 / 111

Principal Component Analysis

All that is left is to discuss scaling and to visualize our results.

Let us practice and compute a PCA on the standardized environmental variables for the same dataset.

env.pca <- rda(env.z)
# summary(env.pca, scaling = 2)
82 / 111

Principal Component Analysis

All that is left is to discuss scaling and to visualize our results.

Let us practice and compute a PCA on the standardized environmental variables for the same dataset.

env.pca <- rda(env.z)
# summary(env.pca, scaling = 2)

Determine our subset of Eigenvalues and their corresponding Eigenvectors:

ev <- env.pca$CA$eig
ev[ev>mean(ev)]
# PC1 PC2 PC3
# 5.968749 2.163818 1.065164
82 / 111

Principal Component Analysis

All that is left is to discuss scaling and to visualize our results.

Let us practice and compute a PCA on the standardized environmental variables for the same dataset.

env.pca <- rda(env.z)
# summary(env.pca, scaling = 2)

Determine our subset of Eigenvalues and their corresponding Eigenvectors:

ev <- env.pca$CA$eig
ev[ev>mean(ev)]
# PC1 PC2 PC3
# 5.968749 2.163818 1.065164

82 / 111

Principal Component Analysis: plot()

The information computed by the PCA can be represented with biplots.

We can produce a quick and dirty biplot of the PCA using the function plot() in base R.

plot(spe.h.pca)

83 / 111

Principal Component Analysis: biplot()

biplot() from base R allows for a better interpretation.

biplot(spe.h.pca)

The arrows are plotted to show the directionality and angle of the descriptors in the ordination.

Descriptors at 180 degrees of each other are negatively correlated;

Descriptors at 90 degrees of each other have zero correlation;

Descriptors at 0 degrees of each other are positively correlated.

84 / 111

Principal Component Analysis: Scaling

Type 2 scaling (default): distances among objects are not approximations of Euclidean distances; angles between descriptor (species) vectors reflect their correlations.

Type 1 scaling: attempts to preserve the Euclidean distance (in multidimensional space) among objects (sites): the angles among descriptor (species) vector are not meaningful.

biplot(spe.h.pca, scaling = 1)

biplot(spe.h.pca, scaling = 2)

85 / 111

2: Best for interpreting relationships among descriptors (species)!

1: Best for interpreting relationships among objects (sites)!

Challenge #2

Using everything you have learned, compute a PCA on the mite species abundance data

data(mite)

Be ready to discuss and answer:

  • What are the most relevant principal components, i.e. subset them?
  • Which groups of sites can you identify?
  • Which groups of species are related to these groups of sites?
86 / 111

Challenge #2: Solution

Compute PCA on the Hellinger-transformed species data

mite.spe.hel <- decostand(mite,
method = "hellinger")
mite.spe.h.pca <- rda(mite.spe.hel)
87 / 111

Challenge #2: Solution

Compute PCA on the Hellinger-transformed species data

mite.spe.hel <- decostand(mite,
method = "hellinger")
mite.spe.h.pca <- rda(mite.spe.hel)

Apply the Kaiser-Guttman criterion

ev <- mite.spe.h.pca$CA$eig
ev[ev>mean(ev)]
n <- length(ev)
barplot(ev, main = "Eigenvalues",
col = "grey", las = 2)
abline(h = mean(ev),
col = "red3", lwd = 2)
legend("topright",
"Average eigenvalue",
lwd = 2,
col = "red3", bty = "n")

87 / 111

Challenge #2: Solution

biplot(mite.spe.h.pca,
col = c("red3", "grey15"))

88 / 111

Principal Coordinates Analysis

The PCoA is similar in spirit to PCA, but it takes dissimilarities as input data!

It was previously called as Classical Multidimensional Scaling (MDS) and it aims at faithfully representing distances with the lowest possible dimensional space.

It begins with the (i) computation of a distance matrix for the p elements, then (ii) the centering of the matrix by rows and columns, and finally, the (iii) Eigendecomposition of the centered distance matrix.

89 / 111

Principal Coordinates Analysis

The PCoA is similar in spirit to PCA, but it takes dissimilarities as input data!

It was previously called as Classical Multidimensional Scaling (MDS) and it aims at faithfully representing distances with the lowest possible dimensional space.

It begins with the (i) computation of a distance matrix for the p elements, then (ii) the centering of the matrix by rows and columns, and finally, the (iii) Eigendecomposition of the centered distance matrix.

To compute a PCoA, we can use the cmdscale() or the pcoa()functions from the stats and ape packages:

Run a PCoA on the Hellinger distance-transformed fish dataset:

library(ape)
spe.h.pcoa <- pcoa(dist(spe.hel))
summary(spe.h.pcoa)
# Length Class Mode
# correction 2 -none- character
# note 1 -none- character
# values 5 data.frame list
# vectors 810 -none- numeric
# trace 1 -none- numeric
89 / 111

Principal Coordinates Analysis

head(spe.h.pcoa$values)
# Eigenvalues Relative_eig Broken_stick Cumul_eig Cumul_br_stick
# 1 7.2247720 0.49593627 0.14412803 0.4959363 0.1441280
# 2 1.9115366 0.13121526 0.10709099 0.6271515 0.2512190
# 3 1.3384761 0.09187817 0.08857247 0.7190297 0.3397915
# 4 1.0797364 0.07411728 0.07622679 0.7931470 0.4160183
# 5 0.6161946 0.04229798 0.06696753 0.8354450 0.4829858
# 6 0.4820459 0.03308949 0.05956013 0.8685345 0.5425459
90 / 111

Principal Coordinates Analysis

We can also see the Eigenvectors associated to each Eigenvalue containing the coordinates in the Euclidean space for each site.

head(spe.h.pcoa$vectors)[, 1:5]
# Axis.1 Axis.2 Axis.3 Axis.4 Axis.5
# 1 -0.510156945 0.34756659 0.60308559 -0.3105433 -0.197553521
# 2 -0.697567847 0.04025083 0.10420566 -0.2256121 0.163849922
# 3 -0.639279415 -0.02078733 0.03988954 -0.2636284 0.130319412
# 4 -0.411968300 -0.13472113 -0.14290276 -0.2900746 0.003953021
# 5 0.005052046 -0.09541567 -0.30540211 -0.2852990 -0.319905818
# 6 -0.292971968 -0.11423456 -0.31865809 -0.2359718 -0.055670665
91 / 111

Principal Coordinates Analysis: biplot.pcoa()

We can display, the distances between sites using the biplot.pcoa() function, as well as represent the species associated to each site.

biplot.pcoa(spe.h.pcoa, spe.hel)

92 / 111

Principal Coordinates Analysis: non-metric distances

PCoA can also be used to capture information contained in non-metric distances, such as the popular Bray-Curtis distance. Let us give it a try:

spe.bray.pcoa <- pcoa(spe.db.pa)
spe.bray.pcoa$values$Eigenvalues
# [1] 3.695332e+00 1.197352e+00 8.465325e-01 5.253896e-01 4.147227e-01
# [6] 2.977969e-01 1.913094e-01 1.554757e-01 1.307617e-01 1.272105e-01
# [11] 8.550928e-02 4.357746e-02 3.862321e-02 2.727686e-02 1.303318e-02
# [16] 6.704839e-03 3.817779e-03 1.300463e-03 0.000000e+00 -3.597737e-05
# [21] -4.167639e-03 -8.970687e-03 -1.473688e-02 -1.601025e-02 -2.165231e-02
# [26] -3.127263e-02 -3.443356e-02 -3.763160e-02 -6.064415e-02 -6.882758e-02

Note the negative eigenvalues!

This is because non-metric distances cannot be represented in Euclidean space without corrections (see Legendre & Legendre 2012 for more details on this):

spe.bray.pcoa <- pcoa(spe.db.pa,
correction = "cailliez")
93 / 111

Principal Coordinates Analysis: non-metric distances

The corrected Eigenvalues are now on a new column!

spe.bray.pcoa$values$Corr_eig
# [1] 5.208616362 1.735210656 1.249426268 0.813115534 0.690430331 0.513559025
# [7] 0.387520962 0.363702867 0.296206500 0.273082611 0.215178142 0.154525233
# [13] 0.150803988 0.118203435 0.088726577 0.073233820 0.069460754 0.057344134
# [19] 0.056008651 0.054525339 0.049309077 0.040482831 0.037906147 0.034709860
# [25] 0.029748166 0.023624321 0.019072278 0.003071851 0.000000000 0.000000000

Use a biplot without the species to represent it!

biplot.pcoa(spe.bray.pcoa)

94 / 111

Challenge #3

Compute a PCoA on the Hellinger-transformed mite species abundance data

Be ready to answer:

  • What are the significant Eigenvectors and Eigenvalues?
  • Which groups of sites can you identify?
  • Which groups of species are related to these groups of sites
  • How do the PCoA results compare with the PCA results?
95 / 111

Challenge #3: Solution

  • Hellinger transform the species data
mite.spe <- mite
mite.spe.hel <- decostand(mite.spe, method = "hellinger")
  • Compute PCoA
mite.spe.h.pcoa <- pcoa(dist(mite.spe.hel))
96 / 111

Challenge #3: Solution

  • Build a biplot to visualize the results:
biplot.pcoa(mite.spe.h.pcoa, mite.spe.hel)

97 / 111

Non-metric Multidimensional Scaling

  • In PCA and PCoA, objects are ordinated in a few number of dimensions (generally > 2);

  • 2D-biplots may not represent all the variation within the dataset;

  • Sometimes, we aim at representing the data in a specified smaller number of dimensions;

  • How can we plot the ordination space to represent the most variation as possible in the data?

98 / 111

Non-metric Multidimensional Scaling

  • In PCA and PCoA, objects are ordinated in a few number of dimensions (generally > 2);

  • 2D-biplots may not represent all the variation within the dataset;

  • Sometimes, we aim at representing the data in a specified smaller number of dimensions;

  • How can we plot the ordination space to represent the most variation as possible in the data?

We can attempt using non-metric multidimensional scaling!

  • NMDS is the non-metric counterpart of PCoA;
  • It uses an iterative optimization algorithm to find the best representation of distances in reduced space;
98 / 111

Non-metric Multidimensional Scaling

  • NMDS applies an iterative procedure that tries to position the objects in the requested number of dimensions in such a way as to minimize a stress function (scaled from 0 to 1), which measures the goodness-of-fit of the distance adjustment in the reduced-space configuration.

  • Consequently, the lower the stress value, the better the representation of objects in the ordination-space is.

  • nMDS is implemented in vegan as metaMDS() where:

  • distance specifies the distance metric to use;
  • k specifies the number of dimensions.
# Dune meadow vegetation data, dune, has cover class values of 30 species on 20
# sites.
data(dune)
spe.nmds <- metaMDS(dune,
distance = 'bray',
k = 2)
99 / 111

Non-metric Multidimensional Scaling: goodness-of-fit

The Shepard diagram and stress values can be obtained from stressplot():

spe.nmds$stress
# [1] 0.1183186
stressplot(spe.nmds, main = "Shepard plot")

The Shepard plot identifies a strong correlation between observed dissimilarity and ordination distance (R2 > 0.95) highlighting a high goodness of fit of the NMDS.

100 / 111

Non-metric Multidimensional Scaling: biplot()

Construct the biplot

plot(spe.nmds, type = "none",
main = paste("NMDS/Bray - Stress =",
round(spe.nmds$stress, 3)),
xlab = c("NMDS1"), ylab = "NMDS2")
points(scores(spe.nmds, display = "sites",
choices = c(1,2)),
pch = 21,
col = "black",
bg = "steelblue",
cex = 1.2)
text(scores(spe.nmds, display = "species", choices = c(1)),
scores(spe.nmds, display = "species", choices = c(2)),
labels = rownames(scores(spe.nmds, display = "species")),
col = "red", cex = 0.8)
101 / 111

Non-metric Multidimensional Scaling: biplot()

The biplot of the NMDS shows a group of closed sites characterized by the species BLA, TRU, VAI, LOC, CHA and OMB, while the other species form a cluster of sites in the upper right part of the graph.

Four sites in the lower part of the graph are strongly different from the others.

102 / 111

Challenge #4


Run the NMDS of the mite.spe species abundances in 2 dimensions based on a Bray-Curtis distance.

Assess the goodness-of-fit of the ordination and interpret the biplot.

Recall these helpful functions:

?metaMDS
?stressplot
103 / 111

Challenge #4: Solution

Run the NMDS of the mite species abundances in 2 dimensions based on a Bray-Curtis distance:

mite.nmds <- metaMDS(mite.spe, distance = 'bray', k = 2)
104 / 111

Challenge #4: Solution

plot(mite.nmds, type = "none",
main = paste("NMDS/Bray - Stress =",
round(mite.nmds$stress, 3)),
xlab = c("NMDS1"), ylab = "NMDS2")
points(scores(mite.nmds, display = "sites",
choices = c(1,2)),
pch = 21,
col = "black",
bg = "steelblue",
cex = 1.2)
text(scores(mite.nmds, display = "species", choices = c(1)),
scores(mite.nmds, display = "species", choices = c(2)),
labels = rownames(scores(mite.nmds, display = "species")),
col = "red", cex = 0.8)
105 / 111

Challenge #4: Solution

There is not really any obvious clustering of sites in the NMDS biplot. This tells us that the species occurred in most of the sites. Only a few sites shelter specific communities.

106 / 111

Challenge #4: Solution

stressplot(mite.nmds, main = "Shepard plot")

The observed dissimilarity is highly correlated with ordination distance, and the NMDS stress value is relatively low, which tells us that the NMDS ordination is relatively accurate.

107 / 111

Conclusion

Many ordination techniques exist, but their specificity should guide your choices on which methods to use

Distance preserved Variables Maximum number of axis
PCA Euclidean Quantitative data, linear relationships p
CA Chi2 Non-negative, quantitative homogeneous data, binary data p-1
PCoA User defined Quantitative, semi-quantitative, mixed data p-1
NMDS User defined Quantitative, semi-quantitative, mixed data User defined
108 / 111

Quiz time!

What does PCA stand for?

109 / 111

Quiz time!

What does PCA stand for?

Principal Component Analysis

109 / 111

Quiz time!

What does PCA stand for?

Principal Component Analysis

Which one is the best way to visualize the distances between the community composition of many sites?

109 / 111

Quiz time!

What does PCA stand for?

Principal Component Analysis

Which one is the best way to visualize the distances between the community composition of many sites?

Principal Coordinate Analysis (PCoA)

109 / 111

Quiz time!

What does PCA stand for?

Principal Component Analysis

Which one is the best way to visualize the distances between the community composition of many sites?

Principal Coordinate Analysis (PCoA)

What does an eigenvalue represent in PCA?

109 / 111

Quiz time!

What does PCA stand for?

Principal Component Analysis

Which one is the best way to visualize the distances between the community composition of many sites?

Principal Coordinate Analysis (PCoA)

What does an eigenvalue represent in PCA?

The proportion of variance explained by a principal component

109 / 111

Quiz time!

What is wrong with this plot?

110 / 111

Quiz time!

What is wrong with this plot?

  • The data is not centred. Yikes!
110 / 111

Quiz time!

What is wrong with this plot?

  • The data is not centred. Yikes!
  • The first two principal components explain 100% of the variation!
110 / 111

Thank you for attending this workshop!

111 / 111

About this workshop

badge badge badge badge badge

2 / 111
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow