In this vignette, we aim at
evaluating the contribution of individual species to each bioregion,
using the function contribution()
.
We use the vegetation dataset that comes with
bioregion
.
We use the same three bioregionalization algorithms as in the visualization
vignette, i.e. a non-hierarchical, hierarchical and network
bioregionalizations.
We chose 3 bioregions for the non-hierarchical and hierarchical
bioregionalizations.
# Non hierarchical bioregionalization
vege_nhclu_kmeans <- nhclu_kmeans(vegedissim, n_clust = 3, index = "Simpson")
vege_nhclu_kmeans$cluster_info # 3
## partition_name n_clust
## K_3 K_3 3
# Hierarchical bioregionalization
set.seed(1)
vege_hclu_hierarclust <- hclu_hierarclust(dissimilarity = vegedissim,
index = names(vegedissim)[3],
method = "average", n_clust = 3)
vege_hclu_hierarclust$cluster_info # 3
## partition_name n_clust requested_n_clust output_cut_height
## 1 K_3 3 3 0.5625
# Network bioregionalization
set.seed(1)
vege_netclu_walktrap <- netclu_walktrap(vegesim,
index = names(vegesim)[3])
vege_netclu_walktrap$cluster_info # 3
## partition_name n_clust
## K_3 K_3 3
The contribution index ρ is
calculated for each species x bioregion combination, following (Lenormand et al., 2019).
Its formula is
the following:
$$\rho_{ij} = \frac{n_{ij} - \frac{n_i n_j}{n}}{\sqrt{\frac{n - n_j}{n-1} (1-\frac{n_j}{n}) \frac{n_i n_j}{n}}}$$ with n the number of sites, ni the number of sites in which species i is present, nj the number of sites belonging to the bioregion j, nij the number of occurrences of species i in sites belonging to the bioregion j.
Cz
metrics are derived from . Their respective formula
are: $$C_i = 1 -
\sum_{s=1}^{N_M}{{(\frac{k_is}{k_i}})^2}$$
where kis is the number of links of node (species or site) i to nodes in bioregion s, and ki is the total degree of node i. The participation coefficient of a node is therefore close to 1 if its links are uniformly distributed among all the bioregions and 0 if all its links are within its own bioregion.
And: $$z_i = \frac{k_i - \overline{k_{si}}}{\sigma_{k_{si}}}$$
where ki is the number of links of node (species or site) i to other nodes in its bioregion si, $\overline{k_{si}}$ is the average of k over all the nodes in si, and σksi is the standard deviation of k in si. The within-bioregion degree z-score measures how well-connected node i is to other nodes in the bioregion.
We can now run the function contribution()
.
contrib_kmeans <- contribution(vege_nhclu_kmeans, vegemat,
indices = "contribution")
contrib_hclu <- contribution(vege_hclu_hierarclust, vegemat,
indices = "contribution")
contrib_netclu <- contribution(vege_netclu_walktrap, vegemat,
indices = "contribution")
# Cz indices
clust_bip <- netclu_greedy(vegedf, bipartite = TRUE)
cz_netclu <- contribution(cluster_object = clust_bip, comat = vegemat,
bipartite_link = vegedf, indices = "Cz")
contribution()
outputs data.frame
with the
contribution metrics available at the species level.