Modelling biological regions from multi-species and environmental data

Partitioning the environment into areas that appear to contain similar biological content is useful for investigating questions of distribution and habitat and for helping guide resource conservation and utilization. The statistical task requires relating presence/absence data from multiple species to co-located environmental data. In this article, we introduce a statistical modelling framework that models the environment as a set of regions where the vector of probabilities of observing a set of species remains approximately constant within a region and distinct between regions. This is achieved within a mixture-of-experts model framework, which treats the region type as a latent variable whose distribution varies as a function of the environment. This approach allows us to predict probabilities of region types for sampled and unsampled locations. The model synthesizes biological and environmental data, incorporating both in a single likelihood that enables propagation of uncertainty through the entire model. The method is demonstrated using a synthetic example and data from a survey of fish from the North West Shelf, which is located off Western Australia. An R package, RCPmod, which implements the methods described in this article, is available from CRAN. Copyright © 2013 John Wiley & Sons, Ltd.

Document type: