Issue 
Knowl. Manag. Aquat. Ecosyst.
Number 424, 2023
Management of habitats and populations/communities



Article Number  5  
Number of page(s)  13  
DOI  https://doi.org/10.1051/kmae/2023002  
Published online  23 February 2023 
Research Paper
ABC model for estimating sea lamprey local population size using a simple nest count during the spawning season
^{1}
Université de Pau et des Pays de l'Adour, E2S UPPA, INRAE, ECOBIOP, SaintPéesurNivelle, France
^{2}
Pôle Gestion des Migrateurs Amphihalins dans leur Environnement, OFB, INRAE, Agrocampus Ouest, Univ Pau & Pays Adour/ E2S UPPA, Pau, France
^{3}
Department of Plant Biology and Ecology, University of the Basque Country (UPV/EHU), Bilbao, Spain
^{*} Corresponding author: marius.dhamelincourt@inrae.fr
Received:
21
September
2022
Accepted:
12
January
2023
Population estimation implies considering the biology of the species, but also the constraints of logistic aspects such as cost. While common methods based on individual counts can provide precise estimates, they require an extensive sampling effort. An alternative to these methods is using cues linked to the species abundance. In that case, producing absolute estimates requires assessing the relationship between the individuals and these cues. In this paper, we propose a model based on data on spawning behaviour and Approximate Bayesian Computation to estimate the number of sea lamprey spawners using nest counts data. By counting the daily number of occupied nests and using parameters from a behavioural study, we set up a model simulating a spawning season and returning a population estimate by comparison with field data. Our model gives realistic estimates and we discuss the parameters on which to prioritize data collection with a sensitivity analysis, and show that halving the sample size provides a still satisfactory accuracy. We made an easily parametrizable application to run the model for any people interested in sea lamprey population estimation, and believe this framework to be a good way to increase data collection for both endangered and invasive sea lamprey.
Key words: Management / anadromous species / endangered species / nesting behaviour / mechanistic model
© M. Dhamelincourt et al., Published by EDP Sciences 2023
This is an Open Access article distributed under the terms of the Creative Commons Attribution License CCBYND (https://creativecommons.org/licenses/bynd/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. If you remix, transform, or build upon the material, you may not distribute the modified material.
1 Introduction
Population estimation implies taking into account biological aspects of the focused species, but also the constraints depending on more logistic aspects such as cost, material possibilities or number of people involved. A common population estimation method is CaptureMarkRecapture (CMR), which consists in marking all individuals during a first capture and then using the ratios of already marked versus new individuals captured on further occasions to estimate the population size. The CMR method allows to estimate a population when a direct count is not possible, gives a relatively precise estimate of its size (Funk et al., 2003) and has high power to detect its decline (Funk et al., 2003; Pace III et al., 2017). The method is highly flexible, as models have been developed to take into account characteristics such as the closure or not of the population (Schwarz and Seber, 1999), or resight instead of recapture (McClintock and White, 2009). However, obtaining precise population estimation using CMR methods requires the capture of a large proportion of the population, which may lead to a high sampling effort (McClintock and White, 2009), hence important financial and logistic costs. Such costs are not always affordable by agencies in charge of longterm monitoring programs.
Instead of directly counting or detecting the individuals, the use of methods based on cues characteristic of one species presence may be a way to limit the problems in detection or marking. For example, beaver colonies can be indirectly detected and counted using their dams (Johnston and Windels, 2015). Birds can be detected using a large panel of indicators such as their auditory signals, feeding and dusting sites, roost, fecal, and nest counts (Morgan et al., 1983). While the latter is mainly used for birds (Rodgers et al., 1995) or primates (Kouakou et al., 2009), population size of nestbuilding fishes may also be estimated using nest counts (AlChokhachy et al., 2005; Hamstreet, 2012). Nonetheless, many of these indirect methods do not produce absolute estimates but only relative ones. Indeed, producing absolute estimates requires the assessment of the relationship between number of individuals and these indirect cues. The classical procedure consists in the calibration (usually through regression) of the relationship between population size and the indirect cue. This is done on a set of observations in which both are available, and later using inverse prediction to infer population size from the indirect cue in the usual case where only the cue is quantified (Southwood, 1978). However, the relationship can depend on environmental conditions (e.g. birds singing more or less intensely depending on the moment of the day) or on the characteristics of individuals in the population (e.g. unpaired male birds singing more or less than paired males depending on the moment of the day and of the breeding cycle, Amrhein et al., 2002). Accounting for these additional sources of variation in the relationship between actual population size and the measurable cue would require calibrating the relationship on a very large number of observations where all variables are available, which is often not possible.
An alternative, proposed in this paper, is to (1) collect information on the behavioural processes leading to cue production by individuals, thus accounting for variability in these processes; (2) build a model that simulates cues produced by a population composed of a number of such individuals; and (3) compare the number of cues simulated by the model to the amount actually observed during a survey. Steps (1) and (2) correspond to model calibration by the modelers, and step (3) belongs to users. The model is therefore both mechanistic, as it implements individual behaviour explicitly, and statistic, as it produces an estimate and its uncertainty. The flexibility of Bayesian modelling (Kruschke, 2010) makes it a good tool for this task. However, when the measured variables are generated by complex mechanisms including, for example, interactions between individuals, deriving the likelihood may be problematic. This is where Approximate Bayesian Computation (ABC) comes in handy.
Approximate Bayesian Computation framework is a particularly accurate methodology when it is difficult to sample parameter values from the posterior distribution, as there is no need to compute the likelihood function (Beaumont, 2010; Csilléry et al., 2010; Turner and Van Zandt, 2012). Instead, the model returns summary statistics calculated from a hypothetical dataset and which are compared to the observed values. This framework allows a simulation of complex processes behind an observed outcome and is particularly used in population genetics (Estoup et al., 2001; Nielsen and Beaumont, 2009) or epidemiology (McKinley et al., 2018). In ecology, the method was used to infer speciation rates and immigration of species under a neutral ecological model (Jabot and Chave, 2009), to test the existence of a socially induced reproductive synchrony (Koizumi and Shimatani, 2016), or to determine the species richness (Solow and Smith, 2009).
The sea lamprey, Petromyzon marinus, is an anadromous jawless fish. Both males and females build nests, often in pair but sometimes as groups of several individuals. This species is considered endangered in the most important part of its native area in Europe and North America, where the largest sea lamprey fisheries occur (Beaulaton et al., 2008) while being considered invasive in the Laurentian Great Lakes (Hume et al., 2021), where the invasive populations affect fisheries by killing salmonids (Farmer et al., 1975). Those opposite concerns lead to the need for an accurate estimation of sea lamprey populations to monitor the efficiency of either conservation or control policies. Captures from fisheries, where they exist, may be used to provide an annual indication of the adults returning to their spawning grounds (Beaulaton et al., 2008). However, the data provided usually correspond to a relative number of migrants and not a number of spawners, as the individuals are caught during their upstream migration and susceptible to be predated (Boulêtreau et al., 2020) or to be unable to reach their spawning grounds due to impassable barriers (Lasne et al., 2015). Furthermore, fishery catches highly depend on environmental conditions. Fish passes equipped with a counting device are useful as they may provide an exhaustive count near the spawning areas. However, it is impeded as most of the dams are not totally impassable, with individuals going upstream without using the pass. Furthermore, the number of equipped rivers is most often limited. Based on the limits of the current methods, sea lamprey nests seem to be an interesting cue of the sea lamprey spawners abundance, as they directly reflect the spawners activity and can be counted in spawning sites easily, as they are built in shallow and identifiable zones (Johnson et al., 2015). Nonetheless, it is necessary to take into account the polygamy of the species which causes the nests to be built by more than one male and one female (Applegate, 1950; Migradour, 2010). Furthermore, individuals may build several nests, with differences between males and females (Dhamelincourt et al., 2021a). To include these constraints while allowing an easy monitoring, the Approximate Bayesian Computation framework (ABC) was selected.
This paper presents a model based on this ABC framework and uses it to estimate the number of sea lamprey spawners from nest counts data. Our model aims to produce sea lamprey spawners abundance estimates (including uncertainty) from simple and costefficient nest counts performed repeatedly throughout a spawning season (Fig. 1). To assess the relationship between individuals and their nests, our model was based on a previous CMR model and behavioural study from Dhamelincourt et al. (2021a) conducted over an entire spawning season and used to estimate the parameters. After presenting the model structure, its performance and its sensitivity to parameter magnitudes and sampling design, we discuss its implication from a user perspective, recommending sampling schemes and highlighting the evolvability of the model as more information on lamprey nesting behaviour will become available for different river systems.
Fig. 1 Scheme of the process behind the ABC model implemented to estimate the sea lamprey spawner abundance with nest counts. 
2 Material and methods
2.1 ABC framework
Our model was built using the Approximate Bayesian Computation (ABC) framework (Beaumont, 2010; Csilléry et al., 2010) and the ABC_sequential function within the EasyABC package (Jabot et al., 2015) for R (version 4.1.2; R Core Team, 2021) that implements the sequential algorithm from Lenormand et al. (2013). ABC framework was chosen as it allows to infer the posterior distribution of key parameters − here population size − of a model whose likelihood is too complicated to derive, here an individualbased model of a sea lamprey spawning season. Using a set of values for individual parameters, we simulated the individual nesting history of males and females spawning lamprey, which may build, or not, a nest each day d of the season, depending on their parameters and their previous spawning activity on day 1 to d1. Those sexdependent parameters were the maximal number of frequented nests, the number of individuals in each nest, the delay of arrival on the spawning ground, and the residence time. The occupation duration of a nest was also considered. At the end of the simulation, the model produces some summary statistics (i.e., the maximum, median, mean, Q_{25} and Q_{75} of the number of nests built on each day of the season). After k simulations of spawning seasons, each with a different number of individuals N, the model returns a posterior distribution of the most likely number of individuals N_{estimate} depending on the summary statistics really observed during the spawning season.
An ABC model initially generates a sample of model parameter values (often also called particles) from the prior distribution and selects the values leading to model outputs (resumed as summary statistics) satisfying a proximity criterion with the target data (data observed on the field for example). The selected sample of parameter values approximates the posterior distribution of parameters, leading to model outputs with the expected quality of approximation. Inside this general framework, the sequential Lenormand algorithm (Lenormand et al., 2013) was selected as it minimizes the number of runs and automatically determines its stopping criterion. This algorithm determines the final tolerance level in order to ensure a good quality while avoiding too many simulations and computation time. These characteristics are of interest for our model, as they are intended to be implemented in a userfriendly application. We set nb_simul = 400, corresponding to 200 simulations below each tolerance level (see Jabot et al. (2015) for parametrization details in R and Lenormand et al. (2013) for an overview of the algorithm) and did not indicate a tolerance level as the Lenormand's algorithm automatically determines its decreasing sequence of tolerance levels.
2.2 Parameter distributions and priors
All parameter distributions were determined using data from Dhamelincourt et al. (2021a), available in Data INRAE (clickable link). These parameters are all needed to run the model as they are the basis of the realistic arrangement of individuals in the nests, simulated by the model. Briefly, these data were obtained from a daily survey of a 1 km long stretch of the Nive river (France) on which 202 nests were found over a 47days long spawning season. The number of males and females was counted on 69 nests. A total of 114 individuals were captured with a fishing net when first observed on a nest, marked and observed again (for 60 of them) on either the same or subsequent nests. Number of nests and number of individuals per nest followed a zerotruncated negative binomial distribution (generated using the rztnbinom function of the countreg package which generates values according to the distribution of the observed data; see Zeileis et al., 2008) to prevent zero values. Delay and residence time were both obtained using the daily survival probability ϕ_{d,t} of an individual d at a day t estimated by the CMR model in Dhamelincourt et al. (2021a). Delay followed a truncated and skewed normal distribution (rsnorm function of the fGarch package; Wuertz et al., 2016) to avoid zero values and take into account the observed skewness of the real delay. Residence time (and nest duration) followed a zerotruncated normal distribution without skewness (rtnorm function of the msm package; Jackson, 2011). Sex ratio was initially fixed to 0.5 as we observed an equal number of males and females. A uniform, noninformative prior, was used for simulations as the a priori number of individuals was supposed unknown and should fit with spawners abundance from other sites. This prior is likely to be modified according to the number of spawners expected in a given site. Table 1 indicates all distributions and prior.
Parameter estimates obtained from Dhamelincourt et al., 2021a and Dhamelincourt, 2021b; m = males, f = females. With for negative binomial distributions and Variance = sd^{2} for normal distributions.
Percentage of over or underestimation of the median of the spawner abundance estimate obtained with the default values, from 0.1 to twice the default value regarding our field data and CMR model; m= males, f= females.
Percentage of over or underestimation of the standard deviation of the spawner abundance estimate obtained with the default values, from 0.1 to twice the default value regarding our field data and CMR model; m= males, f= females.
2.3 Individualbased model processes
In our individualbased model, each individual has a continuous set of days during which it may build nests and encounter other individuals, depending on its own delay and residence time. Since individuals reach a given spawning ground sequentially, we set up an individual delay of arrival, as the number of days since the arrival of the first spawner. Each day d, an individual builds a nest with other individuals if it and the other individuals are active and satisfy the following conditions: first, an individual cannot exceed a maximum number of nests built during the season. Second, an individual can only be involved in the building of one nest at a time. Third, a nest cannot be built by more males and females than a random limit, specific to each sex. Finally, if an individual is active but finds no partner, it can initiate a nest on its own, as can be observed for both males and females. Figure 2 describes these successive conditions and their consequences on whether a nest is built or not.
This process is repeated each day until the end of the season, when all individuals reach either their maximum number of nests or their maximum residence time on the spawning ground (linked to the rapid senescence associated to semelparity). The model then returns a daily number of active nests (i.e., occupied nests), but not the total number of nests detectable on the spawning ground, which includes nests that have been completed on previous days. Indeed, depending on hydrological conditions, nests can be detected a few days to several weeks after they have been built, but it is easier for operators to count the number of active nests, which does not require labelling the nests each day to avoid double counting from one day to the other, that may complicate a protocol that was intended to be simple. Figure 3 shows the process through the example of a hypothetical individual spawning season.
Model code is freely available at Data INRAE (clickable link).
Fig. 2 Decision tree synthesizing the actions of the individual depending on the conditions implemented in the model. This decision tree is repeated for each individual i each day d. 
Fig. 3 Spawning season of a hypothetical individual i generated by the individualbased ABC model. This individual (either a male or a female) firstly met two available partners (either males, females or both) on day d. On day d + 1, the individual was still on the nest built the previous day, but one partner left the nest, possibly joining or building another one. Then, at d + 2, the individual built a new nest and was joined solely by one of the two partners available this day. Indeed, available partners may join or build another nest with other individuals. At d + 3, the individual i built a nest on its own, as no other partner was available. Finally, from d + 4 until the residence time was reached, the individual did not build or join more nests, as its maximum number of nests was reached at d+3. It finally disappeared from the pool of individuals at d + residence time + 1, simulating its death. The table indicates the number of active nests belonging to this individual. Each day this individual was observed on a nest, +1 was added to the count of “active” nests. For this reason, four nests appear in the count but three nests were built. 
2.4 Model validation
The estimates produced by our model were compared to the estimates from a CMR model fitted to the data obtained on the individuals marked in the Nive in 2019 (Dhamelincourt et al., 2021a).
2.5 Sensitivity analysis
Sensitivity of our model to parameter variations was assessed using the default parametrization but changing one parameter value for each run. Twenty values were tested for each parameter and its variability (standard deviation or dispersion depending on the distribution), from 0.1 × observed value to 2 × observed value with a step of 0.1. The nest duration varied only between four values due to the need to use integer values, and a relatively limited range of realistic variations (a nest built during more than four days was not observed or indicated elsewhere). Deviation from the estimated spawner abundance with CMR was then quantified using the mean ± sd of this deviation.
2.6 Sampling schemes simulation
To test how the accuracy of the estimated number of spawners varied with the way field data are collected, we simulated a variety of sampling schemes, in terms of both frequency and regularity of field work. Four frequencies were simulated: (a) every day of the season, (b) half of the days, (c) once a week, (d) once in two weeks. For each the last three frequencies, two regularities were simulated: (1) regular: every other day, every seven days, or every 14 days, or (2) randomized across the season (e.g., possibly two days in a row followed by 12 days off, for the weekly frequency). This combination of frequency and regularity therefore resulted in seven sampling schemes. First, a full spawning season was simulated and served as a basis. For all sampling schemes, the summary statistics were calculated on a subset of this full spawning season according to the sampling scheme considered. A single set of statistics was calculated for the full sampling scheme, but for the nest count every other day we used the two possibilities of simulated field samplings (starting either the first or the second day of the season). For all other sampling schemes, we calculated seven summary statistic sets corresponding to all the different possibilities of field campaigns (count every other week and weekly count with a fixed 7 days step) for nonrandom sampling schemes. For random ones and to be consistent across the sampling schemes, we kept the same number of sets (seven) even if there were many more possible combinations. For each instance of each sampling scheme, the ABC model estimated a spawner abundance. The bias, or inaccuracy, associated to each sampling scheme was computed as the difference between the medians of the estimates obtained from the sampling scheme of the median of the estimates obtained with the full data. The imprecision associated to each sampling scheme was reflected by the differences among the estimates obtained from that sampling scheme.
3 Results
3.1 Spawner abundance estimate
After k = 4600 simulations (depending on the decreasing sequence of tolerance levels automatically determined by the Lenormand's algorithm), the model estimated a mean value of 148 ± 18 individuals, including males and females. It corresponds to a 25% underestimation compared to the CMR model (197 ± 17 individuals). The distributions estimated are indicated in Figure 4. However, the distribution is narrower for the ABC model if we consider the confidence intervals.
Fig. 4 Median and 95% confidence interval of spawner abundance estimate obtained with (1) the ABC model (blue and dashed lines) using the set of parameters described in Table 1 and (2) the CaptureMarkRecapture model (orange and dotted lines) from Dhamelincourt et al. (2021a). 
3.2 Sensitivity analysis
Sensitivity analysis indicated an important variability considering the effects of parameter values on both median (Fig. 2) and standard deviation (Fig. 3) of the spawners abundance estimate. In each table, the parameters whose variation affected the model output the most are the ones with the most contrasting rows. The further the value moves from green to either yellow or blue, the further the estimated abundance is from the one obtained with the default parameters.
For the median, a lower standard deviation of the delay of arrival for males and females induced an underestimation from 1 to 70%, while higher values caused an overestimation from 3 to 21%. The mean parameter (mu) of the zerotruncated binomial distribution, assigning a number of nests per individual, caused an underestimation from 2% to 27% when higher than the initial value, while increasing the estimated spawners abundance until 31% when set to 90% lower than the initial values. The sexratio is another parameter which highly influenced the median estimate when set to high values, corresponding to a high number of males. A sexratio set up to 0.95 decreased by 55% the spawners abundance estimate while a decrease of similar magnitude (sexratio = 0.05) only decreased the estimate by 7%. A longer duration in a nest tended to underestimate spawners abundance, with values corresponding to a decrease of 30% with nests occupied for three days.
The standard deviation of the spawners abundance estimate followed less progressive trends from low to high parameter values. However, the model was again highly sensitive to standard deviation of the delay. Roughly, low values decreased the uncertainty (down to 58%) while high values increased it (up to 25%). As observed for the median, the mean parameter (mu) of the number of nests per individual highly influenced the final standard deviation of the estimate, with lower uncertainty with high values. The mean parameter of the number of females per nest showed an opposite trend: low values decreased the standard deviation down to 30%. The nest duration reduced the uncertainty down to 23% with high values. It appears difficult to highlight trends for other parameters, which can both increase or decrease the standard deviation while increasing or decreasing.
3.3 Sampling schemes simulation
The summary statistics obtained from the different sampling schemes (Fig. 5) indicated a general increase in variability with a decreasing sampling effort, showing the difficulty to obtain the same statistics for a scheme when a reduced field sampling frequency was repeated. This is especially evident for the maximal number of active nests on a given day, ranging from 6 to 15 for the four least intensive sampling schemes, without the real value (17 nests) included. To a lesser extent, the tendency was the same for the other statistics. The “once in two weeks” random sampling was the worst, with the highest variability among all schemes.
The spawners abundance estimates (Fig. 6) reflected the results of the summary statistics. Sampling schemes with the most variable summaries also had the most variable estimates. While the “every other day” scheme provided an estimate between −12% and +5% when compared to “full sampling” estimate, and the “once in two days” random scheme having an estimate between −6% and +13%, the less intensive schemes were more variable. The estimates of the “weekly count” lied between −44% and +27% compared to the full effort. The same sampling intensity, but applied randomly, gave an estimate between −52% and +11%. Finally, the “every two weeks” sampling scheme provided estimates between −29% and +40%, while its randomly applied version provided estimates between −55% and +42%. As the model always uses the same number of statistics, being more or less representative of the spawning season, the uncertainty did not increase between simulations. Decreasing the sampling effort seems to affect the accuracy more than the precision of the estimates.
Fig. 5 Values of summary statistics calculated with the distribution of active nests, obtained assuming different sampling schemes (maximum, median, mean, Q_{25} and Q_{75}), then used for ABC computation. 
Fig. 6 Comparison of the spawner abundance estimated by ABC models for the different sampling schemes studied, each sampling scheme being repeated several times. Each point corresponds to one of the 100 values provided by the model to calculate the estimate as a distribution, presented here as a boxplot. The horizontal red line corresponds to the estimate obtained with a daily sampling scheme. For all sampling schemes (except the daily nest count and the "every other day" scheme which only has two possibilities) we calculated seven summary statistic sets corresponding to all the different possibilities of field campaigns (count every other week and weekly count with a fixed 7 days step) for nonrandom sampling schemes. For random ones, we kept the same number of sets (seven) even if they have many more possible combinations, in order to be consistent across the sampling schemes. Coloured boxes correspond to the range of spawner's abundance estimated within each sampling scheme and allow to visualize the interval between the lowest and highest estimate. 
4 Discussion
The objectives of the study were to (1) build an individualbased model simulating a daily number of active nests produced by lamprey spawners; (2) evaluate the model performance by comparing the estimate with those of a previous CMR model using the same data; (3) determine the sensitivity to parameter magnitudes; (4) assess the effects of several sampling designs on the estimates; and (5) discuss the implications and evolvability of the model from a management perspective. After building the model, we managed to estimate a spawner abundance 25% below that of the estimate given by the CMR method. We identified delay of arrival, sex ratio, nest duration, number of nests per individual and number of individuals per nest as the most sensitive parameters. In addition, we showed that a sampling scheme “every other day” or “once in two days” decreased the accuracy to a lower extent than less intensive sampling schemes. We will now discuss the implications and evolvability of our approach.
The spawners abundance estimated with the ABC model was 25% below that of the CMR method. This result suggests that our model may produce a biased estimate, possibly caused by two factors. First, even if we simulated a realistic nestbuilding process using the knowledge available for sea lamprey based on some previous works realized on the sea lamprey spawning season (e.g., Applegate, 1950; Hardisty and Potter, 1971; Johnson et al., 2015; Dhamelincourt et al., 2021a), we may have missed important behaviours determining nestbuilding. A hypothesis is that some other parameters determine the individual arrangement within nests. In the model, the individuals can begin or join a nest solely depending on the number of individuals they may spawn with or the number of nests they should visit. However, no information is provided on the influence of environmental parameters such as the water temperature or the density of individuals in the spawning ground. Water temperature is known to influence the spawning activity of sea lamprey, with a threshold for nestbuilding behaviour set to 15 °C (Manion and Hanson, 1980) and an important sensitivity to sudden drops of 1 or 2 °C (Applegate, 1950; Manion and McLain, 1971). During the 2019 spawning season, the mean temperature was below 15 °C during 26 days, and even if we observed a spawning activity, we possibly overestimated the capacity of individuals to build nests on these days. However, little information exists to define a continuous nest digging probability depending on temperature, making it difficult to implement this variable in our model. Furthermore, density of individuals may increase the competition for favourable nesting habitat and increase the number of individuals per nest, therefore modifying the individual parameters throughout the spawning season. During some days we observed a peak of active nests (15 active nests observed on June 3rd, 2019), with a possible increased nest building cooperation compared to days with lower activity and/or density. Another hypothesis concerning the model underestimation is the lack of precision of some of our parameters. Regarding the most sensitive parameters, the delay of arrival, the sexratio, the number of nests and the nest duration influenced the most the spawners abundance estimate. We determined these parameters using the protocol described in (Dhamelincourt et al., 2021a), with no continuous monitoring of the spawning site, especially by night. Consequently, we possibly missed part or all of the breeding activity of some individuals, explaining the uncertainty of our parameters. For that reason, we recommend that future studies should primarily estimate these parameters. The model code being freely available and the dedicated application allowing an easy modification of the parameters, we think our model as being a baseline to be improved by complementary studies. Although, even if some parameters (obtained from Dhamelincourt et al., 2021a) used in our model are consistent with observations made in other populations (e.g., Applegate, 1950 for the number of individuals per nest in the Ocqueoc river, MI, USA; Gardner et al., 2012 for the duration of individual activity in the Sedgeunkedunc stream, ME, USA), some parameters may vary across populations, depending on the genetic background of the population, habitat features, or the local density of lamprey. Additional studies describing lamprey spawning behaviour in different localities will therefore be welcome to refine our model and possibly adapt it to local situation. In this perspective, our sensitivity analysis should help users to prioritize which parameters they should get information on, depending on both their effect on the estimate of the number of spawners and the intuition that they may locally differ from what Dhamelincourt et al. (2021a) observed in their study site. For example, a user working in a site situated upstream an obstacle whose permeability varies a lot with flow may want to design a telemetrybased experiment to assess the mean and standard deviation of the delay of individual arrival on the spawning ground of interest. Likewise, if local conditions are suspected to bias the adult sex ratio, sexing migrating adults caught by a nearby fishery may help adapt this parameter in the model. Since all parameters are needed to run the model, users must either use the parameters collected from their population or abide by the default values.
Sensitivity analysis showed that some parameters highly influenced the spawners abundance estimate. The standard deviation values of our parameter “delay” led to an underestimation of the spawners abundance when low, while leading to an overestimation of the spawners abundance when high. This result is consistent with the individuals' behaviour. Indeed, when many individuals arrive on the spawning ground at the same time (low standard deviation values), they should build fewer nests but with more individuals, due to the space limit (Ostfeld, 1986; Harris et al., 1995) or the attractiveness of already built nests for opportunistic individuals with poor body condition (Harris, 2008). In contrast, when individuals arrive sequentially, they have less opportunity to join or be joined by other spawners within a nest and are therefore more likely to build nests with few individuals. The mean number of nests per individual is another parameter of great influence for the spawners abundance estimate. This is an expected result since it is a parameter directly influencing the nesting process. A high number of nests per individual decreased the final estimate while a low digging capacity implied that more individuals were at the base of a given number of nests. The sex ratio is another important parameter as it decreased the spawners abundance estimate for high values, indicating a sex ratio largely in favour of males. Here again, this result indicates a consistent functioning of the model, as a high number of males means a high number of nests per individual (because the maximal number of males allowed per nest is lower than the maximal number of females), therefore decreasing the number of individuals having built the number of active nests observed on a given day. Our sensitivity analysis points at the parameters on which future studies may focus on in order to improve the performance of our model. Moreover, the parameters which our model's output was the most sensitive to are also likely to vary across populations, or across years for a given population. Our results therefore also highlight the parameters that a user of our model may need to determine for his own system in order to parametrize the model accordingly, and get an accurate estimate of his own population of interest.
The simulation of several sampling schemes from a daily nest count to a count once in two weeks showed an important decrease of the accuracy of the spawners abundance estimate for least intensive samplings. The summary statistics being biased compared to the reality of the nestbuilding process and highly depending on the days monitored, these estimates were largely over or under the reference value. Determining a tradeoff between cost and accuracy requires thinking about the limit one would like to set on the bias of the estimate. According to our results, it appeared necessary to count the nests at least half the days of a spawning season, days randomly chosen or with a regular once in two days step. For both of them, the bias did not exceed 13%, while it reached up to 50% on a weekly basis. However, the operator is not constrained by the day to choose as long as he surveys half of the spawning season. To illustrate the sensitivity of our model's performance to sampling design, we only simulated six sampling schemes according to frequency and evenness of nest counts. However, a potential user of our model could also simulate a raw dataset inspired by the probable magnitude of spawners abundance, apply several custom sampling schemes based on his constrains (e.g., operator availability, upper limit on total number of days in the field), and run our model on each generated dataset. This would allow assessing the accuracy and precision of the estimate associated to each considered sampling, and schedule field season accordingly.
In order to make the model easily usable by any people interested in sea lamprey population management, even without R coding knowledge, we developed a userfriendly web application (https://mdhamelincourt.shinyapps.io/Lamproie_tracker/; Chang et al., 2015). The simplest way to use it is to upload a dataset and launch the analysis, using the default parameter values given in the present article. A .csv dataset with a first column corresponding to the days of monitoring and a second column indicating the number of active nests counted each day is required. Column names do not matter but it is necessary to write 'NA' for the nonmonitored days if they appear in the dataset. The 0 value is considered as a day without active nest. This dataset must be loaded in the tab “Chargement des données” where the user can control the correct form of the data (some options are helpful in the loading of the dataset when necessary). Then, one just launches the analysis in the tab “Lancement de l'analyse” and waits a few minutes until the end of the computation. A plot of the posterior distribution of spawners abundance is then displayed and the user can save it or simply note the median and 95% confidence interval of the spawners abundance estimate. As we wanted the model to be adjustable to the study site and the characteristics of the population studied, the user can modify all the parameters of the model in the tab "Paramètres". However, as the estimates can drastically change depending on these values, the user must have reliable information in order to change them. If supplementary information needs to be included in the model, the code is freely available at (https://github.com/MariusDhamelincourt/Lamproietracker; license CC BYNCSA) and can be used outside of the application. We did not include this possibility inside the application as it would have complicated the ease of use. In addition, it may be possible to infer more parameters than solely spawners abundance. For example, the model could estimate any other parameter used to simulate the spawning behaviour (see Tab. 1), such as the residence time. However, the objective of our model is to specifically estimate the abundance. Furthermore, the addition of parameters to estimate may require more computation time while providing information that would not be of interest to managers. Although the model was used in this study to estimate spawners abundance at the scale of a single river section, the time required to build a nest and the brevity of the spawning season (residence time estimated to 8.33 ± 1.02 days for males and 3.57 ± 1.04 days for females) make it unlikely that individuals will be highly mobile once spawning has begun. Thus, implementing this model on all the spawning sites of a system, or at least on the main sites, may allow obtaining a spawners abundance estimate at the watershed scale by simply summing the estimates provided by the model. However, if additional data revealed high mobility of lamprey across spawning sites, the model could be complexified accordingly, with parameters such as the rate of intersite migration. We believe our model to be adaptable to any nest building fish such as salmonids, only by changing the parameter values or adding some other biological features. Nestbuilding being a behaviour widely spread among fish species (Bessa et al., 2022), this model may be an interesting tool when no practical method exists to accurately estimate the populations.
The model described in this paper aimed to provide sea lamprey population managers an easytouse, accurate and economical way to encompass the limits of actual methods estimating sea lamprey populations. Even if our model needs adapting the parameter values depending on the population considered, we believe this framework to be a good way to facilitate and increase data collection for this both endangered and invasive species. The flexibility of the model architecture also allows for adaptation to other nestbuilding fish species with minor modifications.
Funding
Functioning was funded by Pôle Gestion des Migrateurs Amphihalins dans leur Environnement. M.D. PhD is financed by Univ. Pau & Pays Adour and UPV/EHU.
Conflict of interest statement
All authors disclose any potential sources of conflict of interest.
Data availability statement
Analyzes reported in this article can be reproduced using the data provided in Data INRAE (clickable link).
Author contribution
M.D. conceived and designed the model, collected and analyzed field data and drafted the manuscript; A.E. critically revised the manuscript; C.T. helped for the conception and the design of the study, obtained the funding, helped for data analysis and critically revised the manuscript. All authors gave final approval for publication and agree to be held accountable for the work performed therein.
Acknowledgements
Field work used resources from the IE ECP Experimental Facility of the UMR Ecobiop (ECP, 2018).
References
 AlChokhachy R, Budy P, Schaller H. 2005. Understanding the significance of redd counts: A comparison between two methods for estimating the abundance of and monitoring bull trout populations. N Am J Fish Manag 25: 1505–1512. [Google Scholar]
 Amrhein V, Korner P, Naguib M. 2002. Nocturnal and diurnal singing activity in the nightingale: correlations with mating status and breeding cycle. Anim Behav 64: 939–944. [CrossRef] [Google Scholar]
 Applegate VC. 1950. Natural history of the sea lamprey, Petromyzon marinus, in Michigan. Federal Government Series No. 55U.S. Fish and Wildlife Service. [Google Scholar]
 Beaulaton L, Taverny C, Castelnaud G. 2008. Fishing, abundance and life history traits of the anadromous sea lamprey (Petromyzon marinus) in Europe. Fish Res 92: 90–101. [Google Scholar]
 Beaumont MA. 2010. Approximate Bayesian computation in evolution and ecology. Annu Rev Ecol Evol System 41: 379–406. [CrossRef] [Google Scholar]
 Bessa E, Brand̃o ML, GonçalvesdeFreitas E. 2022. Integrative approach on the diversity of nesting behaviour in fishes. Fish Fish 23: 564–583. [CrossRef] [Google Scholar]
 Boulêtreau S, Carry L, Meyer E, Filloux D, Menchi O, Mataix V, Santoul F. 2020. High predation of native sea lamprey during spawning migration. Sci Rep 10: 6122. [Google Scholar]
 Chang W, Cheng J, Allaire J, Xie Y, MvPherson J. 2015. Package ‘shiny.’ [Google Scholar]
 Csilléry K, Blum MGB, Gaggiotti OE, François O. 2010. Approximate Bayesian Computation (ABC) in practice. Trends Ecol Evol 25: 410–418. [Google Scholar]
 Dhamelincourt M, Buoro M, Rives J, Sebihi S, Tentelier C. 2021a. Individual and group characteristics affecting nest building in sea lamprey (Petromyzon marinus L. 1758). J Fish Biol 98: 557–565. [CrossRef] [PubMed] [Google Scholar]
 ECP. 2018. Ecology and Fish Population Biology Facility. INRAE. [Google Scholar]
 Estoup A, Wilson IJ, Sullivan C, Cornuet JM, Moritz C. 2001. Inferring population history from microsatellite and enzyme data in serially introduced cane toads, Bufo marinus. Genetics 159: 1671–1687. [CrossRef] [PubMed] [Google Scholar]
 Farmer GJ, Beamish FWH, Robinson GA. 1975. Food consumption of the adult landlocked sea lamprey, Petromyzon marinus, L. Comp Biochem Physiol Part A: Physiol 50: 753–757. [CrossRef] [Google Scholar]
 Funk WC, AlmeidaReinoso D, NogalesSornosa F, Bustamante MR. 2003. Monitoring population trends of Eleutherodactylus frogs. hpet 37: 245–256. [Google Scholar]
 Gardner C, Jr SMC, Zydlewski J. 2012. Distribution and abundance of anadromous sea lamprey spawners in a fragmented stream: current status and potential range expansion following barrier removal. nena 19: 99–110. [Google Scholar]
 Hamstreet CO. 2012. Spring and Summer Chinook Salmon. US Fish and Wildlife Service, Leavenworth Washington 26. [Google Scholar]
 Hardisty MW, Potter IC. 1971. The Biology of Lampreys Volume 2, Academic Press, 488 p. [Google Scholar]
 Harris RN. 2008. Body condition and order of arrival affect cooperative nesting behaviour in fourtoed salamanders Hemidactylium scutatum. Anim Behav 75: 229–233. [CrossRef] [Google Scholar]
 Harris RN, Hames WW, Knight IT, Carreno CA, Vess TJ. 1995. An experimental analysis of joint nesting in the salamander Hemidaetylium scutatum (Caudata: Plethodontidae): the effects of population density. Anim Behav 50: 1309–1316. [CrossRef] [Google Scholar]
 Hume JB, Almeida PR, Buckley CM, Cri ger LA, Madenjian CP, Robinson KF, Wang CJ, Muir AM. 2021. Managing native and nonnative sea lamprey (Petromyzon marinus) through anthropogenic change: a prospective assessment of key threats and uncertainties. J Great Lakes Res 47: S704–S722. [CrossRef] [Google Scholar]
 Jabot F, Chave J. 2009. Inferring the parameters of the neutral theory of biodiversity using phylogenetic information and implications for tropical forests. Ecol Lett 12: 239–248. [CrossRef] [PubMed] [Google Scholar]
 Jabot F, Faure T, Dumoulin N, Albert C. 2015. EasyABC: a R package to perform efficient approximate Bayesian computation sampling schemes. 38. [Google Scholar]
 Jackson C. 2011. MultiState Models for Panel Data: The msm Package for R. J Stat Softw 38: 1–28. [Google Scholar]
 Johnson NS, Buchinger TJ, Li W. 2015. Reproductive ecology of lampreys. In: Docker MF, ed. Lampreys: Biology, Conservation and Control: Volume 1. Netherlands, Dordrecht: Springer, pp. 265–303. [Google Scholar]
 Johnston CA, Windels SK. 2015. Using beaver works to estimate colony activity in boreal landscapes. J Wildlife Manag 79: 1072–1080. [Google Scholar]
 Koizumi I, Shimatani IK. 2016. Socially induced reproductive synchrony in a salmonid: an approximate Bayesian computation approach. Behav Ecol 27: 1386–1396. [CrossRef] [Google Scholar]
 Kouakou CY, Boesch C, Kuehl H. 2009. Estimating chimpanzee population size with nest counts: validating methods in Taï National Park. Am J Primatol 71: 447–457. [CrossRef] [PubMed] [Google Scholar]
 Kruschke JK. 2010. What to believe: Bayesian methods for data analysis. Trends Cogn Sci 14: 293–300. [Google Scholar]
 Lasne E, Sabatié MR, Jeannot N, Cucherousset J. 2015. The effects of DAM removal on river colonization by sea lamprey Petromyzon Marinus. River Res Appl 31: 904–911. [Google Scholar]
 Lenormand M, Jabot F, Deffuant G. 2013. Adaptive approximate Bayesian computation for complex models. Comput Stat 28: 2777–2796. [CrossRef] [MathSciNet] [Google Scholar]
 Manion PJ, Hanson LH. 1980. Spawning behavior and fecundity of lampreys from the upper three Great Lakes. Can J Fish Aquat Sci 37: 1635–1640. [CrossRef] [Google Scholar]
 Manion PJ, McLain AL. 1971. Biology of larval sea lampreys (Petromyzon marinus) of the 1960 year class, isolated in the Big Garlic River, Michigan, 196065. Organization Series No. 16. Great Lakes Fishery Commission. [Google Scholar]
 McClintock BT, White GC. 2009. A less fieldintensive robust design for estimating demographic parameters with Markresight data. Ecology 90: 8. [Google Scholar]
 McKinley TJ, Vernon I, Andrianakis I, McCreesh N, Oakley JE, Nsubuga RN, Goldstein M, White RG. 2018. Approximate Bayesian computation and simulationbased inference for complex stochastic epidemic models. Stat Sci 33: 4–18. [Google Scholar]
 Migradour. 2010. Suivi de la reproduction de la Lamproie marine sur le bassin de l’Adour – Tranche 1/3, gaves et nives. [Google Scholar]
 Morgan BJT, North PM, Ralph CJ, Scott JM. 1983. Estimating numbers of terrestrial birds. Biometrics 39: 1123. [CrossRef] [Google Scholar]
 Nielsen R, Beaumont MA. 2009. Statistical inferences in phylogeography. Mol Ecol 18: 1034–1047. [PubMed] [Google Scholar]
 Ostfeld RS. 1986. Territoriality and mating system of california voles. J Anim Ecol 55: 691–706. [CrossRef] [Google Scholar]
 Pace III RM, Corkeron PJ, Kraus SD. 2017. State–space mark–recapture estimates reveal a recent decline in abundance of North Atlantic right whales. Ecol Evol 7: 8730–8741. [Google Scholar]
 R Core Team. 2021. R: A language and environment for statistical computing. [Google Scholar]
 Rodgers JA, Linda SB, Nesbitt SA. 1995. Comparing aerial estimates with ground counts of nests in wood stork colonies. J Wildlife Manag 59: 656–666. [Google Scholar]
 Schwarz CJ, Seber GAF. 1999. Estimating animal abundance: review III. Stat Sci 14: 427–456. [Google Scholar]
 Solow AR, Smith WK. 2009. Estimating species number under an inconvenient abundance model. JABES 14: 242–252. [CrossRef] [Google Scholar]
 Southwood TRE. 1978. Estimates based on products and effects of insects. In: Southwood TRE, ed. Ecological methods: with particular reference to the study of insect populations. Dordrecht: Springer Netherlands, pp. 288–301. [CrossRef] [Google Scholar]
 Turner BM, Van Zandt T. 2012. A tutorial on approximate Bayesian computation. J Math Psychol 56: 69–85. [Google Scholar]
 Wuertz D, Setz T, Chalabi Y, Boudt C, Chausse P, Miklovac M. 2016. fGarch: Rmetrics  Autoregressive conditional heteroscedastic modelling. [Google Scholar]
 Zeileis A, Kleiber C, Jackman S. 2008. Regression models for count data in R. J Stat Soft 27. [Google Scholar]
Cite this article as: Dhamelincourt M, Tentelier C, Elosegi A. 2023. ABC model for estimating sea lamprey local population size using a simple nest count during the spawning season. Knowl. Manag. Aquat. Ecosyst., 424, 5.
All Tables
Parameter estimates obtained from Dhamelincourt et al., 2021a and Dhamelincourt, 2021b; m = males, f = females. With for negative binomial distributions and Variance = sd^{2} for normal distributions.
Percentage of over or underestimation of the median of the spawner abundance estimate obtained with the default values, from 0.1 to twice the default value regarding our field data and CMR model; m= males, f= females.
Percentage of over or underestimation of the standard deviation of the spawner abundance estimate obtained with the default values, from 0.1 to twice the default value regarding our field data and CMR model; m= males, f= females.
All Figures
Fig. 1 Scheme of the process behind the ABC model implemented to estimate the sea lamprey spawner abundance with nest counts. 

In the text 
Fig. 2 Decision tree synthesizing the actions of the individual depending on the conditions implemented in the model. This decision tree is repeated for each individual i each day d. 

In the text 
Fig. 3 Spawning season of a hypothetical individual i generated by the individualbased ABC model. This individual (either a male or a female) firstly met two available partners (either males, females or both) on day d. On day d + 1, the individual was still on the nest built the previous day, but one partner left the nest, possibly joining or building another one. Then, at d + 2, the individual built a new nest and was joined solely by one of the two partners available this day. Indeed, available partners may join or build another nest with other individuals. At d + 3, the individual i built a nest on its own, as no other partner was available. Finally, from d + 4 until the residence time was reached, the individual did not build or join more nests, as its maximum number of nests was reached at d+3. It finally disappeared from the pool of individuals at d + residence time + 1, simulating its death. The table indicates the number of active nests belonging to this individual. Each day this individual was observed on a nest, +1 was added to the count of “active” nests. For this reason, four nests appear in the count but three nests were built. 

In the text 
Fig. 4 Median and 95% confidence interval of spawner abundance estimate obtained with (1) the ABC model (blue and dashed lines) using the set of parameters described in Table 1 and (2) the CaptureMarkRecapture model (orange and dotted lines) from Dhamelincourt et al. (2021a). 

In the text 
Fig. 5 Values of summary statistics calculated with the distribution of active nests, obtained assuming different sampling schemes (maximum, median, mean, Q_{25} and Q_{75}), then used for ABC computation. 

In the text 
Fig. 6 Comparison of the spawner abundance estimated by ABC models for the different sampling schemes studied, each sampling scheme being repeated several times. Each point corresponds to one of the 100 values provided by the model to calculate the estimate as a distribution, presented here as a boxplot. The horizontal red line corresponds to the estimate obtained with a daily sampling scheme. For all sampling schemes (except the daily nest count and the "every other day" scheme which only has two possibilities) we calculated seven summary statistic sets corresponding to all the different possibilities of field campaigns (count every other week and weekly count with a fixed 7 days step) for nonrandom sampling schemes. For random ones, we kept the same number of sets (seven) even if they have many more possible combinations, in order to be consistent across the sampling schemes. Coloured boxes correspond to the range of spawner's abundance estimated within each sampling scheme and allow to visualize the interval between the lowest and highest estimate. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.