Introduction

Signals are elaborate and costly traits that can function via many modalities, including acoustic (e.g. bird song; Murphy et al. 2008; Nemeth et al. 2012; Baldo et al. 2015), visual (e.g. horns, tusks, various colourful traits; Geist 1966; West and Packer 2002; Tibbetts and Dale 2004; Loyau et al. 2005; Girard and Endler 2014; Graham et al. 2020) or behavioural ones, which combines both acoustic and visual elements (e.g. nuptial dance; McDonald 1989; Lukianchuk and Doucet 2014). Although research in recent decades has accumulated a large amount of information about communication, in general, and signalling, in particular, the precise indicator value of traits presumably serving as signals, like colourful ornaments of the integument, is still unknown in many species (e.g. Kemp et al. 2012; Pérez-Rodríguez et al. 2017).

Conspicuous colour signals are usually characteristic of males in sexually dimorphic species, and are frequently used as sexual signs to attract females, which typically have more cryptic colouration (Dale et al. 2015). As a consequence of this common asymmetry between the sexes, there has been a strong bias, ever since Darwin (Darwin 1871), in research interest and effort towards studying species with strong sexual dimorphism, and in particular the role of elaborated ornaments in males in the context of sexual selection (e.g. Pérez i de Lanuza et al. 2013; Seddon et al. 2013; Dale et al. 2015). This phenomenon is well-illustrated by the strongly male-skewed sex ratio found in museum collections from all over the world for both birds and mammals (Cooper et al. 2019). As a result, the adaptive role of female ornaments has been overlooked for a long time, and conspicuous female characters were considered purely as a “by-product” of the shared genome (i.e. genetic correlation) between males and females, but without selection acting on these traits in females (Amundsen 2000). Studies from the last decades (e.g. Amundsen and Forsgren 2001; Jawor et al. 2004; Heinsohn et al. 2005; Komdeur et al. 2005; Weiss 2006, Weiss et al. 2009; Griggio et al. 2010; Baldauf et al. 2011; but see Hill 1993), however, provide considerable evidence that female colouration, similar to that of males, can be adaptive and evolutionary persistent in contexts of sexual (e.g. mate choice, mate competition), social (e.g. competition over non-breeding resources) or natural selection (e.g. camouflage against predators) (Amundsen 2000; Clutton-Brock 2007; Rosvall 2011; Tobias et al. 2012).

Colour traits, including those serving as signals, can be structurally acquired (Prum 2006) or pigment-based (e.g. carotenoids, melanins; McGraw 2006a, b). Melanins are the most common pigment molecules found in the vertebrate integument and are of two types, eumelanins and pheomelanins, which are responsible for the brown to black and yellow to reddish-brown colourations, respectively (Ducrest et al. 2008). The central mechanism orchestrating melanin synthesis from amino acid precursors (i.e. tyrosine, cysteine) is the so-called melanocortin system (Ducrest et al. 2008). This system consists of the proopiomelanocortin (POMC) gene that produces four different types of melanocortin hormones (α-, β- and γ-MSH and ACTH) that bind to five different melanocortin receptors (MC1-5R). The main receptor expressed in the integument, thus responsible for colouration, is the MC1R, the other four receptors (MC2-5R) being expressed weakly at the level of skin but having a more important role in other physiological processes in various parts of the body. Overall, the location and hence functions of melanocortin receptors are highly diverse and include skin pigmentation, energy expenditure, stress response, immune capacity and behaviour (Ducrest et al. 2008; San-Jose and Roulin 2018). Eumelanin and pheomelanin are synthesised in different pathways, and the synthesis is controlled by receptor agonists binding to MC1R (Ducrest et al. 2008).

Melanin-based colouration has been found to covary with several phenotypic traits (reviewed by Ducrest et al. 2008; San-Jose and Roulin 2018). One of the possible reasons for this phenomenon is the pleiotropic effects of the genes (see above) regulating the melanin synthesis (“melanocortin hypothesis”; Ducrest et al. 2008). The melanocortin hypothesis thus provides a good mechanistic framework to set up predictions about associations between eumelanic colouration, physiology and behaviour. For instance, according to the melanocortin hypothesis, more eumelanised individuals, due to a higher melanocortin activity, should have a more balanced energy homeostasis (i.e. lower body mass) and a higher immune capacity (i.e. higher anti-inflammatory capacity, lower oxidative stress levels), and should be less sensitive to acute stress (i.e. lower glucocorticoid-mediated stress response) (Ducrest et al. 2008). Moreover, melanin-based colouration should be also correlated with individual behaviour (e.g. aggressiveness; Ducrest et al. 2008).

Behavioural traits are usually consistent within individuals across time and contexts, and this behavioural consistency is defined as individual personality (Dall et al. 2004; Réale et al. 2007). Personality of an individual often mirrors its physiology as well, the two being interrelated, forming different “coping styles” (Groothuis and Carere 2005; Carere et al. 2010). For instance, individuals with different coping styles are characterised by different levels of glucocorticoid-mediated stress responsiveness (e.g. Koolhaas et al. 1999): proactive individuals have a less elevated stress response in risky situations, while reactive individuals are the opposite (Cockrem 2007). Similarly, the coping style of individuals is also linked to their immunity (Koolhaas 2008; Lopes 2017). Furthermore, coping style is also translated into a specific individual behavioural profile. For example, individuals with a more proactive coping style are usually more active, exploratory, aggressive and bold, while reactive individuals are less active, exploratory and aggressive, and more shy (e.g. Barnett et al. 2012; Thys et al. 2017). Therefore, given that the melanocortin machinery is associated with physiology (e.g. stress responsiveness, immunity), melanin-based colouration is also expected to correlate with some of the individual personality traits (e.g. activity, exploration, boldness, aggressiveness), and indeed, previous studies found correlations in multiple instances (e.g. Mafli et al. 2011; Mateos-Gonzalez and Senar 2012; Fargallo et al. 2014; Schweitzer et al. 2015; Thys et al. 2020; but see Nicolaus et al. 2016). Taking all together, a large body of evidence seems to support associations between melanin-based colouration, physiology and behaviour (reviewed in detail by San-Jose and Roulin 2018; but see e.g. Santostefano et al. 2019). It has to be noted, though, that correlations between melanin-based colouration and phenotype are not caused exceptionally by gene pleiotropy in the melanocortin system, and several exceptions are known where melanin-based colour variations among individuals (e.g. colour polymorphism) were not explained by the variation of the MC1R gene (e.g. Derelle et al. 2013; Farrell et al. 2015; Riyahi et al. 2015; Corti et al. 2018), suggesting that other mechanisms can also be important.

Melanin-based colouration, by often showing rather complex co-variations with multiple aspects of individuals’ phenotype (see above), forms so-called integrated phenotypes (Pigliucci and Preston 2004). Phenotypic integration of colouration can be similar, but can also differ between the sexes, since trait correlations can be shaped by sex-specific selection on life-history profiles and associated physiological (e.g. hormonal) mechanisms (e.g. Stoehr and Kokko 2006; Ketterson et al. 2009), eventually resulting in sex-specific relationships between traits (i.e. stronger phenotypic integration in one of the sexes). For instance, in female fledgling boobies (Sula dactylatra), brown patch size on the head and major wing coverts and testosterone levels were both found to be negatively related to boldness, while in males, these traits were not related (Fargallo et al. 2014). A similar pattern of correlations was found in juvenile kestrels (Falco tinnunculus) as well between plumage blackness and boldness (López-Idiáquez et al. 2019).

Melanin-based colour traits are exhibited many times as conspicuous visual attributes (e.g. “bagdes of status”) which serve as signals in various contexts (i.e. melanin-based signals; Jawor and Breitwisch 2003; McGraw 2008; Roulin 2016). Here, we study melanin-based signalling in the Eurasian tree sparrow (Passer montanus), a species which appears fully monochromatic to the human eye (i.e. males and females look completely similar), but being dichromatic outside the colour spectrum visible to humans, according to a study using avian vision modelling (Eaton 2005). Tree sparrows are mutually ornamented (Kraaijeveld et al. 2007), meaning that both sexes are possessing a black throat patch (hereafter “bib”) which is a conspicuous eumelanin-based plumage ornament with demonstrated signalling value in social context during the non-reproductive season (i.e. badge of status; Torda et al. 2004; Mónus et al. 2017): in males, bib size seems to indicate fighting success over both males and females; whereas in females, the social status signalling value of the black bib was not confirmed, although it could not have been excluded (Mónus et al. 2017). Other signalling functions of the bib have not been studied. Nevertheless, the continued expression of the black bib in both sexes, especially in the light of that females in several closely related species lack the black bib (Tibbetts and Safran 2009), leaves open the capacity for a signalling value of the bib in females as well, and also for potential sex-specific effects in the signalling role of this plumage trait.

In this study, we aim to widen our knowledge on the signalling value of the black bib in tree sparrows using the melanocortin hypothesis as a framework (Ducrest et al. 2008). Specifically, we investigate by correlative means whether the size of the bib conveys information, apart from dominance status, about the individuals’ body condition (i.e. size-corrected body mass), physiology (i.e. cellular innate immunity/inflammation status, expressed through total leucocyte counts, and chronic physiological stress, expressed through the ratio of heterophils to lymphocytes) and an axis of personality, namely activity in a novel environment (used at occasions as synonym for exploratory behaviour; Réale et al. 2007; Carter et al. 2013). Based on the melanocortin hypothesis (Ducrest et al. 2008), we predict a positive relationship between bib size and activity, while we predict a negative relationship between bib size, body condition, cellular innate immunity/inflammation status and chronic physiological stress levels. Since phenotypic integration of melanin-based ornamental traits can be similar or different in males and females, we also explore if trait correlations of the black bib differ between the sexes by exploring various scenarios regarding the sex specificity of the investigated phenotypic relationships (Fig. 1).

Fig. 1
figure 1

Predicted relationships between bib size and phenotype in the two sexes under different scenarios. Blue solid and red dashed lines indicate males (sex A) and females (sex B), respectively

Methods

Study site and period

The study was conducted in the Botanical Garden and the Central Campus of the University of Debrecen (Debrecen, Hungary) during the wintering seasons (i.e. between mid-October and mid-March) of 2016/2017, 2017/2018, 2018/2019 and 2019/2020. The study site is mainly an open area with scattered trees and bushes, also containing some buildings of various sizes forming a heterogeneous semi-urban landscape mosaic (Barta et al. 2004; Fülöp et al. 2019).

Field procedures

Field procedures were similar as described in Fülöp et al.’s study (2019). We captured tree sparrows with mist nets (Ecotone, Poland) at bird feeders each winter between mid-October (after the completion of the annual moult) and late January. Upon capture, we marked all the birds with a uniquely numbered aluminium ring and a unique combination of three plastic colour rings to allow individual identification. Furthermore, we recorded standard biometry of individuals: body mass (± 0.1 g with a Pesola spring balance), tarsus length (± 0.01 mm with a digital calliper) and wing length (± 0.5 mm with a ruler). We also took a blood sample (~ 50–150 μl) from the brachial vein to determine sex using molecular methods (see below), since sexes cannot be reliably distinguished on the basis of plumage characteristics and/or biometry alone (Mónus et al. 2011). A drop of blood was also used to prepare a blood smear for leukocyte counts (see below). All blood samples were collected within 30 min after capture (mean = 5.33, median = 3.50, SD = 5.67, range = 0–26) to exclude the effect of capture stress on blood cell counts (see e.g. Cīrule et al. 2012). At capture, we also photographed the black bib of the birds with a digital camera near a standard reference for length (i.e. ruler or millimetre paper), while the birds were held in a standardised position with their head facing the camera so that the axis of the beak was perpendicular to the axis of the body and the camera sensor plane (Fülöp et al. 2019). Finally, each individual was tested for activity (see below). After completing all the procedures, individuals were released at the location of their capture.

During the four seasons of the study, we caught 199 tree sparrows in total (70 males and 129 females). A number of individuals which were used to calculate repeatability of activity (see below) were recaptured and underwent again the same procedure as detailed above. The female-biased sex ratio of our sample can be the result of sex-dependent trapability of individuals or can also represent biased sex ratio in the population (i.e. random sample). Evidence from another population (Kato et al. 2017) shows that in tree sparrows, the mortality of male embryos during incubation can cause a similar female-biased adult sex ratio, as observed in our population. Therefore, we are reasonably confident that our sample is random, but we are aware that a sampling bias caused by different trapping probability of males and females cannot be excluded completely. Also, it was not possible to record data blind because our study involved focal animals in the field. However, we minimised eventual observer bias in our data as the analysis of the materials and samples collected during fieldwork was carried out by different persons (see below) without prior knowledge on individuals.

Activity test

The activity of individuals was tested in a mobile test cage, a field-adapted version of the standard open-field test used to quantify exploratory behaviour (see Stuber et al. 2013). Nevertheless, we prefer to label the behaviour measured in this mobile test cage “activity”, since we believe that due to the rather small size of the test cage (see below), the test is more suitable to measure the general level of activity of the individuals rather than exploratory tendency. Moreover, considering also that the behavioural test was carried out at capture, we consider that this measure adequately mirrors the behavioural (acute) stress response of individuals to a risky life event (i.e. trapping; e.g. Martins et al. 2007; Baugh et al. 2013; but see Baugh et al. 2012). It has to be noted though that “activity” and “exploration” are frequently measured using the same test (i.e. open-field test), and the two terms are often used as synonyms (Réale et al. 2007; Carter et al. 2013). The test cage was a 75 × 45 × 55 (L × W × H) cm solid wooden box of which the front wall (75 × 55 cm) was made of transparent wire mesh. The inside space of the test cage was divided into six equal-sized virtual compartments in two rows and three columns which were marked by reference lines traced on the inner walls of the test cage. Three of these virtual compartments (the two upper corners and the lower middle one) contained a perch for each. Individuals were released into the test cage directly from the hands of an experimenter. The behaviour of individuals in the test cage was then video-recorded for 5 min using a video camera (Panasonic HC-V510), which was mounted on a tripod and placed at a distance of 1 m from the wire mesh wall of the cage. While the behaviour of the focal individual was recorded, the experimenter waited quietly on the opposite side of the test cage without disturbing the tested individual. For the same reason, in all seasons, except in 2016/2017, the test cage was visually isolated from the surrounding environment using a camouflage tent, which was installed behind the camera in the front of the cage. The presence or absence of the tent was taken into account during the statistical analyses (see below).

The activity level of individuals was quantified from the video recordings using the “mwrap” event recorder software (Bán et al. 2017). The activity score of each individual was expressed as the total number of position switches between the six compartments of the test cage. We note that, although not all the compartments contained perches, birds also frequently used the bottom and the wire mesh wall of the cage to land. Therefore, the number of perches seemingly did not influence the activity pattern of individuals in the test cage. Cases when the bird transited a compartment, but did not land in it (e.g. the upper middle part of the test cage while moving from one upper corner to the other), were also counted as position switches. In total, between 2016 and 2020, we tested 167 individuals (56 males and 111 females) for activity, out of which 23 individuals (6 males and 17 females) were tested more than once (21 individuals twice and 2 individuals three times) in order to calculate the repeatability of activity. Repeated tests were performed only on birds that were recaptured on days different from the day of their first capture. Time spent between tests, expressed in number of days, was as follows: mean = 59.28, median = 19, SD = 115.44, and range = 2–379. Out of all ringed individuals, 32 individuals were not tested for activity due to logistical reasons. All videos were analysed by the same person (DL).

Bib size measurements

The bib size (i.e. area) was measured from digital photographs by using the ImageJ software (ver 1.51i for Linux; https://imagej.nih.gov/ij/index.html). For each photo, with the “set scale” function, we first calibrated the unit length on the photograph, using the ruler or the millimetre paper, that was also included in the photograph as a reference. Then, with the “freehand selection tool”, we traced the outline of the black bib patch, and measured its area (mm2) using the “measure” function. The bib area for each individual was measured from the same photo twice, and the values were averaged to increase measurement precision (Fülöp et al. 2019). Repeatability between these two measurements was high (intraclass correlation coefficient, performed with the “ICC” package for R (Wolak et al. 2012; R Core Team 2020): ICC = 0.913, 95% CI = 0.888–0.933, N = 192). Bib areas were all measured by the same person (AF).

Leukocyte counts

The number of leukocytes (i.e. white blood cells; WBC) was counted from blood smears following the method described in Pap et al. (2011). Briefly, blood smears were air-dried, then fixed and coloured using Dia-Fix, Dia-Red and Dia-Blue Panoptic (Diagon Ltd., Hungary). Afterwards, five different types of white blood cells (i.e. heterophils, lymphocytes, monocytes, eosinophils and basophils) were identified and counted under a microscope at × 1000 magnification. Total WBC was expressed as the total number of leukocytes per 10,000 erythrocytes after counting in total 50 leukocyte cells. For the statistical analyses, we used two parameters calculated from leucocyte counts, the total WBC and the ratio of heterophils to lymphocytes (H/L ratio). Total WBC level indicates the cellular innate immune capacity and inflammation status of the organism (Norris and Evans 2000; Salvante 2006), whereas H/L ratio reflects chronic physiological stress (Davis et al. 2008; Davis and Maney 2018). H/L ratios have been found to correlate with the intensity of glucocorticoid-mediated acute stress response (i.e. corticosterone level; Goessling et al. 2015), however, while corticosterone level tends to decrease over time, even if the stressor persists, the H/L ratio seems to remain stable or to even increase (Davis et al. 2008; Goessling et al. 2015; Davis and Maney 2018). Therefore, H/L ratio is a more robust marker of chronic physiological stress, resulting especially from ecological stressors (e.g. hunger, parasites; Davis et al. 2008; Goessling et al. 2015; Davis and Maney 2018). Counts were highly repeatable based on a subset of blood smears chosen at random and counted twice (WBC: ICC = 0.911, 95% CI = 0.792–0.963; heterophils: ICC = 0.931, 95% CI = 0.837–0.972; lymphocytes: ICC = 0.904, 95% CI = 0.779–0.961; H/L ratio: ICC = 0.912, 95% CI = 0.795–0.964; N = 20 for each test). All leukocyte counts were carried out by the same person (PIF).

Molecular sexing

Blood samples were stored without any cryoprotectant at − 20 °C until molecular analyses. DNA was extracted by homogenising 5 μl of blood in 150 μl extraction buffer following the protocol described in Bereczki et al. (2014). Molecular sexing was carried out using the 2550F and 2718R primers (Fridolfsson and Ellegren 1999) following the amplification protocol in Bereczki et al. (2014). PCR products were loaded on a silver-stained polyacrylamide gel. The electrophoresis buffer systems and running conditions as well as the staining solutions were used according to Bereczki et al. (2005, see Appendix 2, 4a) and An et al. (2009). The identification of different genders was executed based on the banding pattern after visualisation by white light. Molecular sexing was performed by JB and by a laboratory technician (V. Mester).

Statistical analyses

All calculations and statistical analyses were performed in the R statistical environment version 4.0.3 (R Core Team 2020).

Repeatability of activity level

We tested the repeatability of the activity scores using variance partitioning with a linear mixed-effects model (LMM) with Gaussian error distribution (“lmer” function in R, package “lme4”; Bates et al. 2015) as recommended by Nakagawa and Schielzeth (2010). We built a full model that contained the activity score of the individuals as a dependent variable. Sex (male or female) was entered as a fixed factor in the model, while body condition, test date (expressed as the number of days since 1 October of each wintering season), test time (expressed in minutes from midnight) and handling time (expressed as minutes elapsed from capture until the start of the activity test) were included as continuous predictors. Additionally, all second-order interactions of the fixed factor “sex” and other predictors were tested. Individual ID and season were entered as crossed random factors in the model. Since the effect of the tent cannot be statistically separated from the effect of season (i.e. the tent was not used in the first wintering season, but was used in all other seasons), we accounted for the presence or absence of the camouflage tent in the model through the random factor “season”. Body condition was characterised using the scaled mass index (SMI; Peig and Green 2009), which is a size-corrected body mass index. SMI was calculated using the body mass and tarsus length data of individuals with the formula: SMI = body mass × (mean body mass of the sample/tarsus length)^bSMA, where bSMA is the slope of a model II standard major axis regression of log mass on log tarsus length, calculated from the sampled individuals (“lmodel2” function in R, package “lmodel2”; Legendre 2018). The activity score of the individuals, and all the continuous predictors included in the model, except SMI, were first square root–transformed to improve their distributional properties and then Z-transformed (to have mean = 0, SD = 1) to help model convergence (Schielzeth 2010). The full model was simplified in a stepwise manner (using the “drop1” function in R) by sequentially removing terms with non-significant effects (P > 0.050), until reaching the minimal adequate model that included only significant interactions, significant main effects or main effects involved in significant interactions. Significance level (P) of the predictors from the minimal model was tested using the “Anova” function (type II) from the R package “car” (Fox and Weisberg 2019). Post hoc comparisons between different slopes of continuous predictors involved in significant interactions from the minimal model were performed using the R package “emmeans” (function “emtrends”; Lenth 2020). The repeatability of activity (i.e. adjusted repeatability; r), the associated standard error (SE), 95% confidence intervals (95% CI) and significance levels (P and Pperm) were calculated following Nakagawa and Schielzeth (2010) using the R package “rptR” with 4999 parametric bootstraps (for the 95% CI) and 4999 permutation steps (for Pperm) (function “rptGaussian”; Stoffel et al. 2017).

Bib size and individual phenotypic traits

We tested the relationship between bib size and the different phenotypic traits (i.e. body size, SMI, total WBC, H/L ratio and activity) using a LMM with season entered as a random effect with four levels. In this analysis, we used only the data from the first capture of the individuals. Body size index was expressed using predicted values of the first principal component (PC1) extracted using a principal component analysis (PCA; “prcomp” function in R, package “ade4”; Dray and Dufour 2007) that included three biometry measures of individuals: body mass, tarsus length and wing length. PC1 explained 62.13% of variance and had the following loadings for body mass, tarsus length and wing length, respectively: 0.633, 0.593 and 0.498. Because we are interested in how bib size might integrate information on different aspect of the phenotype, we first built a full model in which we entered bib size as the dependent variable, while sex was set as a fixed factor, and body size index, SMI, activity score of individuals, total WBC and H/L ratio were entered as continuous predictors. The second-order interactions of the fixed factor “sex” with all the continuous predictors were also included in the model. Since the activity of individuals was significantly influenced by a series of confounding effects (see the “Results” section), but also to minimalize eventual collinearity between predictors, we used the residuals of activity scores extracted from the statistical model we used to test the behavioural repeatability as individual activity scores (i.e. residual activity score) in the subsequent analyses. Prior to entering into the models, all variables with a skewed distribution were transformed to improve model fit (bib size was natural base log-transformed, total WBC and H/L ratio were square root–transformed); and both the dependent variable and the continuous predictors were Z-transformed (Schielzeth 2010). The full model was simplified to a minimal adequate model in a similar manner as described above, and the significance level of predictors from the minimal model was tested also in the same way as above presented. The model fit in every case was assessed visually using model diagnostic plots. We note that we omitted one outlier bib size value from the analysis to obtain an adequate model fit (i.e. normality and homoscedasticity of model residuals). Multicollinearity between predictors was excluded as being a potential confounding effect, since all variance inflation factor (VIF) values were less than 5 (e.g. James et al. 2013). Results of both full and minimal models are presented.

Results

Repeatability of activity level

Activity was significantly repeatable (adjusted repeatability for the random factor “individual ID”, full model: r = 0.361, SE = 0.138, 95% CI = 0.216–0.740, P = 0.001, Pperm = 0.038; minimal model: r = 0.311, SE = 0.137, 95% CI = 0.116–0.644, P = 0.004, Pperm = 0.030). Activity was significantly influenced by sex, females being less active than males (Table 1). Furthermore, activity was significantly negatively related to the time of the day when the test was performed (Table 1). Finally, activity and SMI were related in a sex-dependent manner (Table 1, Fig. 2), post hoc tests revealing that males with better body condition (i.e. SMI) were less active (β = − 0.262, SE = 0.107, df = 182, t = − 2.443, P = 0.016), whereas in females SMI was not related to activity (β = 0.041, SE = 0.081, df = 181, t = 0.507, P = 0.612). Test date, handling time and the interactions of sex with test date, time, and handling time did not influenced activity of individuals (all P > 0.050; Table 1).

Table 1 Results of the linear mixed-effects model on the relationship between individual activity and predictors. For the main effect of the fixed factor “sex”, males are included in the intercept, and therefore, the reported estimates show the extent to which females differ from males. The sign of estimates indicates the direction of associations. Significance levels indicate results from likelihood ratio tests (ANOVA, type II). Significant effects (P ≤ 0.050) are highlighted in bold. SMI – scaled mass index (body condition)
Fig. 2
figure 2

Sex-dependent relationship between activity (square root) and body condition, expressed through the scaled mass index (SMI), in free-living tree sparrows (Passer montanus). Lines and shaded areas are model-predicted logistic regression lines ± 95% confidence intervals. Filled circles and blue solid line indicate males, while empty circles and red dashed line indicate females

Bib size and individual phenotypic traits

Bib size was significantly larger in male than in female tree sparrows (Table 2, Fig. 3). Bib size was associated with activity (i.e. residual activity) in a sex-dependent way (Table 2), results of post hoc tests indicating that in males, activity was marginally negatively related to bib size (β = − 0.268, SE = 0.137, df = 147, t = − 1.957, P = 0.052), but in females the two were positively associated (β = 0.183, SE = 0.090, df = 146, t = 2.032, P = 0.044; Fig. 4). Furthermore, H/L ratio was significantly positively correlated with bib size, regardless of sex (Table 2, Fig. 5). Body size, SMI and total WBC were not associated with bib size, either independently of or in interaction with sex (Table 2).

Table 2 Results of the linear mixed-effects model on the relationship between bib size and aspects of individual phenotype. For the main effect of the fixed factor “sex”, males are included in the intercept, and therefore, the reported estimates show the extent to which females differ from males. The sign of estimates indicates the direction of associations. Significance levels indicate results from likelihood ratio tests (ANOVA, type II). Significant effects (P ≤ 0.050) are highlighted in bold. SMI – scaled mass index (body condition), total WBC – total number of leukocytes, H/L ratio – ratio of heterophils to lymphocytes
Fig. 3
figure 3

Boxplot showing the sex differences in bib size in free-living tree sparrows (Passer montanus). Horizontal bold lines indicate median values, box margins mark the 25% and 75% quartiles, while whiskers show the lower and upper range values, excluding outliers (i.e. values outside 1.5 times the interquartile range above the upper quartile and below the lower quartile), which are denoted by asterisks. The blue box indicates males, while the red box indicates females. Points denote individual values that are jittered to reduce overlaps of data points on the plot

Fig. 4
figure 4

Sex-dependent relationship between activity (i.e. residual activity) and bib size in free-living tree sparrows (Passer montanus). Lines and shaded areas are model-predicted logistic regression lines ± 95% confidence intervals. Filled circles and blue solid line indicate males, while empty circles and red dashed line indicate females

Fig. 5
figure 5

Relationship between the ratio of heterophils to lymphocytes (square root of H/L ratio) and bib size in free-living tree sparrows (Passer montanus). The black line and the shaded area are the model-predicted logistic regression line ± 95% confidence interval

Discussion

We studied the potential signalling value, and its sex dependency, of an eumelanin-based mutual plumage ornament, the black throat patch (i.e. bib), in tree sparrows. We found that bib size was positively related in both sexes to H/L ratio (i.e. ratio of heterophils to lymphocytes), a measure characterising chronic physiological stress. Interestingly, the relation between bib size and activity was influenced by sex: while males with larger bib patches were less active, females with larger black bib patches were more active. We found no association between bib size and either SMI (i.e. body condition) or total WBC (i.e. cellular innate immune capacity/inflammation status of the organism).

The melanocortin hypothesis provides a series of predictions for trait correlations between eumelanic colouration and individual phenotype (Ducrest et al. 2008). While some of our results may fit into this framework, for a number of phenotypic measures, we found no relationship with colouration. An interesting prediction of the melanocortin hypothesis is that more eumelanic individuals should be more active (Ducrest et al. 2008). The activity level of an individual, as quantified in the present study, might mirror both its physiological state (e.g. acute stress response to capture), but also other aspects of its personality (e.g. exploration, boldness). More eumelanised individuals are expected to be more resistant to acute stress, since melanocortins are involved in the modulation of glucocorticoid-mediated stress response (Ducrest et al. 2008); and more eumelanised individuals have been shown to be more active, exploratory, aggressive and bold (e.g. Mafli et al. 2011; Mateos-Gonzalez and Senar 2012; Schweitzer et al. 2015; Thys et al. 2020; but see Nicolaus et al. 2016). Therefore, we might reasonably expect more eumelanised individuals to be also more active. Our findings apparently contradict this prediction, at least in part, since we found different relationships between bib size and activity in males and females. The sex-specific relationships between activity and bib size we identified (i.e. negative in males and positive in females) raise several questions. First, what proximate mechanisms might be responsible for this difference between the two sexes? Second, why do females and males advertise their personality differently in this species? In other words, does signalling activity levels differently in the two sexes have an adaptive value?

Phenotypic traits, including colouration, often covary with each other forming integrated phenotypes (Pigliucci and Preston 2004). Mechanistically, covariation between traits is usually the result of a shared physiological mechanism determining trait expression, for instance, when the expression of multiple phenotypic traits is mediated by the same hormone (Ketterson et al. 2009). Hormones, however, can have weaker or stronger effects on the expression of different traits, due to differing circulating hormone concentrations or sensitivity of target tissues, which might result in low or high phenotypic integration (Ketterson et al. 2009). For example, in males and females, trait correlations might be altered by gonadal hormones (e.g. testosterone), which have been found to influence sex-specific trait covariances (e.g. pace-of-life syndrome; Immonen et al. 2018). Melanocortins and sex hormone (e.g. testosterone) production are linked to each other (Ducrest et al. 2008) and might thus have sex-specific consequences in terms of colour-related trait correlations (e.g. Fargallo et al. 2014). Furthermore, since some hormone levels, for instance that of testosterone, can strongly vary seasonally (e.g. Van Duyse et al. 2003; Laucht et al. 2010), trait correlations mediated by testosterone might also change between seasons and correlational patterns can differ to those reported in this study, which were found during winter.

Why do males and females signal their activity differently? Sex-dependent associations between phenotypic traits, including behaviour, might be the result of adaptations to differences in social context. For instance, in great tits Dingemanse and de Goede (2004) found context-dependent relationship between exploration and dominance (the latter is signalled in great tits by their black “tie”). More dominant territory-holding adults were also more exploratory, while in juvenile, non-territorial birds, the relationship was reversed, with more dominant individuals being less exploratory. Tree sparrows form large flocks during the winter (Summers-Smith 1995; Mónus and Barta 2010); therefore, the most relevant context shaping trait correlations in this species during this period might, at least partly, be related to aspects of social behaviour (e.g. sex roles within the group, social dominance status, group composition). Sex-specific associations between phenotypic traits are already known in this species from previous studies (e.g. dominance signalling, Mónus et al. 2017; determinants of social foraging tactic use, Fülöp et al. 2019), and these suggest that males and females may use distinct behavioural strategies in various social situations. As a consequence, the signalling role of mutual plumage ornaments, and implicitly trait correlations, might evolve in sex-specific ways. Moving beyond, though, the question of why to signal individual personality still remains open. Previously, we have found that personality is related to social foraging tactic use in this species (Fülöp et al. 2019), suggesting that signalling individual personality can be meaningful, at least in this context. Still, we acknowledge that a significant correlation between bib size and activity does not necessarily signify a direct causal link between the two traits and can have other (e.g. genetic) causes (San-Jose and Roulin 2018).

The melanocortin hypothesis also predicts that more eumelanised individuals should have a higher immune capacity (i.e. better anti-inflamatory capacity, more resistant to oxidative stress; Ducrest et al. 2008). We found no relationship between bib size and total WBC (i.e. cellular innate immunity/inflammation status), but found a positive association of bib size with H/L ratio (i.e. chronic physiological stress). H/L ratios adequately indicate individual physiological responses to various internal and/or external stressors (e.g. hunger, parasites; Davis et al. 2008; Davis and Maney 2018) and tend to remain stable over time; thus, H/L ratios mirror long-lasting exposure to stressful stimuli (Davis et al. 2008; Goessling et al. 2015; Davis and Maney 2018). The positive relationship between bib size and H/L ratio is inconsistent with most of the earlier studies (e.g. Minias et al. 2014, 2019; Moore et al. 2015; Svobodová et al. 2018; but see e.g. Svobodová et al. 2013). One possible interpretation of this positive relationship between bib size and H/L ratio is that individuals with large bibs might be involved in a higher number of agonistic interactions which might result in higher physiological stress levels, a cost of social dominance (Davis et al. 2008).

Finally, according to the melanocortin hypothesis, more eumelanised individuals should have a lower body mass, since melanocortins reduce food intake, the amount of adipose tissue, and increase metabolic rate and physical activity (Ducrest et al. 2008). We found no correlation between bib size and body condition (i.e. an index of body reserves). Our results are in contrast with some previous studies reporting either positive or negative relationship between melanin-based colouration and body mass and/or body condition (e.g. Kingma et al. 2008; Roulin 2009; Kim et al. 2013; Moore et al. 2015; reviewed by Roulin 2009, 2016; Guindre-Parker and Love 2014), but are in line with many others where no such relationship was found (e.g. Järvi and Bakken 1984; Jawor et al. 2004; reviewed by Roulin 2009). This diversity of results suggests that the relationship between melanic colouration and condition might not be a simple one (see also San-Jose and Roulin 2018). Indeed, the relationship between body reserves and melanin-based colouration can be confounded by several environmental factors (e.g. food availability; Roulin 2009), but can have genetic bases as well (see discussion in Roulin 2009).

To summarise, our study provides rather intriguing results which bring only a weak support for the melanocortin hypothesis. Although we found significant correlations between activity (i.e. a measure of personality) and eumelanin-based colouration, and chronic physiological stress levels and eumelanic colouration, respectively, both relationships were contrasting to what we predicted based on the melanocortin hypothesis. Several studies seem to support the melanocortin hypothesis (reviewed by Ducrest et al. 2008; San-Jose and Roulin 2018), but some others, similarly to our study, failed to detect expected trait correlations (e.g. Santostefano et al. 2019). Our findings in tree sparrows, however, show that, interestingly, some correlations between melanin-based colouration and individual phenotypic traits might vary according to sex, indicating that the signalling function of melanin traits in males and females might be different, at least in this species. Nevertheless, the role of melanin signals in mutually ornamented species and the mechanisms responsible for the emergence and evolutionary maintenance of sex-dependent signal information content in such species still require further investigations.