Background

Constipation is a disorder defined by incomplete defecation, and/or infrequent bowel movements which associated with persistent difficult and/or painful defecation, fecal incontinence, and abdominal pain [1]. It is a common clinical functional diseases. The worldwide constipation surveys show a wide range of prevalence rates between 1 % and >20 % in western populations, although, a recent epidemiological reports found 16 % general adult populations were constipation [2]. Constipation may be found for up to 20 % of community-dwelling elderly individuals. Moreover the incidence of functional constipation in childhood estimated 3 % [3].

Because of its high disease burden, the treatment of constipation has become an important issue for clinicians and patients. During the last two decade there were more than 20 developed clinical practice guidelines (CPGs) to manage the constipation. The main role of the CPGs is to give clear recommendations to help clinicians make appropriate clinical decision for specific clinical circumstance [4, 5].

However, not all guidelines are developed with the same methodologically rigorous approaches, there is no research for evaluating the quality of CPGs on constipation so far. With the above in mind, the objectives of the present study was to systematically review guidelines using the appraisal of guidelines for research and evaluation (AGREEII) instrument related to constipation [6].

Methods

Literature search

An electronic literature search using multiple databases (PubMed, EMBASE, The China Journal Full-text Database, Chinese Biomedical Literature Database, Chinese Scientific Journals Full-text Database), and guideline website or databases—including the Guidelines International Network (GIN) Database, the National Guideline Clearinghouse (NGC), National Institute for Health and Care Excellence (NICE), Scottish Intercollegiate Guidelines Network (SIGN), National Comprehensive Cancer Network (NCCN) was conducted limited to Chinese and English from the inception to May 2015. MeSH terms and text words “guideline, consensus, recommendation, criteria, statement, constipation” for constipation and guidelines were used within the MEDLINE database. The same search strategy was made applicable for the other databases or websites.

Guideline selection and data extraction

Four reviewers (THL,DC,LN,GJF) independently extracted the guidelines which met the characteristics (for example, a clear guideline definition as proposed by the institute of Medicine [4], focused exclusively on constipation disease). We constructed a standard form table to extract the data of guidelines. Four reviewers extracted data separately, disagreements were discussed or by a fifth reviewer (GXL) if no consensus was reached.

Quality appraisal and recommendation

We evaluated the twenty-two included CPGs quality by AGREEII instrument [6]. The instrument includes a 23-item tool comprising six quality domains. The four authors read the entire AGREEII handbook and then independently rated all included guidelines using formula as follows:

$$ \frac{\mathrm{Obtained}\ \mathrm{score}\ \hbox{-}\ \mathrm{Minimum}\ \mathrm{possible}\ \mathrm{score}}{\mathrm{Maximum}\ \mathrm{possible}\ \mathrm{socre}\ \hbox{-}\ \mathrm{Minimum}\ \mathrm{possible}\ \mathrm{score}}\times 100\% $$

According to the handbook for use of the AGREEII instrument, the six domain scores were considered independently. Finally, a guideline is labelled as “strongly recommended” if most domain scores are greater than 60 %. Guideline is “recommended” when most scores are between 30 % and 60 %. A guideline is labelled as “not recommended” when most domain scores are less than 30 % [7].

Statistical analysis

A descriptive study of item frequency was carried out and the AGREEII domain scores calculated as means. Intra-class correlation coefficients (ICCs) is a measure of the reliability of measurements or ratings within each domain [8]. Statistically significant was considered if p value less than 0.05.

Results and discussion

Literature search

Figure 1 shows how we screened the guidelines, we preliminary search found 1234 citations, 35 were excluded because they were duplicate citations. By screening their titles and abstracts and 1,146 citations were ineligible as they didn’t meet the characteristics of constipation CPGs, 31 articles were excluded from the left 53 studies as following: eight were duplicates, seven were not in English or Chinese, 14 were guidelines not related to constipation, and 2 guidelines were the old version. Finally, a total of 22 guidelines were included [931].

Fig. 1
figure 1

Flow of information through the different phases of the literature search

Guideline characteristics

The summary of CPGs baseline data were shown in Table 1. The twenty-two CPGs published between 2000 and 2014. Of the 22 selected CPGs, half of were from north America (America and Canada), six from European (UK, Ireland, Italy, Sweden) and the remaining five were from Asia (two from China, one from Korea, one from Indonesia and one multi-national),respectively. The scope of the CPRs varied: one guideline topic covered prevention, diagnosis and treatment of constipation [31]; two focused only on prevention and treatment [14, 16]; 16 covered diagnosis and treatment [9, 12, 13, 15, 1719, 2129]; three only focused on treatment [10, 20, 30] and one focused on prevention [11]. CPGs cited a range of number of references (range: 0–364, mean: 78) and were of varying length (mean number of pages = 25, range: 5–255). Each of the domains being evaluated using the AGREEII appraisal (Table 2). The ICCs score was moderate among raters (0.84; 95 % CI, 0.56–0.86).

Table 1 Characteristics of clinical practice guidelines for constipation
Table 2 Guideline score according to score on each of the domains assessed by the AGREEII instrument

Appraisal of guidelines

Domain 1

Scope and purpose is concerned with the overall aim of the guideline, the specific health questions, and the target population (items 1–3) [32]. This domain’s mean score was 51.77 %, and nine of the guidelines (47.62 %) scored below 50 % [9, 10, 12, 21, 22, 25, 26, 29, 31].

Domain 2

Stakeholder involvement focuses on the extent to which the guideline was developed by the appropriate stakeholders and represents the views of its intended users (items 4–6) [32]. Of all AGREEII domains, this domain received the lowest scores (23.73 %) with only one CPG scoring over 50 %. Eighteen CPGs had been developed by a multi-disciplinary organization (81.82 %) [9, 11, 1330].

Domain 3

Rigor of development criteria relates to the process used to gather and synthesize the evidence, the methods to formulate the recommendations, and to update those (items 7–14) [32]. Overall, the mean score for this domain was only 32.23 % (range, 3 % to 66 %), with 18 CPGs scoring < 50 %. Meanwhile, only five CPGs reported systematic evidence searching [12, 16, 18, 27, 30], and Just 40.90 %(9/22) guidelines provided the methods for formulating the recommendations [1116, 26, 27, 29]. Moreover, an explicit link between the recommendations and the evidence were explicit in 20/22 of the guidelines and only five guidelines described a procedure about updating [11, 13, 21, 27, 29].

Domain 4

Clarity of presentation deals with the language, structure, and format of the guideline (items 15–17) [32]. The mean score for this domain was 56.73 % (range, 36 % to 83 %). Most CPGs provided a concrete and precise description of key recommendations with only eight guidelines scoring less than 50 % [9, 10, 14, 22, 25, 26, 30, 31].

Domain 5

Applicability pertains to the likely barriers and facilitators to implementation, strategies to improve uptake, and resource implications of applying the guideline (items 18–21) [32]. This domain’s score was 29.14 % (range, 10 % to 58 %) and only three CPGs scored > 50 % [11, 18, 20]. A total of 10 CPGs discussed barriers to implementing the guideline’s recommendations [11, 13, 14, 16, 18, 20, 21, 2931] and 7 guideline provides advice and/or tools on how the recommendations can be put into practice [11, 16, 18, 20, 2931]. Resource implications were not explicitly discussed, only five CPGs offered cost implications [11, 18, 20, 29, 30].

Domain 6

Editorial independence is concerned with the formulation of recommendations not being unduly biased with competing interests (items 22–23) [32]. The mean score for this domain was 29.59 %. Fifteen guidelines scored below 50 %. Most (63.64 %) guidelines did not provide the information whether they received funding or not [911, 13, 14, 1618, 2022, 25, 26, 29].

Overall assessment

Guidelines were graded by the overall assessment. Only two CPGs can be strongly recommended [11, 18]. Eight can be recommended with provisions or alterations because of the most domains scoring between 30 % and 60 % [13, 14, 16, 19, 20, 27, 28, 30]. The remaining 12 CPGs were labelled as ‘not recommended’ due to the poor domain scores [9, 10, 12, 15, 17, 2126, 31] (Table 2).

Stratification of CPG quality

In order to examine which factors may have impacted quality scores in the six domains, we stratified the data on the following variables (guideline area, AGREEII publication date, publication type, working group, comprehensive search or not, fund support or not, and evidence-based or not) in Table 3. We didn’t find the difference in six domains quality related to publication year of AGREEII (before or after 2010). Meanwhile, guidelines published in guideline databases were significantly have a higher scores than that in journals. The scores from CPGs developed by medical societies were higher when compared with individuals for the following items: Scope &Purpose, Stakeholders, Rigour, and Applicability. If CPGs were evidence-based, those three domains (Rigour, Applicability and Editorial independence) would have a higher scores. Apart from above, we found no differences in the rest of the comparisons.

Table 3 Mean (±SD) AGREEII scores by subgroups

Discussion

We conducted a comprehensive assessment of the quality of CPGs for constipation. In general, these guidelines existed many deficits. Most of the guidelines had a low score in the following (domain 2, domain 3, domain 5 and domain 6). Table 4 showed that the scores results when compared with international CPGs level [33].

Table 4 A comparison of domain scores between these 22 CPGs and international level (%)

According to the results, the mean score of domain 3 received only 32.23 %. Methods of the search and the criteria for choose evidence must be clearly described. Meanwhile, the contents of health benefits and risks, externally reviewed by experts should be provided. In order to improve the score of domain 3, particular attention should be paid in above shortcomings.

There were only 2 CPGs included guideline developing experts in the panel [11, 18]. What’s more, no patients was invited to participate in the development term. The domain 5 “applicability” have an important role in the CPGs promotion, it should provide advice and/or tools on how the recommendations can be put into practice. These low scores reflect that CPG producers remain have much work to be done to improve guideline applicability.

Lastly, the scores in the domain 6 were less than 30 %. Many guidelines are developed with external funding, the name of the funding body and a statement that the funding body did not influence the content of the guideline should be explicit consideration [34]. What’s more, there should be a clearly declaration that competing interests of guideline development group members have been recorded and addressed. Therefore, conflict of interests need to be clearly stated.

There are two guidelines which we want to recommend strongly due to their high overall quality developed by Registered Nurses Association of Ontario-Professional Association (RNAOPA) [11] and an Italian guideline by the National Collaborating Centre for Women‘s and Children‘s Health (NCC-WCH) [18]. The detailed recommendations were listed in Table 5. Eight of twenty-two guidelines can be reported with provisos and alterations [13, 14, 16, 19, 20, 27, 28, 30], while the remaining 12 CPGs could not be recommended because most domain scores below 30 % [9, 10, 12, 15, 17, 2126, 2931].

Table 5 The detail recommendations information of 2 highly guidelines

However, our evaluation has several limitations. First, AGREEII rarely suggest how guidelines should select topics. To be useful, guidelines should address the challenges that clinicians face in practice, but developers may exclude clinically important topics when available evidence does not meet minimum standards. Second, inclusion criteria have a language restriction (English and Chinese), language search bias might happen. Third, we used only the AGREEII instrument evaluated the CPGs other than instruments may bring some selection bias [35]. AGREEII instrument have been introduced from 2010, frankly speaking, guidelines published before 2010 did not have access to AGREEII to comply with it. Unfortunately, there is no difference when we compare the six domains quality before and after 2010. We can find even if methodological requirements for CPGs are reported comply with these remains unsatisfactory. What’s more, how to spread the CPGs preferable is essential for clinical practice [36]. Through above specific methodological quality analysis, which can effectively promote the development of future constipation CPGs.

Conclusions

The results find that the quality of CPGs for constipation is poor. Guideline quality may be improved if we comply with the AGREEII instrument.