Background

Chronic kidney disease (CKD) is a five-stage progressive loss of renal function over a period of months or years. Each stage is a progression through an abnormally low and deteriorating glomerular filtration rate, which is usually determined indirectly by the creatinine level in the blood serum. When kidney disease progresses, it may lead to kidney failure and possibly require dialysis or a kidney transplant to maintain life. CKD can be caused by diabetes, high blood pressure, and other disorders. It can be detected through three simple tests of: blood pressure, urine albumin level, and serum creatinine level [1]. Early detection and treatment of CKD can help prevent patients' conditions from getting worse.

CKD afflicts people all over the world, and thus it is an urgent need for all countries to have a public health policy for dealing with it. In the U.S.A., for example, CKD is a serious public health problem, with national surveys there showing a considerably higher prevalence than appreciated previously [2, 3]. According to the analysis of the National Kidney Foundation in the U.S.A., 26 million Americans have CKD and another 20 million more are at an increased risk of developing it. The American Diabetes Association (ADA) has stated that 20%-30% of individuals with diabetes develop CKD. This is in spite of the facts that the U.S.A. has good quality medical care and that CKD is one of the most preventable of the many serious complications of diabetes. According to Josef Coresh's study of data from National Health and Nutrition Examination Surveys (NHANES), 10% of Americans had chronic kidney disease between 1988 and 1994, and 13% between 1999 and 2004 [4, 5]. Driving the increase is a dramatic rise in diabetes and high blood pressure. Each of these conditions can lead to chronic kidney disease.

In Taiwan, for another example, CKD is the eighth leading cause of death. The mortality rate increased from 11.39% per 100,000 population in 1990, to 20.8% per 100,000 population in 2004. The incidence of end-stage renal disease (ESRD) in Taiwan is the highest in the world according to ESRDS 2002 statistics. In that same year, the incidence of CKD in Taiwan ranked as the second highest in the world, just after Japan. Hsu (2006) rates the prevalence of CKD stages 3 to 5 in Taiwan at 6.9% [6]. Research also concludes that the high prevalence and low awareness of CKD in Taiwan show the need to advocate more strongly for CKD prevention and education for both physicians and the general populace [6].

Dialysis and kidney transplants are too costly for most people living outside the industrialized world, and too costly even for a large number of people living in industrialized countries. For these people, prevention, early detection, and intervention are the only cost-effective strategies for CKD treatment. For public health programs based on prevention, early detection, and intervention to succeed, however, the informed and active participation of the public is required. Health education programs can deal with the informed aspect the public's required participation, but not with all aspects of the active part of that. One important factor in how willingly and actively people cooperate in a public health program is their perceptions of the quality of the health program's service. Perceptions that the quality of the service is poor will result in less willing and less active participation, while perceptions that the service quality is good should result in an increase in the willingness and activeness of the participation, Thus, accurate and practical measurement tools for assessing participants' perceptions of the quality of health care services are important. Results of such assessments can be used for determining areas with perceived and/or actual poor service quality, so that the service quality and/or the perception of service quality of those areas can be addressed and improved.

Defining service quality

Gronroos (1984) argued that there are two distinct constituents of service quality: the technical and functional [7]. In the health care field, technical quality focuses on the technical accuracy of medical diagnosis and procedures, while functional quality is the manner in which healthcare is provided. However, in the context of health care, technical quality is difficult for patients to evaluate [8], and this resulted in most patients evaluating health care on its functional aspects alone. Parasuraman defined service quality as the difference between customer expectations and customer perceptions, and when expectations are greater than perceptions, a service quality gap arises [9].

Patients' satisfaction should be interpreted carefully due to the lack of theoretical foundations on which the concept of satisfaction and measurement are based [8]. Patients are an active consumer of health-care services rather than merely passive recipients [10]. The validity and reliability of many studies of health-care consumers' satisfaction have been questioned [11].

The original PZB model identified 10 determinants of service quality. The subsequently developed SERQUAL [12] recast the 10 determinants into five components: tangibles, reliability, responsiveness, assurance, and empathy. These five components constitute a factor analysis of the 22 item scale. Measuring quality of care from the patient's perspective has been increasingly used and accepted in health care [1315]. One study used the SERVQUAL service quality to measure the expectations and perceptions of Greek patients regarding dental health care [9, 12, 16, 17]. Another, refined version of SERVQUAL was used to measure patient satisfaction in health services in Bangladesh [18], and found that the "tangible" factor was the most important factor in health service quality. Li and Amir also applied SERVQUAL to measure patient satisfaction with breastfeeding education and support services [19]. Cock concluded that REFERQUAL, which is derived from SERVQUAL, holds promise as a suitable tool for future evaluation of service quality within the Exercise Referral Systems (ERS) community [20].

Method

Patients and Institution

Taichung City is located in central Taiwan with a population of more than one million people. In order to improve chronic kidney disease (CKD) prevention, awareness, and education, the Taichung City Public Health Bureau conducted a series of 20 free CKD screenings and prevention lectures for the community population from 2006 to 2008 (Program Number PHB-SK-93-001), as part of health exams in community centers and eight district clinics of Taichung City. Participant recruitment was effected through advertising and word of mouth. Screening recipients voluntarily chose to participate, learning of the program from posters in community centers or city district clinics, or hearing of the program from someone else. The questionnaires were self-administered, with in-person help available.

The preventative aspect of the goal was to identify people with abnormal kidney functions in an early stage and to treat identified kidney diseases efficiently and effectively using a case management and follow-up model. Persons whose kidney function eGFR measurement was determined from the screening to be below 60 were contacted and encouraged to enroll in the CKD case management plan. The management plan included several medical interventions, such as health education by public health care nurses (in person or by phone interview) every three months during their treatment, case management register card records made by public health care nurses, and a survey of all cases' kidney functions after the first year of health education for the patients. The website shows how many people attended this program and the summarized results. The details of the personal health information results were emailed or mailed to the participants.

The study total obtained the results from 1595 kidney disease screens, of which 27.8% were male and 64.9% were patients aged above 50 years old. Based on the National Kidney Foundation's definition, the screening results show 30.3% of the participants were in Stage 1 (glomerular filtration rate >90 ml/min/1.73 m2), 57.2% were in Stage 2 (glomerular filtration rate between 60-89 ml/min/1.73 m2), 11.7% were in Stage 3 (glomerular filtration rate between 30-59 ml/min/1.73 m2), 0.7% were in Stage 4 (glomerular filtration rate between 15-29 ml/min/1.73 m2), and 0.1% were in Stage 5 (glomerular filtration rate,15 ml/min/1.73 m2).

An adapted and revised SERVQUAL questionnaire was used in the study. A total of 1595 consecutive patients who attended the kidney disease screening program in Taichung City were requested to fill out the questionnaire. A total of 1187 questionnaires were received, of which 102 were excluded due to incompleteness. A total of 1085 effective questionnaires were collected for analysis in the study. The paired t-test, a correlation test, ANOVA, and factor analysis were used to identify the characteristics and factors of quality in the kidney disease service. The paired t-test was used to test the gaps between patients' expectation scores and perception scores. In addition, a structural equation modeling system was used to examine the relationship between satisfaction-based components. Structural Equation Modeling (SEM) of patient satisfaction was done using the goodness-of-fit measuring model. The SEM approach is considered appropriate for estimating among multiple dependent and independent latent variables, and provides a better model of the complex relationships among satisfaction components.

Instruments

SERVQUAL represents service quality as the discrepancy between a customer's expectations for a service offering and the customer's perceptions of the service received [12]. The original SERVQUAL contains 22 paired items on a Likert scale of five service-quality dimensions: tangibility, reliability, responsiveness, assurance, and empathy. The questionnaire used in this study (see Table 1) has three parts, and uses a 7 point Likert scale (strongly disagree = 1 to strongly agree = 7). The first part, the perception and expectation component, (quality gap) is composed of the 22 pair items on service quality. The second part, the loyalty component, has two items on loyalty, which rate overall satisfaction and willingness to recommend to a friend [21, 22]. These loyalty items can serve as anchor items to examine the criterion-related validity of the scale [22]. The third part of the questionnaire is the patient background data component, on areas such as sex, age, job, and educational degree.

Table 1 Chronic Kidney Disease Screening Questionnaire

Power analysis

For a statistical power of 0.9999, the required sample size is 364.

According to the calculation of Get PS version 3.0, 2009, when α equals 0.05 in a two-tailed test, and the sample size is 329, the power is 0.9999. Prior data indicate that the difference in the response of matched pairs is normally distributed with standard deviation 1. If the true difference in the mean response of matched pairs is 0.3, we need to study 364 pairs of subjects to be able to reject the null hypothesis that this response difference is zero with probability (power) 0.9999.

Trial Registration

This program been waived from trial registration by the Department of Health, Taichung City, Taiwan R.O.C. within Document PHB-SK-93-001.

Reliability and Validity

Internal consistency reliability

The expectation and perception satisfaction scales had Cronbach's alpha coefficients > 0.902. The "item to total" correlations were all from 0.36 to 0.90.

Content validity

Content validity of the questionnaire was confirmed by 3 kidney specialists and 2 healthcare management specialists. Triangulation of content validity was achieved through several literature reviews on the SERVQUAL service model [1214].

Construct validity

On the basis of a review of the literature, the latent construct of patient expectations and perceptions of quality was theorized to be multidimensional. Factor analysis of the survey data identified three dimensions for expected and perceived quality [23].

Criterion-related validity and predictive validity

Criterion-related validity and predictive validity, shown in Figure 1 and Figure 2, indicate that the expected quality scale is correlated with the perceived quality scale, and that the perceived quality scale is correlated with the dimension of loyalty, which includes overall satisfaction and willingness to recommend to friends [22]. In addition, the goodness-of-fit indices provide model validity [24].

Figure 1
figure 1

SEM on patients' satisfaction model 1. Indicated the initial SEM patients' satisfaction model.

Figure 2
figure 2

SEM on patients' satisfaction model 2. Indicated the final model which shows the perceptions are positively correlated with expectations. Also, loyalty is positively correlated with perceptions.

Convergent validity

Bollen's Rho coefficient equal to 0.85 and 0.91 which are greater than 0.70.

Statistical Analysis

The software STATISTICA® Version 7.1 was used for statistical analysis throughout this research. The Student t-test, a correlation test, ANOVA, and Least Significant Difference (LSD) test were used to test the average scores of expectation and perception scores with patient's characteristics. Factor analysis, which is a data-reduction technique, was used to determine the number and nature of factors of service quality that underlie our set of variables [25]. The principal axis method was used to extract all factors that had eigenvalues greater than 1, and therefore can account for a significant amount of the total variance. Scree tests were used to identify the number of factors to retain. The Paired t-test was used to test the gap between expectation scores and perception scores. Structural equation modeling was used to examine relationships among satisfaction components. The three research hypotheses of this study are as follow.

H1: Perceptions are positively correlated with expectations.

H2: Loyalty is positively correlated with perceptions.

H3: Loyalty is positively correlated with expectations.

The hypotheses were tested under the SEM using the STATISTICA®7.1 package. The parameters estimated were the regression coefficients in the structural equation part of the SEM. The assessment of model adequacy was based on the following goodness-of-fit criteria: normed chi-square (χ2/df) <3, root mean square error of approximation (RMSEA) < 0.08, population gamma index (PGI), adjusted population gamma index (APGI), goodness-of-fit (GFI), adjusted goodness-of-fit (AGFI), and Bollen's Rho >0.8 [26].

Results

The patient's characteristics are presented in Table 2. The mean age and the standard deviation of the study population was 44.72 years old and 12.12 years, 77.97% (N = 846) of the screeners were female, and 67.1% (N = 729) of the screeners had a college degree or higher. In Table 3, the Student t-test on gender show females have significantly higher expectation levels than males. However, there is no significant difference in perception scores. Furthermore, the correlation test shows no significant relationship between age and expectation. However, other results from other studies suggest that older patients have higher perception scores. The ANOVA results show significant differences for expectation and perception scores for patients with different types of jobs. The further LSD test results reported in Table 4 and Table 5 show further details. Regarding expectations, business and labor workers have lower scores, and house keepers and farmers have higher scores. With respect to perceptions, public service and business workers have lower scores, and farmers and service workers have higher scores. The most interesting result is for the amount of education; the higher the amount, the lower the scores for both expectation (r = -0.09) and perception (r = -0.26).

Table 2 Patients' Characteristics
Table 3 Satisfaction Statistical Test Results With Patient Demographics
Table 4 LSD Test On The Job Variable In Expectation
Table 5 LSD Test On The Job Variable In Perception

The patient score results are very high for expectations - 6.50(0.82), and perceptions - 6.14(1.02), as seen in Table 6. Table 6 also shows the gaps between patients' expectations and perceptions. 23 paired t tests (22 on PZB plus 1 on average) were conducted. The results show that in all dimensions patients had significantly higher scores for expectations than for perceptions.

Table 6 Paired t-Test Of Kidney Disease Screening Service

The factor loading results of factor analysis, seen in Table 7, identify three factors in the SERVQUAL model perceived satisfaction scores (accounting for 86.91% of the total variance) and expected satisfaction scores (accounting for 80.78% of the total variance). The results in Table 8, patient expectations, Factor 1 is responsiveness, which consists of three of the original PZB model's factors: reliability, responsiveness, and assurance. Factor 2 is empathy, and Factor 3 is tangibility. However, in patient perceptions, Factor 1 is empathy, Factor 2 is tangibility, and Factor 3 is responsiveness. The eigenvalues criteria and Scree tests further confirm these 3 factors.

Table 7 Factor Loading Of Patient Satisfaction
Table 8 Questionnaire Reliability

Figure 1, Figure 2, and Table 9 summarize the goodness-of-fit results of the structure equation modeling, showing the directions and concepts in expectations, perceptions, and loyalty. Since Model 1 and Figure 1 do not show adequate results, Model 1 and Figure 1 have been revised into Model 2 and Figure 2. The revised model's results show adequate test results in RMSEA, PGI, APGI, GFI, AGFI, and Bollen's Rho. The value of above 0.8 is adequate and the results of the first two hypotheses are accepted. Only the χ2/df is unsatisfactory, being higher than the normal criteria of 3.0. Based on the SEM results, Research Hypothesis H1 (perceptions are positively correlated with expectations) and H2 (loyalty is positively correlated with perceptions) are accepted, and H3 (loyalty is positively correlated with expectations) is rejected.

Table 9 Goodness-Of-Fit Summary For Patient Satisfaction Models

Discussion and Conclusion

One of the strong points of this research study is the high percentage of effective responses (1187/1595 = 74.4%), which reduces the non-response bias. This rate is just slightly lower than the 79% rate of similar research in Tso [22], but higher than the 63% response rate in Hendriks [27], the 48.8% rate in Oltedal [26], and the 25.6% rate in Bankauskaite [8]. The patients' highest expectations were on the items "Did the screening insist on error free records?" (E9) and "Did the staff instill confidence in you?" (E14), which means that patients were the most concerned about the accuracy of the screen, with respect to both the equipment used and the people who operated the equipment and administered the screens. Some mechanisms can be implemented to improve screening accuracy, for ensuring the use of well-trained medical and nursing staff, high-level technology and equipment, and the standard procedure and certification system of ISO (International Organization for Standardization) in the screening program.

The research results show generally high scores for patient expectations (6.50/7 = 92.9%) and perceptions (6.14/7 = 87.7%). In comparison, a study in India on outpatient (n = 1837) and inpatient services (n = 611) in primary health centers and district hospitals reports scores lower than those in this study, ranging from 3.63/5 = 72.6% to 3.74/5 = 74.8% [22]. In addition, the scores reported from Lin's study [15] on solo practice and group practice are also lower than those from this study, ranging from 3.73/5 = 74.6% to 4.11/5 = 82.2%. The results in this study have females with higher expectation scores than men, which is similar to results from another study which focused on asthma patients [28]. Furthermore, a study on lung cancer patients has results showing low educational level is significantly related with better patient satisfaction regarding nurse care, medical care, and other staff care, which accords with this study's results [29]. However, another research on laser-assisted in situ keratomileusis (Lasik) services showed gender and education level did not play a significant role on patients' satisfaction [30].

The results of this research suggest that the SERVQUAL instrument can be a useful measurement tool in assessing and monitoring service quality in chronic kidney disease screening service, and enabling staff to identify where improvements are needed from the patient's perspective. There were service quality gaps on all three dimensions. This means that the government-worker who administered the screenings did not meet patients' expectations, and more on-job training in areas such as etiquette are needed to provide better service. Also, positive incentives for the personnel involved in a screening to achieve higher patient satisfaction scores, and the providing of more health education and information on the screening process to those being screened to make sure their expectations are reasonable, could be effective strategies to use in the future. This study also raised a number of issues such as a need for more follow-up research on the patients from Stage 3 to Stage 5. Finally, further validation studies in various disease screening programs in Taiwan and other countries are suggested to make future cross-cultural comparisons possible.

Limitations

This research has some limitations. One is that the questionnaires were administered during the screening process and were answered anonymously. Thus they did not include information on the severity of the participants' CKD, and were not able to be later linked to the participants who filled them out. Another limitation is that 72.2% of the participants were females, although the percentage of females in the larger general population is much less, meaning that if there is a gender-related difference in attitudes, the results are likely influenced by it. A third limitation of the study is the fact that its participants were all people who went in to a clinic or community center for a health exam, and who thus do not necessarily represent parts of the general population who for whatever reasons did not do so.

Gender selection bias is a problem for the potential representativeness of the study's results, which future research should address. As discussed in the Method section, the participants were self-recruited, and thus the recruitment/participant rate is not really applicable. In this study, gender self-selection indicated that female participants' decision to participate may be correlated with their traits, which show that females have more time to response the questionnaire than males. The traits may affect the study, making the participants a non-representative sample.

Another limitation is the 25.6% (1-74.4%) non-response bias. Furthermore, ceiling effect on the expectation and perception data also a limitation after the statistic results proved skew on data distribution. Ceiling effect can affect means, variances, reliabilities and validities of an instrument. Based on the findings of the data distribution, therefore, it implies that the effect may have direct negative consequences on patients measured by the instrument of customer satisfaction in this study.