Introduction

Recurrent respiratory papillomatosis (RRP) is a rare disease occurring in both children and adults. RRP predominantly affects the larynx and trachea but it can spread to any part of the respiratory tract. The etiological agent of RRP is human papilloma virus types 6 and/or 11, and rarely 16 or 18. The HPV virus is transmitted by sexual contact (oral sex) and by a vertical transmission to infants by infected mothers [1, 2]. The main symptoms of this disease are hoarseness, dysphonia, stridor, and in some cases respiratory distress. RRP is a mild but recurring disorder; however, it rarely becomes malignant. Besides laryngeal inlet scarring is common in the course of repeated treatment. The main problems in vocal fold RRP localization are the treatment sequelae, calling into question how to preserve good vocal quality after multiple surgical procedures [3].

Treatment options in RRP include surgical excision and optional adjuvant therapies. The administration of adjuvant antiviral drugs such as alpha-interferon, acyclovir, bevacizumab, indol 3-carbinol, and cidofovir decreases the frequency of papilloma recurrence and reduces the number of surgeries required [4].

Intralesional cidofovir injection is a popular and effective method for treating patients with RRP, based on the viral nature of the disease. Cidofovir is an injectable antiviral medication for the treatment of cytomegalovirus (CMV) retinitis in patients with AIDS. It suppresses CMV replication by selective inhibition of viral DNA polymerase and therefore prevention of viral replication and transcription. It is an acyclic nucleoside phosphonate and is therefore independent of phosphorylation by viral enzymes. Cidofovir is the most contemporary adjuvant antiviral treatment for RRP and its topical use is widely described [5]. Since 1998 several articles have reported the results of intralesional cidofovir injection in the treatment of papillomatosis [4]. Adjuvant therapy was effective in most of these studies [6]. However, there is a lack of data on voice outcomes in this group of patients.

Repeated surgery affects the structure of the vocal folds, causing scars, incomplete vocal fold closure, and poor mucosal wave, which may ultimately reduce vocal quality. Intraepithelial injections of cidofovir as the sole treatment or complementary to debulking procedures reduces the number of recurrences and the intensity of papillomata growth, and thus reduces the aggressiveness of consecutive procedures, in both ways favorably improving the outcome [7]. The aim of our study was to compare vocal quality before and after cidofovir and CO2 treatment by means of objective phoniatric parameters and to obtain evidence concerning voice outcomes. The hypothesis of our study based on our clinical observations is that additional use of cidofovir leads to significantly improved phonation in this group of patients.

Materials and methods

The study included 42 patients with RRP treated at the Department of Otolaryngology, Head and Neck Surgery, University of Medical Science in XXX, between 2008 and 2016. The mean age of patients was 28.3 years (range 8–82 years). Treatment included CO2 laser debulking (Lumenis AcuPulse 40 CO2 laser, wavelength 10.6 μm; Lumenis Ltd., Yokneam, Israel) followed by intraepithelial cidofovir injections (Vistide, Gilead Sciences Intl Ltd., Cambridge, UK). The laser was used in SuperPulse mode with power tailored on the target structures (average 7 W). The depth of tissue penetration was 1 mm with a single burst of energy lasting 0.3 ms (Fig. 1). Most of the patients had previously undergone surgery (1–105 procedures); in 7/42 patients, the video stroboscopic examination revealed extensive scar tissue covering the mucosa of the vocal folds, limiting the mucosal wave. All patients underwent vocal examination at the Phoniatrics and Audiology Department of the Medical University in XXX. Voice parameters were assessed before and after treatment, using subjective voice evaluation (GRBAS), videostroboscopy, analysis of the acoustic laryngeal tone (MDVP), acoustic analysis, and VHI.

Fig. 1
figure 1

Inclusion criteria

Voice handicap index

The effect of the voice disorder on daily life was studied using the Polish translation of a voice handicap index (VHI) questionnaire containing 30 questions. This has been proved to be an appropriate instrument to quantify the psychosocial consequences of a voice disorder [8]. When needed, additional oral instruction was given while completing the VHI. Each of the 30 questions was scored between 0 and 4. Zero means “never” and 4 “always.” A score between 0 and 30 represents “minimal handicap,” a score from 31 to 60 “moderate handicap,” and a score from 61 to 120 “serious handicap” [9].

Perceptual voice assessment

Each patient read short stories before and after treatment, during which they were evaluated by two doctors: a laryngologist (PhD, clinical and academic experience—17 years) and phoniatrician (PhD, clinical and academic experience—21 years). The GRBAS scale allowed perceptual voice assessment. The main parameters evaluated in this study were grade of hoarseness, roughness, and breathiness. GRB parameters seemed to be the most appropriate in perceptual voice assessment in patients with RRP [10].

Videostroboscopy

The larynx was visualized with a 90° rigid laryngoscope connected to a Xione stroboscope (Endo STROBO E, XION GmbH, Germany). The images were recorded on the hard disk of a PC (Lenovo, Lenovo (Singapore), Pet.Ltd., China). The quality of vocal fold vibration, closure of glottis in stroboscopic light, and papilloma recurrence were assessed, along with the shape of the anterior commissure. Videostroboscopy was performed before and a few weeks after treatment during a standard examination.

Voice samples

Voice samples of each patient were recorded before and after treatment in a soundproof chamber at the Phoniatric and Audiology Department of XXX Medical University. All acoustic examinations were performed by the same voice analyst in a computerized speech laboratory using a multidimensional voice program (Kay Elemetrics Corporation, Pine Brook, NJ, USA). A patient was asked to read a short story and to produce a prolonged vowel /a/. The maximum phonation time (MPT) and loudness were assessed during a sustained vowel “a” after deep inspiration at conversational pitch using a stop watch. Of each subject, a voice sample was recorded on digital audiotape (DAT recorder Sony TDCD-D100, Sony Electronics Inc., Park Ridge, NJ, USA, and Sony microphone ECM-7171, Sony Corporation, Tokyo, Japan). Mouth to microphone distance was set on 30 cm.

Acoustic analysis

Acoustic parameters of all patients were examined before and after treatment. Acoustic voice analysis was performed using the Multi Dimensional Voice Program (MDVP) of the CSL. Patients were instructed to produce the vowel /a/ for 3 s at a comfortable pitch and volume. Three repetitions were collected and their average was used for analysis. The vowel /a/ was chosen because it is a steady-state vowel and is relatively easy for adults and children to reproduce. For the present study, three parameters were selected for analysis: short-term frequency variation (“jitter”), long-term frequency perturbation (“vFO”), and short-term amplitude perturbation (shimmer). The acoustic analysis was performed by the Computerized Speech Lab (MDVP, model 4305, Kay Elemetrics Corp., Lincoln Park, NJ, USA).

End points and statistical analysis

The main predictor variable was the time before and after RRP treatment with cidofovir. The number of previous procedures, sex, and age were additional predictor variables. The primary outcome variables were parameters describing vocal quality: GRBAS scale, VHI score, MPT, and MDVP. The secondary outcome measure was the videostrobocopy findings. The significance level for all calculations was p < 0.05. Statistical analysis was performed using Statistica 10 by StatSoft Polska, Kraków, Poland. Estimation of the power analysis 1 − beta in the case of comparison analysis methods was ranged between 81.94 and 97.53%.

Results

The clinical characteristics of the patients are shown in Table 1. Forty-two patients were treated in the Laryngology Department. The standard procedure comprised laser ablation followed by intralesional cidofovir injection. Cidofovir injections were given to the areas previously cleaned by laser. The time of follow-up ranged from 6 to 48 months. Complete remission was observed in 22/42 and partial remission in 20/42 patients. Before the first cidofovir injection the majority of patients had undergone several mechanical treatment procedures due to the recurrent nature of the disease (from 1 to more than 100 mechanical debulkings, mean 8) (Table 2). There was no statistically significant correlation between age at RRP onset and number of procedures (p > 0.05).

Table 1 Clinical data on 42 patients with RRP—results of treatment
Table 2 GRB scale

Videostroboscopy

Papillomatous lesions were mostly located on the vocal folds (33 patients), anterior commissure (23 patients), and vestibular folds (22 patients). In 12 patients the disease involved the aryepiglottic folds and supraglottic area, in nine the epiglottis, in three the posterior commissure, and in eight the hypopharynx and arytenoids. Papillomas at the entrance of the esophagus were diagnosed in one patient, in whom papillomas also covered the entire hypopharynx and larynx, with the exception of the laryngeal surface of the epiglottis. The trachea was involved in one patient with previous tracheostomy (no. 6). Diffuse disease was observed in two patients.

The closure of the glottis was almost impossible to assess due to papilloma in 33 patients with glottis level involvement. Anterior web was present in seven patients. The medial edge of the vocal fold mucosa was unchanged during phonation in only 5/42 patients.

Six weeks after treatment severe scars were observed on the vocal fold in seven subjects, and the remainder had partial or complete closure of the glottis. Mucosal wave and amplitude were asymmetrically reduced in the affected vocal fold in 28/42 patients.

Perceptual voice assessment

Vocal quality was poor in all patients with papilloma in the glottis region. In the perceptual analysis, patients had statistically significantly higher values for grade of hoarseness, roughness, and breathiness before treatment than after cidofovir injections combined with CO2 laser treatment. Vocal quality correlated with a number of procedures (p < 0.05) (Table 2).

Perceptual voice assessment

Patients during spontaneous conversation revealed significantly better voice quality following implemented treatment. In all patients, the maximal phonation time (MPT) was longer after treatment. The correlation between MPT before and after treatment was statistically significant (Table 3).

Table 3 MPT results

Acoustic analyses

Acoustic voice analysis was performed using the MDVP of the CSL. Thirty-two patients were instructed to produce the vowel /a/ for 3 s using a comfortable pitch and volume. Three repetitions of a sustained /a/ vowel were collected and their average was used for analysis. The vowel /a/ was chosen because it is a steady-state vowel and a relatively easy vowel for children to reproduce. It also allows for extraction of frequency and amplitude measures, which were compared to previously published pediatric norms. The MDVP program can extract up to 33 acoustic variables from each voice sample. For the present study, five parameters were selected for analysis: short-term frequency variation (“jitter”), long-term frequency perturbation (“vFO”), short-term amplitude perturbation (“shimmer”), long-term amplitude perturbation (“vAM”), and noise-to-harmonic ratio (“NHR”). The parameters of jitter, vFO, shimmer, and vAM are all measures of variability in the frequency and amplitude of the voice signal. The noise-to-harmonic ratio is a general measure of the noise present in the analyzed voice signal. Short- and long-term variation measures and signal-to-noise ratios are common parameters used to evaluate deviant voice qualities. In the MDVP analysis laryngeal tone before surgery showed significantly elevated parameters defining the amplitude and frequency of laryngeal tone. After treatment, both jitter and shimmer were reduced along with the other acoustic parameters. More pronounced changes were observed via spectrographic analysis. After surgery well-defined harmonic changes were observed with well-marked controls and normal speech amplitude. Nevertheless, perceptual evaluations and acoustic assessments showed that voice quality was measurably abnormal compared to that of the matched control group (Table 4).

Table 4 MDVP—results

VHI

In terms of the VHI subscale scores, the physical subscale before and after treatment showed the highest scores of all subscales. The total VHI score before treatment was significantly higher than that after treatment. Almost in all patients VHI total scores after treatment were lower than before treatment. The mean values after treatment showed a 35% reduction for the functional subscale, a 28% reduction for the emotional subscale, and a 42% reduction for the physical subscale. The total VHI score was reduced by 42% (Fig. 2). The reductions in total and subscale VHI scores after treatment were statistically significant (p < 0.001) for the whole group. Four diagnostic categories (three subscale and globally) of the VHI scale showed a significant reduction after treatment.

Fig. 2
figure 2

Total VHI score

We found no significant difference in total VHI scores between males and females. Significant reduction of VHI scores shows improvement in voice self-assessment (Table 5).

Table 5 VHI

Discussion

Despite being a histologically benign disease, recurrent laryngeal papillomatosis is considered to be a severe clinical problem due to its therapeutic resistance. A characteristic feature is the relapsing growth of papillomatous tumors in the airway, especially the larynx. Juvenile RRP occurs in patients younger than 5 years of age, whereas adult RRP typically presents during the third decade of life [1]. The most important aim in surgical therapy is to improve the patient’s vocal quality and to avoid vocal fold scars and anterior commissure web formation [11]. Long-term harmful structural and functional consequences of repeated laser therapy, and the potential risk of malignant transformation into a squamous cell carcinoma, make this disease very demanding for laryngologists, phoniatricians, and speech therapists [2].

The role of cidofovir in the therapy of recurrent laryngeal papillomatosis has not yet been sufficiently evaluated. Several studies have shown that when locally applied, cidofovir is virustatic without any side effects and, in particular, is able to significantly reduce the rate of recurrence [12, 13]. The importance of evaluating the impact of any disease on the patient’s quality of life and the outcome of various therapeutic interventions is increasing. Nowadays, vocal quality and communication skills play an essential role in social and professional life, explaining the increase in studies in the fields of otolaryngology and speech-language pathology. For example, Stewart et al. evaluated the association between voice satisfaction and quality of life in patients who had been treated for laryngeal cancer with total laryngectomy, radiotherapy [14], or both; Wiskirska-Woźnica et al. studied voice quality in patients after reconstructive subtotal laryngectomy [15], and Lindman et al. assessed the vocal quality of prepubescent children with recurrent respiratory papillomatosis [16].

The VHI questionnaire was chosen from among other similar tools because of its simplicity and ease of administration among children [9]. This instrument was developed in order to evaluate dysphonic patients and their treatment outcomes. Despite our patients’ collective history of diffuse disease and multiple surgical excisions, their responses to the VHI questionnaire suggest that these patients do not perceive their voices to have a notably negative impact on their quality of life, and even clearly marked improvement after cidofovir treatment.

The GRBAS scale was developed by the Committee for Phonatory Function Tests of the Japanese Society of Logopedics and Phoniatrics. This scale was selected because of its generality. This scale is also widely used among different classes of judges with consistent results. The results of the perceptual analysis performed in our study provide clinicians with a baseline of voice quality and a way to monitor the course of therapy to improve voice quality. Significant differences were found in the overall quality of the voice, roughness, and breathiness before and after surgery. It is evident that CO2 surgery combined with intraepithelial cidofovir injections led to even better voice quality.

In our study a number of surgical procedures correlated with poor voice quality but not with patient age. Letho et al. found the opposite, possibly due to the smaller group of patients in their study [17].

Computer-assisted voice analysis is an important diagnostic breakthrough, providing objective measurements that are not available with other instruments.

Fundamental frequency is “an acoustic measure that directly reflects the vibrating rate of the vocal fold.” In our study patients had high fundamental frequencies after cidofovir treatment. Acoustic parameters showed a strong correlation between voice quality before and after treatment.

Chan et al. proposed that hyaluronic acid (HA), which is a glycoseaminoglycan isolated from the vocal fold lamina propria extracellular matrix, may play an important role in maintaining optimal tissue firmness that could contribute to the control of fundamental frequency [18]. Papillomatosis is known to occur most often in sites at which ciliated and squamous epithelium are adjacent to each other [19]; however, the impact of this viral disease or its surgical therapy on the HA content in the vocal folds is unknown.

This is confirmed by the fact that other elements determining the amplitude and frequency of vocal fold vibration improved after treatment.

The MPT provides information about the control of the glottic region and respiratory function. All patients had longer MPTs compared with MPTs reported before treatment. Possible reasons for this include the following: decreased air escape from the glottis during phonation, increased lung capacity, and the use of proper breathing techniques [18].

The large group of patients and the three kinds of treatment (CO2 laser, mechanical removal, and cidofovir injections) enabled us to assess the impact of surgical techniques on final voice quality.

Conclusions

Treatment with intraepithelial cidofovir injections and CO2 laser debulking can lead to a nearly normal voice. Perceptual voice assessment during spontaneous conversation revealed improved phonation results, which were confirmed by objective acoustic voice analysis (MDVP). The maximal phonation time was significantly longer in all patients after implemented treatment. The significant reduction in VHI scores showed improved voice self-assessment and quality of life.