After applying the search strings 628 potentially eligible articles were identified, of which 473 were excluded based upon title, and 52 studies were duplicates of these reports. Another 67 manuscripts were excluded after reviewing the abstract. Contact with the author of one abstract revealed that the trial was still actively recruiting patients. In the next phase of the selection procedure 35 full articles were reviewed of which 24 articles did not meet the predefined eligibility criteria. Two studies were published twice [10, 16, 25, 26]. One report was considered the index report, the other article was searched for additional information. Data from both articles were included in this study. One manuscript was a 13-year follow-up  of a previously conducted RCT . Data from both reports were included in the analysis. In conclusion, a total of 11 articles about eight studies were included for the present review and meta-analysis which involved a total of 986 patients [13–17, 25–29] (Fig. 1).
Tables 1, 2, 3 and 4 summarize the methodological quality, the methodological characteristics, the characteristics of the interventions and the characteristics of individual studies. Two studies had also included a third (internal fixation) arm [17, 27, 28]. These data were not taken into account, as internal fixation was not assessed in the present study. In all studies inclusion and exclusion criteria were clearly defined prior to the study in order to select patients with an ambulatory and cognitive fit pre-fracture status. The quality of the individual parameters ranged from low to very low (Table 1). In three studies, sealed envelopes were used as randomization system [14–16, 26]; one of which was stated as block randomization . A fully automated computerized allocation system was used in two studies [10, 13]. Other methods used for treatment allocation were by hospital number , fixed treatment sequence , and according to the order of admission . The outcome assessor was blinded for the allocated treatment in only one study . Patients were not blinded for treatment in any of the studies. Three studies [10, 15, 16, 25, 26] stated an intention to treat analysis, one a per protocol analysis  and four studies did not specify the data analysis method [14, 17, 27–29]. For all eight studies [13–17, 25–29] the follow-up period was at least one year (Table 2). All patients in the THA arm were treated with a cemented stem, except in one study  where both cemented and uncemented stems were used. For patients treated with hemiarthroplasty in two studies [16, 29] both cemented and uncemented stems were used; in one study  cementing of the stem was not specified. In four studies cemented stems were used; in one study uncemented stems were used. In three studies [13, 14, 28] only unipolar heads were used, in three studies [10, 15, 29] only bipolar heads were used, in one study  both types of heads were used and one study  did not specify the polarity of the head component of the hemiarthroplasty (Table 3).
The exact recruitment period was not specified in three studies [14–16]. The number of patients per arm ranged from 17 to 137. Three studies [15, 28, 29] used a single-center design; five studies [10, 13, 14, 16, 17] were performed with a multicenter approach (Table 4).
Data on revision surgery and reported planned revision surgery were pooled, totaling 986 patients and 55 events (5 %). Revision surgery was performed in 4 % in the THA-arm versus 7 % in the HA-arm (Fig. 2). There was low evidence of heterogeneity across the studies (I2 = 9 %, P = 0.36). No statistically significant difference in revision surgery between the two groups (relative risk, RR 0.59, 95 % confidence interval CI 0.32–1.09, absolute risk difference, ARD −0.02, 95 % CI −0.06 to 0.01) could be found. However, the pooled data showed a trend towards less revision surgery for patients who had undergone total hip arthroplasty compared with those who had undergone hemiarthroplasty.
Data for mortality at one year were pooled. Six out of the eight selected studies provided adequate data on one-year mortality [10, 13, 15–17, 28] which involved a total of 816 patients and 117 deaths (overall 14 %; Fig. 3). The one-year mortality was 13 % in the THA-arm versus 15 % in the HA-arm. There was no evidence of heterogeneity (I2 = 0 %, P = 0.79). The pooled one-year mortality data did not differ between patients who had undergone total hip arthroplasty or hemiarthroplasty (RR 0.91, 95 % CI, 0.65–1.27, ARD −0.01, 95 % CI −0.05 to 0.03).
Six of the included studies provided data on dislocation [10, 13, 14, 16, 28, 29] (Fig. 4). Another study did not report on dislocation , and one study reported that in both treatment arms there were no cases of dislocation . The risk of dislocation was 9 % in the THA-arm versus 3 % in the HA-arm. There was low evidence of heterogeneity across the studies (I2 = 30 %, P = 0.21). Pooling the data of these 780 patients and 47 events (6 %) revealed a significant risk for dislocation after treatment with total hip arthroplasty for dislocated femoral neck fractures (RR 2.53, 95 % CI 1.05–6.10, ARD 0.05, 95 % CI 0.02–0.08).
Complications (Appendix 2)
Data on major complications were retrieved from five studies [10, 13–16] (Fig. 5). In addition, one study reported data on both minor and major complications, and these data had to be excluded as these were not specified to both treatment groups . The outcome measures of two other studies were focused on functional recovery only and data on general complications were not presented [17, 28]. In 25 % major complications were found after THA versus 24 % after performing HA. No significant difference in major complication rates was found after either form of arthroplasty (RR 1.07, 95 % CI 0.76–1.50, ARD 0.00 95 % CI −0.08 to 0.08). Heterogeneity across the studies was 17 % (P = 0.31).
The same five studies described in the section above on major complications presented data on general minor complications [10, 13–16] (Fig. 6). Heterogeneity across the five studies was 39 % (P = 0.16). In 13 % minor complications were found after THA versus 14 % after performing HA. After excluding the mentioned three studies for analysis, pooled data for general complications showed no significant difference in general minor complications (RR 0.94, 95 % CI 0.56–1.58, ARD −0.01, 95 % CI −0.08 to 0.07).
Four studies reported the Harris hip score after total follow-up [13, 15–17]. The Harris hip score ranges from 0 to 100 points and include function, pain, deformity and the range of motion. The weighted mean HHS was 81 (weighted mean SD 11) versus 77 (12) for THA and HA, respectively. A difference was found for the total score of this specific hip score (mean difference, MD 5.12, 95 % CI 2.81–7.42). Patients treated with THA reported statistically significantly higher Harris hip Scores. Heterogeneity across the studies was 0 % (P = 0.46) (Fig. 7).
From two papers it was possible to calculate separately the pain subdomain of the Harris hip score [13, 15]. The weighted mean score for the pain subdomain of the HHS was 42 (weighted mean SD 2) versus 39 (3) for THA and HA, respectively. A significant difference was found favouring this score after treatment with THA (MD 2.62, 95 %CI 0.18–5.05) (Fig. 8).
Two studies [10, 28] reported pain in categories mild to no pain (with no analgesia) after total follow-up. No to mild pain was reported in 75 % after THA and in 56 % after HA. These pooled data also showed a significant difference in favour of the THA group (RR 1.36, 95 % CI 1.20–1.54. Heterogeneity across studies was 0 % (P = 0.39) (Fig. 9).
One study  separately showed the results of pain as scored with the Western Ontario and McMaster Universities Osteoarthritis Index questionnaire (WOMAC). The calculated mean difference was 16.60 points (THA 94.4, SD 6.8 versus HA 77.8, SD 20.9; 95 % CI 5.00–28.20, P = 0.005) favouring THA (Fig. 10).
Quality of life
Two European studies measured the quality of life with the EuroQol-5 Dimensions questionnaire at the final follow-up at one and two years respectively [10, 15]. The weighted mean EQ-5D score was 0.69 (weighted mean SD 0.28) versus 0.57 (0.48) for THA and HA, respectively. A difference was found favouring THA (MD 0.13, 95 % CI 0.03–0.23, P = 0.01). Heterogeneity across the studies was 0 % (P = 0.33) (Fig. 11).