Arbitrator teams and dispute resolution performance: an empirical analysis


In the context of international investment disputes, this paper investigates how arbitrator team characteristics affect team performance in solving disputes between a host country and a foreign investor. Our data include 277 judgments issued by arbitrator teams at the International Centre for Settlement of Investment Disputes at the World Bank from 1972 to 2018. The time to resolution and the quality of the final judgment, as measured by the requirement of a follow-on proceeding to rectify mistakes, are used to measure the team performance. We consider both biographical and professional characteristics of the arbitrators as determinants of the team performance. We find that mixed gender teams and previous team member’s collaborations increase the time to resolution contrary to team members’ experience and diversity in the professional background that decrease it. None of the team characteristics considered has an impact on the quality of the final judgment. Our findings talk to the current policy debate on the reform of the international investment arbitration system aiming to increase its effectiveness and transparency.

Fig. 1

Source of data: Authors’ calculations based on ICSID’s data

Fig. 2

Source of data: Authors’ calculations based on ICSID’s data


  1. 1.

    Final award date: July 8, 2016, p.167.

  2. 2.

    In the context of the international tribunal, economic agents (e.g. host countries, multinational firms) are primarily affected by the time to resolution. Longer resolution means higher (non) pecuniary litigation costs, disruptions in contract execution, uncertainty about the justice and the business environment. In addition to the time to resolution, the enforceability of the tribunal’s judgment is also another concern. If an award issued by the tribunal does not satisfy the parties, they will delay its enforcement by asking for an annulment proceeding. Time and quality are two important performance areas of ICSID. Marciano et al. (2019) provide an interesting discussion about different measures of judicial performance. In particular, the authors insist on the use of two terms that have often been confused in previous literature: efficiency and effectiveness. While efficiency, mainly used in the domestic context, refers to the optimal use of public resources to obtain a given outcome, effectiveness (or efficacy) refers to the capacity of a system to respond quickly to demand for justice (e.g. without delay). Dakolias (1999, p.97) also classified quality (e.g. client satisfaction, appeal rate) in the category effectiveness. In this paper, we follow the approach proposed by Dakolias (1999) and Marciano et al. (2019) and use the term effectiveness when studying the time to resolution and the quality of the final judgment of ICSID.

  3. 3.

    Since ICSID is a host institution with proper arbitration rules to manage the resolution of investment disputes, its effectiveness in resolving dispute is rather observed through the performance of arbitrator teams. In this paper, the effectiveness of ICSID or arbitrator team performance are used interchangeably.

  4. 4.

    For more information, see: aspx. Accessed July 25, 2019.

  5. 5.

    The clearance rate is the number of outgoing cases as a percentage of the number of incoming cases during a specific period (e.g. year). The purpose of this indicator is to assess whether a tribunal is keeping up with its incoming caseload.

  6. 6.

    The reversal rate, according to Eisenberg (2004, p.663), is “the proportion or percentage of appeals that reach a decisive outcome and that emerge as reversed rather than affirmed”.

  7. 7.

    Some studies focus on the judge-level characteristics directly (as variables of interest) and indirectly (as control variables). The judge’s gender is an important variable in these studies. However, there is mixed evidence of the effect of the judge’s gender on adjudicatory outcomes. A few authors also consider the judge’s educational background as a determinant. See Choi et al. (2011); Christensen and Szmer (2012); Dimitrova-Grajzl et al. (2012) and Bielen et al. (2018).

  8. 8.

    There is a fascinating debate in the literature about how judges (and arbitrators) reach a decision. While classical legal theorists answer that judges apply the law and only the law to the fact of the case, the law and economics scholars studying judicial behavior try to understand how the interaction between the law and non- legal factors (e.g. reputation, personal preferences, political biases) may impact the judges’ decision-making. The starting point of this economic analysis is that judges (and arbitrators in our context) maximize “the same thing everybody else does” (Posner, 1993). See Schultz (2015) for more information. In this paper, we do not add much reinforcement to this discussion, but leave open the possibility of having conflicts among appointees of a team.

  9. 9.

    For example, see Sects. 2, 3 of the ICSID Convention, Chapter 1 of the Arbitration Rules on the constitution, powers and functions of the tribunals.

  10. 10.

    In practice, many professors of law have practiced as lawyers. However, not all professional lawyers have an academic background, i.e. working at the university. We consider this difference by distinguishing between arbitrators with and without an academic background.

  11. 11.

    See Christensen and Szmer (2012) for the same argument.

  12. 12.

    We confirm the robustness of our results by applying an estimation method for binary dependent variables, i.e. Probit. See Column 2 of Table 4 in Appendix 2. The main reason to use a Linear Probability Model is that coefficients can be interpreted directly as marginal effect without any further calculation as required by the Probit model.

  13. 13.

    As the purpose of this article is to investigate the effect of arbitrator team on the dispute resolution performance, we exclude the following cases from the main dataset: (1) cases resolved by a sole arbitrator, (2) cases in which the parties to the dispute decided to settle early before the final judgment.

  14. 14.

    A searchable database on ICSID arbitrators (with curriculum vitae) can be found at: Accessed July 25, 2019.

  15. 15. Accessed July 25, 2019.

  16. 16. Accessed July 25, 2019.

  17. 17. Accessed July 25, 2019.

  18. 18.

    Also, we do not prioritize the use of the time between the parties’ final submissions (whether written or by hearing) and the final judgment, i.e. time to produce the final judgment, to measure the effectiveness for three reasons. First, the increasing criticism about the effectiveness of the international arbitration system over recent years requires a relatively general assessment of the duration of the whole proceeding rather than only of the duration of the award phase. Second, besides disputing parties, the arbitrators have significant discretion in conducting and managing the proceeding and this fact needs to be considered when assessing the ICSID’s effectiveness. Third, the measure Time to produce the final judgment might suffer from missing data due to confidentiality in arbitration. Therefore, this measure is only introduced in Sect. 6 for reference.

  19. 19.

    There is a slight difference between an annulment and an appeal. See Caron (1992) for an interesting discussion about the use of these terms. For example, the author insists that while an appeal can lead to some modifications of the final judgment, an annulment proceeding can only void it (in whole or in part). While an appeal focuses on both the substantive correctness of the judgment and the legitimacy of the proceeding, an annulment is rather based on the second ground. However, the line to distinguish between these two post-judgment remedies remains vague in some contexts (e.g. an illegitimate process can lead to incorrect decisions). Without referring to the lexical difference, a common point between an appeal and an annulment is that the disputing parties are not satisfied with the results conveyed via the final judgment.

  20. 20.

    Another proxy for the quality of the judgment is the number (or the rate) of cases that are truly “rectified” (i.e. the outcome of post-judgment remedies). Unfortunately, comprehensive data on such cases is unavailable to us. Moreover, the number of cases rectified (even through an annulment proceeding) is also an imperfect proxy for two reasons. First, in many cases, the arbitral tribunal constituted to consider the request for a “soft” follow-on proceeding (e.g. rectification, supplementary interpretation or supplementary decision) is the same as in the original proceeding. Second, although in a “hard” proceeding to annul the judgment, an ad hoc committee (with different members) is constituted, it is highly possible that members of this committee have some previous collaborations with ones of the original tribunal. Therefore, a small number of cases that were “rectified” might simply reflect the fact that arbitrators were not willing to correct judgments issued by themselves or by their colleagues. See Shavell (1995) for more information.

  21. 21.

    This means that we do not count visiting and adjunct positions, as many professional arbitrators were appointed to the university as practitioners rather than as legal academics.

  22. 22.

    We suppose that a party needs more than one representative before the arbitral tribunal (i) when it anticipates very well a positive outcome of the dispute (i.e. multiple representatives have a complementary effect on the positive outcome) or (ii) when it is unsure of the outcome (i.e. multiple representatives can serve as substitute for uncertainty). If the second hypothesis is true, the time to resolution is expected to be longer for both parties, because they may need more time to produce and find suitable documents and evidence. If the first hypothesis holds, two scenarios may happen. If the investor (the claimant) thinks that he will win the case, the duration of the proceeding may increase because he bears the burden to prove the validity of his claims (see Brower, 1994; Bielen et al., 2015). In contrast, if the respondent state anticipates a favorable outcome, the duration of the proceeding may decrease.

  23. 23.

    The 2006 Rule amendment is the third rule amendment process in the history of ICSID. The first two amendment processes in 1984 and 2003 result in relatively modest changes. In contrast, the 2006 amendment process brought some significant changes, for example, disclosure requirements for arbitrators, the participation of non-disputing parties in the proceeding, improving transparency provisions to favor the publication of the final award. For more information, see: Accessed July 25, 2019.

  24. 24.

    For example, long-lived cases related to the energy and mining sector often require a relatively high level of sunk costs for investors. Therefore, they may be scrutinized and resolved slowly. Hafner-Burton and Victor (2016) also use the type of industry to proxy for the value of investment project in the same context.

  25. 25.

    During the study period, we observe 10 different Secretaries-General—the legal representative as well as the principal officer of ICSID. It is important to include Secretary-General fixed effects because she has considerable impacts on the resolution of disputes administered by ICSID (e.g. the registration of new cases, the appointment of missing arbitrators when the parties disagree on the choice of arbitrator candidates). For more information, see: Accessed July 25, 2019.

  26. 26.

    This rate is relatively high, in comparison with the average appeal rate found in the domestic context (e.g. see Eisenberg, 2004). There are some possible explanations for this high rate. First, the host country is a sovereign respondent with international credibility and the claimant often has high-value claims. Given the fact that follow-on proceedings (i.e. annulment) are allowed, the parties (always) try to reverse the unwanted outcome, even though it is highly possible that some errors are neglected but some correct decisions are appealed (Shavell, 1995). Second, choosing international investment arbitration to resolve a dispute means agreeing on the “law” to be bound by the parties. Evidently, they are free to choose the way they will be bound, e.g. by refusing to enforce an award because what was called “award” is the result of an illegitimate process of decision making (Caron, 1992). As mentioned in Sect. 4.2, the probability of follow-on proceedings should not be considered as a perfect proxy for the quality of decisions issued by the tribunal.

  27. 27.

    Given the overrepresentation of male arbitrators in our dataset, from an econometric point of view, we use a dummy instead of the share of female arbitrators in a team to capture the gender composition.

  28. 28.

    Multicollinearity diagnostics of independent variables are presented in Table 7 in Appendix 5.

  29. 29.

    See Note 24. Also, the effect of the claimant’s multiple representatives on the time to resolution is positive and becomes statistically significant in some models in Table 3 (Appendix 1).

  30. 30.

    As confirmed in the main regressions in Table 2, we find a little evidence that a dispute registered after 2006 has a shorter time to resolution or a lower probability of follow-on proceedings (see Tables 3 and 4 in Appendices 1 and 2). The effect becomes less robust to different model specifications.

  31. 31.

    Negative binomial regression is useful to model over-dispersed count outcome variable, i.e. when the conditional variance exceeds the conditional mean (or in other words, an extra-Poisson variation). Lnalpha is the log-transformed over-dispersion parameter. Remember that in a Poisson model, the alpha value is constrained to zero. The larger alpha, the greater over-dispersion. See Wooldridge (2010, p.725–736).

  32. 32.

    We also checked the robustness of our findings concerning the probability of having a follow-on proceeding for disputes registered after 2000 (See Columns 3–4 of Table 4 in Appendix 2). Similar to the results reported in Table 2 (Column 4), we find that team characteristics considered have no impact on the outcome variable.

  33. 33.

    There are good intuitive reasons for this assumption. First, the disputing parties are always rational and choose arbitrators of high quality. Second, the arbitrator market is competitive, and its barriers keep less competent arbitrators out of the network.

  34. 34.

    A long-term suggestion, as supported by Szmer et al. (2010), is that only when women are not a minority in a system, gender barriers will be more likely to be removed and the cooperation, given a gender diversity, becomes then more equal and effective. However, some institutional rules should be established to follow this agenda (Puig 2014).

  35. 35.

    See Proposals for Amendment of the ICSID Rules – Working Paper, available at: Accessed July 25, 2019.

  36. 36.

    According to Article 13 of the ICSID Convention: “(1) Each Contracting State may designate to each Panel (panel of arbitrators and panel of conciliators) four persons who may but need not be its nationals”, and “(2) The Chairman may designate ten persons to each Panel. The persons so designated to a Panel shall each have a different nationality”. Also, Article 38 of the ICSID Convention indicates that if the parties fail to agree on appointing arbitrators, the Secretary-General (or the Chairman of the Administrative Council) of ICSID can intervene to appoint the missing arbitrators from that Panel of arbitrators.

  37. 37.

    See also Table 6 in Appendix 4 for the partial correlation between the quantity (proxied by Time to resolution) and the quality (proxied by Follow-on proceeding). Accordingly, we find no quantity – quality tradeoff in case resolution before ICSID.

  38. 38.

    An exclusion restriction is a variable that affects the selection mechanism but not the outcome.

  39. 39.

    All Team variables (not reported) keep the same sign and are not statistically significant.

  40. 40.

    We also check the robustness of the quantity/quality correlation by using the variable Time to produce the final judgment which reflects the deliberation phase, instead of Time to resolution. The results (not reported) are very similar. Longer time to produce the final judgment does not improve the quality of decisions. This effect, after controlling for other variables, is significant at the 5 percent level.


This paper was presented at the 4th Annual Conference of the French Association of Law and Economics (Rennes, October 2019). The authors thank conference participants, and especially Christophe Charlier for helpful comments.


Appendix 1: Alternative estimations of the time to resolution regression

In this robustness check, we consider four different estimations of the time to resolution regression. Specifically, Column 1 estimates the impact of the explanatory variables in terms of semi-elasticities, Column 2 uses a negative binomial regression to generate the estimates, Column 3 shows the sensitivity of or results to a restriction of our study sample limited only to the disputes registered after 2000 and Column 4 uses an alternative dependent variable: Time to produce the final judgment. This new variable measures the days passed between the parties’ final submissions (whether written or by hearing) and the official issuance of the final judgment (Table 3).

Table 3 Determinants of the time to resolution

Appendix 2: Alternative estimations of the quality regression

In this appendix we check the robustness of our results concerning Eq. 2 (Sect. 3), i.e. probability of having a follow-on proceeding, using different estimation methods. Specifically, in Column 1 and 2 we report the OLS estimation and the Probit estimation. Similarly, In Columns 3 and 4 we compare the OLS and Probit estimations of Eq. 2 but limiting our sample to the disputes registered after 2000, approximately the year when arbitration became a popular tool in resolving international investment disputes (Table 4).

Table 4 Determinants of the probability of having a follow-on proceeding

Appendix 3: Heckman selection model

To address the issue of sample selection, we apply the Heckman’s (1979) selection model estimated with the two-step method. The first stage (selection equation) is a Probit regression where the dependent variable Litigation equals 1 if the parties enter in the litigation process and 0 if the dispute is terminated by an early settlement. In the second stage (outcome equation) we consider two OLS regressions, one having as dependent variable Time to resolution and one having Follow-on proceeding. In both outcome equations, we include the inverse Mills ratio as a covariate in order to control for the sample selection. If the coefficient of the inverse Mills ratio is statistically significant, it is clear evidence of sample selection and we need to apply the Heckman’s method to reduce selection bias. While the outcome equations include all above-mentioned variables, we borrow the set of independent variables found in Vu (2020) to explain the probability of litigation (selection equation). In particular, to obtain more precise estimates, we estimate this model using as exclusion restrictionFootnote 38 the variable Extreme measure. Intuitively, an extreme regulatory measure of the host country is a reason for the negotiation breakdown and the parties’ motivation to go to trial, but it should not affect the time to resolution as well as the quality of the judgment issued by the tribunal. The results of two-stage estimations are presented in Table 5.

Table 5 Heckman selection model

We find that, as expected, the variable Extreme measure has a positive impact on the probability to start a litigation (Column 1) (Vu, 2020). When focusing on the two outcome equations (Columns 2 and 3), we find that the magnitude of the estimated parameters for the team characteristic variables is almost identical to one found in Table 2. Moreover, there is no evidence of selection bias being the inverse Mills ratio not statistically insignificant. Therefore, The OLS estimations are preferred in the analysis reported in the main text.

Appendix 4: Quantity—Quality tradeoff in case resolution

Recent literature on the economic analysis of court delays highlights the presence of a quantity—quality tradeoff in dispute resolution (Coviello et al. 2015; Dimitrova-Grajzl et al. 2016; Bielen et al. 2018). That is the implementation of policies aimed at reducing the time to resolution may come at the expense of the quality of decisions. To answer this question, we follow the approach suggested by Dimitrova-Grajzl et al. (2016) and Bielen et al. (2018). In the following regression, we use Follow-on proceeding (quality) as dependent variable and Time to resolution (quantity), as well as other variables in the right-hand side of Eq. 2, as independent variables. If the coefficient on Time to resolution is negative, longer case resolution will improve the quality of decisions. Therefore, policies aiming to increase in the speed of case resolution should be implemented carefully, because they may come at the cost of lower quality of decisions. Since both Time to resolution and Follow-on proceeding are two dependent variables that are explained by two sets of explanatory variables, we cannot rule out the situation that some unobserved determinants of the parties’ decision to require post-judgment remedies are also correlated with the duration of the proceeding. As mentioned in Bielen et al. (2018), the finding should be viewed as partial correlation instead of causality. The results of the linear probability regression are presented in Table 6.

Table 6 Quantity-Quality tradeoff in case resolution

Controlling for other factors, the coefficient on Time to resolution is positive and significant at the 5 percent level.Footnote 39 A positive correlation between Time to resolution and Follow-on proceeding means that longer duration to conclude a case does not improve the quality of arbitrators’ decisions. This result resonates with some conclusions in the literature. Rosales-López (2008) and Dimitrova-Grajzl et al. (2016) find no significant association between the productivity of judges (in terms of speed) and the appeal or reversal rate. Coviello et al. (2015) share the same finding with our research. Bielen et al. (2018) find a negative relationship between time to reach a verdict and the reversal rate.Footnote 40

Appendix 5: Multicollinearity

See Table 7

Table 7 Multicollinearity Diagnostics

  • Investor-state arbitration
  • Dispute resolution effectiveness
  • Team performance
  • Team composition
  • Economics of litigation

JEL Classification

  • F21
  • F53
  • K33
  • K41