Allele frequency data for 23 aSTR for different ethnic groups from Republic of Zimbabwe

Abstract

In order to determine the population allele frequencies of autosomal STR markers of forensic interest in the Zimbabwean population, we analyzed a sample of 478 individuals from 19 different ethnic groups using the PowerPlex® Fusion 6C Kit (Promega Corp, Madison, Wisconsin). The data obtained were compared among the different Zimbabwean ethnic groups as well as with several African populations to establish whether significant differences exist among them. No significant differences were found among the ethnic groups in Zimbabwe. Statistically significant differences were observed between allele frequencies in Zimbabwe and some other African populations, although FST with neighboring Bantu populations from South and Southeast regions were low (below 0.005 in most single locus comparisons).

The importance of knowing the allele frequencies of autosomal STR markers (aSTR) of forensic interest for their use in paternity and/or kinship studies, human identification, as well as in the genetic analysis of forensic evidence is unquestionable. To date, few data on aSTR markers in populations of the Republic of Zimbabwe have been published [1]. In this paper, we report the allele frequencies and forensic parameters of 23 autosomal STR markers included in the PowerPlex® Fusion 6C kit in a sample of 478 individuals of the African Republic of Zimbabwe.

Zimbabwe is located in the southeast of the African continent, between the Zambezi River, Victoria Falls, and the Limpopo River. It is bordered to the west by Botswana, to the north by Zambia, to the south by South Africa, and to the east by Mozambique. Their territories correspond to the former Southern Rhodesia. The country is organized in 8 Provinces: Manicalandia, Central Mashonaland, East Mashonaland, West Mashonaland, Masvingo, Northern Matabeleland, Southern Matabeleland, Midlands, and two cities with territorial status (Bulawayo and Harare) (Fig. 1). Zimbabwe has a population of 16,530,000 according to United Nations Department of Economic and Social Affairs (UNDESA) data in 2017.

Fig. 1
figure1

Administrative division of the Republic of Zimbabwe

The number of established languages listed for Zimbabwe is 22, all of them being living languages. Of these, 16 are indigenous and 6 are non-indigenous (https://www.ethnologue.com/country/ZW). According to the languages spoken in the country, there are two major ethnicities, Shona (8.9 million speakers) and Ndebele (1.6 million speakers), which were included in this study together with other ethnic groups as Kalanga, Manyika, Karanga, Tonga, Shangani, and Sotho. Other individuals self-identified as belonging to different ethnic minority groups, of which only one or two individuals were analyzed, were also included. Samples were collected to represent all the provinces and the two cities with territorial range.

A total of 478 unrelated individuals living in the Republic of Zimbabwe (475 males and 3 females) were analyzed in this study. Buccal swabs collected from different ethnicities were distributed as follows: 146 from Ndebele, 80 from Shona, 56 from Kalanga, 53 from Tonga, 47 from Manyika, 40 from Karanga, 19 from Sotho, 8 from Shangani, and 29 individuals classified as “others.” These 29 individuals were distributed as 3 from Venda, 2 from Xhosa, 1 from Tswana, 6 from Lozwi, 4 from Ngoni, 2 from Chewa, 2 from Zezuru, 1from Chikunda, 1 described as colored, 1 from Malawi, 1 from Mozambique, 1 from Swasi, 2 from Sena, and 2 without ethnic data. This study followed the ethical principles of the 2000 Helsinki Declaration of the World Medical Association [1], and written informed consent was obtained from the participants for cooperation under strictly confidential conditions.

DNA was extracted by silica-based method. All samples were amplified for 23 autosomal STR markers as CSF1PO, FGA, TH01, vWA, D1S1656, D2S1338, D2S441, D3S1358, D5S818, D7S820, D8S1179, D10S1248, D12S391, D13S317, D16S539, D18S51, D19S433, D21S11, Penta D, Penta E, D22S1045, TPOX, and SE33, in addition to DYS391, DYS576, and DYS570 loci presented in the PowerPlex® Fusion 6C Kit (Promega Corp, Madison, Wisconsin) following manufacturer’s instructions. PCR amplification was performed on GeneAmp 9700 (Thermo Fischer Scientific™, Waltham, MA, USA.), and capillary electrophoresis (CE) was performed on the Applied Biosystems 3500 Genetic Analyzers (Thermo Fischer Scientific™) according to the manufacturer’s instructions.

Forensically relevant parameters (i.e. DG: Gene diversity; DP: discrimination power; MP: matching probability; TPI: typical paternity index; PE: power of exclusion; and MinFrec: minimal frequency) were calculated using an in home modified version of the Powerstats Version 1.2 software package (Promega, Madison, WI, USA). Exact tests for Hardy–Weinberg equilibrium as well as inter-population comparisons were performed with Arlequin Version 3.1 [2]. The same software was used to calculate gene diversity (GD). The significance level for multiple tests was adjust using the Bonferroni´s correction [3, 4]. For the Hardy–Weinberg equilibrium test, the rejection level was 0.0021 (0.05/23 loci). In pairwise population comparisons, the significance level varied between 0.0025 and 0.0038, depending on the number of loci compared. A Neighbor-joining tree was built based on Nei’s genetic distances calculated using the PHYLIP software package [5, 6]. The tree was visualized with the Treeview software [7].

The allele frequencies and statistical parameters of forensic interest for the 23 autosomal markers are shown in Table 1. The most polymorphic system was SE33 (GD = 0.9336) whereas the less one was D13S317 (GD = 0.7084). The DP ranged from 0.9891 (SE33) to 0.8567 (D13S317) with PE values spanning from 0.8683 (SE33) to 0.4571 (D3S1358). No departures from HWE were observed in all loci (P value ≥ 0.045) after Bonferroni’s correction for multiple testing.

Table 1 Allele frequency distributions, gene diversity (GD), discrimination power (DP), matching probability (MP), typical paternity index (TPI), power of exclusion (PE), minimal frequency (MinFrec), and P values for Hardy-Weimberg exact test (p-HW), obtained in a sample of 465 individuals from Zimbabwe.

Twelve (all males) out of 478 samples (2.5%) showed a tri-allelic pattern in TPOX and one sample (male) showed this pattern for SE33. These samples were not included in the data set for the calculation of allelic frequencies and statistical parameters. A previous study reported that approximately 2.4% of South African indigenous have three rather than two alleles in TPOX, and revealed that the extra allele is almost always the 10 [8]. Concordantly, in this Zimbabwean population sample, the tri-allelic patterns always included allele 10; seven samples showed genotype 8-10-11, and two samples were 10-11-12, while patterns 6-8-10, 9-10-11, and 8-10-12 were observed once. The sample with tri-allelic pattern in SE33 was 19-20-27.2. All tri-allelic patterns were confirmed using the Globalfiler™ kit (Thermo Fischer Scientific™, Waltham, MA, USA), following the manufacturer’s recommendations.

We found intermediate alleles unreported in other African populations: 10.3, 14.1, and 14.3 in D2S441; 17.1 in D12S391; 26.2 and 33.2 in FGA; 11.4 in Penta D; 11.2 in Penta E; 7.1 in TH01 and 5.2, 9.2, and 20.1 in SE33. The new alleles in D2S441, FGA, TH01, and SE33 were confirmed replicating the genotypes with Globalfiler™ kit (Thermo Fischer Scientific™, Waltham, MA, USA), following the manufacturer’s recommendations.

Population differentiation test showed no statistically significant differences among the 9 Zimbabwean ethnic groups sampled (FST ≤ 0.0018; p ≥ 0.2043). Therefore, these samples were pooled for the comparisons with other African populations.

For an overview of the genetic affinities between Zimbabwe and other African populations [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24], pairwise genetic distances were calculated using the Nei’s formula implemented in the PHYLIP software. Genetic distances were calculated for a common set of 13 STRs (CSF1PO, D13S317, D16S539, D18S51, D21S11, D3S1358, D5S818, D7S820, D8S1179, FGA, TH01, TPOX, and VWA). The most distant population from the Sub-Saharan cluster was the “colored” group from South Africa, due to its high European admixture (Fig. 2). Well apart from this cluster are also the non-Bantu populations from Somalia (mainly Cushitic) and Uganda (Nilotic). The other Eastern populations from Tanzania and Rwanda, although represented by Bantu groups, they stand between the Nilotic and Cushitic populations and the root from where the other Bantu groups emerge. Zimbabwe is in a branch represented by south and southeast populations, closer to the neighboring populations from Mozambique and Botswana.

Fig. 2
figure2

Neighbor-joining tree based on pairwise Nei’s genetic distances between 21 African samples. A European sample was used as reference (data retrieved from the STRs for identity ENFSI Reference database, v2/R2 – STRidER, https://strider.online/frequencies). Compared populations included Angola [9], Angola (Cabinda) [10], Botswana [11], Equatorial Guinea [12], Equatorial Guinea (Fang) [13], Guinea Bissau [14], Mozambique [15], Namibia [16], Nigeria (Yoruba, Igbo and Hausa) [17], Rwanda (Hutu [18] and Tutsi [19]), Somalia [20], South Africa (coloured, amaXhosa, amaZulu [21] and Black [22]), Tanzania [23] and Uganda [24]

Using data from the same populations described above, single locus comparisons were performed based on pairwise FSTs, as shown in Table 2. The highest FST values were found between Zimbabwe and Somalia, with 11 out of the 15 compared loci showing statistically significant non-differentiation P values. Statistically significant FSTs were also observed in the comparison with the South African Colored population group (for the loci D16S539, D21S11, D2S441, and TH01), with Uganda (D19S433, TH01, and TPOX), Botswana (D16S539, D21S11, and TPOX), Tutsi from Rwanda (TH01 and TPOX), Namibia (D8S1179), Igbo from Nigeria (TPOX), Black group from South Africa (D16S539), and Tanzania (D16S539). Relatively low FST values (below 1%) were found in most single locus comparisons between Zimbabwe and South, Southeast, Central and Southwest Bantu populations. In contrast, high FST values were obtained for some loci when comparing Zimbabwe with the Eastern populations from Uganda, Tanzania, and Rwanda.

Table 2 Single locus FSTs and non-differentiation P values calculated between Zimbabwe and 20 African populations. P values were obtained after 50,000 permutations (s.e. ≤ 0.0053). P values below 0.05 are represented in blue. In red are the statistically significant values obtained when applying the Bonferroni’s correction. The significance level was calculated for each population, by dividing 0.05 by the number of loci compared

In summary, the results from this study did not revealed significant differences among the ethnic groups in Zimbabwe for the aSTR analyzed, supporting the use of a single frequency database for the whole population. Furthermore, the high diversity observed supports the usefulness of the studied makers for forensic analysis in Zimbabwe, including kinship and human identification.

References

  1. 1.

    Budowle B, Nhari LT, Moretti TR, Kanoyangwa SB, Masuka E, Defenbaugh DA, Smerick JB (1997) Zimbabwe black population data on the six short tandem repeat loci--CSF1PO, TPOX, THO1, D3S1358, VWA and FGA. FSI 90:215–221

    CAS  Google Scholar 

  2. 2.

    World Medical Association (2013) World Medical Association Declaration of Helsinki: Ethical Principles for Medical Research Involving Human Subjects. JAMA 310:2191–2194

    Article  Google Scholar 

  3. 3.

    Excoffier L, Laval G, Schneider S (2005) Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evol Bioinformatics Online 1:47–50

    CAS  Google Scholar 

  4. 4.

    Weir BS (1996) Genetic Data Analysis II: Methods for Discrete Population Genetic Data. Sinauer Associates, USA, pp 109–110

    Google Scholar 

  5. 5.

    Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30:2725–2729

    CAS  Article  Google Scholar 

  6. 6.

    Felsenstein J (2005) PHYLIP (Phylogeny Inference Package) version 3.6. Distributed by the author. Department of Genome Sciences, University of Washington, Seattle

    Google Scholar 

  7. 7.

    Page RDM (1996) Treeview: An application to display phylogenetic trees on personal computers. Comput Appl Biosci 12:357–358

    CAS  PubMed  Google Scholar 

  8. 8.

    Lane AB (2008) The nature of tri-allelic TPOX genotypes in African populations. Forensic Sci Int Genet 2:134–137

    CAS  Article  Google Scholar 

  9. 9.

    Melo MM, Carvalho M, Lopes V, Anjos MJ, Serra A, Vieira DN, Sequeiros J, Corte-Real F, Melo MM (2010) Genetic study of 15 STRs loci of Identifiler system in Angola population. Forensic Sci Int Genet 4:e153–e157

    Article  Google Scholar 

  10. 10.

    Beleza S, Alves C, Reis F, Amorim A, Carracedo A, Gusmao L (2004) 17 STR data (AmpF/STR Identifiler and Powerplex 16 System) from Cabinda (Angola). Forensic Sci Int 141:193–196

    CAS  Article  Google Scholar 

  11. 11.

    Tau T, Wally A, Fanie TP, Ngono GN, Mpoloka SW, Davison S, D’Amato ME (2017) Genetic variation and population structure of Botswana populations as identified with AmpFLSTR Identifiler short tandem repeat (STR) loci. Sci Rep 7:6768

    Article  Google Scholar 

  12. 12.

    Alves C, Gusmão L, López-Parra AM, Soledad Mesa M, Amorim A, Arroyo-Pardo E (2005) STR allelic frequencies for an African population sample (Equatorial Guinea) using AmpFlSTR Identifiler and Powerplex 16 kits. Forensic Sci Int 148:239–242

    CAS  Article  Google Scholar 

  13. 13.

    Calzada P, Suárez I, García S, Barrot C, Sánchez C, Ortega M, Mas J, Huguet E, Corbella J, Gené M (2005) The Fang population of Equatorial Guinea characterised by 15 STR-PCR polymorphisms. Int J Legal Med 119:107–110

    CAS  Article  Google Scholar 

  14. 14.

    Gonçalves R, Jesus J, Fernandes AT, Brehm A (2002) Genetic profile of a multi-ethnic population from Guiné-Bissau (west African coast) using the new PowerPlex® 16 System kit. Forensic Sci Int 129:78–80

    Article  Google Scholar 

  15. 15.

    Alves C, Gusmão L, Damasceno A, Soares B, Amorim A (2004) Contribution for an African autosomic STR database (AmpF/STR Identifiler and Powerplex 16 System) and a report on genotypic variations. Forensic Sci Int 139:201–205

    CAS  Article  Google Scholar 

  16. 16.

    Muro T, Fujihara J, Imamura S, Nakamura H, Yasuda T, Takeshita H (2008) Allele frequencies for 15 STR loci in Ovambo population using AmpFlSTR Identifiler Kit. Legal Med 10:157–159

    CAS  Article  Google Scholar 

  17. 17.

    Okolie V, Cisana S, Schanfield MS, Adekoya KO, Oyedeji OA, Podini D (2018) Population data of 21 autosomal STR loci in the Hausa, Igbo and Yoruba people of Nigeria. Int J Legal Med 132:735–737

    Article  Google Scholar 

  18. 18.

    Tofanelli S, Boschi I, Bertoneri S, Coia V, Taglioli L, Franceschi MG, Destro-Bisol G, Pascali V, Paoli G (2003) Variation at 16 STR loci in Rwandans (Hutu) and implications on profile frequency estimation in Bantu-speakers. Int J Legal Med 117:121–126

    Article  Google Scholar 

  19. 19.

    Regueiro M, Carril JC, Pontes ML, Pinheiro MF, Luis JR, Caeiro B (2004) Allele distribution of 15 PCR-based loci in the Rwanda Tutsi population by multiplex amplification and capillary electrophoresis. Forensic Sci Int 143:61–63

    CAS  Article  Google Scholar 

  20. 20.

    Tillmar AO, Bäckström G, Montelius K (2009) Genetic variation of 15 autosomal STR loci in a Somali population. Forensic Sci Int Genet 4:e19–e20

    CAS  Article  Google Scholar 

  21. 21.

    Ristow PG, Cloete KW, D'Amato ME (2016) GlobalFiler® Express DNA amplification kit in South Africa: Extracting the past from the present. Forensic Sci Int Genet 24:194–201

    CAS  Article  Google Scholar 

  22. 22.

    Lucassen A, Ehlers K, Grobler PJ, Shezi AL (2014) Allele frequency data of 15 autosomal STR loci in four major population groups of South Africa. Int J Legal Med 128:275–276

    Article  Google Scholar 

  23. 23.

    Forward BW, Eastman MW, Nyambo TB, Ballard RE (2008) AMPFlSTRR IdentifilerTM STR Allele Frequencies in Tanzania, Africa. J Forensic Sci 53(1):245–247

    Article  Google Scholar 

  24. 24.

    Gomes V, Sánchez-Diz P, Alves C, Gomes I, Amorim A, Carracedo A, Gusmão L (2009) Population data defined by 15 autosomal STR loci in Karamoja population (Uganda) using AmpF/STR Identifiler kit. Forensic Sci Int Genet 3:e55–e58

    CAS  Article  Google Scholar 

Download references

Funding

LG was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico – CNPq (ref. 306342/2019-7), and Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro – FAPERJ (CNE-2018).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Carlos Vullo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed consent

This study followed the ethical principles of the 2000 Helsinki Declaration of the World Medical Association and written informed consent was obtained from the participants for cooperation under strictly confidential conditions.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Borosky, A., Rotondo, M., Eppel, S. et al. Allele frequency data for 23 aSTR for different ethnic groups from Republic of Zimbabwe. Int J Legal Med (2021). https://doi.org/10.1007/s00414-021-02514-1

Download citation

Keywords

  • Africa
  • Autosomal STR
  • Forensic parameters
  • Zimbabwean population