Introduction

HIV-1 remains a matter of global significance and concern, even after over 40 years of research. As of 2022, nearly 38 million individuals were infected with HIV-1 [1]. The virus is categorized into groups M, N, O, and P according to genetic differences [2]. Among these groups, group M is the most prevalent worldwide and is further divided into subtypes A, B, C, D, F, G, H, J, and K [3]. The pathophysiology of HIV-1 varies by region and subtype. Among several contributing mechanisms, subtype-specific alterations of viral protein amino acid sequences are considered influential in HIV-1 pathophysiology and clinical outcomes in people with HIV [4, 5]. HIV encodes 15 unique proteins, which can be broadly categorized into structural, enzymatic, regulatory, and accessory. Structural proteins include the Gag polyproteins: p17 (Matrix), p24 (Capsid), p7 (Nucleocapsid), and p6, and the envelope polyproteins, which include gp120 (Surface glycoprotein) and gp41 (Transmembrane glycoprotein). Enzymatic proteins include the Pol polyproteins: Protease, Integrase, Reverse Transcriptase, and RNase H. Regulatory proteins include Tat (Transactivator of Transcription) and Rev (Regulator of Expression of Virion Proteins), while accessory proteins include Vif (Viral Infectivity Factor), Nef (Negative Regulatory Factor), Vpu (Viral Protein U), and Vpr (Viral Protein R) [6, 7]. Specific amino acid variations in these proteins have been associated with differing pathogenesis [8,9,10,11].

Among the myriad of HIV-1 viral proteins, Tat [12, 13] and Vif [14, 15] have garnered significant attention for their role in HIV-1 pathogenesis. The HIV-1 Tat protein binds to RNA element called the Trans-activation response element (TAR) and enlists CDK9/cyclin T1 along with other host factors to trigger HIV-1 transcription, playing a pivotal role in HIV pathogenesis. [16]. Likewise, Vif fulfils functional roles by counteracting members of the Apolipoprotein B Editing Complex (APOBEC)3 (A3) family, directing them for degradation. The A3 proteins act as host antiviral cellular proteins, inducing hypermutations in the viral genome [17].

These viral proteins can still be detected despite effective antiretroviral therapy and undetectable viremia, suggesting they may have a functional role in the modern combined antiretroviral therapy (cART) era and warrant further investigation [18, 19]. Specific amino acids in Tat and Vif are linked to negative clinical outcomes in people with HIV, such as disease progression [19, 20]. Variation of Tat at position 24 may influence proviral DNA load and CD4+ counts [21]. The amino acid sequence variation of the Tat protein at position 31 is one of the most widely investigated positions, as the disulfide motif (C30C31) has been implicated in higher levels of inflammation in cell culture studies [22,23,24]. The position 57 of Tat has also been implicated in the dysregulation of transactivation [25, 26] and inflammation [20, 27]. For Vif, variation at position 17 between Lysine and Arginine was suggested to have a role in viral infectivity and may function in binding APOBEC3G [28]. Lastly, Vif mutations at position 31 between Valine and Isoleucine are involved in cell cycle arrest by inducing the degradation of protein phosphatase 2 regulatory subunit B family (PPP2R5) subunits [29, 30]. Thus, previous studies highlight that these specific amino acid signatures may be important in the pathogenesis of HIV-1.

In line with this, previous investigations by our group, conducted in a limited cohort of South African individuals living with HIV, have shown that Tat amino acid sequence diversity influences the levels of peripheral chemokine ligand 2 (CCL2) and thymidine phosphorylase (TYMP) [20]. Additionally, specific amino acids in the Vpr have been shown to influence the levels of certain immune markers and metabolites in the tryptophan-kynurenine (Trp-Kyn) pathway [31, 32]. Therefore, we hypothesized that other viral proteins might also influence specific immune markers and metabolites. Building on these initial findings and conducting an exploratory analysis, we aimed to determine whether Tat sequence variations could affect other immune markers/metabolites not previously investigated. Additionally, we recently demonstrated that Vif in South Africa is genetically diverse compared to other regions where subtype C is present [33]; however, the clinical implications of these amino acid variants remain unknown. Thus, we sought to explore whether Vif amino acid sequence variants could influence immune markers and metabolites within this study. Further, our understanding of these proteins and the effects of amino acid sequence variations come from studies in regions dominated by subtype B (Europe and United States of America)[34]. This is intriguing since subtype C (Southern African region) has the highest global incidence, yet sequence variation knowledge in such cohorts is notably sparse [20, 21, 33, 35].

Moreover, dysregulated inflammation [36] and metabolism are crucial mechanisms that contribute to adverse clinical outcomes in people with HIV [37, 38]. Our focus was specifically on the Trp-Kyn metabolic pathway, particularly examining kynurenic acid (KA), quinolinic acid (QUIN), Trp, and Kyn, as these metabolites has been previously linked to HIV-1 pathogenesis [39, 40]. Further, current evidence has suggested a correlation between peripheral inflammation and the Trp-Kyn metabolites [41] in people with HIV [32]. In general, our understanding of the influence of viral protein amino acid sequence diversity on underlying mechanisms is largely based on studies conducted in cohorts where subtype B predominates [34]. Therefore, there is a need to profile the influence of viral protein amino acid sequence diversity in subtype C-specific cohorts. In this study, we aimed to use an exploratory approach to investigate the potential associations between specific Tat and Vif amino acid sequence variants with soluble urokinase plasminogen activator receptor (suPAR), interleukin (IL)-6, high-sensitivity C-reactive protein (hsCRP), soluble CD163 (sCD163) and neutrophil gelatinase-associated lipocalin (NGAL), as well as Trp-Kyn pathway metabolites in a cART treatment-naive South African cohort.

Methods

Study cohort

The Prospective Urban and Rural Epidemiology (PURE) study is centred on cardiovascular diseases (CVD) and investigates the underlying mechanisms that may influence CVD development, including factors like HIV and inflammation. This comprehensive study spanned 20 countries worldwide, encompassing nations with diverse income levels, ranging from high to low-income categories [42]. As a component of this comprehensive investigation, individuals aged 30 years and above, of African descent, were recruited from both rural and urban areas in the North West province of South Africa. Participants were assisted by trained fieldworkers in an interview-manner to obtain questionnaire data in the home language of the participant. A questionnaire tailored and standardized according to the international PURE study and adapted to each country was utilized to gather demographic and lifestyle information from participants [42]. Demographic data covered age and gender, while self-reported lifestyle details encompassed location, employment, medication usage, alcohol consumption, and tobacco use. Data at baseline was voluntarily collected in 2005, involving n = 2010 participants. Data was then collected in 2010 (1288 participants) serving as the first follow-up. This study included participants who were HIV-1 positive and not on treatment (cART-naïve) at the time of data collection in 2010, comprising a total of 71 participants. For the subsequent analysis, we included only those participants for whom viral sequencing was successful. Thus, 67 people with HIV were included for further analysis (Fig. 1). We were particularly interested in specific amino acid variations within the Tat and Vif proteins due to their pathogenic roles in HIV-1. Therefore, for Tat, we investigated amino acid variations at position 24 (Lysine (K) vs. Asparagine (N)), position 31 (Cysteine (C) vs. Serine(S)), position 57 (Arginine (R) vs. Serine(S)), and position 68 (Leucine (L) vs. Proline (P)). This study protocol was approved by the Health Research Ethics Committee of North-West University in South Africa (NWU-00106–22-A1).

Fig. 1
figure 1

Study layout

HIV Status

Before conducting HIV testing, all participants received counselling from a trained counsellor. The initial step in determining their HIV status involved using the First Response rapid HIV card test (Premier Medical Corporation Limited, Daman, India), following the protocol outlined by the South African Department of Health. A confirmatory test, the SD BIOLINE HIV 1/2 3.0 card test (Standard Diagnostics, INC in Korea), was then conducted. If participants tested positive for HIV, they then received post-counselling and were referred to the nearest clinical/hospital for further assessment. At the clinic/hospital, CD4 + counts were analysed using the flow cytometric method (Beckman COULTER EPICS XL™ machine, Fullerton, USA).

Preparation of blood samples

After fasting, whole blood samples were collected in EDTA tubes. To isolate plasma, samples underwent centrifugation at 2000 × g for 15 min at 10 °C within a 2-h timeframe. After centrifugation, the samples were transferred into microcentrifuge tubes, promptly frozen using dry ice, and subsequently stored at -80 °C until further analysis. For samples gathered from rural locations, the same rapid freezing process was employed, but they were maintained at -18 °C for a maximum of five days before transportation to the laboratory. Upon arrival, these samples were again stored at -80 °C until they underwent subsequent analysis.

Targeted metabolomics analysis

All chemicals and standards used in the analyses were described previously [32]. We examined Trp-Kyn metabolic profiles, focusing specifically on Trp, QUIN, Kyn, Trp/Kyn ratio, and KA, due to their potential role in HIV-1 pathophysiology. These were investigated using high-performance liquid chromatography (HPLC) and quantified through liquid chromatography–tandem mass spectrometry (LC–MS) using a targeted approach. To precipitate proteins from the HIV plasma samples, 300 µl of ice-cold acetonitrile was added to 100 µl of the plasma, along with 100 µl of an internal standard mixture (10 ppm) containing L-Kynurenine-d4 [2-aminophenyl-3,5-d2], Kynurenic acid-d5, D-Tryptophan (Indole-D5, 98%) and 2,3-pyridinedicarboxylic acid-d3 (Major) [53]. Matrix-appropriate external calibrators were prepared in a similar fashion. These calibrators were enriched with Kyn, Trp and QUIN. To establish a calibration curve, these calibrators were diluted serially. Thereafter, all samples underwent vortexing and placed on ice for 10 min to allow protein precipitation, followed by centrifugation at 12,000 × g for 10 min. The supernatant obtained after centrifugation was collected, dried (nitrogen gas), and stored at -80 °C until analysis. Before analysis, the samples were thawed to room temperature. Thereafter samples were re-dissolved in 100 µl of mobile phase (50% HPLC water and 50% acetonitrile), incubated for 30 min at room temperature, and underwent one additional vortexing step. Samples were then transferred to glass vials with vial inserts for LC–MS/MS analysis. We assessed the metabolic profiles utilizing multiple reaction monitoring (MRM) with electrospray ionization mass spectrometry in both positive and negative ionization modes. With the use of commercial standards, we further optimized the MRM transitions and chromatographic conditions on the LC–MS/MS. This was to ensure precise detection and quantification of the target metabolites. The Kyn to Trp ratio was utilized to estimate IDO activity [43].

LC–MS/MS analyses

HIV-positive plasma samples underwent targeted LC–MS/MS analysis using an Agilent Technologies 6470 triple quadrupole mass spectrometer coupled with a 1200 series HPLC system. Chromatographic separations were performed on an Acquity UPLC CSH C18 1.7µm 2.1 × 100mm column maintained at 80°C, with a 1 µL sample injection. The mobile phases, HPLC water (solvent A) and acetonitrile (solvent B), both contained 0.1% formic acid. The positive and negative polarization characteristics were previously described [32].

Immune marker measurement in blood samples

We prepared samples following the procedure outlined in Sect. "Preparation of blood samples". We focused on examining specific immune markers, namely NGAL, suPAR, hsCRP, IL-6 and sCD163. We selected these markers due to their potential significance in inflammation-related diseases in people with HIV. More specifically, the levels of these markers are known to be influenced by HIV-1 [44,45,46,47,48,49]. Previous work done by our group has shown that, in a South African cohort, certain Tat amino acids (R57S) were linked to participants having lower peripheral CCL2 and TYMP levels [20]. To build on these initial findings, it is reasonable to hypothesize that Tat and Vif amino acid variations may influence other immune markers. To quantify plasma suPAR levels, the suPARnostiC® ELISA kit (ViroGates, Copenhagen, Denmark) was used. Plasma hsCRP were analysed using a particle-enhanced turbidimetric assay (Cobas Integra 400 plus, Roche Diagnostic, Basel, Switzerland), while IL-6 levels were determined via the electrochemiluminescence immunoassay method (Elecsys 2010, Roche, Basel, Switzerland). ELISA assays (R&D Systems DuoSet) were employed for plasma sCD163 and NGAL measurements, following the instructions from the manufacturer, with all samples analysed in duplicate. The coefficients of variation for both intra- and inter-assay tests fell within acceptable ranges, with values below 8% and 10%, respectively.

Viral sequencing

Viral sequencing was conducted as previously described [31, 32]. In brief, RNA was extracted from plasma, reverse transcribed via polymerase chain reaction (PCR), and the cDNA was used to amplify the specific Tat/Vif regions (HXB2 position 4900–6351). The following primers were utilized to amplify this region: Vif-1 (5’GGGTTTATTACAGGGACAGCAGAG) and CATH-4R (5’-GTACCCCATAATAGACTGTGACC). The PCR cycling conditions have been detailed in our primary investigations [31, 32]. After PCR amplification, the PCR products were purified and sequenced using the BigDye Terminator and analysed with the ABI Prism 3130xl automated DNA sequencer (Applied Biosystems, Foster City, CA). The acquired sequences underwent analysis utilizing GeneStudio™ Professional sequence analysis software (Version 2.2) and were subsequently translated into amino acid sequences employing the Expasy translate tool [50]. Key mutations within the Tat/Vif regions were identified and specifically highlighted for further analysis.

Statistical approach

All analyses were carried out utilizing SPSS software (IBM, USA), version 29. P-values below 0.05 were deemed statistically significant for all analyses. Normality of variables was evaluated by visually inspecting QQ plots alongside descriptive statistics. It was observed that the data distribution of immune markers IL-6, hsCRP, and sCD163, as well as the metabolite KA, exhibited skewness. Consequently, the data for skewed variables were log-transformed prior to statistical analyses. After log transformation, all test assumptions were checked and met. Data presented acceptable skewness and kurtosis values within the range of -2 and 2, residual plots indicated homoscedasticity and linearity. The Durbin-Watson statistic was within acceptable range, indicating independence of errors. Lastly, residuals of the regression models were normally distributed.

Primarily, we aimed to evaluate whether specific immune marker/metabolite levels had a relationship with single amino acid variants in Tat and Vif, respectively. Therefore, we divided participants into groups for Tat: K24 vs. N24, C31 vs. S31, R57 vs. S57, and L68 vs. P68. We divided participants into groups for Vif: K17 vs. R17 and I31 vs. V31. These amino acid positions were selected as previous studies have implicated it in pathogenesis and neuropathogenesis of HIV-1 [9, 21, 27, 28, 33]. χ2 tests were utilized to assess group disparities across amino acid variants for sex, gender, smoking, alcohol consumption, and locality. Independent sample T-tests were utilized to detect differences in study characteristics (such as age, body mass index (BMI), and CD4 + counts), as well as levels of immune markers/metabolites. For the χ2 tests and independent sample t-tests, p values of < 0.05 we deemed significant. To correct for the number of immune markers or metabolites tested, a Bonferroni correction was implemented (α/n = 0.05/5 = 0.01) in all relevant analyses. Pearson correlation analysis was utilized to identify covariates by exploring correlations between sociodemographic and lifestyle variables (age, sex, BMI, smoking status, alcohol use, and locality) and specific immune markers/metabolites. An Analysis of Covariance (ANCOVA) was used to adjust for the influence of covariates, which helps in isolating the effect of the independent variable on the dependent variable. Therefore, an ANCOVA was performed, with immune marker/metabolite levels as the dependent variables, to compare their levels among Tat or Vif amino acid variants. Adjustments were made for sex, BMI, and locality in the investigation of immune markers, and for alcohol use and BMI in the examination of metabolites to prevent model overfitting. Multiple regression analysis using the enter method was employed to determine associations between Tat/Vif amino acid variants and immune marker/metabolite levels after adjusting for covariates. For the Pearson correlation and multiple regression analyses, p values of < 0.05 we deemed significant.

Results

Study characteristics

This study included a total of n = 67 treatment naïve people with HIV, with a mean age of 47.23 years as shown in Table 1. A total of 70% (n = 47) of the cohort were men. Viral load data was not documented in the primary study; nonetheless, at the time of sample collection, all participants were treatment naive. Only 59% of participants had CD4 + count data, with a mean CD4 + count of 299.5 (148.7) cells/μl. Approximately half of the participants (n = 34, 51%) were recruited from urban areas in South Africa. Self-reported data on alcohol consumption and smoking status were accessible for 99% (n = 66) of the participants. Among these individuals, 62% and 41% were identified as current or former smokers and alcohol consumers, respectively. Participants with available data for both Tat/Vif sequences totalled n = 21, while those with Tat only were n = 37, and Vif only were n = 51. Immune markers were examined in the entire cohort (n = 67); however, due to limited availability of biological samples, metabolomic analysis was conducted on a subset of the entire cohort (n = 48).

Table 1 Study characteristic of people with HIV

Differences in immune marker and metabolite levels across Tat/Vif variants

Participants were stratified based on Tat amino acid variants at position 24 (K: 14 vs. N: 13, n = 27), position 31 (C: 4 vs. S: 29, n = 33), position 57 (R: 5 vs. S: 26, n = 31), and position 68 (L: 17 vs. P: 15, n = 32), as well as Vif amino acid variants at position 17 (K: 25 vs. R: 23, n = 48) and position 31 (I: 13 vs. V: 30, n = 43). Using independent sample t-tests as well as χ2 tests, no significant differences were found in study characteristics (sex, age, locality, alcohol, BMI, and smoking) amongst the investigated groups.

Participants with the Tat N24 variant had higher levels of sCD163 compared to participants with the K24 variant (p = 0.04) (supplementary Fig. 1A). Moreover, sCD163 levels were significantly higher in participants with Tat S31 compared to participants with C31 (p < 0.001) (supplementary Fig. 2A). Nevertheless, following the application of a Bonferroni correction (p = 0.05/5 = 0.01), only higher levels of sCD163 remained statistically significant in participants with the S31 variant (p < 0.001). None of the remaining Tat amino acid variants (positions 24, 31, 57, and 68) displayed significant differences for any of the immune markers or metabolites investigated (supplementary Figs. 1–4). None of the immune markers or metabolites were significantly different between the Vif amino acid variants (supplementary Figs. 5–6).

Furthermore, we aimed to determine whether our findings retained or gained significance after adjusting for covariates. After adjusting for locality, BMI, and sex using ANCOVA models for immune markers, it was found that among all Tat variants investigated, participants with the Tat N24 variant had higher levels of sCD163 (adj R2 = 0.048, p = 0.042) compared to those with the K24 variant (Fig. 2A). Upon adjusting for alcohol consumption and BMI to investigate metabolites, it was observed that participants with the Tat R57 group had higher levels of KA (adj R2 = 0.166, p = 0.031) compared to those in the S57 group (Fig. 2B). The full ANCOVA data for significant findings are presented in the Supplementary Tables 1 and 2. None of the metabolites/immune markers showed significant differences between the Vif amino acid variants and the remaining Tat amino acid variants after adjusting for covariates. Sample level data for immune markers and metabolites are presented in Supplementary Table 5 and 6.

Fig. 2
figure 2

Significant findings for Tat position 24 and 57. (A) sCD163 levels were significantly higher in participants with the N24 amino acid variant in contrast to participants with the K24 amino acid variant (p = 0.042). KA levels were significantly higher in the participants with the R57 amino acid variant compared to participants with the S57 variant (p = 0.031). The bars depict the average protein concentrations across the diverse study groups and are articulated as mean values with standard error of the mean (SEM)

Relationship between Tat and Vif variants and immune marker/metabolite levels

Additionally, we sought to assess the associations between Tat/Vif amino acid variations at specific positions and levels of immune markers/metabolites while controlling for covariates. To achieve this, we employed multiple regression analyses where covariates including locality, BMI, and sex were adjusted for in models focusing on immune markers, while alcohol use and BMI were adjusted for in models focusing on metabolites. The volcano plot (Fig. 3) illustrates the distribution of effect sizes and significance levels for the associations between amino acid positions in Tat and Vif and various immune markers and metabolites. Notably, significant associations were highlighted for Tat positions 24 and 57, highlighting their impact on specific immune markers and metabolic profiles. The heatmap (Fig. 4) further demonstrates these associations by mapping the normalized effect sizes for each Tat position against various immune markers/metabolites. Significant associations are marked with an asterisk, providing a clear visualization of the correlation patterns. This heatmap underscores the importance of certain Tat positions in influencing immune /metabolic changes, with Tat24 and Tat57 showing notable associations with sCD163 and KA, respectively. Among all Tat and Vif amino acid variants investigated, the amino acid variation of Tat at position 24 was found to be associated with sCD163 (adj R2 = 0.048, β = -0.416, p = 0.042). Similarly, the amino acid variation of Tat at position 57 was associated with KA (adj R2 = 0.166, β = 0.535, p = 0.031) (Fig. 3 and Fig. 4). The full regression data for significant findings are presented in Supplementary Tables 3A, 3B, 4A, and 4B.

Fig. 3
figure 3

Volcano plot demonstrating the association between viral protein amino acid variations and immune marker/metabolite levels. The plot includes Tat amino acid position 24 (blue circles), position 31 (brown circles), position 57 (green circles), and position 68 (black circles). It also includes Vif amino acid positions 17 (orange triangles) and 31 (yellow triangles). Significant values, those with > -log10 1.3, are in red text

Fig. 4
figure 4

Heatmap representing the associations (normalized effect size β) between Tat/Vif amino acid positions and peripheral immune and metabolic markers. Significant associations with p-value < 0.05 are indicated by white asterisks

Discussion

In this study, we investigated the relationship between Tat and Vif amino acid sequence variations and peripheral immune markers, as well as Trp-Kyn metabolism, in a unique cohort of cART-naïve individuals from South Africa. This approach enabled us to analyse metabolic and immune profiles without the potential confounding effects of cART, which is known to influence these profiles in people with HIV [51, 52]. Several key findings emerged: (1) After adjusting for covariates, sCD163 and KA were higher in participants with Tat signatures N24 and R57, respectively and (2) amino acid variation at position 24 and 57 of Tat were associated with sCD163 and KA, respectively.

Firstly, we observed that participants with the N24 variant exhibited higher levels of sCD163 in contrast to participants with the K24 amino acid variant and amino acid variation at this position (between N24 and K24) was associated with sCD163. sCD163 is an immune activation marker often considered as an indicator of inflammation [53]. Dysregulated inflammation is often regarded as a negative clinical outcome, as higher levels of this specific immune markers have been linked to disease progression and neurocognitive impairment in individuals living with HIV [54, 55]. This finding is intriguing, particularly considering a recent study conducted among treatment naïve people with HIV in South Africa [11]. The study revealed that the Tat K24 variant was associated with a 2.08 times increased risk of neurocognitive impairment, 3.15 times higher proviral load, and a 69% lower absolute CD4 T-cell count compared to individuals without the signature. However, our findings suggest the N24 variant to be a higher risk amino acid compared to the K24 variant due its possible influence on inflammation. It is conceivable that position 24 of the Tat protein may play a functional role in HIV-1 pathogenesis overall, potentially influencing the structure–function relationship of the Tat protein.. This hypothesis aligns with observations in other regions of the Tat protein, such as the basic domain, where alterations of amino acids significantly affect Tat transactivation [26]. However, this previous study also investigated a small cohort; therefore, these findings warrant further validation in larger cohorts. Nonetheless, further investigation is necessary to elucidate the complete functional significance of variation at this position.

Furthermore, we observed that the Tat R57 variant exhibited higher levels of KA. KA has been linked to the pathology of several disorders [56]. Although our sample size for the amino acid position 57 split was small (R: 5 vs. S: 26, n = 31), and we cannot completely rule out the possibility of false positive findings in this group, the idea that the R57 amino acid variant may negatively influence clinical outcomes has been documented in clinical and in vitro models. In a previous study conducted within our group, albeit with a limited number of participants (R: 4 vs. S: 45, n = 49), lower TYMP and CCL2 levels were noted in the S57 group compared to the R57 group (p < 0.05) [20]. Similarly, in in vitro studies, the R57 variant was found to be associated with increased cellular uptake of Tat, transactivation (likely through enhanced interaction with TAR), and heightened inflammation [8, 27]. Here our results also showed that participants with the R57 signature had higher levels of KA, and amino acid variation at this position (between R57 and S57) was associated with KA. Hence, a consistent trend for the impact of the R57 variant is observed in the dysregulation of metabolism, indicating the necessity for larger cohort studies to further investigate this amino acid signature.

We also found that the Tat C31/C31S variant does not significantly influence the immune markers and metabolites investigated in this study. Similarly, in a previous study, amino acid variation at position 31 also did not show relevant differences in the levels of MCP-1/CCL2, matrix metalloproteinase (MMP)9, NGAL, transforming growth factor (TGF)-β1, TYMP, and vascular endothelial growth factor (VEGF) [20]. Despite extensive investigation of the C31 variant in cell culture studies, which showed that Tat C31 contributes to increased transmigration of cells into the CNS [57, 58], greater neurotoxicity to neuronal cells [59], and higher levels of inflammation [23, 24], these findings are not mirrored in clinical sample types or clinical outcomes. For instance, previous studies indicate that there are no significant disparities in cognitive performance between individuals exhibiting the C31S motif and those lacking the C31S substitution [60]. Likewise, there were no significant differences observed in either volumetric or diffusion indices between the Tat C31S and C31C groups [61]. This implies that the Tat C31S status might not serve as an adequate biomarker for adverse clinical outcomes, as the effects of mutations at this position could be concealed by other unexplored clinical factors.

A new variant identified in approximately half of our cohort was Tat P68, which also did not yield significant findings. It is noteworthy that the Vif R17 and V31 variant is found to be more commonly observed in South African subtype C Vif sequences in comparison to subtype C prevalent regions, such as India and Uganda [33]. Interestingly, a cell culture study suggests that K17 is considered more important for viral infectivity [28] and a The V31I mutation is known to induce cell cycle arrest [29]. However, within the context of this study, no significant findings were reported for the influence of these amino acid signatures. The precise roles of these Tat and Vif variants are yet to be fully explored.

Overall, it is evident from this study and many others that amino acid sequence variations of viral proteins may influence the structure–function relationships of these proteins, ultimately impacting the underlying mechanisms of HIV-1 pathogenesis. However, the extent to which these variations contribute to overall clinical outcomes in people with HIV requires further investigation and cannot be fully ascertained from this study alone. We acknowledge that certain signatures investigated in cell culture (without confounders) similarly reflect characteristics in clinical sample types, while others do not show this trend. This discrepancy may stem from the fact that in people with HIV, several other proteins may influence the levels of these markers, or there may be other confounding factors at play in clinical investigations. Additionally, it is worth considering that certain amino acids may not directly influence outcomes, but rather, their positioning in critical functional regions of key viral proteins may lead to changes in underlying mechanisms.

Limitations

While recognizing the exploratory nature of this study, we acknowledge several limitations that are apparent, and therefore, caution should be exercised in interpreting our findings. Firstly, the number of participants per group was limited, and there were imbalances in group sizes in some cases, which may introduce the possibility of chance findings. Additionally, we may have lacked statistical power to detect other differences. Moreover, our investigation focused only on Tat and Vif, yet HIV involves multiple proteins that could influence the levels of metabolites and immune markers studied here. For instance, previous research by our group and others has linked certain signatures of other proteins like Vpr to immune markers [32]. Furthermore, there may be additional potential covariates that could have influenced our findings in the clinical sample. However, we have adjusted our analysis based on demographics and study characteristics that we have determined to have an influence. Another limitation of our study is related to the number of statistical tests performed and the multiple testing correction strategy employed. To identify covariates associated with our outcomes of interest, we conducted a correlation analysis. This approach allowed us to adjust for only those covariates that were significantly associated, thereby reducing the risk of overfitting the model. However, this method also has inherent limitations. To account for multiple comparisons, we applied the Bonferroni correction. While this conservative approach helps control the family-wise error rate, it also increases the likelihood of Type II errors, which means some true associations may not have been detected. The stringent nature of the Bonferroni correction can lead to an overly cautious interpretation of results, potentially overlooking meaningful findings. Lastly, we only investigated specific immune markers and metabolites, and there may be other markers more directly involved in pathways related to the amino acid changes of the proteins we investigated. Therefore, the exploratory nature of this study should be taken into consideration when interpreting the findings presented here.

Conclusion

In a treatment-naïve cohort from South Africa subtype C, we investigated the associations between changes in amino acid sequences in Tat and Vif and specific immune and metabolic markers. After adjusting for covariates, sCD163 and KA were higher in participants with Tat signatures N24 and R57, respectively and amino acid variation at position 24 and 57 of Tat were associated with sCD163 and KA, respectively. Findings from this study highlight the potential influence of amino acid sequence variation of the Tat protein on inflammatory and metabolic pathways in people with HIV.