Divergence of plastid 2-oxoglutarate “only” transporters away from general transporters by using a cysteine-rich architecture.

The common carbon and nitrogen currency, 2-oxoglutarate, could become a valuable resource for nitrogen assimilation and carbon centered biochemical fates. Here in this in silico study, a myriad of factors was used, namely phylogeny, sequence comparisons, and presence and location of clustered cysteines in specific plastid transporters of 2-oxoglutarate, to examine their evolution away from more generalized transporters. This transition would be to adopt the capability of internalizing 2-oxoglutarate alone or with superior specificities at the expense of malate. In phylogeny, the specific 2-oxoglutarate transporters (Cluster 1) are clustered in a separate clade away from 2 clades of general transporters (Cluster 2 and 3). The exclusivity (Cluster 1) and promiscuity of transporters (Cluster 2 and 3) compared to Arabidopsis counterparts characterized prior to this study, were used as a benchmark for my study. Within this mother clade of exclusive transporters, C4 and C3 2-oxoglutarate transporters once again form separate clusters of monophyly.  Furthermore, a pattern of Cys –X-X-Cys-X(19)-Cys is conserved within the 2-oxo-glutarate only transporters that is missing in general transporters. Cysteines which are functionally key residues are inferred to be mediating intra- or inter-reactive disulfide bond formation or using a thiol (sulfhydryl) group for transport or to be forming a metal binding site. When a disulfide bond prediction tool was employed, it showed with negligible doubt that the Cys-X-X Cys-X(19) -Cys region was a strong contender for 2 separate disulfide bonds, although the middle cysteine was predicted to be involved in both.  In addition, Cluster 2 general Zea mays C4 transporters are shown to be more recalcitrant to mutations of cysteines, compared to Panicum and Oryza counterparts.  The study of 2-oxoglutarate and its availability in the chloroplast could play a two-prong role in C4 plants: to be a candidate for synthesis of bundle sheath cell Rubisco enzyme, which makes up ~50% of plant proteins, via ammonia assimilation, and even playing a role in carbon-centered biochemical pathways. This study could greatly facilitate choices in the tinkering of the right transporters for a future C4 rice in a climate change impacted world.

The common carbon and nitrogen currency, 2-oxoglutarate, could become a valuable resource for nitrogen assimilation and carbon centered biochemical fates. Here in this in silico study, a myriad of factors was used, namely phylogeny, sequence comparisons, and presence and location of clustered cysteines in specific plastid transporters of 2-oxoglutarate, to examine their evolution away from more generalized transporters. This transition would be to adopt the capability of internalizing 2-oxoglutarate alone or with superior specificities at the expense of malate. In phylogeny, the specific 2-oxoglutarate transporters (Cluster 1) are clustered in a separate clade away from 2 clades of general transporters (Cluster 2 and 3). The exclusivity (Cluster 1) and promiscuity of transporters (Cluster 2 and 3) compared to Arabidopsis counterparts characterized prior to this study, were used as a benchmark for my study. Within this mother clade of exclusive transporters, C4 and C3 2-oxoglutarate transporters once again form separate clusters of monophyly. Furthermore, a pattern of Cys -X-X-Cys-X(19)-Cys is conserved within the 2-oxo-glutarate only transporters that is missing in general transporters. Cysteines which are functionally key residues are inferred to be mediating intra-or interreactive disulfide bond formation or using a thiol (sulfhydryl) group for transport or to be forming a metal binding site. When a disulfide bond prediction tool was employed, it showed with negligible doubt that the Cys-X-X Cys-X(19) -Cys region was a strong contender for 2 separate disulfide bonds, although the middle cysteine was predicted to be involved in both. In addition, Cluster 2 general Zea mays C4 transporters are shown to be more recalcitrant to mutations of cysteines, compared to Panicum and Oryza counterparts. The study of 2oxoglutarate and its availability in the chloroplast could play a two-prong role in C4 plants: to be a candidate for synthesis of bundle sheath cell Rubisco enzyme, which makes up ~50% of plant proteins, via ammonia assimilation, and even playing a role in carbon-centered biochemical pathways. This study could greatly facilitate choices in the tinkering of the right transporters for a future C4 rice in a climate change impacted world.

Research Article
Open Access DOI: http://doi.org/10.4038/sljb.v6i2.81 gases and has independently evolved in at least 65 occasions, which consolidates the emergence and reemergence of a convergent evolutionary fate. With this in mind, there is a worldwide effort to transform the staple of Asia -the domesticated genus Oryzafrom a C3 engine into a supercharged C4 photosynthetic system where CO2 will be concentrated at the Rubisco (Ribulose 1-, 5-Bisphosphate Carboxylase) active site (Zhu et al., 2010;Feldman et al., 2014).
One of the key molecules that is common to nitrogen assimilation and many carbon related phenomena is 2-oxoglutarate (Woo et al., 1987;Taniguchi et al., 2002). This molecule is synthesized as a product of the Krebs cycle and then is internalized into plastids for the concomitant transformation of its organic backbone into an amino acid, glutamate, using ammonia. C3 plants and C4 counterparts possess transporters for the transfer (exchange) of 2oxoglutarate with malate (for the internalization of the former), while there are more general counterparts which are less specific for the organic molecule 2-oxoglutarate (Taniguchi et al. 2002). 2oxoglutarate has influence on two major elements in plants -the canonical element in terms of macro nutrients, nitrogen, and the one used for autotrophy, the element carbon, which is converted to 3phosphoglycreate by Rubisco and used for organic building endeavors (Huergo & Dixon, 2015). In nitrogen assimilation, 2-oxoglutarate is converted to glutamate by two enzymes, glutamate synthase (GOGAT) and glutamate dehydrogenase (GDH), the latter being able to synthesize a single glutamate and the former two glutamates respectively, starting from the 2-oxoglutarate substrate (Huergo & Dixon, 2015). In plants, the synthesized glutamate donates an amide group to a glyoxylate, which results in the formation of glycine in photorespiratory pathway (Dellero et al., 2016).
Cysteines are thought to be one of the most functionally significant amino acids which transpired later in organismal evolution. A hallmark in cysteine function includes disulfide bond formation which gives a protein extra stability in three-dimensional structure. Cysteines possess thiol groups (the only amino acid to utilize such a group), are able to form disulfide bonds using two cysteine residues, are found as a highly conserved residue in protein sequences, forms clusters in close proximity, possess high metal binding affinities, while having two bases of interpretation on its hydrophobic/nonhydrophobic nature (Poole, 2015).
In a recent study, there was the demonstration by the C4 rice consortium that transgenic expression of key enzymes related to maize C4 photosynthesis inside the rice leaf, does not have a resulting increase in yield or vegetative growth parameters (biomass), despite correct localization in planta of the transgenic enzymes (Karki et al., preprint). The compartmentalization of photosynthesis and associated photorespiration, does make the membrane-bound transporters key contenders for the transport of small organic molecules, chiefly C4 compounds (Dellero et al., 2016). If at all C4 rice is to be a success, the biochemical currents should be manipulated with diligence, along the conduits between mesophyll and bundle sheath cell, cytosol and plastid and other compartments, as too longerrange cell wall (apoplastic transport) and plasmodesmata (symplastic transport) based transport. There are many protein gatekeepers on border control at membranes (Taniguchi et al., 2002) and such proteins have to be taken into perspective, if we are to build a C4 prototype. Furthermore, there is the requirement of looking at the enigmatic 2-oxoglutarate, of which the key functions are all not known in the present (Huergo & Dixon, 2015). How 2-oxoglutarate can be sensed and used in C and N biochemistry can be the holy grail in relation to a future GM-based C4 rice crop.
Finally, there are two kinds of 2-oxoglutarate shuttles; the specific one and the common one, or simply the specialist and the generalist (Taniguchi et al., 2002). While the specific one -albeit associated with 2-oxoglutarate only -is now known to have two biochemical shuttle capacities; oxaloacetate/malate and 2-oxoglutarate/malate (Kinoshita et al., 2011). The former or oxaloacetate/malate shuttles are known as the malate valves (Selinski & Scheibe, 2019) and are key due to the non-permeability of NADH and NADPH across membranes, along with the enzymes, malate dehydrogenases.
This paper is an exploration into what makes the specialist 2-oxoglutarate transporters different from generalist counterparts. Observations on phylogeny, sequence divergence, amino acid composition and reactivities between unique cysteines are inferred here to play key roles in making such transporters unique.

Phylogenetic Reconstructions
The non-redundant downloaded amino acid sequences (as FASTA files) from each query were first aligned with the ClustalW algorithm using MEGA version X (default parameters) (Sohpal et al., 2010) which were converted to the MEGA Sri Lankan Journal of Biology 6(2) June 2021 sequence format, and phylogenetic reconstruction was performed using the neighborhood joining/maximum parsimony methods with support from 500 bootstrap replications. There was no assignment of outgroups.

Prediction of disulfide bonds
The protein >XP_015620646.1 (a specific 2oxoglutarate transporter) from the genus Oryza was searched against the DiANNA server (http://clavius.bc.edu/~clotelab/DiANNA/) for the identification of likely disulfide bond pairs.

Multiple sequence alignments
The non-redundant downloaded amino acid sequences (as FASTA files) were used for sequence alignment using the ClustalW algorithm using MEGA version X (default parameters) (Sohpal et al., 2010).

Results and Discussion
2-oxoglutarate transporters from the genus Oryza (C3) and the genus Panicum (those only made of C4 species) including those that were both general and specific for 2-oxoglutarate, from the NCBI protein sequences database, were first downloaded. The threshold value was set as 30% identity (Structural Classification Of Proteins (SCOP) defined levels) and 30% coverage (arbitrary value) to zoom in on authentic candidates and away from false positives (Lo Conte et al., 2000). For both C3 rice and C4 Panicum, there were three clear clusters of dicarboxylate transporters, which showed that there were three evolutionary outcomes in both C3 and C4 candidate plants (Figure 1).
In a study performed in 2002, a recombinant AtpOmt1 protein from Arabidopsis thaliana, produced using a yeast expression system, was able to transport "exclusively" 2-oxoglutarate, while the related AptDct1 protein was able to transport both 2oxoglutarate with glutamate suggesting the latter was a general transporter (Taniguchi et al., 2002). However, this is now disputed due to the capability of AptOmt1 to internalize oxaloacetate into the chloroplast at the expense of malate (Kinoshita et al., 2011). The Arabidopsis AtpOmt1 protein was found as a divergent node in phylogeny that deviated prior to the emergence of C4 photosynthesis and clustered with 6 Oryza (C3), 3 Zea (C4) and 4 Panicum (C4) proteins as a monophyletic clade (Figure 2). This monophyletic clade composed of the "exclusive" 2-oxoglutarate transporters was named Cluster 1. The AtpOmt1 protein would have been present at a time predating the emergence of C4 photosynthesis, and would have diverged down the dicot lineage after the breakaway of monocots from the prior common plant lineage dating back to before 140-150 Mya.
On the other hand, the AtDct1 protein was found as a basal node to two other Clusters (Cluster 2 and 3), which demonstrates that the emergence of the two Clusters postdated the dicot-monocot breakaway (Figure 2). From sequence similarity and phylogeny, it is stated that the AtDct1 protein and the remaining proteins from Cluster 2 and Cluster 3 are all general dicarboxylate transporters, using a combination of sequence similarity, with monophyly. The general transporters will likely show promiscuity to accommodate myriad compounds to be transported across as cargo to the plastid interior, while letting malate be transported out to the cytosol. It is though interesting that once the monocot clade emerged from the monocot-dicot common lineage, there too was an extra divergence event which made way to the formation of Cluster 2 and Cluster 3 as separate Clusters, which predates the 35 Mya emergence/breakaway of the C4 photosynthesis group (Figure 2). Basically, following the breakaway which took place between ~140 and ~35 Mya, there would have been a bifurcation, to let two Clusters -2 and 3 -emerge from the common monocot lineage.
Sequence features that were different between the exclusive 2-oxoglutarate (Cluster 1) and the generalized transporters (Cluster 2 and 3) were scrutinized. When sequence alignments were produced and analyzed, there were three cysteines, clustered close to each other that were only present in Cluster 1 (in both C3 and C4 candidates) but were missing in general transporter clusters. The distance between the first and the second cysteine was 2 residues and the space between the second and the third cysteines was 19 residues (Figure 3). The motif was named as Cys -X-X-Cys-X(19)-Cys. The same three cysteines too are conserved in the Arabidopsis AtpOmt1 protein (-CVACGSNVGDGTEHRLGSWLMLTC-) suggesting that for specialist 2-oxoglutarate transporters, there is no difference in relation to the monocot-dicot divergence or the emergence of C3/C4 photosynthesis systems, indicating events in convergent evolution. However, cysteines are relatively rare in transmembrane proteins which infer that the three-cysteine region may be crucially significant for a specific function.

Figure 1(A)
The amino acid sequences comprising diarboxylate (C4, 2-oxoglutarate) transporters from the genus Oryza were first aligned using the ClustalW algorithm using MEGA version X and the phylogenetic reconstruction performed using the Neighborhood Joining method with support from 500 bootstrap replications. There are three possible biochemical fates to the above string (three) of cysteines. First, they can form intra-molecular disulfide bonds, or if the arrangement of the monomers is as dimers or multimers, then there is the possibility of intermolecular disulfide bond formation. When the secondary structure of the Oryza protein (>XP_015620646.1) was analyzed using bioinformatics (McGuffin et al. 2000), The first two cysteines (Cys X-X Cys) were found in an extracellular domain and the third cysteine was found in a region making interactions with the plastid membrane, perhaps in a helical disposition (Figure 4). The Cys-X-X-Cys region is one of the most commonly found motifs in proteins, although this motif is found less in plants compared to most other complex as well as basic life forms (Miseta and Csutora 2000). It is suggested that either of these three cysteines could form intra-molecular disulfide bonds. To further explore this possibility, the Oryza sativa japonica sequence >XP_015620646.1 (a specific 2-oxoglutarate transporter belonging to Cluster 1) was used against the DiANNA server (http://clavius.bc.edu/~clotelab/DiANNA/) for the identification of likely disulfide bond pairs (Ferre & Clote, 2005;Ferre & Clote, 2006). Three cysteines (Cys -X-X-Cys-X(19)-Cys) was found as being reactive in disulfide bonds in the following combinations. Disulfide bonds 232-235 (Cysteine 1 and Cysteine 2) or 235-255 (Cysteine 2 and Cysteine 3) were identified by DiANNA as the likely sources of disulfide bonds ( Table 1). The fact that the second cysteine is found in both bonds, attests to a likely scenario that only one of the above bonds are present at any given time.   Table 1: Predicted disulfide bonds and their probabilities based on scores, when the Oryza sativa Japonica sequence >XP_015620646.1 was searched using the DiANNA server (http://clavius.bc.edu/~clotelab/DiANNA/). The most likely scores respectively, involve the regions demarcated by the Cys-X-X -Cys X(19) Cys and are a reflection of their candidacy for being reactive as disulfide bonds.

Cysteine sequence position Distance
Bond Score The DiANNA server Clote, 2005, 2006) uses a five step process for the identification of disulfide bonds: (1) PSIPRED is used for the determination of secondary structural motifs; (2) PSI-Blast is used against the SWISS-PROT database to obtain a multiple sequence alignment (3) (1) and (2) together lead a neural network mechanism for training purposes of the prediction methodology (4) The oxidation states of the cysteines are predicted and finally (5) Rothberg's usage of Gabow's maximum weighted matching parameters are used for the clear demarcation of likely disulfide bond formations (Ferre & Clote, 2006).
On the other hand, though, there have been instances where transmembrane cysteines have facilitated the transport of compounds to the inside of a compartment, which points to any of the cysteines, being structurally partial to the permeation of the membrane by the organic molecule, 2-oxoglutarate (Jimenez-Vidal et al., 2004). Since the chloroplast is widely an oxygen rich environment (due to photosynthesis), it is likely that disulfide bond formation would be preferred. Yet another possibility is that the sulfhydryl groups are acting as sensors of 2-oxoglutarate or to bind to metal ions, thereby enabling a more discrete function.
A preserved cysteine is found in Cluster 2 of Zea mays general C4 transporters but are missing in counterparts of the same clade that are Panicum or Oryza in identity ( Figure 5 and 6). Even in the basal AtDct1 protein there is a cysteine to glycine mutation indicating that the Zea mays sequences may have preserved the cysteine function while the near and far counterparts in Cluster 2, may have lost the putative functional cysteine to a glycine ( Figures  5 and 6). It would be interesting to transfer the candidate Zea mays sequences in Cluster 2 to the C3 rice plant to ascertain their true contribution to the cellular transport of C4 compounds. It is inferred that the cysteine to glycine mutation to be a loss of function mutation. The cysteine to glycine mutation according to the genetic code is not a simple wobble (third) position mutation, and this consolidates my theory that this is likely a functional strategy preserved in Zea mays, that is common to all C4 transporters of this food crop.
There is the need to explore the role of the three cysteines using site-directed mutagenesis or CRISPR/Cas9 and perhaps even using the Xenopus oocyte expression system to study the transporter functions. It is crucial to learn how 2-oxoglutarate is channeled across the many conduits that make up compartmentalized anatomies and substructures (example - Figure 7). With the absence of a clear tertiary structure, it is difficult to identify any other residue that may have a role to play in the biology of 2-oxoglutarate transporters. The rather unique syndrome of specific 2-oxoglutarate transporters is a sound convergent adaptation, and genetically incorporating such a transporter in C4 rice should be a biochemical bonanza for the rice scientists working on C4 rice. Sri Lankan Journal of Biology 6(2) June 2021

Figure 5:
The sequence alignment of all 28 sequences representing Cluster 2 and Cluster 3, showcasing the key cysteine→glycine mutation that is only found in Oryza and Panicum general C4 transporters of Cluster 2 but are missing in Zea mays transporters, suggesting a likely preserved cysteine implying functional significance.  GOGAT -Glutamine oxoglutarate aminotransferase; GDH -Glutamate dehydrogenase; GS -Glutamine Synthetase.

Conclusions
It is showcased here the founding diversity in C4 transporters capable of transport of 2-oxoglutarate to the inside of the plastid, while focusing on their divergence from general transporters incorporating cysteines in their extracellular or membraneinteracting helical domains. The phylogeny of unique 2-oxoglutarate only transporters suggests that C3 and C4 photosynthesis-based divisions share the same cysteine architecture perhaps alluding to common shared functionality. The Arabidopsis AtOmt1 protein too shares with this monophyletic clade -although diverging at the dicot-monocot bifurcation -the three cysteines (Cys -X-X-Cys-X(19)-Cys) architecture. It is also shown that the presence of a conserved cysteine in all Cluster 2 2oxoglutarate transporters of Zea mays, which is transformed to a glycine in non-Zea proteomes, which we infer to be a loss of function mutation. In all, learning the architecture of cysteines and their possible contributions to transport of 2-oxoglutarate should be a valuable addition to the strategic tinkering of transporters inside a future C4 rice.