1Department of Biochemistry, University of Okara, Punjab, Pakistan
2Department of Zoology, University of Okara, Punjab, Pakistan
3Department of Molecular Biology, Faculty of Life Sciences, University of Okara, Punjab, Pakistan
The etiology and clinical manifestations of immune-mediated diseases are known to be attributed to a variety of hereditary and environmental variables. In the instance of the genetic factor, genetic variations have been identified in a range of autoimmune disorders. Genome-wide association studies (GWAS) are a viable approach for determining the inherited basis of disease-causing genes. We can investigate mutation or polymorphisms correlated to identify the diseases and predict outcomes by fully understanding these results. Autoimmune diseases are distinguished by a lack of self-tolerance, which results in immune-mediated tissue damage and chronic inflammation. Although studies indicate associations between single nucleotide polymorphism (SNPs) in the human genome and autoimmune disorders, the findings are still ambiguous. A literature review was explored to identify the relationship between polymorphisms and autoimmune disease. In this study, we will discuss the significant outcomes of GWAS in common autoimmune conditions such as multiple sclerosis, diabetes, migraine, Parkinson's disease, and Alzheimer's disease. We further explore the prospective role of these innovations in terms of disease forecasting, basic biology, and the development of possible therapies. Continuous developments in technology and analytic methods have boosted the effectiveness and efficiency of genetic mapping approaches.
The human genome innovation is becoming more accessible owing to breakthroughs in DNA sequencing. Due to the massive variety of mutations, interpreting genetic variation is one of the main issues facing genomics research nowadays [1]. The simplest form of polymorphism is a single-nucleotide polymorphism (SNP), which is caused by a single base mutation that changes one nucleotide for another. The insertion or deletion of a DNA sequence may result in a variety of further alterations [2, 3]. The insertion or deletion of a DNA sequence may result in a variety of further alterations. The 99.5% genomic similarity shared by humans suggests that epigenetic mutations and the rest of the 0.5% difference account for phenotypic variability [4] are sometimes referred to as mutations and often as polymorphisms [3, 5]. Any change from the norm in a DNA sequence is referred to as a mutation [6]. Conversely, polymorphism is an alteration in the genetic sequence of the population. A particular genetic variant is depicted in Figure 1A as a polymorphic modification. The majority of SNPs are biallelic (two alternative bases occur), require a minimum frequency in the population (>1%), and multiallelic SNPs do exist[7]. The aforementioned SNPs may impact the genetic code that affects either the coding (exons), intergenic, or non-coding (introns) domains [8, 9]. Polymorphisms with transitions (A ? G or C ?T) are relatively common as SNPs with transversions (A ? C/T; and G ?C / T) [10]. The majority of the genetic differences in the human genome, which comprise between 3 and 5 million in each person and involve a single base change in the alleles, are known SNPs [11, 12]. The existence of short and variable tandem repeats, insertion or deletion polymorphisms (InDels), and SNPs causes sequence modifications. Many InDels occur in functionally significant regions of the DNA and may thus play a role in disease progression [13]. Human genes correlated with various multifaceted disorders were found using candidate genes or GWAS, highlighted in Table 1 [14]. Through the entire genome, mutations occur at a rate of one in a thousand (roughly every 300-2000) pair of bases, while the majority of significant SNPs occur in coding domains (about 50 thousand) [12]. The human genome has an estimated 84.7 million polymorphisms [15]. In one study, around 50% of variants occur in non-coding regions, 25% result in missense variants and the other are silent mutations, may not affect transcribed protein [16].
Figure 1. A) A diagram depicting a single nucleotide polymorphism (SNP) B) The specific categorization of variations
There are primarily two types of SNPs found among them as follows: synonymous SNPs (sSNPs) and non-synonymous (nsSNP). The precise categorization of variants is illustrated in Figure 1B [2]. As such, a large number of variations from the human genome project (HGP) are currently accessible. In the last 5 years, over 500 GWASs have revealed thousands of DNA variants on more than 300 diseases and traits that are functionally characterized [17]. Human genes correlated with various multifaceted disorders were found using candidate genes or GWAS, highlighted in Figure 2 [14]. Prior studies showed that nsSNPs contribute to about half of the alterations correlated with many genetic disorders, including inflammatory and autoimmune disorders [17].
Table 1. A set of genes attributed to the probability of developing various multifaceted human disorders.
The present study showed that nsSNPs, notably missense SNPs have gained attention as potential variables in disease association research with complicated health issues and phenotypic characteristics. A SNP innovation is often used to detect genes associated with diseases and examine inter-individual differences in therapy responsiveness [16]. The conclusion of the entire genome project has led to renewed interest in biological sciences. It offers the techniques for understanding the biological basis of genetic variations, the most frequent inherent trait, evolution, intricate and common diseases such as overweight, hypertension, diabetes, and schizophrenia, as well as to produce gene-based therapeutic medicines [18]. The modification in genomic sequences is thought to be a factor in the evolution of the genome. It may be possible to improve genome-based research on human risk for certain diseases, generate more effective and safer food and drugs for patients, and comprehend how evolution works by categorizing genetic variants across various groups and species. Even so, researchers acknowledge that prior SNP innovation will have an effect on medicine; it must conquer a variety of limitations [16, 19].
Figure 2. Functional representation of SNPs.
Functional categorization of polymorphisms
Functional variants are categorized based on the region where they are mapped and Table 2 shows regulatory SNPs (rSNPs) and microRNA regulatory SNPs (miR-rSNPs), type of functional SNPs located in coding and non-coding genes, respectively; both variants alter gene regulation [20, 21]. The functional changes found in parent mRNA form (pre-mRNA) and mature mRNA are referred to as structural RNA SNPs (srSNPs), whilst microRNA srSNPs are termed as miR-srSNPs. Although srSNPs regulate mRNA translation, splicing, structure and stability, protein maturation, function, and miRNA-mRNA interaction [20, 22]. The primary miRNA and pre-miRNA transcript processing, splicing, miRNA, and mRNA binding and function are all affected by miR-srSNPs [23]. In contrast, nsSNPs are classified to be nonsense or missense as illustrated in Figure 2; the former results in a stop codon and a premature end to the amino acids, whereas the latter results in a protein alteration. Both possess a significant impact, but the latter may not be as severe if substitute amino acids have identical chemical compositions and biological abilities. Both forms of variants have an impact on protein sequence, configuration, and activity [14].
Table 2. Functional characterization of SNPs
Significance of SNPs in Genome Detection
Many complex traits exhibit quantitative variation, and the locus controlling these traits is known as a quantitative trait locus (QTL) [24]. By identifying the entire DNA loci that cause sophisticated variations in traits, lacking a biological map built with genetic markers. With the advent of different identification of sequence variation methods and genotyping, studies utilizing SNPs have become more effective. Polymorphism recognition can be divided into screening genetic sequences for unknown mutations, and assessing (genotyping) people for identified variants. The pursuit of novel mutations can be defined into two approaches: global (random) and localized (targeted) methods [25]. Furthermore, before sequence determination, SNPs must be monitored; several approaches for prescreening SNPs, including single-strand structural variation, denatured dynamics, chemical cleavage, enzyme splitting, array hybridization, repair of mismatches, and bacteriophage Mu DNA transposition are just a few examples [16, 26-30]. The fundamental disadvantage of these approaches is the demand for prior sequence knowledge, which implies the region for the design of primers should be simple. Even so, these approaches are bound to extremely similar areas where the sequence is obtained through homologous cloning. Methods for exploiting variants at random across the genome, such as representation shotgun sequencing, primer ligation-mediated PCR, and degenerate oligonucleotide-primed PCR, were formed [31, 32]. Despite the benefit of having fewer duplicates, such approaches still require sequencing millions of clones to provide evidence suitable for mutation research, and efficacy was substantially lower than anticipated. As such, there remains significant space for development in SNP identification tools, precision, sensibility, and effectiveness.
Autoimmune disease-associated genes
Immune-related diseases, including celiac disease, inflammatory bowel disease, heart disease, arthritis, diabetes, and multiple sclerosis, are a variety of biologically complex disorders that result from autoimmune malfunction and share basic pathogenic processes that remain unclear, although it is commonly believed to develop as a result of an imbalance interaction of environmental as well as genetic variables [33, 34]. It exhibits abnormalities of several significant regulatory processes, and methods such as GWAS along with NGS have greatly enhanced our awareness of biological factors [35]. Over decades, GWAS studies identified a numbers of susceptible alleles, both prevalent and pathogen-specific. Also, Immunochip platform analyzed 200,000 genetic variations in 186 autoimmune related disorder, revealing typical vulnerable loci for several of the aforementioned diseases [36, 37]. Autoimmune disorders arise when the body defense system is misdirected towards the host [38], they vary from organ-specific disorders in which antibodies and T cells react to self-antigens [39]. In regard to the process of delayed immune reactions in the onset of autoimmune disorders, aberrant immune actions are involved in the progression of the diseases. Clinical studies showed that autoimmunity disorders affects around 5-8% of the world's population with limited therapies. Over 80 immune-related diseases were found, and they are seen as a serious global socioeconomic issue since they cause patients severe suffering and reduce their standard of existence [38, 39]. Despite the etiology of many such diseases being unknown, the discovery of complex aspects involving pathogenetic, inherited, and ecological regulators could contribute to the emergence of novel therapies, prior screening ability, and the perception of autoimmune processes that underlie these disorders [40, 41]. New research shows that genomic data may assist in comprehending the etiology of immune-mediated diseases and give a genetic predisposition by discovering the genes and particular processes responsible for the disease. The HLA genes, for example, were identified as the most potent and early-risk genes associated with autoimmunity and predict the extent of the disorder that responds to biological drugs [42]. In this review, we will highlight the noteworthy outcomes of GWAS in common autoimmune diseases as discussed below. We explore the potential relevance of these breakthroughs in disease forecasting, and basic biology, and explore how many currently known genetic mutations will probably be revealed to influence disease vulnerability.
Type 2 Diabetes Mellitus (T2DM)
Diabetes is a significant chronic condition that endangers human life. In 2017, there were an estimated 451 million diabetes sufferers globally [43]. Based on the International Diabetes Federation (IDF) research, around 415 million people have diabetes, with T2D contributing to 90% of these cases [44]. Globally, diabetes was correlated with CDKAL1, PRKAA2, ABCA1, FADS, HLA-B, TCF7L2, IGF2BP2, and EXT2 genes. Currently, preferable 20 SNPs related to SLC30A8 (rs13266634), CDKN2A/2B (rs10811661), HHEX (rs1111875), and TCF7L2 (rs7903146) serving major roles in the risk of T2DM in the European population [45-47]. The Human Leukocyte Antigen class B (HLA-B) gene was related to T2D in the study cohort. The data reveal a substantial link between SNP rs2308655 and T2D in the Pashtun tribal population [48]. The other HLA alleles, mainly rs1051488, rs1131500, rs1050341, and rs1131285, were identified. A study issued in biology discovered rs560887 in the G6PC2 gene was linked with FBG while not with T2DM risk, and multiple studies stated that variations in the G6PC2 gene were linked to T2DM susceptibility. Another study found that G6PC2 (rs16856187) and GCKR haplotypes were related to T2DM vulnerability [49], and also revealed rs492594, and rs2232328 variants in Arabic people associated with diabetes [50, 51]. The SNPs of glucokinase (GCK) rs1799831 were linked with gestational diabetes mellitus (GDM) in the Indian population. A T2DM was related to rs1276891 and 44,184,184 3?UTR in the GCK gene in the Indian population [52]. Also, GCKR rs780094 and rs1260326 alleles have been linked to a lower risk of T2DM and obesity in the Han Chinese group. Fasting hyperglycemia was found to be molded by a gene/gene interaction between rs780094 and rs1799884 [53]. According to Sadeghi et al. (2021) found a correlation between rs28514894 and rs2303044 in the NR1H2 gene and a chance of developing T2D [54]. Mutations in the 3'UTR of SLC30A8 (rs2466293 and rs2466294) were shown to increase Iranian susceptibility to T2D by influencing the binding location of certain miRNAs and decreasing the stability of SLC30A8 mRNA transcripts [55]. Early detection of this genetic marker would allow health professionals to provide adequate guidance and therapy to patients, allowing for frequent follow-up and modification of lifestyle risk factors and daily meal intakes to protect them from developing T2DM at an early stage [56].
Multiple sclerosis (MS)
Multiple sclerosis is a prolonged, degenerative disorder of the central nervous system characterized by inflammatory and persistent neuronal degeneration. The HLA class II, ApoE, IL-1ra, IL-1beta, TNF-alpha, TNF-beta, and CCR5 genes have received tremendous research interest to explore disease-modifying implications in MS [57]. Currently, two polymorphisms found in the IL2RA gene, rs2104286 and rs12722489, SOCS-1 gene (rs243324), IL-16 (rs4072111 C/T), TNF-alpha-308 G/A (rs1800629 G/A), and IL-18-607 C/A (rs1946518 C/A) and rs352162 and rs187084 markers is regarded as a potential risk of MS [58]. The SIRT1 gene was studied in patients with vision impairment and MS that correlates with rs3818292, rs3758391, and rs7895833 substitutions. It also has a part in repairing DNA, mitochondrial formation, and cellular death. The potential contrast of CD58 variants, particularly rs2300747, rs12044852, and rs1335532, with MS among the Malay inhabitants of Malaysia was studied [59]. An analysis of six mutations comprising the MIR137HG (rs1625579), GAS5 (rs2067079), MIR3142HG (rs57095329), MIR146A (rs2910164), MIR155HG (rs767649), and IRAK1 (rs3027898) genes found an association with MS. A new study linked the variant rs12959006 found in the myelin basic protein (MBP) gene to a greater probability of relapsing and an adverse outcome [60].
Migraine
Migraines are a common neurovascular disorder with a complex of between 40-57% hereditary etiology, migraine without aura (MO) and migraine with aura (MA) are the two most common subtypes [61]. Polymorphisms in the ACE, DBH, TRPM8, COMT, GABRQ, CALCA, TRPV1, and other genes were identified as affecting migraine susceptibility.[62] Modern GWAS research identified 4 variants on chromosomal location 8q22.1, 2q37.4, 12q13.3, and 1p36.32 that have a strong association with migraine subtypes [63]. The most recent GWAS for migraine found 38 loci, and to determine whether 46 SNPs (most common rs12135062, rs10166942, rs11031122, rs11172113, and rs17857135) at these loci have a strong association with migraine. According to the Women's Genome Health Study (WGHS), rs11031122 variations were found as variables by examining its link with MA and MO in meta-analyses [64, 65]. The first meta-analysis conducted by Anttila et al. included 29 research studies of 23,285 migraine sufferers and 95,425 controls [66]. The glutamatergic nerve signaling loci were (rs1835740) MTDH, (rs11172113) LRP1, and (rs3790455) MEF2D. For a synaptic growth and neural plasticity genes include (rs6478241) ASTN2, (rs1320832) FHL5, and sensitive to pain involve (rs10166942) TRPM8 gene and for metallic proteins include (rs10504861) MMP16, (rs10915437) AJAP1, (rs12134493) TSPAN2 genes, 5) and for the circulation and metabolites have (rs4379368) C7orf10, (rs2651899) PRDM16, (rs9349379) PHACTR1, and (rs7640543) TGFBR2 genes [67-69]. The rs1835740 SNP is found in the 8q22.1 chromosome, in the MTDH and PGCP gene related to glutamate metabolism and found that raising the release of glutamate or lower uptake raises the probability of migraine attacks.[70] The MTHFR genetic variants rs1801133 (C66T) and rs1801131 (A1298C) are risk factors for migraine susceptibility and are primarily linked to high homocysteine concentrations. The rs7590387, rs3754701 substitutions around the RAMP1 gene associated with migraine risk in Asian and Caucasian individuals were assessed [71].
Parkinson's disease (PD)
Several researchers have found that multiple genes affect PD vulnerability and the pathology remains believed to be solely environmental. Several polymorphisms were investigated by Australian Parkinson's Disease Registration (APDR) such as CD14 (rs2569190), MUC1 (rs4072037), MUC2 (rs11825977), CLDN2 (rs12008279 and rs12014762), and CLDN4 (rs8629). The PGLYRP4 (rs10888557) genotype significantly increased the risk of PD in each group. The PGLYRP2 (rs892145), TLR1 (rs4833095) or TLR2 (rs3804099), and MUC2 (rs11825977) variations substantially increased PD disease susceptibility in the APDR cohort. Five unknown loci (ACMSD, STK39, MCCC1/LAMP3, SYT11, and CCDC62/HIP1R) were found, and six were already known loci (MAPT, SNCA, HLADRB5, BST1, GAK, and LRRK2) were identified [72-77]. Another 5 PD-associated loci such as PARK16, STX1B, FGF20, STBD1, and GPNMB were identified by the IPDGC [63]. Most detailed meta-analyses included results from 7 million SNPs from GWAS and smaller-scale PD association studies. Future research must focus on identifying functional variations and comprehending the molecular implications of each risk locus.
Alzheimer’s disease
By 2050, there will probably be 152 million cases of dementia worldwide, with AD being the most prevalent form of autoimmune disorder that is regarded as the more frequent basis for cognitive impairment and contributes to 70% of AD instances [78]. The amino acid at position 69 shifts from arginine to glycine (R69G) as a result of the minor allele at rs2455069. The CD33 SNP rs2455069 is strongly linked with dementia in Italy and the R69G amino acid shift modifies the structure of CD33 and affects its susceptibility to sialic acid residue [79]. Moreover, it was found that the ABCA7 variants rs3764650 and rs4147929 were associated with AD cases. Up to 20 risk genes were identified and described by prior GWAS investigations, including ABCA7, BIN1, CLU, CR1, PICALM, SORL1, and others [80]. New studies have revealed that the mutations in GRN, TMEM106B, Complement C7, and RBFOX1 are linked to AD in different populations [81]. The rs3810950 polymorphism may have a small but significant effect on the risk of AD in the Czech people, demonstrated by the link between the genotype of choline acetyltransferase (ChAT) and a 1.25 times greater risk of AD [82]. Furthermore, there was considerable discussion regarding the correlation between the APOC1 rs11568822 SNP and the probability of AD in many studies [83].
FUTURE PERSPECTIVE & CHALLENGES
The research described above gives a summary of significant mutations related to autoimmune and multifaceted genetic disorders. Despite significant collaborative efforts, many disorders can no longer be studied inherently, and the heritability of complex diseases is only partially explained by GWAS studies. The risk gene prognosis may be improved by more research, including novel approaches for finding rare modifications using DNA sequencing and structural variants. To efficiently exploit the GWAS tool to improve the knowledge about cancer physiology and discover potential targets for clinical and therapeutic methods, it is necessary to ascertain the functional data implications. Concerning neurodegeneration, heart disease, or immunological diseases, these results may indicate a variety of provided inherited and mechanistic variables. Many researchers are exploring new filtering techniques, effective computational methods, and process methods for studying SNP interactions. Although metagenome as collecting data for massive cohorts has risen greatly, it is still difficult to analyze the microbiome about host DNA. The effects of exogenous and external variables on the gut microbiome, which may obscure the effects of genetic variations; the intricate organization of microbiome data, makes analysis tough; the substantial number of tests done are challenging, which necessitates the collection of large cohorts to address the issue of various assessment. In future decades, we believe that introducing the findings of GWAS will increase the probability of effective biological and therapeutic discoveries in neurology. A high-dimensional difficulty, computing constraints, the absence of marginal effects, lacking heritability, and genetic variability are among the limitations.
Figure 4. Taxonomy of mutations according to a sequenced concept in the pairing and non-paired approach [84].
Several machine learning and data mining techniques were used to address such issues. Research and development of economic or breeding features related to these genes, as well as the invention of genetic breeding methods using DNA loci, are all significant to genetic engineering. In the era of Next Gen Sequencing (NGS) and precise medicine, the whole genome or transcriptome sequencing methods are projected to grow considerably less laborious and cost-effective. Multiple global collaborations, such as ENCODE (Encyclopaedia of DNA elements) and HapMap (Haplotype Map), have followed to map the DNA variations and regulatory elements of a human genome [84]. To distinguish between the mutation and polymorphism of genomic variation, we propose an obvious difference between an SNP and a somatic mutation, as shown in Figure 3. It is challenging to establish effective method-based strategies for predicting the risk of infectious diseases. Researchers can use powerful computational methods to analyze high-speed sequencing programs. Thus far, new genetic technology allowed the merging of molecular genetics data with qualitative genetic research formerly available in the discipline. Some processes, such as methylation of cytosine and histone modification, have now become innovations that help to comprehend how certain active SNPs function in certain ambient conditions. Another approach such as DNA editing on designed nucleases (Crispr-Cas9, Crispr/ScCas9, and Crispr-Cas12) [85-87], may be used later on to modify genes using new targeted methods. The significance of gene mapping strategies has risen as technology and analytic tools have advanced. Such investigations complement genetic methods and are comparable to human genome sequencing because they allow the production of a vast place of 'functional' facts that can be systematically explored to find genes encoding proteins that will likely have a key part in the progression of diseases.
CONCLUSION
Genetic mapping of mutant genes in diseases with more complex genetic aetiologies, as well as modifier genes, has become more possible. In the future, many genetic variants that are currently known will probably be found to affect the extent of immune-mediated disorders in both prevalent and rare circumstances. Such understanding will allow for more accurate disease outcome prediction in individuals, as well as more targeted preventive measures. A systematic analysis of known genetic variants with no currently defined functional effects will almost certainly lead to the discovery of a subgroup that influences drug response. It is feasible to pinpoint genetic characteristics that increase the chance of negative responses, as well as factors affecting the effectiveness and proper medication. These studies reinforce biological methods and are comparable to genome sequencing as they can enable the production of a massive store of functional data that can be systematically mined to find genes encoding proteins that anticipate playing a critical role in disease progression.
ACKNOWLEDGMENTS
We thank all the authors who have contributed to this review paper.
AUTHOR CONTRIBUTION
All authors discussed the results and contributed to the final manuscript.
FUNDING STATEMENT
No specific funding for this research was received from any public, commercial, or not-for-profit sectors.
CONFLICT OF INTEREST
No conflict of interest
PARTICIPANT CONSENT AND ETHICAL APPROVAL
Not applicable.
RIGHTS FOR PUBLICATION
The work has been authorized for submission by all authors.
COMPETING INTERESTS
The authors disclose that they have no conflicts of interest.
REFERENCES
Gosavi, G., et al., Applications of CRISPR technology in studying plant-pathogen interactions: overview and perspective. Phytopathology Research, 2020. 2(1): p. 1-9.
Kainat Ramzan , Ali Noman , Saira Ramzan , Ali Haider Ali , Ayesha Waheed , Muhammad Bilal , Usama Tahir , Moeen Zulfiqar , Significance Of Genetic Polymorphisms In Autoimmune Diseases: A Comprehensive Review, Int. J. of Pharm. Sci., 2024, Vol 2, Issue 9, 399-412. https://doi.org/10.5281/zenodo.13731724