Research Article
Computational Analysis of Mycobacterium bovis Protein Sequence
Dauda A1*, Abbaya HY2 and Ndirpaya AD3
Corresponding Author: A Dauda, Department of Animal Science, University of Calabar, P.M.B. 1115 Calabar, Nigeria
Received: October 16, 2019; Accepted: November 11, 2019; Published: June 28, 2020;
Citation: Dauda A, Abbaya HY & Ndirpaya AD. (2020) Computational Analysis of Mycobacterium bovis Protein Sequence. J Vet Marine Sci, 2(2): 88-93.
Copyrights: ©2020 Dauda A, Abbaya HY & Ndirpaya AD. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

A total of ten (10) Mycobacterium bovis (MBB) proteins were retrieved from the GenBank. The GenBank accession numbers of the sequences and sequence variations of the proteins were used to investigate the molecular identity of various MBB proteins. The physico-chemical properties of MBB proteins were performed using ExPASy Protparam tool. The amino acid sequences of MBB proteins were subjected to secondary structure prediction using ExPASy’s SOPMA tool. The three dimensional structure (3D) of the MBB protein were determined using Phyre2 server. The 3D structure was then validated by using VADAR server and Ramachandran plot. Isoelectric point (pI), molecular weight (MW), extinction coefficient (EC); instability index (II), aliphatic index (AI) and grand average of hydropathicity (GRAVY) were computed. The study revealed that some of the MBB proteins were acidic while some were basic in nature based on their pI. The EC of all the proteins revealed appreciable value. The II of MBB proteins value showed majorities to be <40. The AI for some of the MBB proteins have values >100 while others were <100. The GRAVY of MBB proteins revealed some positive and negative values. The amino acid composition of MBB proteins indicates that they are rich in aliphatic amino acids. Among the MBB proteins some are alpha helice while others are random coil structure. The Ramachandran Plot validated the 3D structure of MBB protein. The present protein information may aid future research in mutagenesis and pharmacogenetics of cattle especially in a developing economy such as Nigeria.


Keywords: Protein, Mycobacterium bovis, Sequences, Bioinformatics


Mycobacterium bovis is the classical causative agent of bovine tuberculosis, can be responsible for human tuberculosis, which makes this bacterium an important zoonotic species. The Mycobacterium tuberculosis complex (MTBC) includes M. tuberculosis (the cause of most human tuberculosis), M. bovis, M. bovis bacillus Calmette-Guérin, M. africanum and M. microti [1]. M. bovis is the main cause of tuberculosis in cattle, deer and other mammals. The human bacillus M. tuberculosis may have evolved from M. bovis in the setting of animal domestication [2]. Human M. bovis infection generally occurs in the setting of consumption of infected cow milk products. The combination of disease tracing and molecular typing is needed to understand the epidemiology of tuberculosis. Mycobacterium bovis in silico study is significant because it help to bring new insights into epidemiological questions [3] Molecular typing has been instrumental in determining the population structure and evolution of pathogens. Since tuberculosis have economical and nutritional consequences, efforts should be intensified towards finding sustainable genomic solutions to these deadly diseases which continue to ravage the livestock industry [3]. New typing tool such as computational approach may help to improve the surveillance and control of the disease, as well as to trace new epidemics. The aim of the study is to carry out in silico study on the proteins sequence of Mycobacterium bovis.


A total of ten (10) Mycobacterium bovis (MBB) nucleotide sequences were retrieved from the GenBank (NCBI) via (   The    Genbank    accession

numbers of the sequences are AAD09878, 3I7J, BAA05497, BAA05496, CAA44268, ANG90842, ANG90839, ANG90831, ANG90823 and ANG90818.ExPASy ProtParam Tool was used for the computation of various physical and chemical properties of the MBB proteins using amino acid sequences. The computed parameters were Isoelectric point (pI), molecular weight (MW), extinction coefficient (EC); instability index (II), aliphatic index (AI) and grand average of hydropathicity (GRAVY) [4]. The amino acid sequences of MBB proteins were subjected to secondary structure prediction using ExPASy’s SOPMA tool as it is an improved SOPM method. It predicts 69.5% of amino acids for a 3 state description of the secondary structure (a helix, b sheets and coil). The Phyre2 server was used to predict the 3D structure of MBB proteins. These servers predict the three-dimensional structure of a protein sequence using the principles and techniques of homology modeling [5]. Currently, the most powerful and accurate methods for detecting and aligning remotely related sequences rely on profiles or Hidden Markov Models (HMMs). 3D ligand site was used to predict the binding site of the 3D structure of the MBB proteins. Phyre2 is coupled to the 3D ligand site server for protein binding site prediction [6]. The 3D structure was then validated by using VADAR server and Ramachandran plot. VADAR (Volume, Area, Dihedral Angle Reporter) is a compilation of more than 15 different algorithms and programs for analyzing and assessing peptide and protein structures from their PDB coordinate data. The results have been validated through extensive comparison to published data and careful visual inspection. The VADAR web server supports the submission of either PDB formatted files or PDB accession numbers. VADAR produces extensive tables and high quality graphs for quantitatively and qualitatively assessing protein structures determined by X-ray crystallography, NMR spectroscopy, 3D threading or homology modeling [7].


The physical and chemical properties of MBB protein are shown in Table 1. There were variations in the amino acid number of MBB proteins. The higher the amino acid number, the higher the molecular weight of the protein. The isoelectric point (pI) of the protein indicates that some were acidic with pI value less than 7 while some wee basic with pI value greater than 7. The net charges of some proteins were negative while others were positive. The EC value ranges from 30940-147140. The half-life of all the proteins indicates 30 h except that of protein with accession number 3I7 J which was not determined due to the presence of an N-terminal ambiguity. The instability index (II) of the proteins shows only three proteins with accession number BAA05496, CAA44268 and ANG90831 having II values greater than 40. The AI of the MBB proteins shows AAD09878, ANG90842 and ANG90831 having values greater than 100. The GRAVY values of the some MBB proteins some were –ve while otherswere +ve. The Ramanchandra plot shown in Figure 1 showed the protein favored glycine or is more of glycine which is an aliphatic amino acid.

The amino acid composition of MBB proteins are shown in Table 2. All the proteins have high percentage composition of alanine, glycine and valine amino acids which belong to aliphatic amino acid group. All the proteins have zero percent composition in selenocystein and pyrrolysine which is the stop code.

The secondary structure prediction of MBB proteins is presented in Table 3. Proteins with the accession number AAD09878, ANG90842, ANG90831 and ANG90823 had high percent as alpha helice structure while the rest of the proteins had high percent in random coil structure.


The computed isoelectric points (pI) for MBB will be useful for developing buffer system for purification by isoelectric focusing method. The isoelectric point is of significance in protein purification because it is the pH at which solubility is always minimal and at which mobility in an electro focusing system is zero and therefore the point at which the protein will accumulate [8]. The extinction coefficient of a protein at 280 nm depends almost exclusively on the number of aromatic residues, particularly tryptophan [9]. This indicates that the higher the EC value of the MBB proteins, the higher the number of aromatic residues [10,11]. In particular, hydrophobic amino acids can be involved in binding/recognition of hydrophobic ligands such as lipids [12]. All the MBB proteins have zero selenocystein and pyrrolysine which are interpreted as stop codons such that the protein cannot conclusively determine the identity of a residue [13].

Protein structure prediction from amino acid sequence is a fundamental scientific problem and it is regarded as a grand challenge in computational biology and chemistry. Given an amino acid sequence (i.e., the primary structure) which represents a monomeric globular protein in aqueous solution and at physiological temperatures, it is necessary to determine all helical segments and all beta-strands, all pairs of beta-strands which form beta-sheets (i.e., the beta-sheet topology), all disulfide bridges if cysteines are present, all loops that connect secondary structure elements, and the three-dimensional folded protein structure [14]. The accuracy of protein structure prediction depends critically on sequence similarity between the query and template as observed in the present study. If a template is detected with >30% sequence identity to the query, then usually most or all of the alignment will be accurate and the resulting relative positions of structural elements in the model will be reliable [5]. The practical applications of protein structure prediction are many and varied, including guiding the development of functional hypotheses about hypothetical proteins [15], improving phasing signals in crystallography [16], selecting sites for mutagenesis [17] and the rational design of drugs [18]. The present approach can be used to make functional predictions especially for newly discovered sequences [19].


The study finds out that the physico-chemical properties revealed the following; all the protein is high in EC, some of the proteins are high in II and AI. The amino acid compositions of all the protein are high in percent composition of alanine, glycine, leucine, isoleucine and valine amino acids. All the protein also showed high percent of alpha helice structure. Therefore, it may be concluded that Mycobacterium bovis is resistant to mutation and is thermally stable. I therefore recommend research on mutagenesis of this causative agent.

1.       Grange JM, Yates MD, De Kantor IN (1996) Guidelines for speciation within the Mycobacterium tuberculosis complex. 2nd Edn, WHO/EMC/ZOO/96.4.

2.       Cole ST, Brosch R, Parkhill J (1998) Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393: 537.

3.       Dauda A, Duwa H, Yaska JA, Egahi JO, Kollo YS (2017) Protein sequence analysis of Newcastle (Paramyxovirus) in poultry. Int J Interdiscip Res Innov 5: 19-24.

4.       Gasteiger E (2005) The Proteomics Protocols Handbook. Humana Press, pp: 571-607.

5.       Kelley LA, Sternberg MJE (2009) Protein structure prediction on the web: A case study using the Phyre server. Nature Protocols 4: 363-371.

6.       Wass MN, Kelley LA, Sternberg MJ (2010) 3D ligand site: Predicting ligand-binding sites using similar structures. Nucleic Acids Res 38: 469-473.

7.       Leigh R, Ellis R, Wattie J, Southam DS, De Hoogh M, et al. (2002) Dysfunction and remodeling of the mouse airway persist after resolution of acute allergen-induced airway inflammation. Am J Resp Cell Mol Biol 27: 526-535.

8.       Fennema R (2008) Food Chemistry. 3rd Edn. CRC Press, pp: 327-328.

9.       Gill SC, Von Hippel PH (1989) Calculation of protein extinction coefficients from amino acid sequence data. Anal Biochem 182: 319-326.

10.    Gasteiger E, Gattiker A, Hoogland C, Ivanyi I, Appel RD, et al. (2003) ExPASy - The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res 31: 3784-3788.

11.    Munduganore DS, Mundaganore YD, Ashokan KV (2012) Sequence analysis of protein in pest des petits. Int J Food Agric Vet Sci.

12.    Betts MJ, Russell RB (2003) Amino acid properties and consequences of substitutions. In: Bioinformatics for Geneticists. John Wiley and Sons Ltd., pp: 289-316.

13.    Suchanek M, Radzikowska A, Thiele C (2005) Photo-leucine and photo-methionine allow identification of protein-protein interactions in living cells. Nat Methods 2: 261-267.

14.    Floudas CA (2007) Computational methods in protein structure prediction. Biotechnol Bioeng 97: 207-213.

15.    Watson JD, Baker TA, Bell SP, Gann A, Levine M, et al. (2004) Molecular biology of the gene. 5th Edn. Peason Benjamin Cummings (Cold Spring Harbor Laboratory Press). ISBN: 08053-4635.

16.    Qian BD (2007) High-resolution structure prediction and the crystallographic phase problem. Nature 450: 259-264.

17.    Rava P, Hussain MM (2007) Acquisition of triacylglycerol transfer activity by microsomal triglyceride transfer protein during evolution. Biochemistry 46: 12263-12274.

18.     Park H, Hwang KY, Oh KH, Kim YH, Lee JY, et al. (2008) Discovery of novel alpha-glucosidase inhibitors based on the virtual screening with the homology-modeled protein structure. Bioorg Med Chem 16: 284-292.

19.    Ugbo SB, Yakubu A, Omeje JN, Musa IS, Bibinu BS, et al. (2015) Assessment of genetic relationship and application of computational algorithm to assess functionality of non-synonymous substitutions in DQA2 gene of cattle, sheep and goats. Open J Genet 5: 145-158.