![]() |
||||||||
|
Research Areas The following provides overviews of some specific research areas, followed by a description of selected computational analyses/tools developed or under development:
Predicting subcellular localization of proteins, particularly cell surface proteins:
There is a growing interest in both agriculture and medicine in computationally predicting the subcellular localization of proteins, to identify new potentially accessible (surface-exposed) drug targets and vaccine candidates from genomic sequence data. However, previously only one comprehensive computational tool was publicly available for predicting bacterial protein subcellular localization, named PSORT, and it needed to be improved and updated. Most notably, computational prediction of bacterial outer membrane proteins (OMPs) is poor. We are therefore utilizing our knowledge of OMPs and computational analyses of protein subcellular localization to expand and improve the PSORT algorithm that we have on hand (the original author of this program, Dr. Kenta Nakai, kindly provided us with the source code for this program). We have now improved PSORT's prediction and characterization of OMP-encoding genes, as well as making other improvements that improve prediction of other proteins. To do this, a database of proteins of known subcellular localization has been formed and analyzed, and new data mining approaches have been implemented. Further laboratory experimentation is also being generated to more accurately determine the residues compatible with the transmembrane beta-strands of OMPs - properties that may be incorporated into the prediction algorithm in combination with updated knowledge about protein sorting signals and protein structure. Our more accurate method of PSORT, named PSORTb (for "bacterial" PSORT - see also Tools below), is now more accurate than high throughput proteomic methods for protein subcellular localization characterization for bacteria. However, we believe we can still increase its accuracy further (higher recall or sensitivity) and also expand its ability to make predictions for organisms other than bacteria. Our research is now providing new insights into the structure and prevalence of OMPs, revealing patterns in terms of the proportion of proteins at a given subcellular localization, and is contributing important structural data about model proteins used in the laboratory studies. Data incorporated later into the program will facilitate the identification of surface exposed regions in predicted proteins, and complement laboratory analysis of key surface-exposed bacterial membrane proteins and their structures. If you are interested in being notified about changes in the status of this PSORTb project, please subscribe to the psort-update mailing list by emailing maillist@sfu.ca with "subscribe psort-update" in the subject or body of the message. We are also maintaining a database of information regarding the subcellular localization of proteins called PSORTdb. Finally, we maintain psort.org, which acts as a central portal for access to all PSORT programs and related tools for prediction of protein subcellular localization.
Genome Canada Pathogenomics Project - Pathogenomics of Innate Immunity: The Genome Canada/Genome Prairie Pathogenomics Project, named "Functional
Pathogenomics of Mucosal Immunity" (FPMI) was initiated in 2002-2003
and the followup project named "Pathogenomics of Innate Immunity"
started in 2006. The Brinkman Laboratory has been heading Bioinformatics
and Informatics for these projects that involves researchers at SFU, UBC,
U Sask, BC Cancer Agency, Inimex, The Sanger Institute (UK), Trinity University
(Ireland), and the National University of Singapore (Singapore). A website
profiling the Pathogenomics Project is available at www.pathogenomics.ca.
The overall objective for this project is to provide new information about
the processes of disease and innate immunity to microbial pathogens. The
results of this research will enable researchers to gain an increased
understanding of how the mucosal surfaces of bovine, chicken, mouse and
human hosts respond to the presence of infectious agents, and to the adjuvants,
immuno-modulators and vaccines designed to combat these agents. Bioinformatics
challenges in such a project are significant, involving a large amount
of heterogenous data from a variety of sources, however this data provides
us with unique opportunities for global bioinformatics analyses and, hopefully,
conclusions regarding innate immunity and how we may use this system more
efficiently to boost our resistance to infectious agents. In addition
to performing customized analysis of host and pathogen gene expression
responses for this project, we have also developed some resources to aid
microarray analysis, comparative genomics analysis, and innate immunity
systems biology: See the ArrayPipe,
ProbeLynx,
Ortholuge and InnateDB
websites for more details. InnatDB has also been profiled by Cell
Host and Microbe. The BC Pathogenomics Project and other Pathogenomics research: The BC Pathogenomics Project, was an interdisciplinary project involving 20 faculty and students from UBC and SFU that are in the fields of computer science, bioinformatics, microbiology, evolutionary theory and eukaryotic genetics. This project, coordinated by Dr. Brinkman, had been using informatic approaches to identify pathogen genes which are more similar to host genes than expected, and may be more likely to interact with, or mimic, their hosts gene functions. In addition, potential pathogenicity islands (genomics islands) in genomes were being identified. Resources initially developed for this project include the BAE-watch database, PhyloBLAST and IslandPath. Genes identified through our analysis include an interesting case of potential horizontal gene transfer between the ancestor of a parasitic protozoan and the ancestor of pasteurellaceae bacteria which cause respiratory infections in humans and other animals (see our publication: de Koning, A. et al. (2000) Molecular Biology and Evolution 17:1769-1773) and also a case of possible horizontal gene transfer of a demonstrated virulence factor from bacteria to fungi (Brinkman, F.S.L. et al. (2001) Infection and Immunity. 69:5207-11). A publication in Genome Research (Brinkman, F.S.L., et al. (2002) Genome Research 12:1159-67) illustrates the use of the BAE-watch database. IslandPath and associated genomic island analysis have included a study illustrating that Genomic Islands do disproportionately contain novel genes and further analysis of genomic islands and genomic island predictors is ongoing (for more information, see below).
Insight into the evolution of pathogens: prevalence of known virulence factors and role of horizontal gene transfer: Analysis of a database of all known virulence factors is possible, to determine additional features in genome sequences that could be used to predict potential virulence factors in genomes. This informatics analysis can be complemented with functional studies of particular genes that are identified as potential virulence factors, and can also be complemented by PCR-based analyses of the prevalence of potential virulence factors in pathogens and their closely related non-pathogenic relatives. By using a population-based or global genomics-based approach to investigate the prevalence of particular known or putative virulence factors in a selected species, it is hoped that insights will be gained into the evolution of pathogenicity for those species. Computational methods for detecting potential horizontal gene transfer events are also being further developed, using both seqeunce composition (dinucleotide bias etc) and sequence similarity approaches, as horizontal gene transfer is an important mechanism in the evolution of new pathogens. The role of Genomic Islands in pathogen evolution is a particular current focus, as is improving the identification and characterization of genomic islands and pathogen-specific genes (see, for example, Langille et al (2008) BMC Bioinformatics 9:329; Hsiao et al (2005) PLoS Genetics 1:e62 and Hsiao et al (2003)Bioinformatics 19: 418-420 and IslandViewer.
Continually-updated Genome Databases: The Pseudomonas aeruginosa Community Annotation Project (PseudoCAP) and Pseudomonas Genome Database: We are continuing the coordination of PseudoCAP, a community genome annotation project for the Pseudomonas Genome Project that was the first Internet-based community annotation approach developed for a bacterial genome project (see our papers in Nature: Brinkman et al. (2000) Nature 406:933 and Stover et al. (2000) Nature 406: 959-964). The project is now in its next stage after the genome sequence was published: We have set up a web-based system and methodology to continually update, and allow the research community to correct, Pseudomonas aeruginosa genome annotations in the Pseudomonas Genome Database. We are also integrating more information into this website based on our analyses and those of others, using a Gbrowse/Distributed Annotation System (DAS) approach. Multiple genome seqeunces can now be compared, to aid analysis of P. aeruginosa. See also our data on Pseudomonas aeruginosa outer membrane proteins and publication by Winsor et al (2005) Nucleic Acids Research 33:D338-343. We have also recently set up a sister database for another important Cystic Fibrosis pathogen, the Burkholderia Genome Database.
Computational Analyses and Tools: Tools have been developed for the above projects, to aid analysis or
to provide a base for further algorithm development. Below are short profiles
of some tools developed, or under development. PSORTb: The initial version of PSORTb was limited to the analysis of Gram-negative bacteria. However a new version has been developed that can be used to investigate the subcellular localization of proteins from both Gram-negative and Gram-positive bacteria. This resource, plus PSORTdb mentioned below, is available from our psort.org portal for prediction of protein subcellular localization. See our list of publications for more information. PSORTdb: PSORTdb is a database of bacterial protein subcellular localization that includes both known proteins and those that have been computationally predicted. It is available as part of the psort.org family of resources. It features a very flexible user-friendly web-based interface and the source code for this database interface is being made freely available. Pseudomonas Genome Database: The current Pseudomonas Genome Database at www.pseudomonas.com has been expanded to allow continual updates and more flexible queries of the data, along with a Gbrowse view of the genome data, which provides links to other data sources and distributed annotation system (DAS) capabilities. Multiple genomes may now be viewed to faciliate comparisons. This database contains genome data for all Pseudomonas genomes, though it is primarily focused on Pseudomonas aeruginosa and providing a high quality resource to aid research for improved methods of controling this important Cystic Fibrosis pathogen. Burkholderia Genome Database: The Burkholderia Genome Database is a sister database to the Pseudomonas Genome Database, with a similar interface and functionality to facilitate analysis of Burkholderia species of interest as additional important Cystic Fibrosis pathogens. InnateDB - facilitating systems based analysis of Innate Immunity: InnateDB is a publicly available database of the genes, proteins, experimentally-verified interactions and signaling pathways involved in the innate immune response of humans and mice to microbial infection. The database captures an improved coverage of the innate immunity interactome by integrating known interactions and pathways from major public databases together with manually-curated data into a centralised resource. The database can be mined as a knowledgebase or used with our integrated bioinformatics and visualization tools for the systems level analysis of the innate immune response. IslandViewer: Facilitates the identification of "genomic islands" of genes in prokaryotic genomes that may have horizontal origins, through a whole-genome graphical display of genomic island predictions from the most accurate genomic island prediction methods (including our IslandPath and IslandPick methods). See http://www.pathogenomics.sfu.ca/islandviewer IslandPath: Facilitates the identification of genomic islands in prokaryotic genomes that may have horizontal origins, through a whole-genome graphical display of dinucleotide bias of gene clusters, %G+C ratio genes, and other features relevant to the identification of pathogenicity islands (and other genomic islands). The graphical display is available at www.pathogenomics.sfu.ca/islandpath and the island predictions, using the IslandPath-DIMOB method, are available through IslandViewer, listed above. Ortholuge: Ortholuge is a computational method that can generate precise ortholog predictions between two species on a genome-wide scale (using additional outgroup data for reference). It can either evaluate a previously constructed set of orthologs or it can generate an initial tentative set of orthologs that are subsequently evaluated. Precise ortholog prediction is important for a variety of analyses that utilize comparative genomics. See the Ortholuge website and associated publication more details. Microarray Analysis Software: ArrayPipe and ProbeLynx were developed for the Genome Canada Pathogenomics Project. ArrayPipe provides a flexible approach to microarray analysis that is parralellizable, web-based or command-line based, and is freely available under an open source licence. ProbeLynx allows you to update annotations for your microarray probes, using sequence data that is also investigated for its potential for cross-hybridization. InnateDB mentioned above, can also provide tools to aid microarray analysis of diverse human and mouse gene datasets, since the database is not limited to just containing Innate Immunity genes. Bioperl Modules: Bio::Tools::SubLoc and the associated, required Algorithm::SVM module were developed for making SubLoc protein subcellular localization predictions as part of the PSORT-B project. They were developed by Cory Spencer and are also available from any CPAN mirror. For more information about SubLoc, which was developed by S Hua and Z Sun, see their paper in Bioinformatics (2001. 17:721-728). PhyloBLAST: Allows the user to compare a sequence against a non-redundant swissprot-based database using BLAST, and then select sequences from the BLAST output for further user-defined phylogenetic analyses. See www.pathogenomics.bc.ca/phyloBLAST/. This tool was developed in collaboration as part of the Pathogenomics Project and a publication describing it is now available: Brinkman, F.S.L., I. Wan, R.E.W. Hancock, A.M. Rose and S.J. Jones. 2001. PhyloBLAST: facilitating phylogenetic analysis of BLAST results. Bioinformatics.17:385-387). Note that this tool is not updated, however the source code is freely available and we encourage researchers to customize it for their own analytical use. BAE-watch: Aids identification of genes that may have been horizontally transferred between the three domains of life (Bacteria, Archaea, or Eukarya), or may otherwise have unusual similarity across different domains (This tool was developed in collaboration as part of the Pathogenomics Project. See www.pathogenomics.bc.ca/BAE-watch.html and the publication first demonstrating the use of it: Brinkman, F.S.L., et al. (2002) Genome Research 12:1159-67). Like PhyloBLAST the funding of this project is not currently ongoing, and so this database currently represents a snapshot in time for genes containing unusual similarity to bacterial proteins.
|