|
This web site was developed so that researchers could easily view and and download genomic islands for all published sequenced genomes that have been predicted using the the currently most accurate GI prediction methods. Users can also upload their own unpublished genomes for analysis. The source code and entire GI data sets are available for download and acknowledgment information is available for those who use our resources . Please contact us with any questions or comments.
In a recent study (Langille et al., 2008), we describe a new method called IslandPick that uses a comparative genomic GI prediction method to develop stringent data sets of GIs and non-GIs. These positive and negative data sets were used to evaluate several sequence composition GI prediction methods and showed that SIGI-HMM and IslandPath-DIMOB were shown to have the highest overall accuracy. In addition, IslandPick had the most agreement with an independent data set of previously published genomes indicating that it was a highly precise method for GI prediction. IslandPick requires several phylogentically related genomes to be sequenced to be able to make a prediction; therefore, predictions will not be available for many genomes. Non-default comparison genomes may be chosen using the IslandPick link. For more information please see Langille et al., 2008 and/or our overview presentation.
SIGI-HMM (see Waack et al., 2006) is a sequence composition GI prediction method that is part of the Columbo package. This method uses a Hidden Markov Model (HMM) and measures codon usage to identify possible GIs. In a recent study, SIGI-HMM was shown to have the highest precision and overall accuracy, out of six tested sequence composition GI prediction methods (Langille et al., 2008).
IslandPath (see Hsiao et al., 2003) was originally designed to aid to the identification of prokaryotic genomics islands (GIs), by visualizing several common characteristics of GIs such as abnormal sequence composition or the presence of genes that functionally related to mobile elements (termed mobility genes).
Our subsequent studies (see Langille et al., 2008), showed that dinucleotide sequence composition bias and the presence of mobility genes were good indicators for identifying GIs. In fact, it was tied with SIGI-HMM for having the highest overall accuracy and traded a slightly lower precision for higher recall.
IslandViewer integrates two sequence composition GI prediction methods SIGI-HMM and IslandPath-DIMOB, and a single comparative GI prediction method IslandPick. These methods have varying advantages and disadvantages.
Sequence composition GI prediction methods may have difficulty detecting ancient GIs due to the amelioration of the foreign DNA to the host genome over time. Also, these methods will have difficulty detecting GIs that have originated from a genome with similar sequence composition as the host genome. Lastly, these methods can make false predictions due to the normal variation in sequence composition that can occur in bacterial genomes.
Comparative GI prediction methods depend heavily on the genomes that are chosen for comparison. For instance, the selection of very similar genomes will result in the prediction of only recently inserted GIs, while comparing with more distantly related genomes will detect many more GIs (including recent and ancient GIs), but may increase the chance of false prediction. IslandPick uses several different cutoffs to automatically select genomes for comparison, but users have the choice to select different comparison genomes based on their own insights.
In general, we recommend that users take advantage of the ability to view GI predictions from all three methods in a single integrated view. To aid this further, we have highlighted islands that are predicted by one or more tools as red around the outer circle.
We strongly encourage researchers to conduct further analyses of any GIs displayed by IslandViewer. As outlined above, in Analysis Considerations, there are many factors that can lead to false prediction by any GI prediction program. In addition, any GIs that are very close to each other in the genome may actually be one large GI insertion (or vice versa) and the exact boundaries or insertion sites should be inspected in more detail. We recommend the use of the following computational tools to aid in further analysis of genomic islands.
Artemis, is a popular and well developed genome browser and annotation tool. IslandViewer allows users to download predicted GIs in a GenBank file format that can be opened directly in Artemis. The GIs will appear in Artemis using the same colour scheme used in IslandViewer (i.e. green for IslandPick, orange for SIGI-HMM, etc.). An easy to view list of islands can be produced in Artemis by using the "Feature Selector" under the "Select" menu and filling in:
- Key = "misc_feature"
- Qualifier = "note"
- Containing this text = "Genomic Island"
Along with the inspection of genes neighbouring and within GIs, sequence composition graphs, such as GC content and Karlin Signature Difference, can be added via the Graph menu. Users with their own GI predictions or other genome features, such as IS or direct repeat elements, can add their data to Artemis after constructing a fairly simple input file.
IslandPath, displays many features that are often associated with genomic islands, such as various types of sequence composition bias, tRNAs, and integrases and transposases. These features are shown together in a single integrated view and may help in determining the location of GIs. IslandViewer contains a direct link to IslandPath for the genome being viewed on the left column of every results page.
Mauve, is a whole genome alignment tool that can be used to view genome rearrangements and large insertions in related genomes. By using the IslandViewer GenBank download file along with your choice of other closely related bacteria genomes, GIs can be viewed in the context of other genomes and conserved regions surrounding these regions can be inspected further.
Websites are welcome to link to island predictions in IslandViewer. The RefSeq accession can be used to link to the page in the format: http://www.pathogenomics.sfu.ca/islandviewer/results.php?query_input=NC_XXXXXX Note, that the accession version is optional and if not given the most recent version will be used (e.g NC_003997 is equivalent to NC_003997.3).
|