Gene Prediction in Bacteria, Archaea, Metagenomes and Metatranscriptomes
Novel genomic sequences can be analyzed either by the self-training program GeneMarkS (sequences longer than 50 kb) or by GeneMark.hmm with Heuristic models. For many species pre-trained model parameters are ready and available through the GeneMark.hmm page. Metagenomic sequences can be analyzed by MetaGeneMark , the program optimized for speed.
Gene Prediction in Eukaryotes
Novel genomes can be analyzed by GeneMark-ES, an algorithm utilizing models parameterized by unsupervised training. Notably, GeneMark-ES has a special option for fungal genomes to account for fungal-specific intron organization. To integrate into GeneMark-ES information on mapped RNA-Seq reads, we made semi-supervised GeneMark-ET. Recently, we have developed GeneMark-EP+ that uses homologous protein sequences of any evolutionary distance in both training and predictions.
Gene Prediction in Transcripts
Sets of assembled eukaryotic transcripts can be analyzed by the modified GeneMarkS algorithm (the set should be large enough to permit self-training). A single transcript can be analyzed by a special version of GeneMark.hmm with Heuristic models. A new advanced algorithm GeneMarkS-T was developed recently (manuscript sent to publisher); The GeneMarkS-T software (beta version) is available for download .