{"category":{"categoryid":353,"name":"sci-biology","summary":"The sci-biology category contains software that can be used in biological and related scientific environments."},"packages":[{"categoryid":353,"description":"STAR aligner: align RNA-seq reads to reference genome uncompressed suffix arrays","firstseen":"2017-11-16T13:46:45.075214","name":"STAR","packageid":68457},{"categoryid":353,"description":"Amino acid indices and similarity matrices","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"aaindex","packageid":46645,"summary":"Amino acid indices and similarity matrices maintained at Kyoto University. An amino acid index is a set of 20 numerical values representing any of the different physicochemical and biological properties of amino acids. The AAindex1 section of the Amino Acid Index Database is a collection of published indices together with the result of cluster analysis using the correlation coefficient as the distance between two indices. This section currently contains 494 indices. Another important feature of amino acids that can be represented numerically is the similarity between amino acids. Thus, a similarity matrix, also called a mutation matrix, is a set of 210 numerical values, 20 diagonal and 20x19\/2 off-diagonal elements, used for sequence alignments and similarity searches. The AAindex2 section of the Amino Acid Index Database is a collection of published amino acid mutation matrices together with the result of cluster analysis. This section currently contains 83 matrices."},{"categoryid":353,"description":"Assembly By Short Sequences - a de novo, parallel, paired-end sequence assembler","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"abyss","packageid":50240},{"categoryid":353,"description":"Protein multiple-alignment-based sequence annealing","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"amap","packageid":50287},{"categoryid":353,"description":"Eukaryotic gene predictor","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"augustus","packageid":54585},{"categoryid":353,"description":"A programmer's API and an end-user's toolkit for handling BAM files","firstseen":"2012-02-26T14:35:52.932489","maintainer":"mmokrejs@gmail.com","maintainername":"Martin Mokrejs","name":"bamtools","packageid":58945,"summary":"BAM (Binary Alignment\/Map) format is useful for storing large DNA sequence alignments. It is closely related to the text-based SAM format, but optimized for random-access. BamTools provides a fast, flexible C++ API for reading and writing BAM files."},{"categoryid":353,"description":"Utilities for variant calling and manipulating VCF and BCF files","firstseen":"2017-09-02T12:52:02.026880","name":"bcftools","packageid":67995},{"categoryid":353,"description":"Tools for manipulation and analysis of BED, GFF\/GTF, VCF, SAM\/BAM file formats","firstseen":"2011-10-09T14:35:47.674450","maintainer":"mmokrejs@gmail.com","maintainername":"Martin Mokrejs","name":"bedtools","packageid":58019},{"categoryid":353,"description":"Blat-like Fast Accurate Search Tool","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"bfast","packageid":51486},{"categoryid":353,"description":"Multithreaded tool for matching large sets of patterns against biosequence DBs","firstseen":"2010-06-05T17:11:34.380149","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"biogrep","packageid":55865},{"categoryid":353,"description":"Perl tools for bioinformatics - Core modules","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"bioperl","packageid":42900},{"categoryid":353,"description":"Perl tools for bioinformatics - Perl API that accesses the BioSQL schema","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"bioperl-db","packageid":52408},{"categoryid":353,"description":"Perl tools for bioinformatics - Analysis of protein-protein interaction networks","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"bioperl-network","packageid":53008},{"categoryid":353,"description":"Perl wrapper modules for key bioinformatics applications","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"bioperl-run","packageid":45447},{"categoryid":353,"description":"Python modules for computational molecular biology","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"biopython","packageid":50727},{"categoryid":353,"description":"A generic bioinformatics relational database model","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"biosql","packageid":46941},{"categoryid":353,"description":"The BLAST-Like Alignment Tool, a fast genomic sequence aligner","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"blat","packageid":44029},{"categoryid":353,"description":"Popular short read aligner for Next-generation sequencing data","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"bowtie","packageid":51612},{"categoryid":353,"description":"Burrows-Wheeler Alignment Tool, a fast short genomic sequence aligner","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"bwa","packageid":49132},{"categoryid":353,"description":"Clustering Database at High Identity with Tolerance","firstseen":"2011-02-01T14:40:14.004226","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"cd-hit","packageid":56790,"summary":"CD-HIT is a very widely used program for clustering and comparing large sets of protein or nucleotide sequences. CD-HIT is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset. The CD-HIT package has CD-HIT, CD-HIT-2D, CD-HIT-EST, CD-HIT-EST-2D, CD-HIT-454, CD-HIT-PARA, PSI-CD-HIT and over a dozen scripts. CD-HIT (CD-HIT-EST) clusters similar proteins (DNAs) into clusters that meet a user-defined similarity threshold. CD-HIT-2D (CD-HIT-EST-2D) compares 2 datasets and identifies the sequences in db2 that are similar to db1 above a threshold. CD-HIT-454 is a program to identify natural and artificial duplicates from pyrosequencing reads. The usage of other programs and scripts can be found in CD-HIT user's guide."},{"categoryid":353,"description":"Scalable multiple alignment of protein sequences","firstseen":"2011-09-26T14:35:31.392681","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"clustal-omega","packageid":57986},{"categoryid":353,"description":"General purpose multiple alignment program for DNA and proteins","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"clustalw","packageid":52977},{"categoryid":353,"description":"An MPI implemention of the ClustalW general purpose multiple alignment algorithm","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"clustalw-mpi","packageid":43775},{"categoryid":353,"description":"Codon usage tables calculated from GenBank","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"cutg","packageid":46991,"summary":"Codon usage tables maintained at the Kazusa DNA Research Institute. Codon usage in individual genes has been calculated using the nucleotide sequence data obtained from the GenBank Genetic Sequence Database. The compilation of codon usage is synchronized with each major release of GenBank."},{"categoryid":353,"description":"Greedy and progressive approaches for segment-based multiple sequence alignment","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"dialign-tx","packageid":51444},{"categoryid":353,"description":"Multiple sequence alignment","firstseen":"2015-03-30T13:35:43.075436","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"dialign2","packageid":63274},{"categoryid":353,"description":"Estimated Locations of Pattern Hits - Motif finder program","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"elph","packageid":55065,"summary":"ELPH is a general-purpose Gibbs sampler for finding motifs in a set of DNA or protein sequences. The program takes as input a set containing anywhere from a few dozen to thousands of sequences, and searches through them for the most common motif, assuming that each sequence contains one copy of the motif."},{"categoryid":353,"description":"A meta-package for installing all EMBASSY packages (EMBOSS add-ons)","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy","packageid":47057},{"categoryid":353,"description":"EMBOSS integrated version of Applications from the CBS group","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-cbstools","packageid":51607},{"categoryid":353,"description":"EMBOSS integrated version of Clustal Omega - Multiple Sequence Alignment","firstseen":"2015-03-29T13:37:37.566496","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-clustalomega","packageid":63265},{"categoryid":353,"description":"EMBOSS integrated version of Protein domain analysis add-on package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-domainatrix","packageid":43298},{"categoryid":353,"description":"EMBOSS integrated version of Protein domain alignment add-on package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-domalign","packageid":47664},{"categoryid":353,"description":"EMBOSS integrated version of Protein domain search add-on package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-domsearch","packageid":43581},{"categoryid":353,"description":"EMBOSS integrated version of Simple menu of EMBOSS applications","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-emnu","packageid":42614},{"categoryid":353,"description":"EMBOSS integrated version of sim4 - Alignment of cDNA and genomic DNA","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-esim4","packageid":46542},{"categoryid":353,"description":"EMBOSS integrated version of HMMER wrapper - sequence analysis with profile HMMs","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-hmmer","packageid":54710},{"categoryid":353,"description":"EMBOSS integrated version of InterProScan motif detection add-on package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-iprscan","packageid":52662},{"categoryid":353,"description":"EMBOSS integrated version of MSE - Multiple Sequence Screen Editor","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-mse","packageid":43957},{"categoryid":353,"description":"EMBOSS integrated version of The Phylogeny Inference Package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-phylipnew","packageid":44837},{"categoryid":353,"description":"EMBOSS integrated version of Protein signature add-on package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-signature","packageid":49947},{"categoryid":353,"description":"EMBOSS integrated version of Protein structure add-on package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-structure","packageid":50587},{"categoryid":353,"description":"EMBOSS integrated version of Transmembrane protein display","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-topo","packageid":53067},{"categoryid":353,"description":"EMBOSS integrated version of Vienna RNA package - RNA folding","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"embassy-vienna","packageid":50193},{"categoryid":353,"description":"The European Molecular Biology Open Software Suite - A sequence analysis package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"emboss","packageid":54441,"summary":"EMBOSS is \"The European Molecular Biology Open Software Suite\". EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole. EMBOSS breaks the historical trend towards commercial software packages."},{"categoryid":353,"description":"Prokaryotic and Eukaryotic gene predictor","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"eugene","packageid":41818},{"categoryid":353,"description":"Generic tool for pairwise sequence comparison","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"exonerate","packageid":52203},{"categoryid":353,"description":"FASTA is a DNA and Protein sequence alignment software package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"fasta","packageid":46474},{"categoryid":353,"description":"Fast inference of approximately-maximum-likelihood phylogenetic trees","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"fasttree","packageid":50611},{"categoryid":353,"description":"Tools for Short Read FASTA\/FASTQ file processing","firstseen":"2018-01-11T15:04:20.895537","name":"fastx_toolkit","packageid":68634},{"categoryid":353,"description":"Graphical viewer for chromatogram files","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"finchtv","packageid":51844},{"categoryid":353,"description":"Folding@Home is a distributed computing project for protein folding","firstseen":"2010-05-04T00:54:45.661860","maintainer":"axs@gentoo.org","maintainername":"Ian Stakenvicius","name":"foldingathome","packageid":45068},{"categoryid":353,"description":"An HMM-based microbial gene finding system from TIGR","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"glimmer","packageid":52050},{"categoryid":353,"description":"A eukaryotic gene finding system from TIGR","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"glimmerhmm","packageid":51757},{"categoryid":353,"description":"A Genomic Mapping and Alignment Program for mRNA and EST Sequences","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"gmap","packageid":51090},{"categoryid":353,"description":"Sequence analysis using profile hidden Markov models","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"hmmer","packageid":43083},{"categoryid":353,"description":"Subset seed design tool for DNA sequence alignment","firstseen":"2010-06-18T14:37:58.731517","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"iedera","packageid":55913},{"categoryid":353,"description":"Inference of RNA alignments","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"infernal","packageid":46781},{"categoryid":353,"description":"Important Quartet Puzzling and NNI Operation","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"iqpnni","packageid":49616},{"categoryid":353,"description":"Global and progressive multiple sequence alignment","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"kalign","packageid":48869},{"categoryid":353,"description":"Near-optimal RNA-Seq quantification","firstseen":"2017-11-16T13:46:45.075214","name":"kallisto","packageid":68458},{"categoryid":353,"description":"The LAGAN suite of tools for whole-genome multiple alignment of genomic DNA","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"lagan","packageid":49064},{"categoryid":353,"description":"Gordon Text utils Library","firstseen":"2018-01-11T15:04:20.895537","name":"libgtextutils","packageid":68635},{"categoryid":353,"description":"Multiple sequence alignments using a variety of algorithms","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"mafft","packageid":45811},{"categoryid":353,"description":"Mapping and Assembly with Qualities, mapping NGS reads to reference genomes","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"maq","packageid":46023},{"categoryid":353,"description":"GUI for sci-biology\/maq, a short read mapping assembler","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"maqview","packageid":49154},{"categoryid":353,"description":"A reference-guided aligner for next-generation sequencing technologies","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"mosaik","packageid":43693},{"categoryid":353,"description":"Suite of algorithms for ecological bioinformatics","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"mothur","packageid":45640},{"categoryid":353,"description":"Bayesian Inference of Phylogeny","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"mrbayes","packageid":49915,"summary":"MrBayes is a program for the Bayesian estimation of phylogeny. Bayesian inference of phylogeny is based upon a quantity called the posterior probability distribution of trees, which is the probability of a tree conditioned on the observations. The conditioning is accomplished using Bayes's theorem. The posterior probability distribution of trees is impossible to calculate analytically; instead, MrBayes uses a simulation technique called Markov chain Monte Carlo (or MCMC) to approximate the posterior probabilities of trees."},{"categoryid":353,"description":"A rapid whole genome aligner","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"mummer","packageid":43404},{"categoryid":353,"description":"Multiple sequence comparison by log-expectation","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"muscle","packageid":53549},{"categoryid":353,"description":"Tools for processing phylogenetic trees","firstseen":"2010-06-06T14:35:26.247608","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"newick-utils","packageid":55866},{"categoryid":353,"description":"Pairwise Aligner for Long Sequences","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"pals","packageid":46638},{"categoryid":353,"description":"Phylogenetic Analysis by Maximum Likelihood","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"paml","packageid":54907},{"categoryid":353,"description":"The PHYLogeny Inference Package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"phylip","packageid":51206},{"categoryid":353,"description":"Estimation of large phylogenies by maximum likelihood","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"phyml","packageid":47059,"summary":"Phyml is a simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Given input sequence files, it estimates phylogenies using maximum likelihood, and is capable of processing large amounts of phylogenetic data."},{"categoryid":353,"description":"Analysis of repetitive DNA found in genome sequences","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"piler","packageid":43252},{"categoryid":353,"description":"Analysis of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs)","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"pilercr","packageid":49670},{"categoryid":353,"description":"Whole genome association analysis toolset","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"plink","packageid":44464},{"categoryid":353,"description":"Fast multiple sequence alignments using partial-order graphs","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"poa","packageid":46946},{"categoryid":353,"description":"Probabilistic Alignment Kit","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"prank","packageid":49307},{"categoryid":353,"description":"Primer Design for PCR reactions","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"primer3","packageid":41845,"summary":"Primer3 picks primers for PCR reactions, considering: oligonucleotide melting temperature, size, GC content, and primer-dimer possibilities; PCR product size; positional constraints within the source sequence; and miscellaneous other constraints. All of these criteria are user-specifiable as constraints, and some are specifiable as terms in an objective function that characterizes an optimal primer pair."},{"categoryid":353,"description":"A protein motif fingerprint database","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"prints","packageid":42374,"summary":"A protein motif fingerprint database maintained at the University of Manchester. A fingerprint is a group of conserved motifs used to characterise a protein family; its diagnostic power is refined by iterative scanning of a SWISS-PROT\/TrEMBL composite. Usually the motifs do not overlap, but are separated along a sequence, though they may be contiguous in 3D-space. Fingerprints can encode protein folds and functionalities more flexibly and powerfully than can single motifs, full diagnostic potency deriving from the mutual context provided by motif neighbours."},{"categoryid":353,"description":"Probabilistic Consistency-based Multiple Alignment of Amino Acid Sequences","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"probcons","packageid":42333},{"categoryid":353,"description":"Prokaryotic Dynamic Programming Genefinding Algorithm","firstseen":"2011-07-17T14:39:23.374915","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"prodigal","packageid":57614},{"categoryid":353,"description":"Secondary structure and solvent accessibility predictor","firstseen":"2013-03-19T14:35:58.051770","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"profphd","packageid":60559},{"categoryid":353,"description":"A protein families and domains database","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"prosite","packageid":42442,"summary":"A protein families and domains database maintained at the Swiss Institude for Bioinformatics. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs. PROSITE currently contains patterns and profiles specific for more than a thousand protein families or domains. Each of these signatures comes with documentation providing background information on the structure and function of these proteins."},{"categoryid":353,"description":"Python interface for the SAM\/BAM sequence alignment and mapping format","firstseen":"2012-01-29T14:36:43.052897","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"pysam","packageid":58773},{"categoryid":353,"description":"Sequential, Parallel & Distributed Inference of Large Phylogenetic Trees","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"raxml","packageid":53028},{"categoryid":353,"description":"A restriction enzyme database","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"rebase","packageid":42753,"summary":"The Restriction Enzyme data BASE is a collection of information about restriction enzymes and related proteins. It is maintained by New England Biolabs. It contains published and unpublished references, recognition and cleavage sites, isoschizomers, commercial availability, methylation sensitivity, crystal and sequence data. DNA methyltransferases, homing endonucleases, nicking enzymes, specificity subunits and control proteins are also included. More recently, putative DNA methyltransferases and restriction enzymes, as predicted from analysis of genomic sequences, are also listed."},{"categoryid":353,"description":"Automated de novo identification of repeat families from genomic sequences","firstseen":"2011-02-03T14:39:13.740633","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"recon","packageid":56795},{"categoryid":353,"description":"Utilities for analysing and manipulating the SAM\/BAM alignment formats","firstseen":"2010-12-22T14:41:05.491029","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"samtools","packageid":56631},{"categoryid":353,"description":"A graphical multiple sequence alignment editor","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"seaview","packageid":54725,"summary":"SeaView is a graphical multiple sequence alignment editor developped by Manolo Gouy. SeaView is able to read and write various alignment formats (NEXUS, MSF, CLUSTAL, FASTA, PHYLIP, MASE). It allows to manually edit the alignment, and also to run DOT-PLOT or CLUSTALW programs to locally improve the alignment."},{"categoryid":353,"description":"C++ Sequence Analysis Library","firstseen":"2013-03-14T14:36:03.414471","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"seqan","packageid":60538},{"categoryid":353,"description":"A rewrite and improvement upon sim4, a DNA-mRNA aligner","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"sibsim4","packageid":49161},{"categoryid":353,"description":"A program to align cDNA and genomic DNA","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"sim4","packageid":51727,"summary":"sim4 is a similarity-based tool for aligning an expressed DNA sequence (EST, cDNA, mRNA) with a genomic sequence for the gene. It also detects end matches when the two input sequences overlap at one end (i.e., the start of one sequence overlaps the end of the other).sim4 employs a blast-based technique to first determine the basic matching blocks representing the \"exon cores\". In this first stage, it detects all possible exact matches of W-mers (i.e., DNA words of size W) between the two sequences and extends them to maximal scoring gap-free segments. In the second stage, the exon cores are extended into the adjacent as-yet-unmatched fragments using greedy alignment algorithms, and heuristics are used to favor configurations that conform to the splice-site recognition signals (GT-AG, CT-AC). If necessary, the process is repeated with less stringent parameters on the unmatched fragments."},{"categoryid":353,"description":"Protein secondary structure assignment from atomic coordinates","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"stride","packageid":51967},{"categoryid":353,"description":"A multiple sequence alignment package","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"t-coffee","packageid":49483,"summary":"T-Coffee is a multiple sequence alignment package. Given a set of sequences (Proteins or DNA), T-Coffee generates a multiple sequence alignment. Version 2.00 and higher can mix sequences and structures. T-Coffee allows the combination of a collection of multiple\/pairwise, global or local alignments into a single model. It also allows to estimate the level of consistency of each position within the new alignment with the rest of the alignments."},{"categoryid":353,"description":"Maximum likelihood analysis for nucleotide, amino acid, and two-state data","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"tree-puzzle","packageid":46128,"summary":"TREE-PUZZLE is a computer program to reconstruct phylogenetic trees from molecular sequence data by maximum likelihood. It implements a fast tree search algorithm, quartet puzzling, that allows analysis of large data sets and automatically assigns estimations of support to each internal branch. TREE-PUZZLE also computes pairwise maximum likelihood distances as well as branch lengths for user specified trees. Branch lengths can be calculated under the clock-assumption. In addition, TREE-PUZZLE offers a novel method, likelihood mapping, to investigate the support of a hypothesized internal branch without computing an overall tree and to visualize the phylogenetic content of a sequence alignment. TREE-PUZZLE also conducts a number of statistical tests on the data set (chi-square test for homogeneity of base composition, likelihood ratio clock test, Kishino-Hasegawa test). The models of substitution provided by TREE-PUZZLE are TN, HKY, F84, SH for nucleotides, Dayhoff, JTT, mtREV24, VT, WAG, BLOSUM 62 for amino acids, and F81 for two-state data. Rate heterogeneity is modeled by a discrete Gamma distribution and by allowing invariable sites. The corresponding parameters can be inferred from the data set."},{"categoryid":353,"description":"A phylogenetic tree viewer","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"treeviewx","packageid":53311,"summary":"TreeView X is a program for displaying phylogenetic trees on Linux and UNIX platforms. It can read and display NEXUS and Newick format tree files (such as those output by PAUP*, ClustalX, TREE-PUZZLE, and other programs)."},{"categoryid":353,"description":"Tandem Repeats Finder","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"trf","packageid":52025},{"categoryid":353,"description":"tRNA detection in large-scale genome sequences","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"trnascan-se","packageid":49297,"summary":"tRNAscan-SE detects ~99% of eukaryotic nuclear or prokaryotic tRNA genes, with a false positive rate of less than one per 15 gigabases, and with a search speed of about 30 kb\/second. It was implemented for large-scale human genome sequence analysis, but is applicable to other DNAs as well."},{"categoryid":353,"description":"Fast, accurate chimera detection","firstseen":"2012-08-15T14:43:37.194721","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"uchime","packageid":59697,"summary":"UCHIME is a new algorithm for detecting chimeric sequences. It was developed in collaboration with Brian Haas, Jose Carlos Clemente, Chris Quince and Rob Knight. Chimeras are commonly created during DNA sample amplification by PCR, especially in community sequencing experiments using single regions such as the 16S rRNA gene in bacteria or the fungal ITS region. UCHIME can detect chimeras using a reference database or de novo using abundance information on the assumption that chimeras are less abundant than their parents because they must have undergone fewer rounds of amplification."},{"categoryid":353,"description":"The UCSC genome browser suite, also known as Jim Kent's library and GoldenPath","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"ucsc-genome-browser","packageid":53877},{"categoryid":353,"description":"Unified Nucleic Acid Folding and hybridization package","firstseen":"2011-01-14T14:42:30.477129","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"unafold","packageid":56702},{"categoryid":353,"description":"update_blastdb.pl for local blast db maintainance","firstseen":"2014-05-14T13:37:36.083422","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"update-blastdb","packageid":62199},{"categoryid":353,"description":"Tools for working with VCF (Variant Call Format) files","firstseen":"2011-10-03T14:35:06.100551","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"vcftools","packageid":58004},{"categoryid":353,"description":"A sequence assembler for very short reads","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"velvet","packageid":49259},{"categoryid":353,"description":"Intelligent algorithms for DNA searches","firstseen":"2010-05-04T00:54:45.661860","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"wise","packageid":55024},{"categoryid":353,"description":"Genomic similarity search with multiple transition constrained spaced seeds","firstseen":"2010-06-18T14:37:58.731517","maintainer":"sci-biology@gentoo.org","maintainername":"Gentoo Biology Project","name":"yass","packageid":55912}]}