WS264 release

C. elegans sORFs

sORFs.org is a public repository of small open reading frames (sORFs) identified by ribosome profiling (RIBO-seq).

It contains predicted sORF regions for several species, including C. elegans.

We have annotated 118 predicted sORF regions as coding (CDS) isoforms of the existing genes. It is likely that in the next release, where these isoforms do not overlap with existing isoforms, these sORF regions will be changed to be individual genes and not isoforms.

52 of these annotated sORF regions do not start with the canonical Methionine AUG initiation codon. It is possible that they use a non-canonical initiation codon. Some of these non-canonical initiation codons are not the expected non-canonical initiation codon Isoleucine, but code for residues like Valine.

Trichuris muris

This release we see the integrated of the Edinburgh strain of Trichuris muris version TMUE3.1. This species has been fully integrated as a core species meaning there are stable IDs and tracking with inclusion in all additional pipelines and analysis.
The genome assembly and gene annotation has been taken directly from the Pathogen Genomics group at the WTSI. Additional mapping of gene mergers, splits and transfer of IDs from the TMUE2.2 has been done to allow users to identify their genes of interest.

Caenorhabditis nigoni

This release includes the Caenorhabditis nigoni genome assembly and gene set described in “Rapid genome shrinkage in a self-fertile nematode reveals sperm competition proteins” by Da Yin, et. al (Science 359,55-61 2018) as non-core species set.
This species should be of special interest, due to its phylogenetic closeness to C.briggsae and its differences in sexual reproduction.

The data is available as files on the WormBase FTP site, as well as the JBrowse genome browser.

Gene Transfer Format (GTF) files now available

WormBase now provides the canonical gene set for each species in Gene Transfer Format (GTF, http://mblab.wustl.edu/GTF22.html). These files can be used directly by a number of popular sequence analyses tools (e.g. Cufflinks).

The GTF files are available from the WormBase FTP site, for example, the GTF file for C. elegans, c_elegans.PRJNA13758.WS253.canonical_geneset.gtf.gz, is available here.

 

WS247: C. briggsae genes have descriptions!

In the previous WS246 release we introduced automated gene descriptions for C. elegans genes that lacked a manually written one. These gene descriptions include information related to orthology, process, function and sub-cellular localization (when these data-types have been curated in the WormBase database), giving the user a quick overview of the gene. The current WS247 release includes automated descriptions for over 18,000 C. briggsae genes.  Check out the C. briggase gene pages to view these descriptions under ‘Overview’!  In future releases, we will add genes from many more species!  Also, WormBase is working on user-friendly forms which you can use to edit these descriptions and make them better.