WS273 version of WormBase

Please note that the WS273 version of WormBase is live on our website.  The release notes for this version contain a detailed report of the various entities (eg. gene, allele) and the number of sequence and biological annotations.

Some of the highlights are:

C.elegans Nanopore data has been integrated and is being used to improve UTR annotation. Further work on improving our automatic transcript annotation is ongoing.

This release includes a few hundred gene splits in Pristionchus pacificus predicted by the Sommer lab based on IsoSeq analysis. More will be included in future releases.

WormBase updated to the WS270 release

Alliance orthologs

Starting from WS270 we are using the Alliance of Genome Resources for orthologs between model organism genes. This should be more comprehensive and include the latest data from our sister databases.

VC2010 genome

An assembly and geneset of the VC2010 C.elegans strain has been added and work is ongoing to improve the integration and display of strain specific data. Additionally more mappings of N2 annotation will be included in future releases.

WS267 release of WormBase

Please note that the WS267 release of WormBase is now live.  The complete release notes for WS267 can be viewed here.  Some of the highlights include–

Trichuris muris update
Trinity assembled RNASeq reads from publicly available short read data at SRA have been added to Trichuris muris as additional track and alignments. In addition IsoSeq data from long-range PacBio data (provided by the Berriman lab), corrected by genome alignment has been used as additional source to build transcript models.

In addition the Trichuris muris ncRNA gene set has increased from 26 to 759 following the integration of data produced by the WormBase Parasite ncRNA prediction pipeline. These transcripts have been fully integrated with stable IDs and associated naming and meta data.

Gene descriptions for T. muris will be coming in the WS268 release of WormBase.

Brugia malayi update
New gene models provided by the Beech lab for Parasitology at the McGill University have been merged into the official gene set.

WS264 release

C. elegans sORFs

sORFs.org is a public repository of small open reading frames (sORFs) identified by ribosome profiling (RIBO-seq).

It contains predicted sORF regions for several species, including C. elegans.

We have annotated 118 predicted sORF regions as coding (CDS) isoforms of the existing genes. It is likely that in the next release, where these isoforms do not overlap with existing isoforms, these sORF regions will be changed to be individual genes and not isoforms.

52 of these annotated sORF regions do not start with the canonical Methionine AUG initiation codon. It is possible that they use a non-canonical initiation codon. Some of these non-canonical initiation codons are not the expected non-canonical initiation codon Isoleucine, but code for residues like Valine.

Trichuris muris

This release we see the integrated of the Edinburgh strain of Trichuris muris version TMUE3.1. This species has been fully integrated as a core species meaning there are stable IDs and tracking with inclusion in all additional pipelines and analysis.
The genome assembly and gene annotation has been taken directly from the Pathogen Genomics group at the WTSI. Additional mapping of gene mergers, splits and transfer of IDs from the TMUE2.2 has been done to allow users to identify their genes of interest.

Caenorhabditis nigoni

This release includes the Caenorhabditis nigoni genome assembly and gene set described in “Rapid genome shrinkage in a self-fertile nematode reveals sperm competition proteins” by Da Yin, et. al (Science 359,55-61 2018) as non-core species set.
This species should be of special interest, due to its phylogenetic closeness to C.briggsae and its differences in sexual reproduction.

The data is available as files on the WormBase FTP site, as well as the JBrowse genome browser.

WormBase Release WS263

We would like to announce the availability of the WormBase WS263 release on the WormBase website and FTP.

Some of the highlights of this release are:

New Pristionchus Assembly and Gene Set

The new data made available through WormBase is described in “Single-Molecule Sequencing Reveals the Chromosome-Scale Genomic Architecture of the Nematode Model Organism Pristionchus pacificus by Christian Roedelsperger, et al. in Cell.

ID mapping was carried out to try and preserve the existing IDs. The published IDs were also incorporated into the database so that they are searchable on wormbase.org.

New PantherDB data

the PantherDB orthology set included in WormBase has been updated to the latest version (13.0).

DNAseI hypersensitive Sites

We have created 42,728 Features with a Method=DNAseI_hypersensitive_site to mark DNase hypersensitivity sites found in C. elegans ’embryo’ and ‘L1 arrest’ life-stages in non-coding regions from the paper WBPaper00053259. These sites, together with Transcription Factor footprints, will be added to the tracks available in the genome browsers.

Genome Res. 2017 Dec;27(12):2108-2119. “Genome-wide discovery of active regulatory elements and transcription factor footprints in Caenorhabditis elegans using DNase-seq.”  Ho MCW, Quintero-Cadena P, Sternberg PW.