New gene descriptions for the WS268 release of WormBase

The WS268 release of WormBase features new automated gene descriptions (displayed in the ‘Overview’ widget at the top of the gene page). In addition to the code being entirely rewritten, we have also added gene descriptions for a new worm species – Trichuris muris.   These automated gene descriptions are highly structured and are based on curated data such as orthology data, Gene Ontology (GO) annotations, Disease Ontology (DO) annotations, gene expression data, etc., in WormBase.

The following data-types are included in the description of a given gene, when available: orthology to human genes, molecular function, biological processes, sub-cellular localization and tissue expression. A new addition to the gene descriptions is human disease relevance data in cases where the gene is used experimentally to study a disease or if it’s human orthologs are implicated in a disease.

Data content changes: For lesser known, data-poor genes, we’ve included:
1. protein domain data
2. Orthologous human gene function data drawn from the Alliance of Genome Resources.
3. Perturbation by other genes and/or chemicals and tissue enrichment data based on large scale data such as microarray, tiling arrays and RNA seq data.

Data display changes: The ‘Overview’ widget now displays only the automated gene description, however our legacy manual gene descriptions (when they do exist for genes) are collapsed but available for viewing.

Note that while the automated gene descriptions are not generated directly from the literature, most of the annotations on which they are based are manually curated from the primary literature by WormBase curators.

Paper of Interest: Comparative study of 81 genomes of parasitic and non-parasitic worms

Check out the paper ‘Comparative genomics of the major parasitic worms’ in Nature Genetics by the International Helminth Genomes Consortium and the ‘News and Views’ article by Paul Sternberg, ‘Opening up a large can of worms‘.

Update on protein-protein interaction data in WormBase

Currently WormBase contains over 28,000 physical protein-protein interactions of which 1,500 protein-protein interactions have been curated by BioGRID as a collaboration with WormBase. Within the data set, over 17,000 protein-protein interactions are unique, and over 6,000 unique genes are involved in these interactions. In WormBase, protein-protein interaction data can be found as a subclass of physical interaction data in the ‘Interactions widget’ on the gene report page. The interactions widget provides all types of interaction data related to the gene of interest, such as physical, genetic, regulatory, and predicted interactions. Check out the micropublication ‘2018 Update on Protein-Protein Interaction Data in WormBase‘ by Jae Cho et al., and learn more about this data type in WormBase.