WormBase workshop talks at the 2019 IWM available on Youtube

Talks delivered at the WormBase workshop as part of the International C. elegans meeting at UCLA in June 2019 are now available on Youtube. Please note that these are recordings done at the workshop with external cameras and microphones, so apologies if they are not of the highest quality.  They are linked from their titles below–

  1. Introduction to the WormBase webpages and widgets
  2. WormBase data mining tools: SimpleMine, WormMine, BioMart
  3. WormBase ontologies and gene set enrichment analysis
  4. WormBase JBrowse: tutorial and demo
  5. Community curation
  6. Introduction to the Alliance of Genome Resources

 

Looking for naming conventions and guidelines?

If you have started a new worm lab or are looking for nomenclature guidelines for genes, alleles and other genetic entities, please consult this page of our online user guide–https://wormbase.org/about/userguide/nomenclature#f1il048b3g6e2597cmdjkh–10

Different types of properly named entities (genes, alleles, strains, trangenes, etc.) in published papers are identified by text-mining and other WormBase tools and/or via manual curation. Following official nomenclature guidelines makes your data discoverable by WormBase and thus to the whole community!

Getting DNA sequence from JBrowse

Screen Shot 2019-02-12 at 8.53.04 AM

There are two ways that users can download DNA sequence from JBrowse: in relation to a feature, like a gene, or by specific sequence coordinates.

Getting Sequence for a Gene or Transcript

To get the sequence for a gene, open any of the “Curated Genes” tracks and right click or control click on the feature and select “View Sequence” from the resulting popup menu. If you are using the main “Curated Genes” track, you’ll get a dialog box asking which transcript you want to view. It does this because it needs to know what subfeatures like exons and introns can be shown. If you’re using one of the “type-specific” tracks (like “Curated Genes (noncoding)”), you won’t see this dialog because these tracks show individual transcripts rather than grouping the transcripts together like the “main” gene track does. It’s possible (in fact, likely) that you’ll be displayed a warning message about overlapping subfeatures (since exons and CDS overlap) and then be presented with a dialog box containing a list of subfeatures with buttons and the sequence below them. The buttons allow modifying the sequence display to highlight, hide or change the case of the subfeatures selected. For example, you could do an “in silico mRNA processing” by hiding the introns. In addition, there are drop down menus for showing a set range of up or downstream sequence, up to 4kb. This sequence can then be selected, copied and pasted into word processing program with highlighting preserved. Of course, hiding or changing the case will be preserved regardless of where it is pasted into.

Screen Shot 2019-02-12 at 8.53.04 AM

Two Important Notes

  1. The highlighting tool doesn’t always get the subfeatures right! It sometimes gets confused when calculating boundaries and their overlaps. If it’s important, please take the time to check the results before using them.
  2. Highlighting doesn’t always get preserved when cutting and pasting, and it seems to depend on what browser is being used. In my hands, it works in Firefox but not in Safari or Chrome.

Getting Sequence for Other (non-gene) Features

If there are features other than those that are gene related, the process is similar. Control or right click on a feature and select “View Details” from the popup menu. The resulting dialog contains all of the information that JBrowse “knows” about the feature and, as long as it isn’t too long (ie, less than 250kb), will contain the sequence of the feature. In this instance you don’t even need to select and copy to get the sequence, since there is a “save as FASTA” button.

Screen Shot 2019-02-12 at 1.02.13 PM

Getting Sequence of a Specific Range

If you need some other range of sequence not directly associated with a feature, we will make use of the “Save track data” functionality that is built into every track. In this instance, we want to use the reference sequence track to get its data (ie, the DNA sequence). To do this, turn on the “Reference Sequence (DNA)” track. Then zoom, pan, search for a landmark or enter exact coordinates in the search field to the exact region that you want the sequence for. The coordinates are in the format “Chromosome:start..end” (like GBrowse, “II:20000..30000” for example). When you’re there, mouse over the track label (ie, the words “Reference Sequence (DNA)” in the JBrowse view) and the track label will become more opaque and a down triangle will appear, indicating a menu is available.

Click on that triangle and select “Save track data” from the menu. You will be presented with a dialog menu that will let you either view or save the visible region as FASTA. There isn’t an option to pick where it gets saved, so look for it in your default download location, typically a Downloads folder. A useful side note is that this option is available for every track, so you can always download the data for the track you are looking at regardless of the data type.

Screen Shot 2019-02-12 at 1.10.27 PM

Update on protein-protein interaction data in WormBase

Currently WormBase contains over 28,000 physical protein-protein interactions of which 1,500 protein-protein interactions have been curated by BioGRID as a collaboration with WormBase. Within the data set, over 17,000 protein-protein interactions are unique, and over 6,000 unique genes are involved in these interactions. In WormBase, protein-protein interaction data can be found as a subclass of physical interaction data in the ‘Interactions widget’ on the gene report page. The interactions widget provides all types of interaction data related to the gene of interest, such as physical, genetic, regulatory, and predicted interactions. Check out the micropublication ‘2018 Update on Protein-Protein Interaction Data in WormBase‘ by Jae Cho et al., and learn more about this data type in WormBase.

New SPELL server: Fresh look and faster

Please note that we have moved the WormBase SPELL tool to a new server on the Amazon Cloud: https://spell.wormbase.org.  This should not affect any of our users, as the ‘Tools’ link on the main WormBase menu bar will take you to this new URL.
SPELL (Serial Pattern of Expression Levels Locator) is a query-driven search engine for microarray, RNAseq and Proteomics data. Given a small set of query genes, SPELL identifies which datasets are most informative for these genes, then within those datasets, additional genes are identified with expression profiles most similar to the query set. WormBase SPELL has collected over 6,000 experiments for 9 nematode species. Users can also download these datasets for their own analysis. See WormBase SPELL Tutorial: http://wiki.wormbase.org/index.php/SPELL
The new WormBase SPELL server is twice as fast as the old version. Please give it a try and let us know what you think!