Another record year for WormBase: over 50 million pages served.

Curious about how many people use WormBase and how often? Me too.

I just finished compiling the access statistics for 2010. Last year, WormBase served 51,606,849 distinct pages. This is a dramatic increase over 2009 (34,106,168) pages and continues the trend seen over the last few years. Note: spiders, web-crawling robots, and programmatic data-mining requests are excluded from these tallies.

We have some great things planned for our user community this year, including a ground up re-design of the site designed in part to meet this growing demand.

WormMart is under redevelopment

We know that several of our users have had problems using WormMart.  We would like to alert users to the fact that WormMart is under redevelopment, some data-sets are  unavailable and there are bugs in the tool that we know of and are actively working on.  Developers aim to build a new release before the end of the year.  We are sorry for this inconvenience.

Did you know that WormBase provides useful data files for download?

WormBase maintains a public FTP site where you can find many commonly requested files and datasets, the WormBase software and prepackaged databases. DNA sequence data for the genomes of C. elegans, C. briggsae, C. remanei, etc., are available in FASTA format, as is protein data.  Microarray data like the up-to-date mapping of microarray probes to WormBase genes for Affymetrix, Agilent, Washington University Genome Sequencing Center and Stanford Microarray Database (SMD) chips, is also made available.  For C. elegans, the following files are down-loadable from the FTP site: confirmed_genes — which lists curated C. elegans genes that have been confirmed by transcriptional data; wormpep — FASTA-format files containing predicted and confirmed protein translations, and many other files.

Take a look at our FTP site at ftp://ftp.wormbase.org/pub/wormbase/.  Be sure to look at the README file in each directory for a listing of the contents of that directory.

Genomes in WormBase

In addition to C. elegans, WormBase provides several resources for viewing and obtaining genome information for different worm species. WormBase classifies genomes in various tiers depending on the amount of curation effort it is able to spend on maintaining them.  For a description of the various genomes either in, or coming to WormBase, their current status and the resources available for each, please visit http://wiki.wormbase.org/index.php/WormBase_Genomes.

WormBase refines method to map RNAi targets to the genome

During the process of curation of RNA interference (RNAi) data, WormBase routinely maps the targets of any given RNAi experiment to the genome based on information present in the paper that describes the experiment. Recently WormBase has refined this process and addressed inconsistencies in target determination.  Previously, we were not filtering out the highly fragmented hits that occurred. That is, when many very short alignments occurred close together on the genome our mapping script was concatenating these splits, much like it would do when it skips over introns. These hits caused errant primary and secondary targets to be displayed. Most targets for RNAi experiments remain unchanged, but errant hits have been removed from WormBase.
The criteria for primary and secondary target determination (these descriptions are also on the individual RNAi report pages) are as follows:
Primary targets: These are targets that have sequence identity to the RNAi probe of at least 95%, over a stretch of at least 100 nucleotides, identified using a
combination of BLAST and BLAT algorithms.  These are usually the intended target genes of an RNAi experiment.
Secondary targets: These are targets that have between 80 and 94.99% sequence identity over a stretch of at least 200 nucleotides to the RNAi probe. Targets (and overlapping genes) that satisfy these criteria may or may not be susceptible to a RNAi effect with the given probe and represent secondary (unintended) genomic targets of an RNAi experiment.