mining.wormbase.org: now online

May 28, 2010 by Todd Harris Leave a Comment

We’re happy to report that the new data mining server is now online. aceserver.cshl.edu will be retired at the end of the first week of June 2010.

If you use aceserver.cshl.edu for programmatic access to WormBase, please update your scripts now with the following information:

host : mining.wormbase.org
port : 2005  # for acedb queries
port : 3306  # MySQL queries of sequence feature databases via Bio:DB::GFF/Bio::DB::SeqFeature

Here’s an example script using Ace.pm that lists all of the genes in the Unc gene class:

#!/usr/bin/perl use Ace; use strict;


my $db = Ace->connect(-host=>'mining.wormbase.org',-port=>'2005')

                      or die "Can't connect to the server: $!";

# Get all genes in the Unc gene_class my $gene_class = $db->fetch(Gene_class=>'unc'); my @genes = $gene_class->Genes; foreach (@genes) { print join("t",$_, $_->Public_name),"n"; }

And here’s a script mining sequence features using Bio::DB::GFF. It fetches all coding exons and prints their sequence in FASTA. Please note that access to the MySQL databases is pending firewall reconfiguration which should be complete in the next week.

#!/usr/bin/perl


use Bio::DB::GFF;

use strict;
my $dsn      = 'c_elegans:mining.wormbase.org';

my $feature = 'coding_exon';

my $db = Bio::DB::GFF->new(-dsn  => 'dbi:mysql:' . $dsn,

                           -user => 'remote-user',

                           -pass => '',)

  || die "Couldn't establish a connection to $dsn";
my $iterator = $db->get_seq_stream(-type => $feature);
# Iterate over all of the requested features

while (my $feature = $iterator->next_seq) {
    # Create a more informative header

    my $name   = $feature->name;

    my $type   = $feature->type;

    my $start  = $feature->start;

    my $stop   = $feature->stop;

    my $strand = $feature->strand;

    my $refseq = $feature->sourceseq; # This is the name of the chromosome

    my $header = ">$name ($type; strand: $strand; $refseq: $start..$stop)";
    # If requested, fetch the sequence of the feature and convert it to fasta

      my $seq  = to_fasta($feature->dna);

      print ">$headern",$seq,"n";

}
# This subroutine converts a dna string into fasta format

sub to_fasta {

  my $sequence = shift;
  # Return if we are already in fasta format.

  return if ($sequence=~/^>(.+)$/m);

# This is the business part of the subroutine. # Place a carriage return after every 80 characters $sequence =~ s/(.{80})/$1n/g; return $sequence; }

Questions? Hit me up at todd@wormbase.org.

Colocation problems resolved; service restored

May 27, 2010 by Todd Harris Leave a Comment

This morning, a configuration error by our colocation facility blocked access to WormBase. This problem is now resolved. We apologize for the service disruption.

Aceserver retiring; new data mining server end of May

May 18, 2010 by Todd Harris Leave a Comment

A friend of mine once told me, “If it isn’t grown, it has to be mined.” Maybe it was a bumper sticker.

Well, if it isn’t displayed, it has to be data-mined.

Do you use aceserver.cshl.edu for programmatic access to WormBase? Heads up. Aceserver will be retired at the end of May 2010 and replaced with a new server in a new location. Stay tuned. We’ll report here with additional details shortly.

By all means, you should subscribe to the WormBase blog feed if you aren’t already.

WormBase collaborates with 'Genetics' to markup papers

May 13, 2010 by Ranjana Kishore 1 Comment

WormBase has been collaborating with the journal ‘Genetics’ and Textpresso (www.textpresso.org) to markup online text and pdf versions of papers accepted for publication. The goal of this collaboration is to link entities/database objects, e.g. gene names, alleles, anatomy terms, etc., within a paper to pages in WormBase. Entities from a total of nine data classes are currently being marked up, these classes include genes, proteins, variations, clones, anatomy terms, authors. We will soon begin the markup of phenotypes. In particular, we will be linking phenotype short names, eg., dpy, which are commonly used by the worm community and may be confusing for readers not familiar with the C. elegans nomenclature for phenotypes. This project has pioneered the development of a markup pipeline that is starting to be used for marking up papers curated by other model organism databases, and has been in production since October of 2009. We have marked up 23 papers so far.

We request authors to participate in this effort by alerting us about objects in your paper that don’t exist in WormBase.

Gene page "Location" image problems resolved

May 10, 2010 by Todd Harris Leave a Comment

Wondering why your favorite gene page report is showing the wrong genomic location? Well, we were too. We’re happy to report that this issue is now resolved. We hope 🙂

WormBase -- The Blog

The Official Blog of WormBase

mining.wormbase.org: now online

Colocation problems resolved; service restored

Aceserver retiring; new data mining server end of May

WormBase collaborates with 'Genetics' to markup papers

Gene page "Location" image problems resolved

Share this:

Share this:

Share this:

Share this:

Share this: