news

We’re happy to report that the new data mining server is now online. aceserver.cshl.edu will be retired at the end of the first week of June 2010.

If you use aceserver.cshl.edu for programmatic access to WormBase, please update your scripts now with the following information:

host : mining.wormbase.org
port : 2005  # for acedb queries
port : 3306  # MySQL queries of sequence feature databases via Bio:DB::GFF/Bio::DB::SeqFeature

Here’s an example script using Ace.pm that lists all of the genes in the Unc gene class:

#!/usr/bin/perl use Ace; use strict;


my $db = Ace->connect(-host=>'mining.wormbase.org',-port=>'2005')

                      or die "Can't connect to the server: $!";

# Get all genes in the Unc gene_class my $gene_class = $db->fetch(Gene_class=>'unc'); my @genes = $gene_class->Genes; foreach (@genes) { print join("t",$_, $_->Public_name),"n"; }

And here’s a script mining sequence features using Bio::DB::GFF. It fetches all coding exons and prints their sequence in FASTA. Please note that access to the MySQL databases is pending firewall reconfiguration which should be complete in the next week.

#!/usr/bin/perl


use Bio::DB::GFF;

use strict;
my $dsn      = 'c_elegans:mining.wormbase.org';

my $feature = 'coding_exon';

my $db = Bio::DB::GFF->new(-dsn  => 'dbi:mysql:' . $dsn,

                           -user => 'remote-user',

                           -pass => '',)

  || die "Couldn't establish a connection to $dsn";
my $iterator = $db->get_seq_stream(-type => $feature);
# Iterate over all of the requested features

while (my $feature = $iterator->next_seq) {
    # Create a more informative header

    my $name   = $feature->name;

    my $type   = $feature->type;

    my $start  = $feature->start;

    my $stop   = $feature->stop;

    my $strand = $feature->strand;

    my $refseq = $feature->sourceseq; # This is the name of the chromosome

    my $header = ">$name ($type; strand: $strand; $refseq: $start..$stop)";
    # If requested, fetch the sequence of the feature and convert it to fasta

      my $seq  = to_fasta($feature->dna);

      print ">$headern",$seq,"n";

}
# This subroutine converts a dna string into fasta format

sub to_fasta {

  my $sequence = shift;
  # Return if we are already in fasta format.

  return if ($sequence=~/^>(.+)$/m);

# This is the business part of the subroutine. # Place a carriage return after every 80 characters $sequence =~ s/(.{80})/$1n/g; return $sequence; }

Questions? Hit me up at [email protected].

WormBase -- The Blog

The Official Blog of WormBase

New release of the database: WS214

mining.wormbase.org: now online

Colocation problems resolved; service restored

WormBase Release: WS213

Sanger Institute WormBase Project Manager position

Share this:

Share this:

Share this:

Share this:

Share this: