Variation names restored to the Genome Browser

September 29, 2010 by Todd Harris Leave a Comment

We’re happy to report that variations are now displayed on the Genome Browser and Gene Summary images using their familiar public names (eg m2). Public names were temporarily concealed by the introduction of unique identifiers for variations. This behind-the-scenes change was made to more effortlessly track variations from resequencing and population genetics studies. We apologize for any difficulties this modification may have caused.

Remote access to relational sequence feature databases

June 19, 2010 by Todd Harris 2 Comments

Power users: you can now remotely access our sequence feature databases.

Host : mining.wormbase.org
Port : 3306
User: remote-user
Pass: none

[tharris@unkar: ~]> mysql -h mining.wormbase.org -u remote-user
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 14
Server version: 5.1.45-1-log (Debian)

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| b_malayi           |
| c_brenneri         |
| c_briggsae         |
| c_elegans          |
| c_elegans_gmap     |
| c_elegans_pmap     |
| c_japonica         |
| c_remanei          |
| clustal            |
| h_bacteriophora    |
| m_hapla            |
| m_incognita        |
| p_pacificus        |
| test               |
+--------------------+
15 rows in set (0.00 sec)

Here’s an example script written in Perl using Bio::DB::GFF.

#!/usr/bin/perl

use Bio::DB::GFF;
use strict;

my $db = Bio::DB::GFF->new(-dsn  => 'dbi:mysql:c_elegans:mining.wormbase.org',
                           -user => 'remote-user',
                           -pass => '',)
  || die "Couldn't establish a connection to remote data mining server: $!";

my $iterator = $db->get_seq_stream(-type => ['coding:exon'] );

# Iterate over all of the requested features
while (my $feature = $iterator->next_seq) {

    # Create a more informative header
    my $name   = $feature->name;
    my $type   = $feature->type;
    my $start  = $feature->start;
    my $stop   = $feature->stop;
    my $strand = $feature->strand;
    my $refseq = $feature->sourceseq; # This is the name of the chromosome
    my $header = ">$name ($type; strand: $strand; $refseq: $start..$stop)";

    # If requested, fetch the sequence of the feature and convert it to fasta
      my $seq  = to_fasta($feature->dna);
      print ">$headern",$seq,"n";
}

# This subroutine converts a dna string into fasta format
sub to_fasta {
  my $sequence = shift;

  # Return if we are already in fasta format.
  return if ($sequence=~/^>(.+)$/m);

  # This is the business part of the subroutine.
  # Place a carriage return after every 80 characters
  $sequence =~ s/(.{80})/$1n/g;
  return $sequence;
}

Questions? Hit me up at [email protected]. And remember, please play nice: this is a shared resource. Egregious use that significantly disrupts other users may be curtailed without warning.

mining.wormbase.org: now online

May 28, 2010 by Todd Harris Leave a Comment

We’re happy to report that the new data mining server is now online. aceserver.cshl.edu will be retired at the end of the first week of June 2010.

If you use aceserver.cshl.edu for programmatic access to WormBase, please update your scripts now with the following information:

host : mining.wormbase.org
port : 2005  # for acedb queries
port : 3306  # MySQL queries of sequence feature databases via Bio:DB::GFF/Bio::DB::SeqFeature

Here’s an example script using Ace.pm that lists all of the genes in the Unc gene class:

#!/usr/bin/perl use Ace; use strict;


my $db = Ace->connect(-host=>'mining.wormbase.org',-port=>'2005')

                      or die "Can't connect to the server: $!";

# Get all genes in the Unc gene_class my $gene_class = $db->fetch(Gene_class=>'unc'); my @genes = $gene_class->Genes; foreach (@genes) { print join("t",$_, $_->Public_name),"n"; }

And here’s a script mining sequence features using Bio::DB::GFF. It fetches all coding exons and prints their sequence in FASTA. Please note that access to the MySQL databases is pending firewall reconfiguration which should be complete in the next week.

#!/usr/bin/perl


use Bio::DB::GFF;

use strict;
my $dsn      = 'c_elegans:mining.wormbase.org';

my $feature = 'coding_exon';

my $db = Bio::DB::GFF->new(-dsn  => 'dbi:mysql:' . $dsn,

                           -user => 'remote-user',

                           -pass => '',)

  || die "Couldn't establish a connection to $dsn";
my $iterator = $db->get_seq_stream(-type => $feature);
# Iterate over all of the requested features

while (my $feature = $iterator->next_seq) {
    # Create a more informative header

    my $name   = $feature->name;

    my $type   = $feature->type;

    my $start  = $feature->start;

    my $stop   = $feature->stop;

    my $strand = $feature->strand;

    my $refseq = $feature->sourceseq; # This is the name of the chromosome

    my $header = ">$name ($type; strand: $strand; $refseq: $start..$stop)";
    # If requested, fetch the sequence of the feature and convert it to fasta

      my $seq  = to_fasta($feature->dna);

      print ">$headern",$seq,"n";

}
# This subroutine converts a dna string into fasta format

sub to_fasta {

  my $sequence = shift;
  # Return if we are already in fasta format.

  return if ($sequence=~/^>(.+)$/m);

# This is the business part of the subroutine. # Place a carriage return after every 80 characters $sequence =~ s/(.{80})/$1n/g; return $sequence; }

Questions? Hit me up at [email protected].

Aceserver retiring; new data mining server end of May

May 18, 2010 by Todd Harris Leave a Comment

A friend of mine once told me, “If it isn’t grown, it has to be mined.” Maybe it was a bumper sticker.

Well, if it isn’t displayed, it has to be data-mined.

Do you use aceserver.cshl.edu for programmatic access to WormBase? Heads up. Aceserver will be retired at the end of May 2010 and replaced with a new server in a new location. Stay tuned. We’ll report here with additional details shortly.

By all means, you should subscribe to the WormBase blog feed if you aren’t already.

Beta testing new Genome Browser

March 28, 2010 by Todd Harris Leave a Comment

We’re now beta testing a new version of the Genome Browser. Care to give it whirl? Waltz on over and give it a spin.

Be sure to let me know if you encounter any problems.

«Previous Page

WormBase -- The Blog

The Official Blog of WormBase

Variation names restored to the Genome Browser

Remote access to relational sequence feature databases

mining.wormbase.org: now online

Aceserver retiring; new data mining server end of May

Beta testing new Genome Browser

Share this:

Share this:

Share this:

Share this:

Share this: