Possible service interruptions, 31 August 2010

We’re relocating some services to a new hosting facility beginning at 10AM ET (GMT -5), Tuesday August 31, 2010. We plan to maintain systems at the old facility during this transition, but because these upgrades require modifications to the global domain name system records, you may encounter intermittent service interruptions or “server not found” errors. We apologize in advance for any difficulties.

Typically updates such as this propagate throughout the DNS system within hours but may take up to 48 hours to fully resolve.

Remote access to relational sequence feature databases

Power users: you can now remotely access our sequence feature databases.

Host : mining.wormbase.org
Port : 3306
User: remote-user
Pass: none
[tharris@unkar: ~]> mysql -h mining.wormbase.org -u remote-user
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 14
Server version: 5.1.45-1-log (Debian)

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| b_malayi           |
| c_brenneri         |
| c_briggsae         |
| c_elegans          |
| c_elegans_gmap     |
| c_elegans_pmap     |
| c_japonica         |
| c_remanei          |
| clustal            |
| h_bacteriophora    |
| m_hapla            |
| m_incognita        |
| p_pacificus        |
| test               |
+--------------------+
15 rows in set (0.00 sec)

Here’s an example script written in Perl using Bio::DB::GFF.

#!/usr/bin/perl

use Bio::DB::GFF;
use strict;

my $db = Bio::DB::GFF->new(-dsn  => 'dbi:mysql:c_elegans:mining.wormbase.org',
                           -user => 'remote-user',
                           -pass => '',)
  || die "Couldn't establish a connection to remote data mining server: $!";

my $iterator = $db->get_seq_stream(-type => ['coding:exon'] );

# Iterate over all of the requested features
while (my $feature = $iterator->next_seq) {

    # Create a more informative header
    my $name   = $feature->name;
    my $type   = $feature->type;
    my $start  = $feature->start;
    my $stop   = $feature->stop;
    my $strand = $feature->strand;
    my $refseq = $feature->sourceseq; # This is the name of the chromosome
    my $header = ">$name ($type; strand: $strand; $refseq: $start..$stop)";

    # If requested, fetch the sequence of the feature and convert it to fasta
      my $seq  = to_fasta($feature->dna);
      print ">$headern",$seq,"n";
}

# This subroutine converts a dna string into fasta format
sub to_fasta {
  my $sequence = shift;

  # Return if we are already in fasta format.
  return if ($sequence=~/^>(.+)$/m);

  # This is the business part of the subroutine.
  # Place a carriage return after every 80 characters
  $sequence =~ s/(.{80})/$1n/g;
  return $sequence;
}

Questions? Hit me up at [email protected]. And remember, please play nice: this is a shared resource. Egregious use that significantly disrupts other users may be curtailed without warning.