Power users: you can now remotely access our sequence feature databases.
Host : mining.wormbase.org Port : 3306 User: remote-user Pass: none
[tharris@unkar: ~]> mysql -h mining.wormbase.org -u remote-user Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 14 Server version: 5.1.45-1-log (Debian) Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | b_malayi | | c_brenneri | | c_briggsae | | c_elegans | | c_elegans_gmap | | c_elegans_pmap | | c_japonica | | c_remanei | | clustal | | h_bacteriophora | | m_hapla | | m_incognita | | p_pacificus | | test | +--------------------+ 15 rows in set (0.00 sec)
Here’s an example script written in Perl using Bio::DB::GFF.
#!/usr/bin/perl use Bio::DB::GFF; use strict; my $db = Bio::DB::GFF->new(-dsn => 'dbi:mysql:c_elegans:mining.wormbase.org', -user => 'remote-user', -pass => '',) || die "Couldn't establish a connection to remote data mining server: $!"; my $iterator = $db->get_seq_stream(-type => ['coding:exon'] ); # Iterate over all of the requested features while (my $feature = $iterator->next_seq) { # Create a more informative header my $name = $feature->name; my $type = $feature->type; my $start = $feature->start; my $stop = $feature->stop; my $strand = $feature->strand; my $refseq = $feature->sourceseq; # This is the name of the chromosome my $header = ">$name ($type; strand: $strand; $refseq: $start..$stop)"; # If requested, fetch the sequence of the feature and convert it to fasta my $seq = to_fasta($feature->dna); print ">$headern",$seq,"n"; } # This subroutine converts a dna string into fasta format sub to_fasta { my $sequence = shift; # Return if we are already in fasta format. return if ($sequence=~/^>(.+)$/m); # This is the business part of the subroutine. # Place a carriage return after every 80 characters $sequence =~ s/(.{80})/$1n/g; return $sequence; }
Questions? Hit me up at todd@wormbase.org. And remember, please play nice: this is a shared resource. Egregious use that significantly disrupts other users may be curtailed without warning.
Michael says
I guess, the colon in the feature type should be an underscore? … and the greater sign in the print statement.
tharris says
There may be encoded characters (although I’m not seeing the examples you point out). Feature type should be “coding:exon”, and the greater sign in the print statement is just for generating a fasta header.