Private instances of WormBase in 10 minutes

Are you interested in running your own private instance of WormBase without having to work through a lengthy and complicated installation procedure? The Amazon Elastic Compute (EC2) Machine Images (AMIs) of WormBase are your answer.

Please see this Worm Breeder’s Gazette post for an overview of the process and this blog post for a detailed walkthrough.

Questions? Please drop me a line by email ([email protected]) or Twitter (@tharris).

ModENCODE Data Access Webinar, August 4, 12:00 EDT

“Understanding DNA replication and Nucleosome structure”. This online webinar will introduce participants to the data obtained in studies of replication and nucleosome structure in D.melanogaster, and describe how to visualize and download these data from the modENCODE web site. Space is limited so advance registration is necessary. Please email [email protected] to reserve your spot. More details and the schedule are available at modENCODE.

Upgraded Forums

This weekend, we upgraded the Worm Community Forums (http://forums.wormbase.org). This upgrade includes a cleaner, simpler look-and-feel, and for us, vastly better spam protection.

New users — provided they have the smallest level of C. elegans knowledge — can register automatically without having to wait for administrator approval.

We know some things are broken. For example, some avatars may have been lost in the transition. Please let us know if you experience any odd behaviors.

Restructured FTP site

We’ve recently completed a significant restructuring of our FTP site to better accomodate additional species and third party data. We think the new structure is simple and intuitive. Some of the benefits of the new organization include:

Note: the FTP site is located at ftp.wormbase.org/pub/wormbase.

  • Maintain a persistent archive of all releases:

    releases/

  • Download *all* the files for a given release:

    A specific release:
    releases/WS226/

    Current development release:
    releases/current-dev.wormbase.org-release/

    Current production release:
    releases/current-www.wormbase.org-release/

  • Explore by species:

    species/

  • Standardized file names and contents:
  • All file names contain the species name, WormBase version, file content descriptor and appropriate file prefix. This makes processing data downloaded from WormBase simple; new files and species added to the database are easily discoverable. And if you are cherry picking data, you won’t be overwhelmed with generic “genome.tgz, genome.tgz.1, genome.tgz.2” files.

    releases/WS226/species/c_elegans/c_elegans.WS226.annotations.gff2.gz

  • Current versions of files available at standardized URLs:

    species/gff/c_elegans.current.annotations.gff2.gz

  • Maintain 3rd party data seamlessly side-by-side with WormBase data.

    These data are available by species and abide by the same conventions mentioned above. You can process them identically to the data managed directly by WormBase.

    species/

  • Standard core files; additional files of commonly requested data:

    We’ve standardized the core files available for all WormBase managed species to include the genomic sequence (plain, masked, and softmasked), annotations in GFF2 or GFF3, CDS and ncRNA transcripts, and conceptual translations. We’re also expanding the number of commonly requested datasets, found in the annotations/ directory of each species:

    eg: species/c_elegans/annotation

    Please drop us a line if there is a file you’d like created for each WormBase release

We hope these changes make it easier to find the data you are looking for.