Accessing the WormBase FTP site

You may have noticed that many browsers such as Chrome, Firefox, etc., no longer support the File Transfer Protocol (FTP) to access files on remote servers. However, programs such as FileZilla offer free support of FTP. Please stay tuned as we are currently exploring solutions to make access to the WormBase FTP site easier for our users.

Restructured FTP site

We’ve recently completed a significant restructuring of our FTP site to better accomodate additional species and third party data. We think the new structure is simple and intuitive. Some of the benefits of the new organization include:

Note: the FTP site is located at ftp.wormbase.org/pub/wormbase.

  • Maintain a persistent archive of all releases:

    releases/

  • Download *all* the files for a given release:

    A specific release:
    releases/WS226/

    Current development release:
    releases/current-dev.wormbase.org-release/

    Current production release:
    releases/current-www.wormbase.org-release/

  • Explore by species:

    species/

  • Standardized file names and contents:
  • All file names contain the species name, WormBase version, file content descriptor and appropriate file prefix. This makes processing data downloaded from WormBase simple; new files and species added to the database are easily discoverable. And if you are cherry picking data, you won’t be overwhelmed with generic “genome.tgz, genome.tgz.1, genome.tgz.2” files.

    releases/WS226/species/c_elegans/c_elegans.WS226.annotations.gff2.gz

  • Current versions of files available at standardized URLs:

    species/gff/c_elegans.current.annotations.gff2.gz

  • Maintain 3rd party data seamlessly side-by-side with WormBase data.

    These data are available by species and abide by the same conventions mentioned above. You can process them identically to the data managed directly by WormBase.

    species/

  • Standard core files; additional files of commonly requested data:

    We’ve standardized the core files available for all WormBase managed species to include the genomic sequence (plain, masked, and softmasked), annotations in GFF2 or GFF3, CDS and ncRNA transcripts, and conceptual translations. We’re also expanding the number of commonly requested datasets, found in the annotations/ directory of each species:

    eg: species/c_elegans/annotation

    Please drop us a line if there is a file you’d like created for each WormBase release

We hope these changes make it easier to find the data you are looking for.