Mercurial > repos > devteam > ncbi_blast_plus
comparison tool-data/blastdb_p.loc.sample @ 1:5e9d5e536b79 draft
Uploaded v0.1.02 preview 2, clarify sample blastdb loc files, etc
| author | peterjc |
|---|---|
| date | Tue, 03 Mar 2015 05:32:18 -0500 |
| parents | 432ea9614cc9 |
| children |
comparison
equal
deleted
inserted
replaced
| 0:432ea9614cc9 | 1:5e9d5e536b79 |
|---|---|
| 1 #This is a sample file distributed with Galaxy that is used to define a | 1 # This is a sample file distributed with Galaxy that is used to define a |
| 2 #list of protein BLAST databases, using three columns tab separated | 2 # list of protein BLAST databases, using three columns tab separated: |
| 3 #(longer whitespace are TAB characters): | |
| 4 # | 3 # |
| 5 #<unique_id> <database_caption> <base_name_path> | 4 # <unique_id>{tab}<database_caption>{tab}<base_name_path> |
| 6 # | 5 # |
| 7 #The captions typically contain spaces and might end with the build date. | 6 # The captions typically contain spaces and might end with the build date. |
| 8 #It is important that the actual database name does not have a space in | 7 # It is important that the actual database name does not have a space in |
| 9 #it, and that there are only two tabs on each line. | 8 # it, and that there are only two tabs on each line. |
| 10 # | 9 # |
| 11 #So, for example, if your database is NR and the path to your base name | 10 # You can download the NCBI provided protein databases like NR from here: |
| 12 #is /data/blastdb/nr, then the blastdb_p.loc entry would look like this: | 11 # ftp://ftp.ncbi.nlm.nih.gov/blast/db/ |
| 13 # | 12 # |
| 14 #nr{tab}NCBI NR (non redundant){tab}/data/blastdb/nr | 13 # For simplicity, many Galaxy servers are configured to offer just a live |
| 14 # version of each NCBI BLAST database (updated with the NCBI provided | |
| 15 # Perl scripts or similar). In this case, we recommend using the case | |
| 16 # sensistive base-name of the NCBI BLAST databases as the unique id. | |
| 17 # Consistent naming is important for sharing workflows between Galaxy | |
| 18 # servers. | |
| 15 # | 19 # |
| 16 #and your /data/blastdb directory would contain all of the files associated | 20 # For example, consider the NCBI "non-redundant" protein BLAST database |
| 17 #with the database, /data/blastdb/nr.*. | 21 # where you have downloaded and decompressed the files under /data/blastdb/ |
| 22 # meaning at the command line BLAST+ would be run with something like | |
| 23 # which would look at the files /data/blastdb/nr.p*: | |
| 18 # | 24 # |
| 19 #Your blastdb_p.loc file should include an entry per line for each "base name" | 25 # $ blastp -db /data/blastdb/nr -query ... |
| 20 #you have stored. For example: | |
| 21 # | 26 # |
| 22 #nr_05Jun2010 NCBI NR (non redundant) 05 Jun 2010 /data/blastdb/05Jun2010/nr | 27 # In this case use nr (lower case to match the NCBI file naming) as the |
| 23 #nr_15Aug2010 NCBI NR (non redundant) 15 Aug 2010 /data/blastdb/15Aug2010/nr | 28 # unique id in the first column of blastdb_p.loc, giving an entry like |
| 24 #...etc... | 29 # this: |
| 25 # | 30 # |
| 26 #You can download the NCBI provided protein databases like NR from here: | 31 # nr{tab}NCBI non-redundant (nr){tab}/data/blastdb/nr |
| 27 #ftp://ftp.ncbi.nlm.nih.gov/blast/db/ | |
| 28 # | 32 # |
| 29 #See also blastdb.loc which is for any nucleotide BLAST database, and | 33 # Alternatively, rather than a "live" mirror of the NCBI databases which |
| 30 #blastdb_d.loc which is for any protein domains databases (like CDD). | 34 # are updated automatically, for full reproducibility the Galaxy Team |
| 35 # recommend saving date-stamped copies of the databases. In this case | |
| 36 # your blastdb_p.loc file should include an entry per line for each | |
| 37 # version you have stored. For example: | |
| 38 # | |
| 39 # nr_05Jun2010{tab}NCBI NR (non redundant) 05 Jun 2010{tab}/data/blastdb/05Jun2010/nr | |
| 40 # nr_15Aug2010{tab}NCBI NR (non redundant) 15 Aug 2010{tab}/data/blastdb/15Aug2010/nr | |
| 41 # ...etc... | |
| 42 # | |
| 43 # See also blastdb.loc which is for any nucleotide BLAST database, and | |
| 44 # blastdb_d.loc which is for any protein domains databases (like CDD). |
