Mercurial > repos > fubar > fastqc_320
comparison FastQC/rgFastQC.xml @ 0:42251cbdeeac draft
Initial commit of test for FastQC with installation of the java stuff
| author | fubar |
|---|---|
| date | Mon, 03 Jun 2013 20:30:24 -0400 |
| parents | |
| children | c13fa2748191 |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:42251cbdeeac |
|---|---|
| 1 <tool name="FastQC: Comprehensive QC" id="fastqc" version="0.53"> | |
| 2 <description>reporting for short read sequence</description> | |
| 3 <command interpreter="python"> | |
| 4 rgFastQC.py -i "$input_file" -d "$html_file.files_path" -o "$html_file" -n "$out_prefix" -f "$input_file.ext" -j "$input_file.name" -e "\$FASTQC_INSTALL_PATH/fastqc" | |
| 5 #if $contaminants.dataset and str($contaminants) > '' | |
| 6 -c "$contaminants" | |
| 7 #end if | |
| 8 </command> | |
| 9 <requirements> | |
| 10 <requirement type="set_environment">FASTQC_INSTALL_PATH</requirement> | |
| 11 <requirement type="package" version="0.10.1">FastQC</requirement> | |
| 12 </requirements> | |
| 13 <inputs> | |
| 14 <param format="fastqsanger,fastq,bam,sam" name="input_file" type="data" label="Short read data from your current history" /> | |
| 15 <param name="out_prefix" value="FastQC" type="text" label="Title for the output file - to remind you what the job was for" size="80" | |
| 16 help="Letters and numbers only please - other characters will be removed"> | |
| 17 <sanitizer invalid_char=""> | |
| 18 <valid initial="string.letters,string.digits"/> | |
| 19 </sanitizer> | |
| 20 </param> | |
| 21 <param name="contaminants" type="data" format="tabular" optional="true" label="Contaminant list" | |
| 22 help="tab delimited file with 2 columns: name and sequence. For example: Illumina Small RNA RT Primer CAAGCAGAAGACGGCATACGA"/> | |
| 23 </inputs> | |
| 24 <outputs> | |
| 25 <data format="html" name="html_file" label="${out_prefix}_${input_file.name}.html" /> | |
| 26 </outputs> | |
| 27 <tests> | |
| 28 <test> | |
| 29 <param name="input_file" value="1000gsample.fastq" /> | |
| 30 <param name="out_prefix" value="fastqc_out" /> | |
| 31 <param name="contaminants" value="fastqc_contaminants.txt" ftype="tabular" /> | |
| 32 <output name="html_file" file="fastqc_report.html" ftype="html" lines_diff="100"/> | |
| 33 </test> | |
| 34 </tests> | |
| 35 <help> | |
| 36 | |
| 37 .. class:: infomark | |
| 38 | |
| 39 **Purpose** | |
| 40 | |
| 41 FastQC aims to provide a simple way to do some quality control checks on raw | |
| 42 sequence data coming from high throughput sequencing pipelines. | |
| 43 It provides a modular set of analyses which you can use to give a quick | |
| 44 impression of whether your data has any problems of | |
| 45 which you should be aware before doing any further analysis. | |
| 46 | |
| 47 The main functions of FastQC are: | |
| 48 | |
| 49 - Import of data from BAM, SAM or FastQ files (any variant) | |
| 50 - Providing a quick overview to tell you in which areas there may be problems | |
| 51 - Summary graphs and tables to quickly assess your data | |
| 52 - Export of results to an HTML based permanent report | |
| 53 - Offline operation to allow automated generation of reports without running the interactive application | |
| 54 | |
| 55 | |
| 56 ----- | |
| 57 | |
| 58 | |
| 59 .. class:: infomark | |
| 60 | |
| 61 **FastQC** | |
| 62 | |
| 63 This is a Galaxy wrapper. It merely exposes the external package FastQC_ which is documented at FastQC_ | |
| 64 Kindly acknowledge it as well as this tool if you use it. | |
| 65 FastQC incorporates the Picard-tools_ libraries for sam/bam processing. | |
| 66 | |
| 67 The contaminants file parameter was borrowed from the independently developed | |
| 68 fastqcwrapper contributed to the Galaxy Community Tool Shed by J. Johnson. | |
| 69 | |
| 70 ----- | |
| 71 | |
| 72 .. class:: infomark | |
| 73 | |
| 74 **Inputs and outputs** | |
| 75 | |
| 76 FastQC_ is the best place to look for documentation - it's very good. | |
| 77 A summary follows below for those in a tearing hurry. | |
| 78 | |
| 79 This wrapper will accept a Galaxy fastq, sam or bam as the input read file to check. | |
| 80 It will also take an optional file containing a list of contaminants information, in the form of | |
| 81 a tab-delimited file with 2 columns, name and sequence. | |
| 82 | |
| 83 The tool produces a single HTML output file that contains all of the results, including the following: | |
| 84 | |
| 85 - Basic Statistics | |
| 86 - Per base sequence quality | |
| 87 - Per sequence quality scores | |
| 88 - Per base sequence content | |
| 89 - Per base GC content | |
| 90 - Per sequence GC content | |
| 91 - Per base N content | |
| 92 - Sequence Length Distribution | |
| 93 - Sequence Duplication Levels | |
| 94 - Overrepresented sequences | |
| 95 - Kmer Content | |
| 96 | |
| 97 All except Basic Statistics and Overrepresented sequences are plots. | |
| 98 .. _FastQC: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/ | |
| 99 .. _Picard-tools: http://picard.sourceforge.net/index.shtml | |
| 100 | |
| 101 </help> | |
| 102 </tool> |
