Mercurial > repos > galaxyp > pepquery2
diff pepquery2.xml @ 1:6b5ce9e2b0d0 draft
planemo upload for repository https://github.com/galaxyproteomics/tools-galaxyp/tree/master/tools/pepquery2 commit 3f50a508dbb9050be48de5685cec9a82683d8457
author | galaxyp |
---|---|
date | Sun, 02 Oct 2022 23:50:18 +0000 |
parents | 3c45645197f6 |
children | 3b2874c58bcd |
line wrap: on
line diff
--- a/pepquery2.xml Wed Sep 28 13:56:01 2022 +0000 +++ b/pepquery2.xml Sun Oct 02 23:50:18 2022 +0000 @@ -135,6 +135,9 @@ $fast -o pepquery_output | sed 's/No valid peptide/Error: No valid peptide/' | tee >(cat 1>&2) +#set $flist = str($outputs_selected).replace(',',' ') +&& for i in $flist; do for f in `find pepquery_output/*/* -name \$i`; do cat \$f >> pepquery_output/\${i}; done; done +&& for f in `find pepquery_output/*/ -name parameter.txt`; do cp \$f pepquery_output/parameter.txt; done ]]> </command> <inputs> @@ -253,61 +256,9 @@ </when> <when value="PepQueryDB"> <param name="dataset" argument="-b" type="text" value="" label="PepQueryDB dataset"> - <option value="all">all</option> - <option value="w">w:global proteome</option> - <option value="p">p:phosphorylation</option> - <option value="a">a:acetylation</option> - <option value="u">u:ubiquitination</option> - <option value="g">g:glycosylation</option> - <option value="Academia_Sinica_LUAD100_Phosphoproteome_PDC000220">Academia_Sinica_LUAD100_Phosphoproteome_PDC000220</option> - <option value="Academia_Sinica_LUAD100_Proteome_PDC000219">Academia_Sinica_LUAD100_Proteome_PDC000219</option> - <option value="CCLE_proteome_MSV000085836">CCLE_proteome_MSV000085836</option> - <option value="CPTAC">CPTAC</option> - <option value="CPTAC_CCRCC_Discovery_Study_Phosphoproteme_PDC000128">CPTAC_CCRCC_Discovery_Study_Phosphoproteme_PDC000128</option> - <option value="CPTAC_CCRCC_Discovery_Study_Proteome_PDC000127">CPTAC_CCRCC_Discovery_Study_Proteome_PDC000127</option> - <option value="CPTAC_GBM_Discovery_Study_Acetylome_PDC000245">CPTAC_GBM_Discovery_Study_Acetylome_PDC000245</option> - <option value="CPTAC_GBM_Discovery_Study_Phosphoproteome_PDC000205">CPTAC_GBM_Discovery_Study_Phosphoproteome_PDC000205</option> - <option value="CPTAC_GBM_Discovery_Study_Proteome_PDC000204">CPTAC_GBM_Discovery_Study_Proteome_PDC000204</option> - <option value="CPTAC_HNSCC_Discovery_Study_Phosphoproteome_PDC000222">CPTAC_HNSCC_Discovery_Study_Phosphoproteome_PDC000222</option> - <option value="CPTAC_HNSCC_Discovery_Study_Proteome_PDC000221">CPTAC_HNSCC_Discovery_Study_Proteome_PDC000221</option> - <option value="CPTAC_LSCC_Discovery_Study_Acetylome_PDC000233">CPTAC_LSCC_Discovery_Study_Acetylome_PDC000233</option> - <option value="CPTAC_LSCC_Discovery_Study_Phosphoproteome_PDC000232">CPTAC_LSCC_Discovery_Study_Phosphoproteome_PDC000232</option> - <option value="CPTAC_LSCC_Discovery_Study_Proteome_PDC000234">CPTAC_LSCC_Discovery_Study_Proteome_PDC000234</option> - <option value="CPTAC_LSCC_Discovery_Study_Ubiquitylome_PDC000237">CPTAC_LSCC_Discovery_Study_Ubiquitylome_PDC000237</option> - <option value="CPTAC_LUAD_Discovery_Study_Acetylome_PDC000224">CPTAC_LUAD_Discovery_Study_Acetylome_PDC000224</option> - <option value="CPTAC_LUAD_Discovery_Study_Phosphoproteome_PDC000149">CPTAC_LUAD_Discovery_Study_Phosphoproteome_PDC000149</option> - <option value="CPTAC_LUAD_Discovery_Study_Proteome_PDC000153">CPTAC_LUAD_Discovery_Study_Proteome_PDC000153</option> - <option value="CPTAC_PDA_Discovery_Study_Phosphoproteome_PDC000271">CPTAC_PDA_Discovery_Study_Phosphoproteome_PDC000271</option> - <option value="CPTAC_PDA_Discovery_Study_Proteome_PDC000270">CPTAC_PDA_Discovery_Study_Proteome_PDC000270</option> - <option value="CPTAC_Pediatric_Brain_Cancer_Pilot_Study_Phosphoproteome_PDC000176">CPTAC_Pediatric_Brain_Cancer_Pilot_Study_Phosphoproteome_PDC000176</option> - <option value="CPTAC_Pediatric_Brain_Cancer_Pilot_Study_Proteome_PDC000180">CPTAC_Pediatric_Brain_Cancer_Pilot_Study_Proteome_PDC000180</option> - <option value="CPTAC_Prospective_Breast_BI_Acetylome_PDC000239">CPTAC_Prospective_Breast_BI_Acetylome_PDC000239</option> - <option value="CPTAC_Prospective_Breast_BI_Phosphoproteome_PDC000121">CPTAC_Prospective_Breast_BI_Phosphoproteome_PDC000121</option> - <option value="CPTAC_Prospective_Breast_BI_Proteome_PDC000120">CPTAC_Prospective_Breast_BI_Proteome_PDC000120</option> - <option value="CPTAC_Prospective_Colon_PNNL_Phosphoproteome_PDC000117">CPTAC_Prospective_Colon_PNNL_Phosphoproteome_PDC000117</option> - <option value="CPTAC_Prospective_Colon_PNNL_Proteome_PDC000116">CPTAC_Prospective_Colon_PNNL_Proteome_PDC000116</option> - <option value="CPTAC_Prospective_Colon_VU_Proteome_PDC000109">CPTAC_Prospective_Colon_VU_Proteome_PDC000109</option> - <option value="CPTAC_Prospective_Ovarian_JHU_Glycoproteome_PDC000251">CPTAC_Prospective_Ovarian_JHU_Glycoproteome_PDC000251</option> - <option value="CPTAC_Prospective_Ovarian_JHU_Proteome_PDC000110">CPTAC_Prospective_Ovarian_JHU_Proteome_PDC000110</option> - <option value="CPTAC_Prospective_Ovarian_PNNL_Phosphoproteome_PDC000119">CPTAC_Prospective_Ovarian_PNNL_Phosphoproteome_PDC000119</option> - <option value="CPTAC_Prospective_Ovarian_PNNL_Proteome_Qeplus_PDC000118">CPTAC_Prospective_Ovarian_PNNL_Proteome_Qeplus_PDC000118</option> - <option value="CPTAC_TCGA_Breast_Cancer_Phosphoproteome_PDC000174">CPTAC_TCGA_Breast_Cancer_Phosphoproteome_PDC000174</option> - <option value="CPTAC_TCGA_Breast_Cancer_Proteome_PDC000173">CPTAC_TCGA_Breast_Cancer_Proteome_PDC000173</option> - <option value="CPTAC_TCGA_Colon_Cancer_Proteome_PDC000111">CPTAC_TCGA_Colon_Cancer_Proteome_PDC000111</option> - <option value="CPTAC_TCGA_Ovarian_Glycoproteome_PDC000112">CPTAC_TCGA_Ovarian_Glycoproteome_PDC000112</option> - <option value="CPTAC_TCGA_Ovarian_Phosphoproteome_PDC000115">CPTAC_TCGA_Ovarian_Phosphoproteome_PDC000115</option> - <option value="CPTAC_TCGA_Ovarian_Proteome_PDC000113_PDC000114">CPTAC_TCGA_Ovarian_Proteome_PDC000113_PDC000114</option> - <option value="CPTAC_UCEC_Discovery_Study_Acetylome_PDC000226">CPTAC_UCEC_Discovery_Study_Acetylome_PDC000226</option> - <option value="CPTAC_UCEC_Discovery_Study_Phosphoproteome_PDC000126">CPTAC_UCEC_Discovery_Study_Phosphoproteome_PDC000126</option> - <option value="CPTAC_UCEC_Discovery_Study_Proteome_PDC000125">CPTAC_UCEC_Discovery_Study_Proteome_PDC000125</option> - <option value="Deep_29_healthy_human_tissues_PXD010154">Deep_29_healthy_human_tissues_PXD010154</option> - <option value="GTEx_32_Tissues_Proteome_PXD016999">GTEx_32_Tissues_Proteome_PXD016999</option> - <option value="HBV_Related_Hepatocellular_Carcinoma_Phosphoproteome_PDC000199">HBV_Related_Hepatocellular_Carcinoma_Phosphoproteome_PDC000199</option> - <option value="HBV_Related_Hepatocellular_Carcinoma_Proteome_PDC000198">HBV_Related_Hepatocellular_Carcinoma_Proteome_PDC000198</option> - <option value="Oral_Squamous_Cell_Carcinoma_Study_Proteome_PDC000262">Oral_Squamous_Cell_Carcinoma_Study_Proteome_PDC000262</option> - <option value="Proteogenomics_of_Gastric_Cancer_Glycoproteome_PDC000216">Proteogenomics_of_Gastric_Cancer_Glycoproteome_PDC000216</option> - <option value="Proteogenomics_of_Gastric_Cancer_Phosphoproteome_PDC000215">Proteogenomics_of_Gastric_Cancer_Phosphoproteome_PDC000215</option> - <option value="Proteogenomics_of_Gastric_Cancer_Proteome_PDC000214">Proteogenomics_of_Gastric_Cancer_Proteome_PDC000214</option> + <help>PepQueryDB dataset IDs (separated by commas).</help> + <expand macro="pepquerydb_options" /> + <validator type="regex" message="PepQueryDB dataset_name(,dataset_name)">^[a-zA-Z][^,]*(,[a-zA-Z][^,]*)*$</validator> </param> </when> <when value="public"> @@ -323,31 +274,19 @@ </param> </section> - <param name="parameter_set" argument="-p" type="text" value="" optional="true" label="MS/MS searching parameter set name"> - <help>Currently supported set names start with: MS1 or TMT</help> - <option value="MS1_H_MS2_H_LF">MS1_H_MS2_H_LF</option> - <option value="MS1_H_MS2_L_LF">MS1_H_MS2_L_LF</option> - <option value="TMT10_11">TMT10_11</option> - <option value="TMT10_11_MS2_L">TMT10_11_MS2_L</option> - <option value="TMT10_11_MS2_L_phosphorylation">TMT10_11_MS2_L_phosphorylation</option> - <option value="TMT10_11_acetylation">TMT10_11_acetylation</option> - <option value="TMT10_11_glycosylation">TMT10_11_glycosylation</option> - <option value="TMT10_11_phosphorylation">TMT10_11_phosphorylation</option> - <option value="TMT10_11_ubiquitination">TMT10_11_ubiquitination</option> - </param> + <param name="parameter_set" argument="-p" type="text" value="" optional="true" label="MS/MS searching parameter set name"> + <help>Currently supported set names start with: MS1 or TMT</help> + <option value="MS1_H_MS2_H_LF">MS1_H_MS2_H_LF</option> + <option value="MS1_H_MS2_L_LF">MS1_H_MS2_L_LF</option> + <option value="TMT10_11">TMT10_11</option> + <option value="TMT10_11_MS2_L">TMT10_11_MS2_L</option> + <option value="TMT10_11_MS2_L_phosphorylation">TMT10_11_MS2_L_phosphorylation</option> + <option value="TMT10_11_acetylation">TMT10_11_acetylation</option> + <option value="TMT10_11_glycosylation">TMT10_11_glycosylation</option> + <option value="TMT10_11_phosphorylation">TMT10_11_phosphorylation</option> + <option value="TMT10_11_ubiquitination">TMT10_11_ubiquitination</option> + </param> -<!-- -TMT10_11_MS2_L_phosphorylation - Fixed modification: 1,11,12 = Carbamidomethylation of C,TMT 10-plex of K,TMT 10-plex of peptide N-term - Variable modification: 2,7,8,9 = Oxidation of M,Phosphorylation of S,Phosphorylation of T,Phosphorylation of Y - Enzyme: 1 - Max Missed cleavages: 1 - Precursor mass tolerance: 20.0 - Precursor ion mass tolerance unit: ppm - Fragment ion mass tolerance: 0.6 - Fragment ion mass tolerance unit: Da ---> - <section name="modifications" title="Modifications" expanded="false"> <param name="fixed_mod" argument="-fixMod" type="select" label="Fixed modification(s)" multiple="true" optional="true"> <help>default: 1: Carbamidomethylation of C [57.02146372057]</help> @@ -384,7 +323,7 @@ <option value="ppm" selected="true">ppm</option> <option value="Da">Da</option> </param> - <param name="tolerance" argument="-itol" type="float" value="0.6" optional="true" label="Tolerance" help="Error window for MS/MS fragment ion mass values in Da unit. Default: 0.6 Da" /> + <param name="tolerance" argument="-itol" type="float" value="" optional="true" label="Tolerance" help="Error window for MS/MS fragment ion mass values in Da unit. Default: 0.6 Da" /> </section> <section name="search" title="PSM" expanded="false"> @@ -398,7 +337,7 @@ </param> <param name="extra_score_validation" argument="-x" type="boolean" truevalue="-x" falsevalue="" checked="false" label="Add extra score validation" help="use two scoring algorithms for peptide identification" /> <param name="min_charge" argument="-minCharge" type="integer" value="" optional="true" label="Minimum Charge" help="The minimum charge to consider if the charge state is not available. Default: 2"/> - <param name="max_charge" argument="-maxCharge" type="integer" value="" optional="true" label="Maximum Charge" help="The maximum charge to consider if the charge state is not available. Default: 2" /> + <param name="max_charge" argument="-maxCharge" type="integer" value="" optional="true" label="Maximum Charge" help="The maximum charge to consider if the charge state is not available. Default: 3" /> <param name="min_peaks" argument="-minPeaks" type="integer" value="" optional="true" label="Minimum Peaks" help="Min peaks in spectrum. Default: 10" /> <param name="isotope_error" argument="-ti" type="text" value="" optional="true" label="Isotope peak error range"> <help>A comma-sepated range of integers from -2 to 2, e.g. '-1,0,1,2' Default: 0</help> @@ -412,7 +351,6 @@ </section> <param name="outputs_selected" type="select" multiple="true" optional="false" label="Select outputs"> - <option value="parameter.txt" selected="true">parameter.txt</option> <option value="psm.txt" selected="true">psm.txt</option> <option value="psm_rank.txt" selected="true">psm_rank.txt</option> <option value="psm_rank.mgf" selected="true">psm_rank.mgf</option> @@ -427,13 +365,11 @@ <param name="fast" argument="-fast" type="boolean" truevalue="-fast" falsevalue="" checked="false" label="Use fast mode for searching" help="In fast mode, only one better match from reference peptide-based competitive filtering steps will be returned. A peptide identified or not is not affected by this setting. For most applications, fast mode will speed up the analysis." /> </inputs> <outputs> + <data name="parameter_txt" format="txt" from_work_dir="pepquery_output/parameter.txt" label="${tool.name} on ${on_string}: parameter.txt"> + </data> <data name="ms_index" format="txt" label="${tool.name} on ${on_string}: index summary.txt" from_work_dir="index_dir/summary.txt"> <filter>'ms_index' in outputs_selected and req_inputs['ms_dataset']['ms_dataset_type'] == 'history'</filter> </data> - <data name="parameter_txt" format="txt" from_work_dir="pepquery_output/parameter.txt" label="${tool.name} on ${on_string}: parameter.txt"> - <filter>'parameter.txt' in outputs_selected</filter> - </data> - <data name="psm_txt" format="tabular" from_work_dir="pepquery_output/psm.txt" label="${tool.name} on ${on_string}: psm.txt"> <filter>'psm.txt' in outputs_selected</filter> <actions> @@ -492,9 +428,74 @@ </outputs> <tests> - <!-- Test-1 --> -<!-- + <!-- Test-1 PepQueryDB peptide gencode:human --> <test> + <conditional name="validation"> + <param name="task_type" value="novel"/> + </conditional> + <section name="req_inputs"> + <conditional name="input_type"> + <param name="input_type_selector" value="peptide"/> + <conditional name="multiple"> + <param name="peptide_input_selector" value="single" /> + <param name="input" value="LVVVGADGVGK"/> + </conditional> + </conditional> + <conditional name="db_type"> + <param name="db_type_selector" value="download" /> + <param name="db_id" value="gencode:human"/> + </conditional> + <conditional name="ms_dataset"> + <param name="ms_dataset_type" value="PepQueryDB"/> + <param name="dataset" value="CPTAC_LUAD_Discovery_Study_Proteome_PDC000153" /> + </conditional> + <param name="indexType" value="1"/> + </section> + <param name="parameter_set" value=""/> + <section name="modifications"> + <param name="fixed_mod" value="1"/> + <param name="var_mod" value="2"/> + <param name="max_mods" value="3"/> + <param name="unmodified" value="True"/> + <param name="aa" value="False"/> + </section> + <section name="digestion"> + <param name="enzyme" value="1"/> + <param name="max_missed_cleavages" value="2"/> + </section> + <section name="ms_params"> + <section name="tolerance_params"> + <param name="precursor_tolerance" value="10"/> + <param name="precursor_unit" value="ppm"/> + <param name="tolerance" value="0.6"/> + </section> + <section name="search"> + <param name="frag_method" value="1"/> + <param name="scoring_method" value="1"/> + <param name="extra_score_validation" value="False"/> + <param name="min_charge" value="2"/> + <param name="max_charge" value="3"/> + <param name="min_peaks" value="10"/> + <param name="isotope_error" value="0"/> + <param name="min_score" value="12"/> + <param name="min_length" value="7"/> + <param name="max_length" value="45"/> + <param name="num_random_peptides" value="1000"/> + </section> + </section> + <output name="psm_txt"> + <assert_contents> + <has_text text="LVVVGADGVGK" /> + <has_text text="02CPTAC_LUAD_W_BI_20180518_KR_f15:25149:2" /> + </assert_contents> + </output> + </test> + + <!-- Test-2 MGF peptide Uniprot.fasta --> + <test> + <conditional name="validation"> + <param name="task_type" value="novel"/> + </conditional> <section name="req_inputs"> <conditional name="input_type"> <param name="input_type_selector" value="peptide"/> @@ -503,170 +504,30 @@ <param name="input" value="ELGSSDLTAR"/> </conditional> </conditional> - <param name="db_file" ftype="fasta" value="Uniprot.fasta"/> - <param name="spectrum_file" ftype="mgf" value="iTRAQ_f4.mgf"/> - </section> - <section name="modifications"> - <param name="fixed_mod" value="6,103,157"/> - <param name="var_mod" value="117"/> - <param name="max_mods" value="3"/> - <param name="unmodified" value="False"/> - <param name="aa" value="True"/> + <conditional name="db_type"> + <param name="db_type_selector" value="history" /> + <param name="db_file" ftype="fasta" value="Uniprot.fasta"/> + </conditional> + <conditional name="ms_dataset"> + <param name="ms_dataset_type" value="history"/> + <param name="spectrum_files" ftype="mgf" value="iTRAQ_f4.mgf"/> + </conditional> + <param name="indexType" value="1"/> </section> - <section name="ms_params"> - <section name="tolerance_params"> - <param name="precursor_tolerance" value="10"/> - <param name="precursor_unit" value="ppm"/> - <param name="tolerance" value="0.6"/> - </section> - <section name="digestion"> - <param name="enzyme" value="0"/> - <param name="max_missed_cleavages" value="2"/> - </section> - <section name="search"> - <param name="frag_method" value="1"/> - <param name="scoring_method" value="1"/> - <param name="max_charge" value="3"/> - <param name="min_charge" value="2"/> - <param name="min_peaks" value="10"/> - <param name="min_score" value="12"/> - <param name="max_length" value="45"/> - <param name="num_random_peptides" value="1000"/> - </section> - </section> - <output name="psm_rank_txt"> - <assert_contents> - <has_text text="ELGSSDLTAR" /> - <has_line_matching expression="ELGSSDLTAR\tiTRAQ 4-plex of peptide N-term@0\S+\t2\tiTRAQ_f4.mgf\t2\t2\t1191.6\d+\t2.0\d+\t1191.62\d+\t596.81\d+\t18.68\d+\t0\t20\t5\t995\t0.006\d+\t1"/> - </assert_contents> - </output> - </test> ---> - - <!-- Test-2 --> -<!-- - <test> - <section name="req_inputs"> - <conditional name="input_type"> - <param name="input_type_selector" value="peptide"/> - <conditional name="multiple"> - <param name="peptide_input_selector" value="multiple" /> - <param name="input" ftype="tabular" value="novel_peptides"/> - </conditional> - </conditional> - <param name="db_file" ftype="fasta" value="Uniprot.fasta"/> - <param name="spectrum_file" ftype="mgf" value="iTRAQ_f4.mgf"/> - </section> + <param name="parameter_set" value=""/> <section name="modifications"> - <param name="fixed_mod" value="6,103,157"/> - <param name="var_mod" value="117"/> - <param name="max_mods" value="3"/> - <param name="unmodified" value="False"/> - <param name="aa" value="True"/> - </section> - <section name="ms_params"> - <section name="tolerance_params"> - <param name="precursor_tolerance" value="10"/> - <param name="precursor_unit" value="ppm"/> - <param name="tolerance" value="0.6"/> - </section> - <section name="digestion"> - <param name="enzyme" value="0"/> - <param name="max_missed_cleavages" value="2"/> - </section> - <section name="search"> - <param name="frag_method" value="1"/> - <param name="scoring_method" value="1"/> - <param name="max_charge" value="3"/> - <param name="min_charge" value="2"/> - <param name="min_peaks" value="10"/> - <param name="min_score" value="12"/> - <param name="max_length" value="45"/> - <param name="num_random_peptides" value="1000"/> - </section> - </section> - <output name="psm_rank_txt"> - <assert_contents> - <has_text text="ELGSSDLTAR" /> - <has_text text="SPYREFTDHLVK" /> - <has_line_matching expression="SPYREFTDHLVK\tiTRAQ 4-plex of K@12\S+;iTRAQ 4-plex of peptide N-term@0\S+\t1\tiTRAQ_f4.mgf\t4\t3\t1778.\d+\t3.02\d+\t1778.95\d+\t593.99\d+\t12.17\d+\t2\t14\t-1\t-1\t100.0\t1"/> - </assert_contents> - </output> - </test> ---> - - <!-- Test-3 --> -<!-- - <test> - <section name="req_inputs"> - <conditional name="input_type"> - <param name="input_type_selector" value="peptide"/> - <conditional name="multiple"> - <param name="peptide_input_selector" value="multiple" /> - <param name="input" ftype="tabular" value="novel_peptides"/> - </conditional> - </conditional> - <param name="db_file" ftype="fasta" value="Uniprot.fasta"/> - <param name="spectrum_file" ftype="mgf" value="iTRAQ_f4.mgf"/> - </section> - <section name="modifications"> - <param name="fixed_mod" value="6,103,157"/> - <param name="var_mod" value="117"/> + <!-- 21: iTRAQ 4-plex of K [144.1020624208] --> + <!-- 22: iTRAQ 4-plex of peptide N-term [144.1020624208] --> + <param name="fixed_mod" value="1,21,22"/> + <!-- 2: Oxidation of M [15.99491461956] --> + <param name="var_mod" value="2"/> <param name="max_mods" value="3"/> <param name="unmodified" value="True"/> - <param name="aa" value="True"/> - </section> - <section name="ms_params"> - <section name="tolerance_params"> - <param name="precursor_tolerance" value="10"/> - <param name="precursor_unit" value="ppm"/> - <param name="tolerance" value="0.6"/> - </section> - <section name="digestion"> - <param name="enzyme" value="0"/> - <param name="max_missed_cleavages" value="1"/> - </section> - <section name="search"> - <param name="frag_method" value="1"/> - <param name="scoring_method" value="1"/> - <param name="max_charge" value="3"/> - <param name="min_charge" value="2"/> - <param name="min_peaks" value="7"/> - <param name="min_score" value="10"/> - <param name="max_length" value="45"/> - <param name="num_random_peptides" value="1000"/> - </section> + <param name="aa" value="False"/> </section> - <output name="psm_rank_txt"> - <assert_contents> - <has_text text="ELGSSDLTAR" /> - <has_text text="SPYREFTDHLVK" /> - <has_line_matching expression="ELGSSDLTAR\tiTRAQ 4-plex of peptide N-term@0\S+\t2\t3\t2\t1191.6\d+\t-3.04\d+\t1191.62\d+\t596.8\d+\t24.18\d+\t0\t22\t1\t995\t0.002\d+\t1\t0\tYes" /> - </assert_contents> - </output> - </test> ---> - - <!-- Test-4 --> -<!-- - <test> - <section name="req_inputs"> - <conditional name="input_type"> - <param name="input_type_selector" value="1"/> - <conditional name="multiple"> - <param name="protein_input_selector" value="multiple" /> - <param name="input" ftype="fasta" value="novel_proteins.fa"/> - </conditional> - </conditional> - <param name="db_file" ftype="fasta" value="Uniprot.fasta"/> - <param name="spectrum_file" ftype="mgf" value="iTRAQ_f4.mgf"/> - </section> - <section name="modifications"> - <param name="fixed_mod" value="6,103,157"/> - <param name="var_mod" value="117"/> - <param name="max_mods" value="3"/> - <param name="unmodified" value="False"/> - <param name="aa" value="True"/> + <section name="digestion"> + <param name="enzyme" value="1"/> + <param name="max_missed_cleavages" value="2"/> </section> <section name="ms_params"> <section name="tolerance_params"> @@ -674,193 +535,36 @@ <param name="precursor_unit" value="ppm"/> <param name="tolerance" value="0.6"/> </section> - <section name="digestion"> - <param name="enzyme" value="0"/> - <param name="max_missed_cleavages" value="2"/> - </section> <section name="search"> <param name="frag_method" value="1"/> <param name="scoring_method" value="1"/> - <param name="max_charge" value="3"/> + <param name="extra_score_validation" value="False"/> <param name="min_charge" value="2"/> + <param name="max_charge" value="3"/> <param name="min_peaks" value="10"/> + <param name="isotope_error" value="0"/> <param name="min_score" value="12"/> + <param name="min_length" value="7"/> <param name="max_length" value="45"/> <param name="num_random_peptides" value="1000"/> </section> </section> - <output name="psm_rank_txt"> - <assert_contents> - <has_text text="ELGSSDLTAR" /> - <has_text text="SPYREFTDHLVK" /> - </assert_contents> - </output> - </test> ---> - - <!-- Test-5 --> -<!-- - <test> - <section name="req_inputs"> - <conditional name="input_type"> - <param name="input_type_selector" value="2"/> - <param name="input" value="gaactgggcagcagcgatctgaccgcgcgcagcccgtatcgcgaatttaccgatcatctggtgaaa"/> - </conditional> - <param name="db_file" ftype="fasta" value="Uniprot.fasta"/> - <param name="spectrum_file" ftype="mgf" value="iTRAQ_f4.mgf"/> - </section> - <section name="modifications"> - <param name="fixed_mod" value="6,103,157"/> - <param name="var_mod" value="117"/> - <param name="max_mods" value="3"/> - <param name="unmodified" value="False"/> - <param name="aa" value="True"/> - </section> - <section name="ms_params"> - <section name="tolerance_params"> - <param name="precursor_tolerance" value="10"/> - <param name="precursor_unit" value="ppm"/> - <param name="tolerance" value="0.6"/> - </section> - <section name="digestion"> - <param name="enzyme" value="0"/> - <param name="max_missed_cleavages" value="2"/> - </section> - <section name="search"> - <param name="frag_method" value="1"/> - <param name="scoring_method" value="1"/> - <param name="max_charge" value="3"/> - <param name="min_charge" value="2"/> - <param name="min_peaks" value="10"/> - <param name="min_score" value="12"/> - <param name="max_length" value="45"/> - <param name="num_random_peptides" value="1000"/> - </section> - </section> - <output name="psm_rank_txt"> + <output name="psm_txt"> <assert_contents> <has_text text="ELGSSDLTAR" /> </assert_contents> </output> - </test> ---> - - <!-- Test-6 --> -<!-- - <test> - <section name="req_inputs"> - <conditional name="input_type"> - <param name="input_type_selector" value="peptide"/> - <conditional name="multiple"> - <param name="peptide_input_selector" value="multiple" /> - <param name="input" ftype="tabular" value="novel_peptides"/> - </conditional> - </conditional> - <param name="db_file" ftype="fasta" value="Uniprot.fasta"/> - <param name="spectrum_file" ftype="mgf" value="immunopeptidomics.mgf"/> - <param name="indexType" value="2"/> - <conditional name="tags"> - <param name="tagType" value="PepQuery"/> - <param name="tag_file" ftype="tabular" value="test.tags"/> - <param name="tagIndexType" value="2"/> - </conditional> - </section> - <section name="modifications"> - <param name="fixed_mod" value=""/> - <param name="var_mod" value="117,114,118,128"/> - <param name="max_mods" value="3"/> - <param name="unmodified" value="True"/> - <param name="aa" value="True"/> - </section> - <section name="ms_params"> - <section name="tolerance_params"> - <param name="precursor_tolerance" value="20"/> - <param name="precursor_unit" value="ppm"/> - <param name="tolerance" value="0.02"/> - </section> - <section name="digestion"> - <param name="enzyme" value="0"/> - <param name="max_missed_cleavages" value="0"/> - </section> - <section name="search"> - <param name="frag_method" value="1"/> - <param name="scoring_method" value="1"/> - <param name="max_charge" value="3"/> - <param name="min_charge" value="2"/> - <param name="min_peaks" value="10"/> - <param name="min_score" value="12"/> - <param name="max_length" value="25"/> - <param name="num_random_peptides" value="1000"/> - </section> - </section> <output name="psm_rank_txt"> <assert_contents> - <has_text text="MTDRHAGTY" /> - <has_text text="controllerType=0 controllerNumber=1 scan=19905" /> + <has_text text="ELGSSDLTAR" /> + <has_line_matching expression="ELGSSDLTAR\tiTRAQ 4-plex of peptide N-term@0\[144.1\d+\]\t2\tiTRAQ_f4:3:2\t2\t1191.62\d+\t-3.04\d+\t-0.003\d+\t0.0\t1191.6\d+\t596.8\d+\t24.1\d+\t0\t0\t1\t995\t0.002\d+\t1\t0\tYes\t24.1\d+\t24.1\d+"/> </assert_contents> </output> </test> ---> - - <!-- Test-7 --> -<!-- - <test> - <section name="req_inputs"> - <conditional name="input_type"> - <param name="input_type_selector" value="peptide"/> - <conditional name="multiple"> - <param name="peptide_input_selector" value="multiple" /> - <param name="input" ftype="tabular" value="novel_peptides"/> - </conditional> - </conditional> - <param name="db_file" ftype="fasta" value="Uniprot.fasta"/> - <param name="spectrum_file" ftype="mgf" value="immunopeptidomics.mgf"/> - <param name="indexType" value="2"/> - <conditional name="tags"> - <param name="tagType" value="pFind"/> - <param name="qry_res" ftype="txt" value="pFind.qry.res"/> - </conditional> - </section> - <section name="modifications"> - <param name="fixed_mod" value=""/> - <param name="var_mod" value="117,114,118,128"/> - <param name="max_mods" value="3"/> - <param name="unmodified" value="True"/> - <param name="aa" value="True"/> - </section> - <section name="ms_params"> - <section name="tolerance_params"> - <param name="precursor_tolerance" value="20"/> - <param name="precursor_unit" value="ppm"/> - <param name="tolerance" value="0.02"/> - </section> - <section name="digestion"> - <param name="enzyme" value="0"/> - <param name="max_missed_cleavages" value="0"/> - </section> - <section name="search"> - <param name="frag_method" value="1"/> - <param name="scoring_method" value="1"/> - <param name="max_charge" value="3"/> - <param name="min_charge" value="2"/> - <param name="min_peaks" value="10"/> - <param name="min_score" value="12"/> - <param name="max_length" value="25"/> - <param name="num_random_peptides" value="1000"/> - </section> - </section> - <output name="psm_rank_txt"> - <assert_contents> - <has_text text="MTDRHAGTY" /> - <has_text text="controllerType=0 controllerNumber=1 scan=19905" /> - </assert_contents> - </output> - </test> ---> </tests> <help><![CDATA[ -**PepQuery** +**PepQuery2** PepQuery_ is a universal targeted peptide search engine for identifying or validating known and novel peptides of interest in any local or publicly available mass spectrometry-based proteomics datasets. @@ -872,23 +576,53 @@ **Inputs** - A sequence to match, one of the following: - - A peptide string or a history dataset with a list of peptides + - A peptide string (or strings separated by commas) + - A history dataset with a list of peptides - A protein string or a history dataset with a protein fasta - A DNA string that is at least 60 base pairs in length - - MS/MS data used for identification: + + - MS/MS data used for identification, one of the following: + + - Mass Spectrometry history datasets in MGF, mzML, or mzXML format + - An Indexed MS/MS dataset (from previous PepQuery2 run or from **PepQuery2 index** tool.) + - PepQueryDB dataset IDs + + .. - - A mass spectrometry history datasets in MGF, mzML, or mzXML format - - An Indexed MS/MS dataset - - PepQueryDB dataset IDs + Multiple datasets from PepQueryDB must be separated by comma. A pattern to match datasets in PepQueryDB is also supported, for example, use 'CPTAC' to search all datasets contain 'CPTAC'. In addition, dataset selection from PepQueryDB based on data type (w:global proteome, p:phosphorylation, a:acetylation, u:ubiquitination, g:glycosylation) is also supported. For example, use 'p' to search all phosphoproteomics datasets in PepQueryDB. The **PepQuery2 Show Sets** tool will list available PepQueryDB datasets. + + - Dataset IDs from public proteomics data repositories: PRIDE, MassIVE, jPOSTrepo and iProX + .. + + Dataset ID from public proteomics data repositories, one dataset is supported for each analysis. For example, use 'PXD000529' to use all MS/MS data from dataset PXD000529 or use 'PXD000529:LM3' to use data files containing LM3 from dataset PXD000529 + - A reference protein fasta database, novel peptides matching a reference sequence will be excluded. - A protein fasta file - The ID for a public reference protein database from RefSeq, GENCODE, Ensembl or UniProt. +**Options** + + - MS/MS searching parameter set name + + .. + + Setting a *parameter set name* will change defaults for various options, These may be overridden by manually setting the option. + The **PepQuery2 Show Sets** tool *PepQuery Predefined Parameter Sets* will list those available along with the option values that will be set. + The **PepQuery2 Show Sets** tool *PepQuery Datasets* column *parameter_set* column for each PepQueryDB dataset. + + + - Override default options + + .. + + Values for modifications are provided in a select list. + The **PepQuery2 Show Sets** tool *PepQuery Modifications* lists all available modifications. + **Outputs** - Parameters: