Mercurial > repos > recetox > recetox_aplcms_generate_feature_table
comparison help.xml @ 0:17de8e7ce3ce draft
planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/recetox_aplcms commit 506df2aef355b3791567283e1a175914f06b405a
| author | recetox |
|---|---|
| date | Mon, 13 Feb 2023 10:24:23 +0000 |
| parents | |
| children | a1f2df6ec4fb |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:17de8e7ce3ce |
|---|---|
| 1 <macros> | |
| 2 | |
| 3 <token name="@GENERAL_HELP@"> | |
| 4 General Information | |
| 5 =================== | |
| 6 | |
| 7 Overview | |
| 8 -------- | |
| 9 | |
| 10 recetox-aplcms is a software package for peak detection in high resolution mass spectrometry (HRMS) data. | |
| 11 It supports reading .mzml files in raw profile mode and uses a bi-Gaussian chromatographic peak shape for feature detection and quantification. | |
| 12 | |
| 13 recetox-aplcms is based on the apLCMS package developed by Tianwei Yu at Emory University - see the citations and the apLCMS section beneath. | |
| 14 This version includes various software updates and is actively developed and maintained on `GitHub`_. | |
| 15 Please submit eventual bug reports as `issues`_ on the repository. | |
| 16 | |
| 17 .. _GitHub: https://github.com/RECETOX/recetox-aplcms | |
| 18 .. _issues: https://github.com/RECETOX/recetox-aplcms/issues/new | |
| 19 | |
| 20 | |
| 21 Workflow | |
| 22 -------- | |
| 23 | |
| 24 .. image:: https://raw.githubusercontent.com/RECETOX/galaxytools/aee0dd6cf6c05936269efe4337c50e27cc68e86b/tools/recetox_aplcms/images/scheme.png | |
| 25 :width: 2560 | |
| 26 :height: 788 | |
| 27 :scale: 40 | |
| 28 :alt: A picture of a workflow diagram. | |
| 29 | |
| 30 The individual steps of the recetox-aplcms package can be combined in 2 separate workflows processing HRMS data in an unsupervised manner or by including a-priori knowledge. | |
| 31 The workflows consist of the following building blocks: | |
| 32 | |
| 33 (1) remove noise - denoise the raw data and extract the EIC | |
| 34 (2) generate feature table - group features in EIC into peaks using peak-shape model | |
| 35 (3) compute clusters - compute mz and rt clusters across samples | |
| 36 (4) compute template - find the template for rt correction | |
| 37 (5) correct time - correct the rt across samples using splines | |
| 38 (6) align features - align identical features across samples | |
| 39 (7) recover weaker signals - recover missed features in samples based on the aligned features | |
| 40 (8) merge known table - add known features to detected features table and vice versa | |
| 41 | |
| 42 For detailed documentation on the individual steps please see the individual tool wrappers. | |
| 43 | |
| 44 | |
| 45 apLCMS (Original Reference) | |
| 46 --------------------------- | |
| 47 | |
| 48 apLCMS is a software which generates a feature table from a batch of LC/MS spectra. The m/z and retention time | |
| 49 tolerance levels are estimated from the data. A run-filter is used to detect peaks and remove noise. | |
| 50 Non-parametric statistical methods are used to find-tune peak selection and grouping. After retention time | |
| 51 correction, a feature table is generated by aligning peaks across spectra. For further information on apLCMS | |
| 52 please refer to https://mypage.cuhk.edu.cn/academics/yutianwei/apLCMS/. | |
| 53 </token> | |
| 54 | |
| 55 <token name="@REMOVE_NOISE_HELP@"> | |
| 56 recetox-aplcms - remove noise | |
| 57 ============================= | |
| 58 | |
| 59 This tool is the first step of recetox-aplcms. | |
| 60 It removes noise from the raw data and performs a first clustering step of points with close m/z values into the extracted ion chromatograms (EICs). | |
| 61 Only peaks with a minimum elution length of `min_run` seconds are kept. | |
| 62 | |
| 63 Example Output | |
| 64 -------------- | |
| 65 The raw data points contained in the scans of the `mzml` file are filtered for noise and grouped into clusters based on m/z values. | |
| 66 See an example output in the table below. The `group_number` column indicates the cluster index. | |
| 67 | |
| 68 +----------------------+-------------------+-----------------------+--------------------+ | |
| 69 | mz | rt | intensity | group_number | | |
| 70 +======================+===================+=======================+====================+ | |
| 71 | 70.01060119055192 | 350.58654 | 21178.330810546875 | 5 | | |
| 72 +----------------------+-------------------+-----------------------+--------------------+ | |
| 73 | 70.02334120404554 | 130.175262 | 287869.5478515625 | 10 | | |
| 74 +----------------------+-------------------+-----------------------+--------------------+ | |
| 75 | 70.0287408273165 | 134.801352 | 60883.15185546875 | 11 | | |
| 76 +----------------------+-------------------+-----------------------+--------------------+ | |
| 77 | 70.02872416715464 | 183.991896 | 9201.574584960938 | 11 | | |
| 78 +----------------------+-------------------+-----------------------+--------------------+ | |
| 79 | ... | ... | ... | ... | | |
| 80 +----------------------+-------------------+-----------------------+--------------------+ | |
| 81 </token> | |
| 82 | |
| 83 <token name="@GENERATE_FEATURE_TABLE_HELP@"> | |
| 84 recetox-aplcms - generate feature table | |
| 85 ======================================= | |
| 86 The second step in the recetox-aplcms workflow performing peak shape parameter estimation. | |
| 87 | |
| 88 This tool takes the grouped features created with `recetox-aplcms-remove-noise` and computes the peak shape in `rt` domain and integrates the peak area. | |
| 89 | |
| 90 | |
| 91 Example Output | |
| 92 -------------- | |
| 93 The output contains the `mz` and `rt` of the peaks as well as the standard deviation in both direction of the peak for the bi-gaussian peak shape. | |
| 94 | |
| 95 +----------------------+-------------------+-----------------+-------------------+----------------------+ | |
| 96 | mz | rt | sd1 | sd2 | area | | |
| 97 +======================+===================+=================+===================+======================+ | |
| 98 | 70.02317542938793 | 142.36033 | 11.436659559 | 14.592754933 | 4159269.24595184 | | |
| 99 +----------------------+-------------------+-----------------+-------------------+----------------------+ | |
| 100 | 70.02869594233522 | 205.48765 | 0.263230763 | 0.285101428707 | 8849767.11861127 | | |
| 101 +----------------------+-------------------+-----------------+-------------------+----------------------+ | |
| 102 | 78.04643252598305 | 294.01713 | 0.51677558617 | 1.317028944141 | 1333044.50659719 | | |
| 103 +----------------------+-------------------+-----------------+-------------------+----------------------+ | |
| 104 | ... | ... | ... | ... | ... | | |
| 105 +----------------------+-------------------+-----------------+-------------------+----------------------+ | |
| 106 </token> | |
| 107 | |
| 108 <token name="@COMPUTE_CLUSTERS_HELP@"> | |
| 109 recetox-aplcms - compute clusters | |
| 110 ================================= | |
| 111 | |
| 112 Group features with `mz` and `rt` using tolerances within the tolerance into clusters, creating larger features from raw data points. | |
| 113 Custom tolerances for `mz` and `rt` are computed based on the given parameters. | |
| 114 The tool takes a collection of all detected features and computes the clusters over a global feature table, adding the `sample_id` and `cluster` columns to the table. | |
| 115 | |
| 116 Example Output | |
| 117 -------------- | |
| 118 | |
| 119 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 120 | mz | rt | sd1 | sd2 | area | sample_id | cluster | | |
| 121 +======================+===================+=================+===================+======================+=====================+===============+ | |
| 122 | 70.02317542938793 | 142.36033 | 11.436659559 | 14.592754933 | 4159269.245951841 | 21_qc_no_dil_milliq | 7 | | |
| 123 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 124 | 70.02869594233522 | 205.48765 | 0.263230763 | 0.285101428707 | 8849767.11861127 | 21_qc_no_dil_milliq | 9 | | |
| 125 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 126 | 78.04643252598305 | 294.01713 | 0.51677558617 | 1.317028944141 | 1333044.506597194 | 21_qc_no_dil_milliq | 13 | | |
| 127 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 128 | ... | ... | ... | ... | ... | ... | ... | | |
| 129 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 130 </token> | |
| 131 | |
| 132 <token name="@CORRECT_TIME_HELP@"> | |
| 133 recetox-aplcms - correct time | |
| 134 ============================= | |
| 135 | |
| 136 Apply spline-based retention time correction to a feature table given the template table and the computed `mz` and `rt` tolerances. | |
| 137 | |
| 138 Example Output | |
| 139 -------------- | |
| 140 The output has the same format as `compute clusters` but the retention time values are corrected based on the template table. | |
| 141 | |
| 142 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 143 | mz | rt | sd1 | sd2 | area | sample_id | cluster | | |
| 144 +======================+===================+=================+===================+======================+=====================+===============+ | |
| 145 | 70.02317542938793 | 142.36033 | 11.436659559 | 14.592754933 | 4159269.245951841 | 21_qc_no_dil_milliq | 7 | | |
| 146 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 147 | 70.02869594233522 | 205.48765 | 0.263230763 | 0.285101428707 | 8849767.11861127 | 21_qc_no_dil_milliq | 9 | | |
| 148 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 149 | 78.04643252598305 | 294.01713 | 0.51677558617 | 1.317028944141 | 1333044.506597194 | 21_qc_no_dil_milliq | 13 | | |
| 150 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 151 | ... | ... | ... | ... | ... | ... | ... | | |
| 152 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 153 </token> | |
| 154 <token name="@COMPUTE_TEMPLATE_HELP@"> | |
| 155 recetox-aplcms - compute template | |
| 156 ================================= | |
| 157 Compute the template from a set of feature tables, choosing the one with the most features as the template. | |
| 158 </token> | |
| 159 | |
| 160 <token name="@RECOVER_WEAKER_SIGNALS_HELP@"> | |
| 161 recetox-aplcms - recover weaker signals | |
| 162 ======================================= | |
| 163 Second stage peak detection based on the aligned feature table from the `feature alignment` step. | |
| 164 If a feature is contained in the aligned feature table, this step revisits the raw data and searches | |
| 165 for this feature at the retention time obtained by mapping the corrected retention time back to the original sample. | |
| 166 | |
| 167 This recovers features which are present in a sample but might have been filtered out initially as noise due to low signal intensity. | |
| 168 | |
| 169 Example Output | |
| 170 -------------- | |
| 171 The table has the same format as the `compute clusters` output but might contain additional features which have been extracted based | |
| 172 on their presence in the aligned feature table. | |
| 173 | |
| 174 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 175 | mz | rt | sd1 | sd2 | area | sample_id | cluster | | |
| 176 +======================+===================+=================+===================+======================+=====================+===============+ | |
| 177 | 70.02317542938793 | 142.36033 | 11.436659559 | 14.592754933 | 4159269.245951841 | 21_qc_no_dil_milliq | 7 | | |
| 178 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 179 | 70.02869594233522 | 205.48765 | 0.263230763 | 0.285101428707 | 8849767.11861127 | 21_qc_no_dil_milliq | 9 | | |
| 180 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 181 | 78.04643252598305 | 294.01713 | 0.51677558617 | 1.317028944141 | 1333044.506597194 | 21_qc_no_dil_milliq | 13 | | |
| 182 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 183 | ... | ... | ... | ... | ... | ... | ... | | |
| 184 +----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ | |
| 185 </token> | |
| 186 | |
| 187 <token name="@ALIGN_FEATURES_HELP@"> | |
| 188 recetox-aplcms - align features | |
| 189 =============================== | |
| 190 This step performs feature alignment after clustering and retention time correction. | |
| 191 The peaks clustered across samples are grouped based on the given tolerances to create an aligned feature table, connecting identical features across samples. | |
| 192 The parameter controls in how many samples a feature has to be detected at least in order to be included in the aligned feature table. | |
| 193 | |
| 194 Example Output | |
| 195 -------------- | |
| 196 The tool outputs 3 tables: the peak related `metadata`, the `retention times` and the `intensities` for all features across all samples. | |
| 197 | |
| 198 Metadata Table | |
| 199 ~~~~~~~~~~~~~~ | |
| 200 The `npeaks` column denotes the number of peaks which have been grouped into this feature. The columns with the sample names indicate whether this feature is present in the sample. | |
| 201 | |
| 202 +-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ | |
| 203 | id | mz | mzmin | mzmax | rt | rtmin | rtmax | npeaks | 21_qc_no_dil_milliq | 29_qc_no_dil_milliq | 8_qc_no_dil_milliq | | |
| 204 +=======+==============+==============+===============+================+===============+===============+===========+========================+========================+========================+ | |
| 205 | 1 | 70.03707021 | 70.037066 | 70.0370750 | 294.1038014 | 294.0634942 | 294.149985 | 3 | 1 | 1 | 1 | | |
| 206 +-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ | |
| 207 | 2 | 70.06505677 | 70.065045 | 70.0650676 | 141.9560055 | 140.5762528 | 143.335758 | 2 | 1 | 0 | 1 | | |
| 208 +-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ | |
| 209 | 57 | 78.04643252 | 78.046429 | 78.0464325 | 294.0063397 | 293.9406777 | 294.072001 | 2 | 1 | 1 | 0 | | |
| 210 +-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ | |
| 211 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | | |
| 212 +-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ | |
| 213 | |
| 214 Intensity Table | |
| 215 ~~~~~~~~~~~~~~~ | |
| 216 This table contains the peak area for aligned features in all samples. | |
| 217 | |
| 218 +-------+------------------------+------------------------+------------------------+ | |
| 219 | id | 21_qc_no_dil_milliq | 29_qc_no_dil_milliq | 8_qc_no_dil_milliq | | |
| 220 +=======+========================+========================+========================+ | |
| 221 | 1 | 13187487.20482895 | 7957395.699119729 | 11700594.397257797 | | |
| 222 +-------+------------------------+------------------------+------------------------+ | |
| 223 | 2 | 2075168.6398983458 | 0 | 2574362.159289044 | | |
| 224 +-------+------------------------+------------------------+------------------------+ | |
| 225 | 57 | 2934524.4406785755 | 1333044.5065971944 | 0 | | |
| 226 +-------+------------------------+------------------------+------------------------+ | |
| 227 | ... | ... | ... | ... | | |
| 228 +-------+------------------------+------------------------+------------------------+ | |
| 229 | |
| 230 Retention Time Table | |
| 231 ~~~~~~~~~~~~~~~~~~~~ | |
| 232 This table contains the retention times for all aligned features in all samples. | |
| 233 | |
| 234 +-------+------------------------+------------------------+------------------------+ | |
| 235 | id | 21_qc_no_dil_milliq | 29_qc_no_dil_milliq | 8_qc_no_dil_milliq | | |
| 236 +=======+========================+========================+========================+ | |
| 237 | 1 | 294.09792478513236 | 294.1499853056912 | 294.0634942428341 | | |
| 238 +-------+------------------------+------------------------+------------------------+ | |
| 239 | 2 | 140.57625284242982 | 0 | 143.33575827589172 | | |
| 240 +-------+------------------------+------------------------+------------------------+ | |
| 241 | 57 | 294.07200187644435 | 293.9406777222317 | 0 | | |
| 242 +-------+------------------------+------------------------+------------------------+ | |
| 243 | ... | ... | ... | ... | | |
| 244 +-------+------------------------+------------------------+------------------------+ | |
| 245 </token> | |
| 246 | |
| 247 <token name="@MERGE_KNOWN_TABLES_HELP@"> | |
| 248 recetox-aplcms - merge known table | |
| 249 ================================== | |
| 250 | |
| 251 This tool allows merging the detected features back into the table of known features and vice versa. | |
| 252 It is used in the hybrid version of recetox-aplcms to augment the aligned feature table with the suspect peaks | |
| 253 and to augment this table with successfully detected features. | |
| 254 </token> | |
| 255 </macros> |
