comparison ideas.xml @ 130:d088f25661d9 draft

Uploaded
author greg
date Mon, 11 Dec 2017 14:13:45 -0500
parents d064db60a06d
children 5150fcdcd0fa
comparison
equal deleted inserted replaced
129:d064db60a06d 130:d088f25661d9
163 <configfiles> 163 <configfiles>
164 <configfile name="gen_prep_input_config"><![CDATA[#if str($cell_type_epigenetic_factor_cond.cell_type_epigenetic_factor) == "extract": 164 <configfile name="gen_prep_input_config"><![CDATA[#if str($cell_type_epigenetic_factor_cond.cell_type_epigenetic_factor) == "extract":
165 #set input_name_positions = $cell_type_epigenetic_factor_cond.input_name_positions 165 #set input_name_positions = $cell_type_epigenetic_factor_cond.input_name_positions
166 #for $i in $cell_type_epigenetic_factor_cond.input: 166 #for $i in $cell_type_epigenetic_factor_cond.input:
167 #set file_name_with_ext = $i.name 167 #set file_name_with_ext = $i.name
168 #if str($file_name_with_ext).find("http") >= 0 or str($file_name_with_ext).find("ftp") >= 0:
169 #set file_name_with_ext = $file_name_with_ext.split('/')[-1]
170 #end if
168 #assert str($file_name_with_ext).find("-") >= 0, "The selected input '%s' is invalid because it does not include the '-' character which is required when setting cell type and epigenetic factor names by extracting them from the input file names." % $file_name_with_ext 171 #assert str($file_name_with_ext).find("-") >= 0, "The selected input '%s' is invalid because it does not include the '-' character which is required when setting cell type and epigenetic factor names by extracting them from the input file names." % $file_name_with_ext
169 #set file_name = $file_name_with_ext.split(".")[0] 172 #set file_name = $file_name_with_ext.split(".")[0]
170 #if str($input_name_positions) == "cell_first": 173 #if str($input_name_positions) == "cell_first":
171 #set cell_type_name = $file_name.split("-")[0] 174 #set cell_type_name = $file_name.split("-")[0]
172 #set epigenetic_factor_name = $file_name.split("-")[1] 175 #set epigenetic_factor_name = $file_name.split("-")[1]
354 357
355 * **Set cell type and epigenetic factor names by** - cell type and epigenetic factor names can be set manually or by extracting them from the names of the selected input datasets. The latter case requires all selected datasets to have names that contain a "-" character. 358 * **Set cell type and epigenetic factor names by** - cell type and epigenetic factor names can be set manually or by extracting them from the names of the selected input datasets. The latter case requires all selected datasets to have names that contain a "-" character.
356 359
357 * **BAM or BigWig files** - select one or more Bam or Bigwig files from your history, making sure that the name of every selected input include a "-" character (e.g., e001-h3k4me3.bigwig). 360 * **BAM or BigWig files** - select one or more Bam or Bigwig files from your history, making sure that the name of every selected input include a "-" character (e.g., e001-h3k4me3.bigwig).
358 * **Cell type, Epigenetic factor and Input** - manually select any number of inputs, setting the cell type and epigenetic factor name for each. The combination of "cell type name" and "epigenetic factor name" must be unique for each input. For example, if you have replicate data you may want to specify the cell name as "rep1", "rep2", etc and the factor name as "rep1", "rep2", etc. 361 * **Cell type, Epigenetic factor and Input** - manually select any number of inputs, setting the cell type and epigenetic factor name for each. The combination of "cell type name" and "epigenetic factor name" must be unique for each input. For example, if you have replicate data you may want to specify the cell name as "rep1", "rep2", etc and the factor name as "rep1", "rep2", etc.
359 362
360 * **Cell type name** - cell type name 363 * **Cell type name** - cell type name
361 * **Epigenetic factor name** - epigenetic factor name 364 * **Epigenetic factor name** - epigenetic factor name
362 * **BAM or BigWig file** - BAM or BigWig file 365 * **BAM or BigWig file** - BAM or BigWig file
363 366
364 * **Project name** - datasets produced by IDEAS will have this base name. 367 * **Project name** - datasets produced by IDEAS will have this base name.
389 * **Maximum number of cell type clusters allowed** - If you set the value to 1, then all cell types will be clustered in one group, which may be desirable if all cell types are homogeneous and you want IDEAS to use information in all cell types equally. 392 * **Maximum number of cell type clusters allowed** - If you set the value to 1, then all cell types will be clustered in one group, which may be desirable if all cell types are homogeneous and you want IDEAS to use information in all cell types equally.
390 * **Prior concentration** - specify the prior concentration parameter; default is A=sqrt(number of cell types). A smaller concentration parameter (e.g., 1 or less) will emphasize more on position specificity and a larger concentration parameter (e.g., 10 * number of cell types) will emphasize more on global homogeneity. 393 * **Prior concentration** - specify the prior concentration parameter; default is A=sqrt(number of cell types). A smaller concentration parameter (e.g., 1 or less) will emphasize more on position specificity and a larger concentration parameter (e.g., 10 * number of cell types) will emphasize more on global homogeneity.
391 * **Number of burnin steps** - specify the number of burnin steps; default is 20. Increasing the burnin and maximization steps will increase computing and only slightly increase accuracy, while decreasing them will reduce computing resources but may also reduce accuracy. We recommend to run IDEAS with at least 20 burnins and 20 maximizations. IDEAS will not stop even if it reaches a maximum mode. 394 * **Number of burnin steps** - specify the number of burnin steps; default is 20. Increasing the burnin and maximization steps will increase computing and only slightly increase accuracy, while decreasing them will reduce computing resources but may also reduce accuracy. We recommend to run IDEAS with at least 20 burnins and 20 maximizations. IDEAS will not stop even if it reaches a maximum mode.
392 * **Number of maximization steps** - specify the number of maximization steps; default is 20. 395 * **Number of maximization steps** - specify the number of maximization steps; default is 20.
393 * **Minimum standard deviation for the emission Gaussian distribution** - This number multiplied by the overall standard deviation of your data will be used as a lower bound for the standard deviation for each factor in each epigenetic state (the default is 0.5). This number is useful for removing very subtle clusters in the data. Setting this value near 0 will allow IDEAS to discover many subtle states, while setting it greater than 1 will result in IDEAS losing the ability to detect meaningful states. 396 * **Minimum standard deviation for the emission Gaussian distribution** - This number multiplied by the overall standard deviation of your data will be used as a lower bound for the standard deviation for each factor in each epigenetic state (the default is 0.5). This number is useful for removing very subtle clusters in the data. Setting this value near 0 will allow IDEAS to discover many subtle states, while setting it greater than 1 will result in IDEAS losing the ability to detect meaningful states.
394 * **Maximim standard deviation for the emission Gaussian distribution** - if you want to find fine-grained states you may use this option (if not used, IDEAS uses infinity), but it is rearely used unless you need more states to be inferred. 397 * **Maximim standard deviation for the emission Gaussian distribution** - if you want to find fine-grained states you may use this option (if not used, IDEAS uses infinity), but it is rearely used unless you need more states to be inferred.
395 398
396 </help> 399 </help>
397 <citations> 400 <citations>
398 <citation type="doi">10.1093/nar/gkw278</citation> 401 <citation type="doi">10.1093/nar/gkw278</citation>
399 </citations> 402 </citations>