# HG changeset patch # User jdv # Date 1534094490 14400 # Node ID 0cf41189f086fb55f9b5b0004a73307e541b198b # Parent f2081dc93880cb89b311daa18e81fcbc52c5efd5 planemo upload for repository https://github.com/jvolkening/galaxy-tools/tree/master/tools/nanopore_qc commit 0d8d1ec70b450f96a29a98e4dec9688b18170d32 diff -r f2081dc93880 -r 0cf41189f086 nanopore_qc.R --- a/nanopore_qc.R Mon Mar 12 19:55:54 2018 -0400 +++ b/nanopore_qc.R Sun Aug 12 13:21:30 2018 -0400 @@ -56,11 +56,20 @@ help="The cutoff value for the mean Q score of a read (default 7). Used to create separate plots for reads above and below this threshold" ) +parser <- add_option(parser, + opt_str = c("-d", "--discard_failed"), + type="logical", + default=FALSE, + dest = 'filt.failed', + help="Discard reads that failed Albacore filtering" + ) + opt = parse_args(parser) -input.file = opt$input.file -output.dir = opt$output.dir -q = opt$q +input.file = opt$input.file +output.dir = opt$output.dir +filt.failed = opt$filt.failed +q = opt$q # this is how we label the reads at least as good as q q_title = paste("Q>=", q, sep="") @@ -123,6 +132,7 @@ # by default the lowest value is -Inf, i.e. includes all reads. The # other value in min.q is set by the user at the command line d = read_tsv(filepath, col_types = cols_only(channel = 'i', + passes_filtering = 'c', num_events_template = 'i', sequence_length_template = 'i', mean_qscore_template = 'n', @@ -146,6 +156,10 @@ # ignore 0-length reads d <- d[d$sequence_length_template > 0,] + # ignore reads failing filtering + if (filt.failed) { + d <- d[d$passes_filtering == 'True',] + } d$events_per_base = d$num_events_template/d$sequence_length_template @@ -173,7 +187,6 @@ d = d[keep] d$start_bin = cut(d$start_time, 9,labels=c(1:9)) - write.table(d,"foo.tsv",sep="\t",quote=F) return(d) } diff -r f2081dc93880 -r 0cf41189f086 nanopore_qc.xml --- a/nanopore_qc.xml Mon Mar 12 19:55:54 2018 -0400 +++ b/nanopore_qc.xml Sun Aug 12 13:21:30 2018 -0400 @@ -1,20 +1,20 @@ - + Quality report for nanopore data - r-base - r-ggplot2 - r-plyr - r-reshape2 - r-readr - r-yaml - r-scales - r-futile.logger - r-data.table - r-optparse - r-mgcv - perl-yaml-libyaml + r-base + r-ggplot2 + r-plyr + r-reshape2 + r-readr + r-yaml + r-scales + r-futile.logger + r-data.table + r-optparse + r-mgcv + perl-yaml-libyaml @@ -39,12 +39,14 @@ -i '$input' -o '${html_file.files_path}' -q '$q_cutoff' + $discard_failed && perl '${__tool_directory__}/yaml_to_html.pl' '${html_file.files_path}/summary.yaml' + '${html_file.files_path}' '$html_file' ]]> @@ -53,21 +55,52 @@ + + + + - + + + + + + + + - + + + + + + + + + + + + + + + + + + + + - - - - - - - - - -

NanoporeQC Report

-

Summary statistics

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
All reads
Total Yield (Gb)0.0092773
Total Reads9990
Mean Length928.7
Median Length941.0
Max Length25740.0
Mean Q11.7
Median Q12.7
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Q>=10
Total Yield (Gb)0.0079507
Total Reads7952
Mean Length999.8
Median Length949.0
Max Length6545.0
Mean Q12.9
Median Q13.1
-

QC plots

-

(Click on plot for hi-resolution version)

- -
- length_histogram -
Read length distribution
-
-
- -
- q_histogram -
Mean quality score distribution
-
-
- -
- reads_per_hour -
Yield over time
-
-
- -
- cumulative_yield -
Cumulative yield over time
-
-
- -
- yield_summary -
Yield by read length cutoff
-
-
- -
- flowcell_overview -
Median read quality per channel
-
-
- -
- length_by_hour -
Read length over time
-
-
- -
- q_by_hour -
Read quality over time
-
-
- -
- length_vs_q -
Read length vs. quality
-
-
- - - diff -r f2081dc93880 -r 0cf41189f086 test-data/output.html.small.q6 --- a/test-data/output.html.small.q6 Mon Mar 12 19:55:54 2018 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,191 +0,0 @@ - - - - - - - - - - -

NanoporeQC Report

-

Summary statistics

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
All reads
Total Yield (Gb)0.0092773
Total Reads9990
Mean Length928.7
Median Length941.0
Max Length25740.0
Mean Q11.7
Median Q12.7
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Q>=6
Total Yield (Gb)0.0090947
Total Reads9254
Mean Length982.8
Median Length945.0
Max Length6545.0
Mean Q12.3
Median Q12.9
-

QC plots

-

(Click on plot for hi-resolution version)

- -
- length_histogram -
Read length distribution
-
-
- -
- q_histogram -
Mean quality score distribution
-
-
- -
- reads_per_hour -
Yield over time
-
-
- -
- cumulative_yield -
Cumulative yield over time
-
-
- -
- yield_summary -
Yield by read length cutoff
-
-
- -
- flowcell_overview -
Median read quality per channel
-
-
- -
- length_by_hour -
Read length over time
-
-
- -
- q_by_hour -
Read quality over time
-
-
- -
- length_vs_q -
Read length vs. quality
-
-
- - - diff -r f2081dc93880 -r 0cf41189f086 yaml_to_html.pl --- a/yaml_to_html.pl Mon Mar 12 19:55:54 2018 -0400 +++ b/yaml_to_html.pl Sun Aug 12 13:21:30 2018 -0400 @@ -5,12 +5,13 @@ use 5.012; use YAML::XS qw/LoadFile/; +use MIME::Base64; use autodie; -my ($fn_in, $fn_out) = @ARGV; +my ($fn_yaml, $dir_in, $fn_out) = @ARGV; die "Can't find or read input file: $!\n" - if (! -r $fn_in); + if (! -r $fn_yaml); # set output filehandle based on arguments my $fh = \*STDOUT; @@ -18,9 +19,9 @@ open $fh, '>', $fn_out; } -my $yaml = LoadFile($ARGV[0]); +my $yaml = LoadFile($fn_yaml); -convert($yaml); +convert($yaml, $dir_in); sub convert { @@ -99,16 +100,26 @@ say {$fh} "

QC plots

"; - say {$fh} "

(Click on plot for hi-resolution version)

"; + say {$fh} "

(Click on plot for high-resolution version, or in Chrome \"Open link in new tab\")

"; for my $base (@order) { my $caption = $figs{$base} // die "No caption found for $base"; + + # Base64-encode images + my $fn_img_full = "$dir_in/$base.png"; + my $fn_img_screen = "$dir_in/$base.screen.png"; + die "Failed to find or read $fn_img_full" + if (! -r $fn_img_full); + die "Failed to find or read $fn_img_screen" + if (! -r $fn_img_screen); + my $img_full = encode($fn_img_full); + my $img_screen = encode($fn_img_screen); print {$fh} <<"CONTENT" - +
- $base + $base
$caption
@@ -120,7 +131,14 @@ } +sub encode { + my ($fn) = @_; + open my $in, '<:raw', $fn; + local($/) = undef; + return encode_base64(<$in>); + +} sub header {