# HG changeset patch # User davidvanzessen # Date 1473771305 14400 # Node ID afb0937ec0dc1ca50ef6254f1e4517c1db78193c # Parent ffd5462da9d1babf43aa4ce3c612dcc48fc114e5 Uploaded diff -r ffd5462da9d1 -r afb0937ec0dc merge_and_filter.r --- a/merge_and_filter.r Mon Aug 29 04:20:30 2016 -0400 +++ b/merge_and_filter.r Tue Sep 13 08:55:05 2016 -0400 @@ -195,7 +195,11 @@ print(paste("Number of rows in result:", nrow(result))) print(paste("Number of rows in unmatched:", nrow(unmatched))) -matched.sequences.count = sum(!grepl("^unmatched", result$best_match)) +matched.sequences = result[!grepl("^unmatched", result$best_match),] + +write.table(x=matched.sequences, file=gsub("merged.txt$", "filtered.txt", output), sep="\t",quote=F,row.names=F,col.names=T) + +matched.sequences.count = nrow(matched.sequences) unmatched.sequences.count = sum(grepl("^unmatched", result$best_match)) filtering.steps = rbind(filtering.steps, c("Number of matched sequences", matched.sequences.count)) diff -r ffd5462da9d1 -r afb0937ec0dc new_imgt.r --- a/new_imgt.r Mon Aug 29 04:20:30 2016 -0400 +++ b/new_imgt.r Tue Sep 13 08:55:05 2016 -0400 @@ -7,7 +7,9 @@ merged = read.table(merged.file, header=T, sep="\t", fill=T, stringsAsFactors=F) if(gene != "-"){ - merged = merged[grepl(gene, merged$best_match),] + merged = merged[grepl(paste("^", gene, sep=""), merged$best_match),] +} else { + merged = merged[!grepl("unmatched", merged$best_match),] } merged = merged[!grepl("unmatched", merged$best_match),] diff -r ffd5462da9d1 -r afb0937ec0dc wrapper.sh --- a/wrapper.sh Mon Aug 29 04:20:30 2016 -0400 +++ b/wrapper.sh Tue Sep 13 08:55:05 2016 -0400 @@ -372,7 +372,7 @@ cd $outdir/change_o -bash $dir/change_o/makedb.sh $input false false false $outdir/change_o/change-o-db.txt +bash $dir/change_o/makedb.sh $outdir/new_IMGT.txz false false false $outdir/change_o/change-o-db.txt bash $dir/change_o/define_clones.sh bygroup $outdir/change_o/change-o-db.txt gene first ham none min complete 3.0 $outdir/change_o/change-o-db-defined_clones.txt $outdir/change_o/change-o-defined_clones-summary.txt Rscript $dir/merge.r $outdir/change_o/change-o-db-defined_clones.txt $outdir/merged.txt "all" "Sequence.ID,best_match" "SEQUENCE_ID" "Sequence.ID" $outdir/change_o/change-o-db-defined_clones.txt 2>&1 @@ -453,49 +453,50 @@ echo "" >> $output echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output -echo "" >> $output +echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output -echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output +echo "" >> $output echo "
infolink
The complete datasetDownload
The SHM Overview table as a datasetDownload
The data used to generate the first SHM Overview plotDownload
The data used to generate the sexond SHM Overview plotDownload
The data used to generate the third SHM Overview plotDownload
The alignment info on the unmatched sequencesDownload
The complete datasetDownload
The filtered datasetDownload
The SHM Overview table as a datasetDownload
The data used to generate the first SHM Overview plotDownload
The data used to generate the second SHM Overview plotDownload
The data used to generate the third SHM Overview plotDownload
The alignment info on the unmatched sequencesDownload
Motif data per sequence IDDownload
Mutation data per sequence IDDownload
AA mutation data per sequence IDDownload
Absent AA location data per sequence IDDownload
CDR1+FR2+CDR2+FR3+CDR3 sequences that show up more than onceDownload
Motif data per sequence IDDownload
Mutation data per sequence IDDownload
AA mutation data per sequence IDDownload
Absent AA location data per sequence IDDownload
CDR1+FR2+CDR2+FR3+CDR3 sequences that show up more than onceView
Base count for every sequenceDownload
Base count for every sequenceView
Baseline PDF (http://selection.med.yale.edu/baseline/)Download
Baseline dataDownload
Baseline ca PDFDownload
Baseline ca dataDownload
Baseline cg PDFDownload
Baseline cg dataDownload
Baseline cm PDFDownload
Baseline cm dataDownload
Baseline PDF (http://selection.med.yale.edu/baseline/)Download
Baseline dataDownload
Baseline ca PDFDownload
Baseline ca dataDownload
Baseline cg PDFDownload
Baseline cg dataDownload
Baseline cm PDFDownload
Baseline cm dataDownload
An IMGT archive with just the matched and filtered sequencesDownload
An IMGT archive with just the matched and filtered ca sequencesDownload
An IMGT archive with just the matched and filtered ca1 sequencesDownload
An IMGT archive with just the matched and filtered ca2 sequencesDownload
An IMGT archive with just the matched and filtered cg sequencesDownload
An IMGT archive with just the matched and filtered cg1 sequencesDownload
An IMGT archive with just the matched and filtered cg2 sequencesDownload
An IMGT archive with just the matched and filtered cg3 sequencesDownload
An IMGT archive with just the matched and filtered cg4 sequencesDownload
An IMGT archive with just the matched and filtered cm sequencesDownload
An IMGT archive with just the matched and filtered sequencesDownload
An IMGT archive with just the matched and filtered ca sequencesDownload
An IMGT archive with just the matched and filtered ca1 sequencesDownload
An IMGT archive with just the matched and filtered ca2 sequencesDownload
An IMGT archive with just the matched and filtered cg sequencesDownload
An IMGT archive with just the matched and filtered cg1 sequencesDownload
An IMGT archive with just the matched and filtered cg2 sequencesDownload
An IMGT archive with just the matched and filtered cg3 sequencesDownload
An IMGT archive with just the matched and filtered cg4 sequencesDownload
An IMGT archive with just the matched and filtered cm sequencesDownload
The Change-O DB file with defined clones and subclass annotationDownload
The Change-O DB defined clones summary fileDownload
The Change-O DB file with defined clones of caDownload
The Change-O DB defined clones summary file of caDownload
The Change-O DB file with defined clones of cgDownload
The Change-O DB defined clones summary file of cgDownload
The Change-O DB file with defined clones of cmDownload
The Change-O DB defined clones summary file of cmDownload
The Change-O DB file with defined clones and subclass annotationDownload
The Change-O DB defined clones summary fileDownload
The Change-O DB file with defined clones of caDownload
The Change-O DB defined clones summary file of caDownload
The Change-O DB file with defined clones of cgDownload
The Change-O DB defined clones summary file of cgDownload
The Change-O DB file with defined clones of cmDownload
The Change-O DB defined clones summary file of cmDownload
" >> $output