annotate optimizer_command @ 2:6e4eb4856874 draft

Uploaded
author elixir-it
date Wed, 22 Jul 2020 19:20:30 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
1 Il comando per lanciarlo è qualcosa di questo tipo:
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
2
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
3 perl optimizer.pl -leQTL qfile -similarD sfile -disease dilated_ -lgenes
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
4 DCM_genes -fileR ALL_DCM_Final.vcf.annovar_ANNOT_B.hg19_multianno.vcf
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
5 -fileC test_unaffected.vcf -ofile test_RES_2
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
6
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
7 I parametri, sono molto simili al precedente, anche se alcuni sono
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
8 leggermente diversi. Infatti questo prende 2 file di input e fa un
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
9 output solo.
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
10
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
11 Questa è la lista dei parametri
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
12
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
13 %arguments=
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
14 (
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
15 "fileR"=>"", #file: vcf file of affected
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
16 individuals
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
17 "fileC"=>"", #file: vcf file of unaffected
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
18 individuals
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
19 "ofile"=>"", #name: name of the output files
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
20 "disease_clinvar"=>[4,6], #numeric mandadory, range,
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
21 "score_AF"=>[2,4], #numeric mandatory, range
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
22 "score_functional"=>[4,6], #numeric mandatory, range
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
23 "score_NS"=>[2,4], #numeric mandatory, range
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
24 "score_nIND"=>[2,4], #numeric mandatory, range
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
25 "scoreeQTL"=>[2,4], #numeric mandatory, range
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
26 "scoreG"=>[3,5], #numeric mandatory, range
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
27 "disease"=>"", #name optional
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
28 "similarD"=>"", #file optional
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
29 "lgenes"=>"", #file optional
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
30 "leQTL"=>"qfile", #file mandatory, but default value
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
31 "keywords"=>"kfile", #file mandatory, but default value
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
32 "effects"=>"efile", #file mandatory, but default value
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
33 "AF"=>0.0001 #numeric, mandatory, but default value
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
34 );
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
35
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
36 L'altra differenza fondamentale è che ora i parametri numerici devono
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
37 avere un range. Ad esempio -param 4:6 indica che il parametro assumerà i
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
38 valori 4,5,6. Se questo modo di codificare i parametri è scomodo,
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
39 possiamo trovare un altro tipo di codifica.
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
40
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
41 Infine lo script lancia altri 2 script, uno è il precedente che ti ho
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
42 già passato "score_complete_alt.pl" e l'altro è uno script in R che si
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
43 chiama "wilcox.R". Gli script vengono chiamati con i percorsi assoluti,
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
44 quindi la cosa più semplice sarebbe averne una copia nella cartella da
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
45 cui viene eseguito lo script principale, o in alternativa fare una
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
46 copia di tutti 3 gli script nella cartella in cui stanno i dati e poi
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
47 eseguirli da lì
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
48
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
49 Trovi il tutto a questo link: http://159.149.160.53/coso/optimizer/ I
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
50 due vcf sono i due input (vedi sopra). test_RES_2 è un esempio di
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
51 output. é sempre tabulare.
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
52
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
53 #gen 2020
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
54 Nuova versione
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
55 1 incorpora punteggi e annotazioni aggiuntive
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
56 2 usa gli algoritmi genetici per trovare i pesi ottimali
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
57 3 il criterio di ottimalità è pesato "meglio" su p-value, numero totale di positivi e numero totale di presunti falsi positivi
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
58 comando:
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
59 perl optimizer_genetic.pl -fileR remapped_Sorrentino_DEV.vcf.hg38_multianno.vcf -fileC remapped_TSI_DEV.vcf.hg38_multianno.vcf -ofile TEST1 -keywords kfile -effects efile -AF 0.0001 -lgenes DCM_genes -leQTL qfile -similarD sfile -disease _cardiomyopathy_#_dilated#_hypertrophic -nind 4 -AD T -XL F -score_AF 1:10 -score_functional 1:10 -score_NS 1:10 -score_nIND 1:10 -scoreeQTL 1:10 -scoreG 1:10 -scoreT 1:10 -scoreM 1:10 -scoreR 1:10 -scoreSP 1:10 -scoreGW 1:10
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
60
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
61 i nuovi file di riferimento sono nella cartella.
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
62 viene chiamanto lo script GENEO_VINYL.R
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
63 per default, e per ora non deve essere un parametro configurabile
6e4eb4856874 Uploaded
elixir-it
parents:
diff changeset
64 gli algoritmi genetici fanno 100 generazioni di 100 individui.