annotate dartseq_seeduk.R @ 0:0da02ef4141a draft

Uploaded
author cropgeeks
date Mon, 16 Apr 2018 09:02:18 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
1
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
2 > library("dartR")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
3
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
4 #Read DarT data
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
5 > gl <- gl.read.dart(filename="/data/projects/seed/dart_calls/BBSRC-Panel-DArTSEQ-SNPs.csv", nas = "-", topskip = 5, lastmetric = "TotalPicRepSnpTest", probar = TRUE)
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
6 Trying to determine if one row or two row format...
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
7 Found 2 row(s) format. Proceed...
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
8 Added the following covmetrics:
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
9 AlleleID CloneID ClusterTempIndex AlleleSequence ClusterConsensusSequence ClusterSize AlleleSeqDist SNP SnpPosition CallRate OneRatioRef OneRatioSnp FreqHomRef FreqHomSnp FreqHets PICRef PICSnp AvgPIC AvgCountRef AvgCountSnp RatioAvgCountRefAvgCountSnp FreqHetsMinusFreqMinHom AlleleCountsCorrelation aggregateTagsTotal DerivedCorrMinusSeedCorr RepRef RepSNP RepAvg PicRepRef PicRepSNP TotalPicRepRefTest TotalPicRepSnpTest .
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
10 Number of rows per Clone. Should be only 2 s: 2
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
11 Recognised: 376 individuals and 113138 SNPs in a 2 row format using /data/projects/seed/dart_calls/BBSRC-Panel-DArTSEQ-SNPs.csv
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
12 Start conversion....
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
13 Format is 2 rows.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
14 Please note conversion of bigger data sets will take some time!
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
15 Once finished, we recommend to save the object using save(object, file="object.rdata")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
16 |======================================================================| 100%
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
17 >
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
18 > gl.report.callrate(gl)
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
19 Reporting for a genlight object
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
20 Note: Missing values most commonly arise from restriction site mutation.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
21
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
22 Loci with no missing values = 499 [0.4%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
23 < 5% missing values = 23669 [20.9%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
24 < 10% missing values = 45298 [40%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
25 < 15% missing values = 60678 [53.6%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
26 < 20% missing values = 72478 [64.1%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
27 < 25% missing values = 81629 [72.1%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
28 < 30% missing values = 89227 [78.9%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
29 < 35% missing values = 95969 [84.8%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
30 < 40% missing values = 101973 [90.1%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
31 < 45% missing values = 107590 [95.1%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
32 < 50% missing values = 113138 [100%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
33 [1] "Completed"
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
34 > gl.report.callrate(gl,method='ind' )
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
35 Reporting for a genlight object
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
36 Note: Missing values most commonly arise from restriction site mutation.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
37
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
38 Individuals no missing values = 0 [0%] across loci
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
39 Individuals with less than 5% missing values = 1 [0.3%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
40 Individuals with less than 10% missing values = 73 [19.4%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
41 Individuals with less than 15% missing values = 194 [51.6%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
42 Individuals with less than 20% missing values = 268 [71.3%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
43 Individuals with less than 25% missing values = 320 [85.1%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
44 Individuals with less than 30% missing values = 341 [90.7%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
45 Individuals with less than 35% missing values = 352 [93.6%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
46 Individuals with less than 40% missing values = 358 [95.2%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
47 Individuals with less than 45% missing values = 366 [97.3%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
48 Individuals with less than 50% missing values = 371 [98.7%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
49 Individuals with less than 55% missing values = 372 [98.9%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
50 Individuals with less than 60% missing values = 374 [99.5%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
51 Individuals with less than 65% missing values = 375 [99.7%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
52 [1] "Completed"
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
53
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
54
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
55 > gl_call_rate <- gl.filter.callrate(gl,method = 'loc', t=0.75)
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
56 Reporting for a genlight object
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
57 Note: Missing values most commonly arise from restriction site mutation.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
58
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
59 Initial no. of loci = 113138
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
60 No. of loci deleted = 31509
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
61 Summary of filtered dataset
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
62 Call Rate > 0.75
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
63 No. of loci: 81629
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
64 No. of individuals: 376
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
65 No. of populations: 0
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
66
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
67 > gl_rep <- gl.filter.repavg(gl_call_rate,t=0.98)
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
68 Reporting for a genlight object
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
69 Note: RepAvg is a DArT statistic reporting reproducibility averaged across alleles for each locus.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
70
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
71 Initial no. of loci = 81629
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
72 No. of loci deleted = 6446
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
73 Summary of filtered dataset
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
74 Reproducibility >= 0.98
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
75 No. of loci: 75183
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
76 No. of individuals: 376
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
77 No. of populations: 0
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
78
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
79 > gl.report.callrate(gl_rep,method='ind' )
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
80 Reporting for a genlight object
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
81 Note: Missing values most commonly arise from restriction site mutation.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
82
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
83 Individuals no missing values = 0 [0%] across loci
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
84 Individuals with less than 5% missing values = 161 [42.8%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
85 Individuals with less than 10% missing values = 245 [65.2%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
86 Individuals with less than 15% missing values = 301 [80.1%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
87 Individuals with less than 20% missing values = 337 [89.6%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
88 Individuals with less than 25% missing values = 347 [92.3%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
89 Individuals with less than 30% missing values = 358 [95.2%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
90 Individuals with less than 35% missing values = 359 [95.5%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
91 Individuals with less than 40% missing values = 364 [96.8%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
92 Individuals with less than 45% missing values = 372 [98.9%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
93 Individuals with less than 50% missing values = 373 [99.2%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
94 Individuals with less than 55% missing values = 374 [99.5%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
95 Individuals with less than 60% missing values = 375 [99.7%]
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
96 [1] "Completed"
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
97
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
98 > gl_final <- gl.filter.callrate(gl_rep,method = 'ind', t=0.8)
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
99 Reporting for a genlight object
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
100 Note: Missing values most commonly arise from restriction site mutation.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
101
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
102 Initial no. of individuals = 376
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
103 Filtering a genlight object
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
104 no. of individuals deleted = 39
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
105 Individuals retained = 337
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
106 List of individuals deleted because of low call rate
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
107 908017247001_E_5 908017247001_F_4 908017247002_A_10 908017247002_B_4 908017247002_B_5 908017247002_C_3 908017247002_D_12 908017247002_D_2 908017247002_D_6 908017247002_D_9 908017247002_E_6 908017247002_E_7 908017247002_E_9 908017247002_F_2 908017247002_F_6 908017247002_G_8 908017247002_H_10 908017247002_H_7 908017247002_H_8 908017247003_B_8 908017247003_C_8 908017247003_D_8 908017247003_E_8 908017247003_F_8 908017247003_G_6 908017247003_G_8 908017247003_H_7 908017247004_C_11 908017247004_D_11 908017247004_D_8 908017247004_D_9 908017247004_E_10 908017247004_E_11 908017247004_E_9 908017247004_F_11 908017247004_F_12 908017247004_F_6 908017247004_G_11 908017247004_H_11
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
108 from populations
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
109
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
110 Summary of filtered dataset
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
111 Call Rate > 0.8
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
112 No. of loci: 75183
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
113 No. of individuals: 337
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
114 No. of populations: 0
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
115
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
116 > gl2gds(gl_final,outfile="gl2gds.gds")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
117 Converting gl object to gds formatted file gl2gds.gds
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
118
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
119 Structure of gds file
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
120
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
121 The file name: /data/projects/seed/dart_calls/gl2gds.gds
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
122 The total number of samples: 268
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
123 The total number of SNPs: 113138
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
124 SNP genotypes are stored in SNP-major mode (Sample X SNP).
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
125 The SNP positions are not in ascending order on chromosome 1.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
126 File: /data/projects/seed/dart_calls/gl2gds.gds (32.8M)
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
127 + [ ] *
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
128 |--+ https://protect-eu.mimecast.com/s/cfduCj27LTYnmOHWrcoC?domain=sample.id { Str8 268 ZIP_ra(13.7%), 641B }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
129 |--+ https://protect-eu.mimecast.com/s/byfzCk59DIkOBwfVgChE?domain=snp.id { Str8 113138 ZIP_ra(37.9%), 637.3K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
130 |--+ https://protect-eu.mimecast.com/s/0diWClOjDH12EMtyg-Gp?domain=snp.rs.id { Int32 113138 ZIP_ra(78.4%), 346.6K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
131 |--+ snp.position { Float64 113138 ZIP_ra(14.9%), 131.5K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
132 |--+ snp.chromosome { Int32 113138 ZIP_ra(0.10%), 481B }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
133 |--+ snp.allele { Str8 113138 ZIP_ra(14.4%), 63.6K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
134 |--+ genotype { Bit2 268x113138, 7.2M } *
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
135 \--+ loc.metrics [ data.frame ] *
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
136 |--+ AlleleID { Int32,factor 113138 ZIP_ra(68.9%), 304.3K } *
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
137 |--+ CloneID { Int32 113138 ZIP_ra(78.4%), 346.6K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
138 |--+ ClusterTempIndex { Int32 113138 ZIP_ra(63.6%), 281.1K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
139 |--+ AlleleSequence { Int32,factor 113138 ZIP_ra(68.9%), 304.4K } *
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
140 |--+ ClusterConsensusSequence { Int32,factor 113138 ZIP_ra(66.2%), 292.5K } *
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
141 |--+ ClusterSize { Int32 113138 ZIP_ra(7.27%), 32.1K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
142 |--+ AlleleSeqDist { Int32 113138 ZIP_ra(8.49%), 37.5K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
143 |--+ SNP { Int32,factor 113138 ZIP_ra(38.3%), 169.2K } *
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
144 |--+ SnpPosition { Int32 113138 ZIP_ra(26.0%), 115.1K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
145 |--+ CallRate { Float64 113138 ZIP_ra(2.84%), 25.1K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
146 |--+ OneRatioRef { Float64 113138 ZIP_ra(32.7%), 289.2K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
147 |--+ OneRatioSnp { Float64 113138 ZIP_ra(36.1%), 318.8K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
148 |--+ FreqHomRef { Float64 113138 ZIP_ra(36.6%), 323.6K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
149 |--+ FreqHomSnp { Float64 113138 ZIP_ra(32.6%), 288.4K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
150 |--+ FreqHets { Float64 113138 ZIP_ra(20.0%), 177.2K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
151 |--+ PICRef { Float64 113138 ZIP_ra(29.9%), 264.1K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
152 |--+ PICSnp { Float64 113138 ZIP_ra(33.7%), 297.7K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
153 |--+ AvgPIC { Float64 113138 ZIP_ra(44.0%), 388.6K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
154 |--+ AvgCountRef { Float64 113138 ZIP_ra(55.3%), 489.1K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
155 |--+ AvgCountSnp { Float64 113138 ZIP_ra(36.6%), 323.8K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
156 |--+ RatioAvgCountRefAvgCountSnp { Float64 113138 ZIP_ra(57.6%), 509.2K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
157 |--+ FreqHetsMinusFreqMinHom { Float64 113138 ZIP_ra(31.6%), 279.2K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
158 |--+ AlleleCountsCorrelation { Float64 113138 ZIP_ra(48.2%), 425.8K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
159 |--+ aggregateTagsTotal { Int32 113138 ZIP_ra(0.10%), 481B }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
160 |--+ DerivedCorrMinusSeedCorr { Int32 113138 ZIP_ra(0.10%), 478B }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
161 |--+ RepRef { Float64 113138 ZIP_ra(2.50%), 22.1K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
162 |--+ RepSNP { Float64 113138 ZIP_ra(2.56%), 22.7K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
163 |--+ RepAvg { Float64 113138 ZIP_ra(0.38%), 3.4K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
164 |--+ PicRepRef { Float64 113138 ZIP_ra(3.02%), 26.7K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
165 |--+ PicRepSNP { Float64 113138 ZIP_ra(3.59%), 31.7K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
166 |--+ TotalPicRepRefTest { Int32 113138 ZIP_ra(9.95%), 44.0K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
167 |--+ TotalPicRepSnpTest { Int32 113138 ZIP_ra(10.2%), 45.2K }
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
168 |--+ clone { Int32,factor 113138 ZIP_ra(67.8%), 299.5K } *
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
169 \--+ uid { Int32,factor 113138 ZIP_ra(68.9%), 304.3K } *
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
170 NULL
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
171
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
172 #Workaround to convert Dart format to 0-1-2 format
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
173 library("SNPRelate")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
174 > genofile <- snpgdsOpen("./gl2gds.gds")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
175 > snpgdsGDS2BED(genofile, bed.fn="test", snp.id=snpset)
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
176 Error in .InitFile(gdsobj, https://protect-eu.mimecast.com/s/cfduCj27LTYnmOHWrcoC?domain=sample.id = https://protect-eu.mimecast.com/s/cfduCj27LTYnmOHWrcoC?domain=sample.id, https://protect-eu.mimecast.com/s/byfzCk59DIkOBwfVgChE?domain=snp.id = https://protect-eu.mimecast.com/s/byfzCk59DIkOBwfVgChE?domain=snp.id) :
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
177 object 'snpset' not found
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
178 > snpgdsGDS2BED(genofile, bed.fn="test")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
179 Converting from GDS to PLINK binary PED:
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
180 Working space: 268 samples, 113138 SNPs
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
181 Output a BIM file.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
182 Output a BED file ...
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
183 Fri Jan 5 17:30:03 2018 0%
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
184 Fri Jan 5 17:30:03 2018 100%
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
185 Done.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
186 santosb@triticum:/data/projects/seed/dart_calls$ PLINK v1.90b4.9 64-bit (13 Oct 2017) https://protect-eu.mimecast.com/s/ASv_CmwlGFpjlMF9_Wuo?domain=cog-genomics.org
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
187 (C) 2005-2017 Shaun Purcell, Christopher Chang GNU General Public License v3
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
188 Note: --recodeA flag deprecated. Use 'recode A ...'.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
189 Logging to plink.log.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
190 Options in effect:
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
191 --bfile test
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
192 --recode A
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
193
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
194 3102532 MB RAM detected; reserving 1551266 MB for main workspace.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
195 75183 variants loaded from .bim file.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
196 337 people (0 males, 0 females, 337 ambiguous) loaded from .fam.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
197 Ambiguous sex IDs written to plink.nosex .
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
198 Using 1 thread (no multithreaded calculations invoked).
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
199 Before main variant filters, 337 founders and 0 nonfounders present.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
200 Calculating allele frequencies... done.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
201 Total genotyping rate is 0.927072.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
202 75183 variants and 337 people pass filters and QC.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
203 Note: No phenotypes present.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
204 --recode A to plink.raw ... done.
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
205
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
206
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
207 santosb@triticum:/data/projects/seed/dart_calls$R
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
208 #PCO analysis
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
209 > library("amap")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
210 > data <- read.table("plink.raw")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
211 > distances <- Dist(data[2:338,7:75189], method = "euclidean", nbproc = 144)
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
212 > pco_results <- pco(distances,k=10)
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
213 #Variance explained by first three PCOs
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
214 > pco_results$eig[1]/sum(pco_results$eig[pco_results$eig>0])
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
215 [1] 0.2565937
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
216 > pco_results$eig[2]/sum(pco_results$eig[pco_results$eig>0])
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
217 [1] 0.06878127
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
218 > pco_results$eig[3]/sum(pco_results$eig[pco_results$eig>0])
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
219 [1] 0.04340111
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
220 > write.csv(pco_results$points,file="PCO_analysis_gl_final.csv")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
221 > write.csv(data[,2],file="PCO_sample_names.csv")
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
222
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
223
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
224
0da02ef4141a Uploaded
cropgeeks
parents:
diff changeset
225