comparison plink.xml @ 3:4c3690a9d729 draft default tip

Fix help text formatting
author blankenberg
date Tue, 19 Nov 2019 21:35:42 +0000
parents ed946e888494
children
comparison
equal deleted inserted replaced
2:ed946e888494 3:4c3690a9d729
10409 <data name="OUTPUT_plink_var_ranges" format="plink.var.ranges" label="${tool.name} on ${on_string}: plink.var.ranges" from_work_dir="plink.var.ranges" hidden="True"/> 10409 <data name="OUTPUT_plink_var_ranges" format="plink.var.ranges" label="${tool.name} on ${on_string}: plink.var.ranges" from_work_dir="plink.var.ranges" hidden="True"/>
10410 <data name="OUTPUT_plink_vcf" format="vcf" label="${tool.name} on ${on_string}: plink.vcf" from_work_dir="plink.vcf" hidden="True"/> 10410 <data name="OUTPUT_plink_vcf" format="vcf" label="${tool.name} on ${on_string}: plink.vcf" from_work_dir="plink.vcf" hidden="True"/>
10411 <data name="OUTPUT_plink_log" format="plink.log" label="${tool.name} on ${on_string}: plink.log" from_work_dir="plink.log" hidden="False"/> 10411 <data name="OUTPUT_plink_log" format="plink.log" label="${tool.name} on ${on_string}: plink.log" from_work_dir="plink.log" hidden="False"/>
10412 </outputs> 10412 </outputs>
10413 <help><![CDATA[ 10413 <help><![CDATA[
10414 10414 ::
10415 PLINK v1.90b4 64-bit (20 Mar 2017) www.cog-genomics.org/plink/1.9/ 10415
10416 (C) 2005-2017 Shaun Purcell, Christopher Chang GNU General Public License v3 10416
10417 10417 PLINK v1.90b4 64-bit (20 Mar 2017) www.cog-genomics.org/plink/1.9/
10418 In the command line flag definitions that follow, 10418 (C) 2005-2017 Shaun Purcell, Christopher Chang GNU General Public License v3
10419 * [square brackets] denote a required parameter, where the text between the 10419
10420 brackets describes its nature. 10420 In the command line flag definitions that follow,
10421 * <angle brackets> denote an optional modifier (or if '|' is present, a set 10421 * [square brackets] denote a required parameter, where the text between the
10422 of mutually exclusive optional modifiers). Use the EXACT text in the 10422 brackets describes its nature.
10423 definition, e.g. '--dummy acgt'. 10423 * <angle brackets> denote an optional modifier (or if '|' is present, a set
10424 * There's one exception to the angle brackets/exact text rule: when an angle 10424 of mutually exclusive optional modifiers). Use the EXACT text in the
10425 bracket term ends with '=[value]', '[value]' designates a variable 10425 definition, e.g. '--dummy acgt'.
10426 parameter. 10426 * There's one exception to the angle brackets/exact text rule: when an angle
10427 * {curly braces} denote an optional parameter, where the text between the 10427 bracket term ends with '=[value]', '[value]' designates a variable
10428 braces describes its nature. 10428 parameter.
10429 * An ellipsis (...) indicates that you may enter multiple parameters of the 10429 * {curly braces} denote an optional parameter, where the text between the
10430 specified type. 10430 braces describes its nature.
10431 10431 * An ellipsis (...) indicates that you may enter multiple parameters of the
10432 plink [input flag(s)...] {command flag(s)...} {other flag(s)...} 10432 specified type.
10433 plink --help {flag name(s)...} 10433
10434 10434 plink [input flag(s)...] {command flag(s)...} {other flag(s)...}
10435 Most PLINK runs require exactly one main input fileset. The following flags 10435 plink --help {flag name(s)...}
10436 are available for defining its form and location: 10436
10437 10437 Most PLINK runs require exactly one main input fileset. The following flags
10438 --bfile {prefix} : Specify .bed + .bim + .fam prefix (default 'plink'). 10438 are available for defining its form and location:
10439 --bed [filename] : Specify full name of .bed file. 10439
10440 --bim [filename] : Specify full name of .bim file. 10440 --bfile {prefix} : Specify .bed + .bim + .fam prefix (default 'plink').
10441 --fam [filename] : Specify full name of .fam file. 10441 --bed [filename] : Specify full name of .bed file.
10442 10442 --bim [filename] : Specify full name of .bim file.
10443 --keep-autoconv : With --file/--tfile/--lfile/--vcf/--bcf/--data/--23file, 10443 --fam [filename] : Specify full name of .fam file.
10444 don't delete autogenerated binary fileset at end of run. 10444
10445 10445 --keep-autoconv : With --file/--tfile/--lfile/--vcf/--bcf/--data/--23file,
10446 --file {prefix} : Specify .ped + .map filename prefix (default 'plink'). 10446 don't delete autogenerated binary fileset at end of run.
10447 --ped [filename] : Specify full name of .ped file. 10447
10448 --map [filename] : Specify full name of .map file. 10448 --file {prefix} : Specify .ped + .map filename prefix (default 'plink').
10449 10449 --ped [filename] : Specify full name of .ped file.
10450 --no-fid : .fam/.ped file does not contain column 1 (family ID). 10450 --map [filename] : Specify full name of .map file.
10451 --no-parents : .fam/.ped file does not contain columns 3-4 (parents). 10451
10452 --no-sex : .fam/.ped file does not contain column 5 (sex). 10452 --no-fid : .fam/.ped file does not contain column 1 (family ID).
10453 --no-pheno : .fam/.ped file does not contain column 6 (phenotype). 10453 --no-parents : .fam/.ped file does not contain columns 3-4 (parents).
10454 10454 --no-sex : .fam/.ped file does not contain column 5 (sex).
10455 --tfile {prefix} : Specify .tped + .tfam filename prefix (default 'plink'). 10455 --no-pheno : .fam/.ped file does not contain column 6 (phenotype).
10456 --tped [fname] : Specify full name of .tped file. 10456
10457 --tfam [fname] : Specify full name of .tfam file. 10457 --tfile {prefix} : Specify .tped + .tfam filename prefix (default 'plink').
10458 10458 --tped [fname] : Specify full name of .tped file.
10459 --lfile {prefix} : Specify .lgen + .map + .fam (long-format fileset) prefix. 10459 --tfam [fname] : Specify full name of .tfam file.
10460 --lgen [fname] : Specify full name of .lgen file. 10460
10461 --reference [fn] : Specify default allele file accompanying .lgen input. 10461 --lfile {prefix} : Specify .lgen + .map + .fam (long-format fileset) prefix.
10462 --allele-count : When used with --lfile/--lgen + --reference, specifies 10462 --lgen [fname] : Specify full name of .lgen file.
10463 that the .lgen file contains reference allele counts. 10463 --reference [fn] : Specify default allele file accompanying .lgen input.
10464 10464 --allele-count : When used with --lfile/--lgen + --reference, specifies
10465 --vcf [filename] : Specify full name of .vcf or .vcf.gz file. 10465 that the .lgen file contains reference allele counts.
10466 --bcf [filename] : Specify full name of BCF2 file. 10466
10467 10467 --vcf [filename] : Specify full name of .vcf or .vcf.gz file.
10468 --data {prefix} : Specify Oxford .gen + .sample prefix (default 'plink'). 10468 --bcf [filename] : Specify full name of BCF2 file.
10469 --gen [filename] : Specify full name of .gen or .gen.gz file. 10469
10470 --bgen [f] <snpid-chr> : Specify full name of .bgen file. 10470 --data {prefix} : Specify Oxford .gen + .sample prefix (default 'plink').
10471 --sample [fname] : Specify full name of .sample file. 10471 --gen [filename] : Specify full name of .gen or .gen.gz file.
10472 10472 --bgen [f] <snpid-chr> : Specify full name of .bgen file.
10473 --23file [fname] {FID} {IID} {sex} {pheno} {pat. ID} {mat. ID} : 10473 --sample [fname] : Specify full name of .sample file.
10474 Specify 23andMe input file. 10474
10475 10475 --23file [fname] {FID} {IID} {sex} {pheno} {pat. ID} {mat. ID} :
10476 --grm-gz {prfx} : Specify .grm.gz + .grm.id (GCTA rel. matrix) prefix. 10476 Specify 23andMe input file.
10477 --grm-bin {prfx} : Specify .grm.bin + .grm.N.bin + .grm.id (GCTA triangular 10477
10478 binary relationship matrix) filename prefix. 10478 --grm-gz {prfx} : Specify .grm.gz + .grm.id (GCTA rel. matrix) prefix.
10479 10479 --grm-bin {prfx} : Specify .grm.bin + .grm.N.bin + .grm.id (GCTA triangular
10480 --dummy [sample ct] [SNP ct] {missing geno freq} {missing pheno freq} 10480 binary relationship matrix) filename prefix.
10481 <acgt | 1234 | 12> <scalar-pheno> 10481
10482 This generates a fake input dataset with the specified number of samples 10482 --dummy [sample ct] [SNP ct] {missing geno freq} {missing pheno freq}
10483 and SNPs. By default, the missing genotype and phenotype frequencies are 10483 <acgt | 1234 | 12> <scalar-pheno>
10484 zero, and genotypes are As and Bs (change the latter with 10484 This generates a fake input dataset with the specified number of samples
10485 'acgt'/'1234'/'12'). The 'scalar-pheno' modifier causes a normally 10485 and SNPs. By default, the missing genotype and phenotype frequencies are
10486 distributed scalar phenotype to be generated instead of a binary one. 10486 zero, and genotypes are As and Bs (change the latter with
10487 10487 'acgt'/'1234'/'12'). The 'scalar-pheno' modifier causes a normally
10488 --simulate [simulation parameter file] <tags | haps> <acgt | 1234 | 12> 10488 distributed scalar phenotype to be generated instead of a binary one.
10489 --simulate-qt [simulation parameter file] <tags | haps> <acgt | 1234 | 12> 10489
10490 --simulate generates a fake input dataset with disease-associated SNPs, 10490 --simulate [simulation parameter file] <tags | haps> <acgt | 1234 | 12>
10491 while --simulate-qt generates a dataset with quantitative trait loci. 10491 --simulate-qt [simulation parameter file] <tags | haps> <acgt | 1234 | 12>
10492 10492 --simulate generates a fake input dataset with disease-associated SNPs,
10493 Output files have names of the form 'plink.{extension}' by default. You can 10493 while --simulate-qt generates a dataset with quantitative trait loci.
10494 change the 'plink' prefix with 10494
10495 10495 Output files have names of the form 'plink.{extension}' by default. You can
10496 --out [prefix] : Specify prefix for output files. 10496 change the 'plink' prefix with
10497 10497
10498 Most runs also require at least one of the following commands: 10498 --out [prefix] : Specify prefix for output files.
10499 10499
10500 --make-bed 10500 Most runs also require at least one of the following commands:
10501 Create a new binary fileset. Unlike the automatic text-to-binary 10501
10502 converters (which only heed chromosome filters), this supports all of 10502 --make-bed
10503 PLINK's filtering flags. 10503 Create a new binary fileset. Unlike the automatic text-to-binary
10504 --make-just-bim 10504 converters (which only heed chromosome filters), this supports all of
10505 --make-just-fam 10505 PLINK's filtering flags.
10506 Variants of --make-bed which only write a new .bim or .fam file. Can be 10506 --make-just-bim
10507 used with only .bim/.fam input. 10507 --make-just-fam
10508 USE THESE CAUTIOUSLY. It is very easy to desynchronize your binary 10508 Variants of --make-bed which only write a new .bim or .fam file. Can be
10509 genotype data and your .bim/.fam indexes if you use these commands 10509 used with only .bim/.fam input.
10510 improperly. If you have any doubt, stick with --make-bed. 10510 USE THESE CAUTIOUSLY. It is very easy to desynchronize your binary
10511 10511 genotype data and your .bim/.fam indexes if you use these commands
10512 --recode [output format] <01 | 12> <tab | tabx | spacex | bgz | gen-gz> 10512 improperly. If you have any doubt, stick with --make-bed.
10513 <include-alt> <omit-nonmale-y> 10513
10514 Create a new text fileset with all filters applied. The following output 10514 --recode [output format] <01 | 12> <tab | tabx | spacex | bgz | gen-gz>
10515 formats are supported: 10515 <include-alt> <omit-nonmale-y>
10516 * '23': 23andMe 4-column format. This can only be used on a single 10516 Create a new text fileset with all filters applied. The following output
10517 sample's data (--keep may be handy), and does not support multicharacter 10517 formats are supported:
10518 allele codes. 10518 * '23': 23andMe 4-column format. This can only be used on a single
10519 * 'A': Sample-major additive (0/1/2) coding, suitable for loading from R. 10519 sample's data (--keep may be handy), and does not support multicharacter
10520 If you need uncounted alleles to be named in the header line, add the 10520 allele codes.
10521 'include-alt' modifier. 10521 * 'A': Sample-major additive (0/1/2) coding, suitable for loading from R.
10522 * 'AD': Sample-major additive (0/1/2) + dominant (het=1/hom=0) coding. 10522 If you need uncounted alleles to be named in the header line, add the
10523 Also supports 'include-alt'. 10523 'include-alt' modifier.
10524 * 'A-transpose': Variant-major 0/1/2. 10524 * 'AD': Sample-major additive (0/1/2) + dominant (het=1/hom=0) coding.
10525 * 'beagle': Unphased per-autosome .dat and .map files, readable by early 10525 Also supports 'include-alt'.
10526 BEAGLE versions. 10526 * 'A-transpose': Variant-major 0/1/2.
10527 * 'beagle-nomap': Single .beagle.dat file. 10527 * 'beagle': Unphased per-autosome .dat and .map files, readable by early
10528 * 'bimbam': Regular BIMBAM format. 10528 BEAGLE versions.
10529 * 'bimbam-1chr': BIMBAM format, with a two-column .pos.txt file. Does not 10529 * 'beagle-nomap': Single .beagle.dat file.
10530 support multiple chromosomes. 10530 * 'bimbam': Regular BIMBAM format.
10531 * 'fastphase': Per-chromosome fastPHASE files, with 10531 * 'bimbam-1chr': BIMBAM format, with a two-column .pos.txt file. Does not
10532 .chr-[chr #].recode.phase.inp filename extensions. 10532 support multiple chromosomes.
10533 * 'fastphase-1chr': Single .recode.phase.inp file. Does not support 10533 * 'fastphase': Per-chromosome fastPHASE files, with
10534 multiple chromosomes. 10534 .chr-[chr #].recode.phase.inp filename extensions.
10535 * 'HV': Per-chromosome Haploview files, with .chr-[chr #][.ped + .info] 10535 * 'fastphase-1chr': Single .recode.phase.inp file. Does not support
10536 filename extensions. 10536 multiple chromosomes.
10537 * 'HV-1chr': Single Haploview .ped + .info file pair. Does not support 10537 * 'HV': Per-chromosome Haploview files, with .chr-[chr #][.ped + .info]
10538 multiple chromosomes. 10538 filename extensions.
10539 * 'lgen': PLINK 1 long-format (.lgen + .fam + .map), loadable with --lfile. 10539 * 'HV-1chr': Single Haploview .ped + .info file pair. Does not support
10540 * 'lgen-ref': .lgen + .fam + .map + .ref, loadable with --lfile + 10540 multiple chromosomes.
10541 --reference. 10541 * 'lgen': PLINK 1 long-format (.lgen + .fam + .map), loadable with --lfile.
10542 * 'list': Single genotype-based list, up to 4 lines per variant. To omit 10542 * 'lgen-ref': .lgen + .fam + .map + .ref, loadable with --lfile +
10543 nonmale genotypes on the Y chromosome, add the 'omit-nonmale-y' modifier. 10543 --reference.
10544 * 'rlist': .rlist + .fam + .map fileset, where the .rlist file is a 10544 * 'list': Single genotype-based list, up to 4 lines per variant. To omit
10545 genotype-based list which omits the most common genotype for each 10545 nonmale genotypes on the Y chromosome, add the 'omit-nonmale-y' modifier.
10546 variant. Also supports 'omit-nonmale-y'. 10546 * 'rlist': .rlist + .fam + .map fileset, where the .rlist file is a
10547 * 'oxford': Oxford-format .gen + .sample. With the 'gen-gz' modifier, the 10547 genotype-based list which omits the most common genotype for each
10548 .gen file is gzipped. 10548 variant. Also supports 'omit-nonmale-y'.
10549 * 'ped': PLINK 1 sample-major (.ped + .map), loadable with --file. 10549 * 'oxford': Oxford-format .gen + .sample. With the 'gen-gz' modifier, the
10550 * 'compound-genotypes': Same as 'ped', except that the space between each 10550 .gen file is gzipped.
10551 pair of same-variant allele codes is removed. 10551 * 'ped': PLINK 1 sample-major (.ped + .map), loadable with --file.
10552 * 'structure': Structure-format. 10552 * 'compound-genotypes': Same as 'ped', except that the space between each
10553 * 'transpose': PLINK 1 variant-major (.tped + .tfam), loadable with 10553 pair of same-variant allele codes is removed.
10554 --tfile. 10554 * 'structure': Structure-format.
10555 * 'vcf', 'vcf-fid', 'vcf-iid': VCFv4.2. 'vcf-fid' and 'vcf-iid' cause 10555 * 'transpose': PLINK 1 variant-major (.tped + .tfam), loadable with
10556 family IDs or within-family IDs respectively to be used for the sample 10556 --tfile.
10557 IDs in the last header row, while 'vcf' merges both IDs and puts an 10557 * 'vcf', 'vcf-fid', 'vcf-iid': VCFv4.2. 'vcf-fid' and 'vcf-iid' cause
10558 underscore between them. If the 'bgz' modifier is added, the VCF file is 10558 family IDs or within-family IDs respectively to be used for the sample
10559 block-gzipped. 10559 IDs in the last header row, while 'vcf' merges both IDs and puts an
10560 The A2 allele is saved as the reference and normally flagged as not based 10560 underscore between them. If the 'bgz' modifier is added, the VCF file is
10561 on a real reference genome (INFO:PR). When it is important for reference 10561 block-gzipped.
10562 alleles to be correct, you'll also want to include --a2-allele and 10562 The A2 allele is saved as the reference and normally flagged as not based
10563 --real-ref-alleles in your command. 10563 on a real reference genome (INFO:PR). When it is important for reference
10564 In addition, 10564 alleles to be correct, you'll also want to include --a2-allele and
10565 * The '12' modifier causes A1 (usually minor) alleles to be coded as '1' 10565 --real-ref-alleles in your command.
10566 and A2 alleles to be coded as '2', while '01' maps A1 -> 0 and A2 -> 1. 10566 In addition,
10567 * The 'tab' modifier makes the output mostly tab-delimited instead of 10567 * The '12' modifier causes A1 (usually minor) alleles to be coded as '1'
10568 mostly space-delimited. 'tabx' and 'spacex' force all tabs and all 10568 and A2 alleles to be coded as '2', while '01' maps A1 -> 0 and A2 -> 1.
10569 spaces, respectively. 10569 * The 'tab' modifier makes the output mostly tab-delimited instead of
10570 10570 mostly space-delimited. 'tabx' and 'spacex' force all tabs and all
10571 --flip-scan <verbose> 10571 spaces, respectively.
10572 (alias: --flipscan) 10572
10573 LD-based scan for case/control strand inconsistency. 10573 --flip-scan <verbose>
10574 10574 (alias: --flipscan)
10575 --write-covar 10575 LD-based scan for case/control strand inconsistency.
10576 If a --covar file is loaded, --make-bed/--make-just-fam and --recode 10576
10577 automatically generate an updated version (with all filters applied). 10577 --write-covar
10578 However, if you do not wish to simultaneously generate a new genotype file, 10578 If a --covar file is loaded, --make-bed/--make-just-fam and --recode
10579 you can use --write-covar to just produce a pruned covariate file. 10579 automatically generate an updated version (with all filters applied).
10580 10580 However, if you do not wish to simultaneously generate a new genotype file,
10581 --write-cluster <omit-unassigned> 10581 you can use --write-covar to just produce a pruned covariate file.
10582 If clusters are specified with --within/--family, this generates a new 10582
10583 cluster file (with all filters applied). The 'omit-unassigned' modifier 10583 --write-cluster <omit-unassigned>
10584 causes unclustered samples to be omitted from the file; otherwise their 10584 If clusters are specified with --within/--family, this generates a new
10585 cluster is 'NA'. 10585 cluster file (with all filters applied). The 'omit-unassigned' modifier
10586 10586 causes unclustered samples to be omitted from the file; otherwise their
10587 --write-set 10587 cluster is 'NA'.
10588 --set-table 10588
10589 If sets have been defined, --write-set dumps 'END'-terminated set 10589 --write-set
10590 membership lists to {output prefix}.set, while --set-table writes a 10590 --set-table
10591 variant-by-set membership table to {output prefix}.set.table. 10591 If sets have been defined, --write-set dumps 'END'-terminated set
10592 10592 membership lists to {output prefix}.set, while --set-table writes a
10593 --merge [.ped filename] [.map filename] 10593 variant-by-set membership table to {output prefix}.set.table.
10594 --merge [text fileset prefix] 10594
10595 --bmerge [.bed filename] [.bim filename] [.fam filename] 10595 --merge [.ped filename] [.map filename]
10596 --bmerge [binary fileset prefix] 10596 --merge [text fileset prefix]
10597 Merge the given fileset with the initially loaded fileset, writing the 10597 --bmerge [.bed filename] [.bim filename] [.fam filename]
10598 result to {output prefix}.bed + .bim + .fam. (It is no longer necessary to 10598 --bmerge [binary fileset prefix]
10599 simultaneously specify --make-bed.) 10599 Merge the given fileset with the initially loaded fileset, writing the
10600 --merge-list [filename] 10600 result to {output prefix}.bed + .bim + .fam. (It is no longer necessary to
10601 Merge all filesets named in the text file with the reference fileset, if 10601 simultaneously specify --make-bed.)
10602 one was specified. (However, this can also be used *without* a reference; 10602 --merge-list [filename]
10603 in that case, the newly created fileset is then treated as the reference by 10603 Merge all filesets named in the text file with the reference fileset, if
10604 most other PLINK operations.) The text file is interpreted as follows: 10604 one was specified. (However, this can also be used *without* a reference;
10605 * If a line contains only one name, it is assumed to be the prefix for a 10605 in that case, the newly created fileset is then treated as the reference by
10606 binary fileset. 10606 most other PLINK operations.) The text file is interpreted as follows:
10607 * If a line contains exactly two names, they are assumed to be the full 10607 * If a line contains only one name, it is assumed to be the prefix for a
10608 filenames for a text fileset (.ped first, then .map). 10608 binary fileset.
10609 * If a line contains exactly three names, they are assumed to be the full 10609 * If a line contains exactly two names, they are assumed to be the full
10610 filenames for a binary fileset (.bed, then .bim, then .fam). 10610 filenames for a text fileset (.ped first, then .map).
10611 10611 * If a line contains exactly three names, they are assumed to be the full
10612 --write-snplist 10612 filenames for a binary fileset (.bed, then .bim, then .fam).
10613 --list-23-indels 10613
10614 --write-snplist writes a .snplist file listing the names of all variants 10614 --write-snplist
10615 which pass the filters and inclusion thresholds you've specified, while 10615 --list-23-indels
10616 --list-23-indels writes the subset with 23andMe-style indel calls (D/I 10616 --write-snplist writes a .snplist file listing the names of all variants
10617 allele codes). 10617 which pass the filters and inclusion thresholds you've specified, while
10618 10618 --list-23-indels writes the subset with 23andMe-style indel calls (D/I
10619 --list-duplicate-vars <require-same-ref> <ids-only> <suppress-first> 10619 allele codes).
10620 --list-duplicate-vars writes a .dupvar file describing all groups of 10620
10621 variants with matching positions and allele codes. 10621 --list-duplicate-vars <require-same-ref> <ids-only> <suppress-first>
10622 * By default, A1/A2 allele assignments are ignored; use 'require-same-ref' 10622 --list-duplicate-vars writes a .dupvar file describing all groups of
10623 to override this. 10623 variants with matching positions and allele codes.
10624 * Normally, the report contains position and allele codes. To remove them 10624 * By default, A1/A2 allele assignments are ignored; use 'require-same-ref'
10625 (and produce a file directly usable with e.g. --extract/--exclude), use 10625 to override this.
10626 'ids-only'. Note that this command will fail in 'ids-only' mode if any 10626 * Normally, the report contains position and allele codes. To remove them
10627 of the reported IDs are not unique. 10627 (and produce a file directly usable with e.g. --extract/--exclude), use
10628 * 'suppress-first' causes the first variant ID in each group to be omitted 10628 'ids-only'. Note that this command will fail in 'ids-only' mode if any
10629 from the report. 10629 of the reported IDs are not unique.
10630 10630 * 'suppress-first' causes the first variant ID in each group to be omitted
10631 --freq <counts | case-control> <gz> 10631 from the report.
10632 --freqx <gz> 10632
10633 --freq generates a basic allele frequency (or count, if the 'counts' 10633 --freq <counts | case-control> <gz>
10634 modifier is present) report. This can be combined with --within/--family 10634 --freqx <gz>
10635 to produce a cluster-stratified allele frequency/count report instead, or 10635 --freq generates a basic allele frequency (or count, if the 'counts'
10636 the 'case-control' modifier to report case and control allele frequencies 10636 modifier is present) report. This can be combined with --within/--family
10637 separately. 10637 to produce a cluster-stratified allele frequency/count report instead, or
10638 --freqx generates a more detailed genotype count report, designed for use 10638 the 'case-control' modifier to report case and control allele frequencies
10639 with --read-freq. 10639 separately.
10640 10640 --freqx generates a more detailed genotype count report, designed for use
10641 --missing <gz> 10641 with --read-freq.
10642 Generate sample- and variant-based missing data reports. If clusters are 10642
10643 defined, the variant-based report is cluster-stratified. 'gz' causes the 10643 --missing <gz>
10644 output files to be gzipped. 10644 Generate sample- and variant-based missing data reports. If clusters are
10645 10645 defined, the variant-based report is cluster-stratified. 'gz' causes the
10646 --test-mishap 10646 output files to be gzipped.
10647 Check for association between missing calls and flanking haplotypes. 10647
10648 10648 --test-mishap
10649 --hardy <midp> <gz> 10649 Check for association between missing calls and flanking haplotypes.
10650 Generate a Hardy-Weinberg exact test p-value report. (This does NOT 10650
10651 simultaneously filter on the p-value any more; use --hwe for that.) With 10651 --hardy <midp> <gz>
10652 the 'midp' modifier, the test applies the mid-p adjustment described in 10652 Generate a Hardy-Weinberg exact test p-value report. (This does NOT
10653 Graffelman J, Moreno V (2013) The mid p-value in exact tests for 10653 simultaneously filter on the p-value any more; use --hwe for that.) With
10654 Hardy-Weinberg Equilibrium. 10654 the 'midp' modifier, the test applies the mid-p adjustment described in
10655 10655 Graffelman J, Moreno V (2013) The mid p-value in exact tests for
10656 --mendel <summaries-only> 10656 Hardy-Weinberg Equilibrium.
10657 Generate a Mendel error report. The 'summaries-only' modifier causes the 10657
10658 .mendel file (listing every single error) to be skipped. 10658 --mendel <summaries-only>
10659 10659 Generate a Mendel error report. The 'summaries-only' modifier causes the
10660 --het <small-sample> <gz> 10660 .mendel file (listing every single error) to be skipped.
10661 --ibc 10661
10662 Estimate inbreeding coefficients. --het reports method-of-moments 10662 --het <small-sample> <gz>
10663 estimates, while --ibc calculates all three values described in Yang J, Lee 10663 --ibc
10664 SH, Goddard ME and Visscher PM (2011) GCTA: A Tool for Genome-wide Complex 10664 Estimate inbreeding coefficients. --het reports method-of-moments
10665 Trait Analysis. (That paper also describes the relationship matrix 10665 estimates, while --ibc calculates all three values described in Yang J, Lee
10666 computation we reimplement.) 10666 SH, Goddard ME and Visscher PM (2011) GCTA: A Tool for Genome-wide Complex
10667 * These functions require decent MAF estimates. If there are very few 10667 Trait Analysis. (That paper also describes the relationship matrix
10668 samples in your immediate fileset, --read-freq is practically mandatory 10668 computation we reimplement.)
10669 since imputed MAFs are wildly inaccurate in that case. 10669 * These functions require decent MAF estimates. If there are very few
10670 * They also assume the marker set is in approximate linkage equilibrium. 10670 samples in your immediate fileset, --read-freq is practically mandatory
10671 * By default, --het omits the n/(n-1) multiplier in Nei's expected 10671 since imputed MAFs are wildly inaccurate in that case.
10672 homozygosity formula. The 'small-sample' modifier causes it to be 10672 * They also assume the marker set is in approximate linkage equilibrium.
10673 included, while forcing --het to use MAFs imputed from founders in the 10673 * By default, --het omits the n/(n-1) multiplier in Nei's expected
10674 immediate dataset. 10674 homozygosity formula. The 'small-sample' modifier causes it to be
10675 10675 included, while forcing --het to use MAFs imputed from founders in the
10676 --check-sex {female max F} {male min F} 10676 immediate dataset.
10677 --check-sex ycount {female max F} {male min F} {female max Y obs} 10677
10678 {male min Y obs} 10678 --check-sex {female max F} {male min F}
10679 --check-sex y-only {female max Y obs} {male min Y obs} 10679 --check-sex ycount {female max F} {male min F} {female max Y obs}
10680 --impute-sex {female max F} {male min F} 10680 {male min Y obs}
10681 --impute-sex ycount {female max F} {male min F} {female max Y obs} 10681 --check-sex y-only {female max Y obs} {male min Y obs}
10682 {male min Y obs} 10682 --impute-sex {female max F} {male min F}
10683 --impute-sex y-only {female max Y obs} {male min Y obs} 10683 --impute-sex ycount {female max F} {male min F} {female max Y obs}
10684 --check-sex normally compares sex assignments in the input dataset with 10684 {male min Y obs}
10685 those imputed from X chromosome inbreeding coefficients. 10685 --impute-sex y-only {female max Y obs} {male min Y obs}
10686 * Make sure that the X chromosome pseudo-autosomal region has been split 10686 --check-sex normally compares sex assignments in the input dataset with
10687 off (with e.g. --split-x) before using this. 10687 those imputed from X chromosome inbreeding coefficients.
10688 * You also need decent MAF estimates (so, with very few samples in your 10688 * Make sure that the X chromosome pseudo-autosomal region has been split
10689 immediate fileset, use --read-freq), and your marker set should be in 10689 off (with e.g. --split-x) before using this.
10690 approximate linkage equilibrium. 10690 * You also need decent MAF estimates (so, with very few samples in your
10691 * By default, F estimates smaller than 0.2 yield female calls, and values 10691 immediate fileset, use --read-freq), and your marker set should be in
10692 larger than 0.8 yield male calls. If you pass numeric parameter(s) to 10692 approximate linkage equilibrium.
10693 --check-sex, the first two control these thresholds. 10693 * By default, F estimates smaller than 0.2 yield female calls, and values
10694 There are now two modes which consider Y chromosome data. 10694 larger than 0.8 yield male calls. If you pass numeric parameter(s) to
10695 * In 'ycount' mode, gender is still imputed from the X chromosome, but 10695 --check-sex, the first two control these thresholds.
10696 female calls are downgraded to ambiguous whenever more than 0 nonmissing 10696 There are now two modes which consider Y chromosome data.
10697 Y genotypes are present, and male calls are downgraded when fewer than 0 10697 * In 'ycount' mode, gender is still imputed from the X chromosome, but
10698 are present. (Note that these are counts, not rates.) These thresholds 10698 female calls are downgraded to ambiguous whenever more than 0 nonmissing
10699 are controllable with --check-sex ycount's optional 3rd and 4th numeric 10699 Y genotypes are present, and male calls are downgraded when fewer than 0
10700 parameters. 10700 are present. (Note that these are counts, not rates.) These thresholds
10701 * In 'y-only' mode, gender is imputed from nonmissing Y genotype counts. 10701 are controllable with --check-sex ycount's optional 3rd and 4th numeric
10702 The male minimum threshold defaults to 1 instead of zero in this case. 10702 parameters.
10703 --impute-sex changes sex assignments to the imputed values, and is 10703 * In 'y-only' mode, gender is imputed from nonmissing Y genotype counts.
10704 otherwise identical to --check-sex. It must be used with 10704 The male minimum threshold defaults to 1 instead of zero in this case.
10705 --make-bed/--recode/--write-covar. 10705 --impute-sex changes sex assignments to the imputed values, and is
10706 10706 otherwise identical to --check-sex. It must be used with
10707 --fst <case-control> 10707 --make-bed/--recode/--write-covar.
10708 (alias: --Fst) 10708
10709 Estimate Wright's Fst for each autosomal diploid variant using the method 10709 --fst <case-control>
10710 introduced in Weir BS, Cockerham CC (1984) Estimating F-statistics for the 10710 (alias: --Fst)
10711 analysis of population structure, given a set of subpopulations defined via 10711 Estimate Wright's Fst for each autosomal diploid variant using the method
10712 --within. Raw and weighted global means are also reported. 10712 introduced in Weir BS, Cockerham CC (1984) Estimating F-statistics for the
10713 * If you're interested in the global means, it is usually best to perform 10713 analysis of population structure, given a set of subpopulations defined via
10714 this calculation on a marker set in approximate linkage equilibrium. 10714 --within. Raw and weighted global means are also reported.
10715 * If you have only two subpopulations, you can represent them with 10715 * If you're interested in the global means, it is usually best to perform
10716 case/control status and use the 'case-control' modifier. 10716 this calculation on a marker set in approximate linkage equilibrium.
10717 10717 * If you have only two subpopulations, you can represent them with
10718 --indep [window size]<kb> [step size (variant ct)] [VIF threshold] 10718 case/control status and use the 'case-control' modifier.
10719 --indep-pairwise [window size]<kb> [step size (variant ct)] [r^2 threshold] 10719
10720 --indep-pairphase [window size]<kb> [step size (variant ct)] [r^2 threshold] 10720 --indep [window size]<kb> [step size (variant ct)] [VIF threshold]
10721 Generate a list of markers in approximate linkage equilibrium. With the 10721 --indep-pairwise [window size]<kb> [step size (variant ct)] [r^2 threshold]
10722 'kb' modifier, the window size is in kilobase instead of variant count 10722 --indep-pairphase [window size]<kb> [step size (variant ct)] [r^2 threshold]
10723 units. (Pre-'kb' space is optional, i.e. '--indep-pairwise 500 kb 5 0.5' 10723 Generate a list of markers in approximate linkage equilibrium. With the
10724 and '--indep-pairwise 500kb 5 0.5' have the same effect.) 10724 'kb' modifier, the window size is in kilobase instead of variant count
10725 Note that you need to rerun PLINK using --extract or --exclude on the 10725 units. (Pre-'kb' space is optional, i.e. '--indep-pairwise 500 kb 5 0.5'
10726 .prune.in/.prune.out file to apply the list to another computation. 10726 and '--indep-pairwise 500kb 5 0.5' have the same effect.)
10727 10727 Note that you need to rerun PLINK using --extract or --exclude on the
10728 --r <square | square0 | triangle | inter-chr> <gz | bin | bin4> <spaces> 10728 .prune.in/.prune.out file to apply the list to another computation.
10729 <in-phase> <d | dprime | dprime-signed> <with-freqs> <yes-really> 10729
10730 --r2 <square | square0 | triangle | inter-chr> <gz | bin | bin4> <spaces> 10730 --r <square | square0 | triangle | inter-chr> <gz | bin | bin4> <spaces>
10731 <in-phase> <d | dprime | dprime-signed> <with-freqs> <yes-really> 10731 <in-phase> <d | dprime | dprime-signed> <with-freqs> <yes-really>
10732 LD statistic reports. --r yields raw inter-variant correlations, while 10732 --r2 <square | square0 | triangle | inter-chr> <gz | bin | bin4> <spaces>
10733 --r2 reports their squares. You can request results for all pairs in 10733 <in-phase> <d | dprime | dprime-signed> <with-freqs> <yes-really>
10734 matrix format (if you specify 'bin' or one of the shape modifiers), all 10734 LD statistic reports. --r yields raw inter-variant correlations, while
10735 pairs in table format ('inter-chr'), or a limited window in table format 10735 --r2 reports their squares. You can request results for all pairs in
10736 (default). 10736 matrix format (if you specify 'bin' or one of the shape modifiers), all
10737 * The 'gz' modifier causes the output text file to be gzipped. 10737 pairs in table format ('inter-chr'), or a limited window in table format
10738 * 'bin' causes the output matrix to be written in double-precision binary 10738 (default).
10739 format, while 'bin4' specifics single-precision binary. The matrix is 10739 * The 'gz' modifier causes the output text file to be gzipped.
10740 square if no shape is explicitly specified. 10740 * 'bin' causes the output matrix to be written in double-precision binary
10741 * By default, text matrices are tab-delimited; 'spaces' switches this. 10741 format, while 'bin4' specifics single-precision binary. The matrix is
10742 * 'in-phase' adds a column with in-phase allele pairs to table-formatted 10742 square if no shape is explicitly specified.
10743 reports. (This cannot be used with very long allele codes.) 10743 * By default, text matrices are tab-delimited; 'spaces' switches this.
10744 * 'dprime' adds the absolute value of Lewontin's D-prime statistic to 10744 * 'in-phase' adds a column with in-phase allele pairs to table-formatted
10745 table-formatted reports, and forces both r/r^2 and D-prime to be based on 10745 reports. (This cannot be used with very long allele codes.)
10746 the maximum likelihood solution to the cubic equation discussed in Gaunt 10746 * 'dprime' adds the absolute value of Lewontin's D-prime statistic to
10747 T, Rodriguez S, Day I (2007) Cubic exact solutions for the estimation of 10747 table-formatted reports, and forces both r/r^2 and D-prime to be based on
10748 pairwise haplotype frequencies. 10748 the maximum likelihood solution to the cubic equation discussed in Gaunt
10749 'dprime-signed' keeps the sign, while 'd' skips division by D_{max}. 10749 T, Rodriguez S, Day I (2007) Cubic exact solutions for the estimation of
10750 * 'with-freqs' adds MAF columns to table-formatted reports. 10750 pairwise haplotype frequencies.
10751 * Since the resulting file can easily be huge, you're required to add the 10751 'dprime-signed' keeps the sign, while 'd' skips division by D_{max}.
10752 'yes-really' modifier when requesting an unfiltered, non-distributed all 10752 * 'with-freqs' adds MAF columns to table-formatted reports.
10753 pairs computation on more than 400k variants. 10753 * Since the resulting file can easily be huge, you're required to add the
10754 * These computations can be subdivided with --parallel (even when the 10754 'yes-really' modifier when requesting an unfiltered, non-distributed all
10755 'square' modifier is active). 10755 pairs computation on more than 400k variants.
10756 --ld [variant ID] [variant ID] <hwe-midp> 10756 * These computations can be subdivided with --parallel (even when the
10757 This displays haplotype frequencies, r^2, and D' for a single pair of 10757 'square' modifier is active).
10758 variants. When there are multiple biologically possible solutions to the 10758 --ld [variant ID] [variant ID] <hwe-midp>
10759 haplotype frequency cubic equation, all are displayed (instead of just the 10759 This displays haplotype frequencies, r^2, and D' for a single pair of
10760 maximum likelihood solution identified by --r/--r2), along with HWE exact 10760 variants. When there are multiple biologically possible solutions to the
10761 test statistics. 10761 haplotype frequency cubic equation, all are displayed (instead of just the
10762 10762 maximum likelihood solution identified by --r/--r2), along with HWE exact
10763 --show-tags [filename] 10763 test statistics.
10764 --show-tags all 10764
10765 * If a file is specified, list all variants which tag at least one variant 10765 --show-tags [filename]
10766 named in the file. (This will normally be a superset of the original 10766 --show-tags all
10767 list, since a variant is considered to tag itself here.) 10767 * If a file is specified, list all variants which tag at least one variant
10768 * If 'all' mode is specified, for each variant, each *other* variant which 10768 named in the file. (This will normally be a superset of the original
10769 tags it is reported. 10769 list, since a variant is considered to tag itself here.)
10770 10770 * If 'all' mode is specified, for each variant, each *other* variant which
10771 --blocks <no-pheno-req> <no-small-max-span> 10771 tags it is reported.
10772 Estimate haplotype blocks, via Haploview's interpretation of the block 10772
10773 definition suggested by Gabriel S et al. (2002) The Structure of Haplotype 10773 --blocks <no-pheno-req> <no-small-max-span>
10774 Blocks in the Human Genome. 10774 Estimate haplotype blocks, via Haploview's interpretation of the block
10775 * Normally, samples with missing phenotypes are not considered by this 10775 definition suggested by Gabriel S et al. (2002) The Structure of Haplotype
10776 computation; the 'no-pheno-req' modifier lifts this restriction. 10776 Blocks in the Human Genome.
10777 * Normally, size-2 blocks may not span more than 20kb, and size-3 blocks 10777 * Normally, samples with missing phenotypes are not considered by this
10778 are limited to 30kb. The 'no-small-max-span' modifier removes these 10778 computation; the 'no-pheno-req' modifier lifts this restriction.
10779 limits. 10779 * Normally, size-2 blocks may not span more than 20kb, and size-3 blocks
10780 The .blocks file is valid input for PLINK 1.07's --hap command. However, 10780 are limited to 30kb. The 'no-small-max-span' modifier removes these
10781 the --hap... family of flags has not been reimplemented in PLINK 1.9 due to 10781 limits.
10782 poor phasing accuracy relative to other software; for now, we recommend 10782 The .blocks file is valid input for PLINK 1.07's --hap command. However,
10783 using BEAGLE instead of PLINK for case/control haplotype association 10783 the --hap... family of flags has not been reimplemented in PLINK 1.9 due to
10784 analysis. (You can use '--recode beagle' to export data to BEAGLE 3.3.) 10784 poor phasing accuracy relative to other software; for now, we recommend
10785 We apologize for the inconvenience, and plan to develop variants of the 10785 using BEAGLE instead of PLINK for case/control haplotype association
10786 --hap... flags which handle pre-phased data effectively. 10786 analysis. (You can use '--recode beagle' to export data to BEAGLE 3.3.)
10787 10787 We apologize for the inconvenience, and plan to develop variants of the
10788 --distance <square | square0 | triangle> <gz | bin | bin4> <ibs> <1-ibs> 10788 --hap... flags which handle pre-phased data effectively.
10789 <allele-ct> <flat-missing> 10789
10790 Write a lower-triangular tab-delimited table of (weighted) genomic 10790 --distance <square | square0 | triangle> <gz | bin | bin4> <ibs> <1-ibs>
10791 distances in allele count units to {output prefix}.dist, and a list of the 10791 <allele-ct> <flat-missing>
10792 corresponding sample IDs to {output prefix}.dist.id. The first row of the 10792 Write a lower-triangular tab-delimited table of (weighted) genomic
10793 .dist file contains a single {genome 1-genome 2} distance, the second row 10793 distances in allele count units to {output prefix}.dist, and a list of the
10794 has the {genome 1-genome 3} and {genome 2-genome 3} distances in that 10794 corresponding sample IDs to {output prefix}.dist.id. The first row of the
10795 order, etc. 10795 .dist file contains a single {genome 1-genome 2} distance, the second row
10796 * It is usually best to perform this calculation on a marker set in 10796 has the {genome 1-genome 3} and {genome 2-genome 3} distances in that
10797 approximate linkage equilibrium. 10797 order, etc.
10798 * If the 'square' or 'square0' modifier is present, a square matrix is 10798 * It is usually best to perform this calculation on a marker set in
10799 written instead; 'square0' fills the upper right triangle with zeroes. 10799 approximate linkage equilibrium.
10800 * If the 'gz' modifier is present, a compressed .dist.gz file is written 10800 * If the 'square' or 'square0' modifier is present, a square matrix is
10801 instead of a plain text file. 10801 written instead; 'square0' fills the upper right triangle with zeroes.
10802 * If the 'bin' modifier is present, a binary (square) matrix of 10802 * If the 'gz' modifier is present, a compressed .dist.gz file is written
10803 double-precision floating point values, suitable for loading from R, is 10803 instead of a plain text file.
10804 instead written to {output prefix}.dist.bin. ('bin4' specifies 10804 * If the 'bin' modifier is present, a binary (square) matrix of
10805 single-precision numbers instead.) This can be combined with 'square0' 10805 double-precision floating point values, suitable for loading from R, is
10806 if you still want the upper right zeroed out, or 'triangle' if you don't 10806 instead written to {output prefix}.dist.bin. ('bin4' specifies
10807 want to pad the upper right at all. 10807 single-precision numbers instead.) This can be combined with 'square0'
10808 * If the 'ibs' modifier is present, an identity-by-state matrix is written 10808 if you still want the upper right zeroed out, or 'triangle' if you don't
10809 to {output prefix}.mibs. '1-ibs' causes distances expressed as genomic 10809 want to pad the upper right at all.
10810 proportions (i.e. 1 - IBS) to be written to {output prefix}.mdist. 10810 * If the 'ibs' modifier is present, an identity-by-state matrix is written
10811 Combine with 'allele-ct' if you want to generate the usual .dist file as 10811 to {output prefix}.mibs. '1-ibs' causes distances expressed as genomic
10812 well. 10812 proportions (i.e. 1 - IBS) to be written to {output prefix}.mdist.
10813 * By default, distance rescaling in the presence of missing genotype calls 10813 Combine with 'allele-ct' if you want to generate the usual .dist file as
10814 is sensitive to allele count distributions: if variant A contributes, on 10814 well.
10815 average, twice as much to other pairwise distances as variant B, a 10815 * By default, distance rescaling in the presence of missing genotype calls
10816 missing call at variant A will result in twice as large of a missingness 10816 is sensitive to allele count distributions: if variant A contributes, on
10817 correction. To turn this off (because e.g. your missing calls are highly 10817 average, twice as much to other pairwise distances as variant B, a
10818 nonrandom), use the 'flat-missing' modifier. 10818 missing call at variant A will result in twice as large of a missingness
10819 * The computation can be subdivided with --parallel. 10819 correction. To turn this off (because e.g. your missing calls are highly
10820 --distance-matrix 10820 nonrandom), use the 'flat-missing' modifier.
10821 --ibs-matrix 10821 * The computation can be subdivided with --parallel.
10822 These deprecated commands are equivalent to '--distance 1-ibs flat-missing 10822 --distance-matrix
10823 square' and '--distance ibs flat-missing square', respectively, except that 10823 --ibs-matrix
10824 they generate space- instead of tab-delimited text matrices. 10824 These deprecated commands are equivalent to '--distance 1-ibs flat-missing
10825 10825 square' and '--distance ibs flat-missing square', respectively, except that
10826 --make-rel <square | square0 | triangle> <gz | bin | bin4> 10826 they generate space- instead of tab-delimited text matrices.
10827 <cov | ibc2 | ibc3> 10827
10828 Write a lower-triangular variance-standardized realized relationship matrix 10828 --make-rel <square | square0 | triangle> <gz | bin | bin4>
10829 to {output prefix}.rel, and corresponding IDs to {output prefix}.rel.id. 10829 <cov | ibc2 | ibc3>
10830 * It is usually best to perform this calculation on a marker set in 10830 Write a lower-triangular variance-standardized realized relationship matrix
10831 approximate linkage equilibrium. 10831 to {output prefix}.rel, and corresponding IDs to {output prefix}.rel.id.
10832 * 'square', 'square0', 'triangle', 'gz', 'bin', and 'bin4' act as they do 10832 * It is usually best to perform this calculation on a marker set in
10833 on --distance. 10833 approximate linkage equilibrium.
10834 * The 'cov' modifier removes the variance standardization step, causing a 10834 * 'square', 'square0', 'triangle', 'gz', 'bin', and 'bin4' act as they do
10835 covariance matrix to be calculated instead. 10835 on --distance.
10836 * By default, the diagonal elements in the relationship matrix are based on 10836 * The 'cov' modifier removes the variance standardization step, causing a
10837 --ibc's Fhat1; use the 'ibc2' or 'ibc3' modifiers to base them on Fhat2 10837 covariance matrix to be calculated instead.
10838 or Fhat3 instead. 10838 * By default, the diagonal elements in the relationship matrix are based on
10839 * The computation can be subdivided with --parallel. 10839 --ibc's Fhat1; use the 'ibc2' or 'ibc3' modifiers to base them on Fhat2
10840 --make-grm-gz <no-gz> <cov | ibc2 | ibc3> 10840 or Fhat3 instead.
10841 --make-grm-bin <cov | ibc2 | ibc3> 10841 * The computation can be subdivided with --parallel.
10842 --make-grm-gz writes the relationships in GCTA's original gzipped list 10842 --make-grm-gz <no-gz> <cov | ibc2 | ibc3>
10843 format, which describes one pair per line, while --make-grm-bin writes them 10843 --make-grm-bin <cov | ibc2 | ibc3>
10844 in GCTA 1.1+'s single-precision triangular binary format. Note that these 10844 --make-grm-gz writes the relationships in GCTA's original gzipped list
10845 formats explicitly report the number of valid observations (where neither 10845 format, which describes one pair per line, while --make-grm-bin writes them
10846 sample has a missing call) for each pair, which is useful input for some 10846 in GCTA 1.1+'s single-precision triangular binary format. Note that these
10847 scripts. 10847 formats explicitly report the number of valid observations (where neither
10848 These computations can be subdivided with --parallel. 10848 sample has a missing call) for each pair, which is useful input for some
10849 10849 scripts.
10850 --rel-cutoff {val} 10850 These computations can be subdivided with --parallel.
10851 (alias: --grm-cutoff) 10851
10852 Exclude one member of each pair of samples with relatedness greater than 10852 --rel-cutoff {val}
10853 the given cutoff value (default 0.025). If no later operation will cause 10853 (alias: --grm-cutoff)
10854 the list of remaining samples to be written to disk, this will save it to 10854 Exclude one member of each pair of samples with relatedness greater than
10855 {output prefix}.rel.id. 10855 the given cutoff value (default 0.025). If no later operation will cause
10856 Note that maximizing the remaining sample size is equivalent to the NP-hard 10856 the list of remaining samples to be written to disk, this will save it to
10857 maximum independent set problem, so we use a greedy algorithm instead of 10857 {output prefix}.rel.id.
10858 guaranteeing optimality. (Use the --make-rel and --keep/--remove flags if 10858 Note that maximizing the remaining sample size is equivalent to the NP-hard
10859 you want to try to do better.) 10859 maximum independent set problem, so we use a greedy algorithm instead of
10860 10860 guaranteeing optimality. (Use the --make-rel and --keep/--remove flags if
10861 --ibs-test {permutation count} 10861 you want to try to do better.)
10862 --groupdist {iters} {d} 10862
10863 Given case/control phenotype data, these commands consider three subsets of 10863 --ibs-test {permutation count}
10864 the distance matrix: pairs of affected samples, affected-unaffected pairs, 10864 --groupdist {iters} {d}
10865 and pairs of unaffected samples. Each of these subsets has a distribution 10865 Given case/control phenotype data, these commands consider three subsets of
10866 of pairwise genomic distances; --ibs-test uses permutation to estimate 10866 the distance matrix: pairs of affected samples, affected-unaffected pairs,
10867 p-values re: which types of pairs are most similar, while --groupdist 10867 and pairs of unaffected samples. Each of these subsets has a distribution
10868 focuses on the differences between the centers of these distributions and 10868 of pairwise genomic distances; --ibs-test uses permutation to estimate
10869 estimates standard errors via delete-d jackknife. 10869 p-values re: which types of pairs are most similar, while --groupdist
10870 10870 focuses on the differences between the centers of these distributions and
10871 --regress-distance {iters} {d} 10871 estimates standard errors via delete-d jackknife.
10872 Linear regression of pairwise genomic distances on pairwise average 10872
10873 phenotypes and vice versa, using delete-d jackknife for standard errors. A 10873 --regress-distance {iters} {d}
10874 scalar phenotype is required. 10874 Linear regression of pairwise genomic distances on pairwise average
10875 * With less than two parameters, d is set to {number of people}^0.6 rounded 10875 phenotypes and vice versa, using delete-d jackknife for standard errors. A
10876 down. With no parameters, 100k iterations are run. 10876 scalar phenotype is required.
10877 --regress-rel {iters} {d} 10877 * With less than two parameters, d is set to {number of people}^0.6 rounded
10878 Linear regression of pairwise genomic relationships on pairwise average 10878 down. With no parameters, 100k iterations are run.
10879 phenotypes, and vice versa. Defaults for iters and d are the same as for 10879 --regress-rel {iters} {d}
10880 --regress-distance. 10880 Linear regression of pairwise genomic relationships on pairwise average
10881 10881 phenotypes, and vice versa. Defaults for iters and d are the same as for
10882 --genome <gz> <rel-check> <full> <unbounded> <nudge> 10882 --regress-distance.
10883 Generate an identity-by-descent report. 10883
10884 * It is usually best to perform this calculation on a marker set in 10884 --genome <gz> <rel-check> <full> <unbounded> <nudge>
10885 approximate linkage equilibrium. 10885 Generate an identity-by-descent report.
10886 * The 'rel-check' modifier excludes pairs of samples with different FIDs 10886 * It is usually best to perform this calculation on a marker set in
10887 from the final report. 10887 approximate linkage equilibrium.
10888 * 'full' adds raw pairwise comparison data to the report. 10888 * The 'rel-check' modifier excludes pairs of samples with different FIDs
10889 * The P(IBD=0/1/2) estimator employed by this command sometimes yields 10889 from the final report.
10890 numbers outside the range [0,1]; by default, these are clipped. The 10890 * 'full' adds raw pairwise comparison data to the report.
10891 'unbounded' modifier turns off this clipping. 10891 * The P(IBD=0/1/2) estimator employed by this command sometimes yields
10892 * Then, when PI_HAT^2 < P(IBD=2), 'nudge' adjusts the final P(IBD=0/1/2) 10892 numbers outside the range [0,1]; by default, these are clipped. The
10893 estimates to a theoretically possible configuration. 10893 'unbounded' modifier turns off this clipping.
10894 * The computation can be subdivided with --parallel. 10894 * Then, when PI_HAT^2 < P(IBD=2), 'nudge' adjusts the final P(IBD=0/1/2)
10895 10895 estimates to a theoretically possible configuration.
10896 --homozyg <group | group-verbose> <consensus-match> <extend> 10896 * The computation can be subdivided with --parallel.
10897 <subtract-1-from-lengths> 10897
10898 --homozyg-snp [min var count] 10898 --homozyg <group | group-verbose> <consensus-match> <extend>
10899 --homozyg-kb [min length] 10899 <subtract-1-from-lengths>
10900 --homozyg-density [max inverse density (kb/var)] 10900 --homozyg-snp [min var count]
10901 --homozyg-gap [max internal gap kb length] 10901 --homozyg-kb [min length]
10902 --homozyg-het [max hets] 10902 --homozyg-density [max inverse density (kb/var)]
10903 --homozyg-window-snp [scanning window size] 10903 --homozyg-gap [max internal gap kb length]
10904 --homozyg-window-het [max hets in scanning window hit] 10904 --homozyg-het [max hets]
10905 --homozyg-window-missing [max missing calls in scanning window hit] 10905 --homozyg-window-snp [scanning window size]
10906 --homozyg-window-threshold [min scanning window hit rate] 10906 --homozyg-window-het [max hets in scanning window hit]
10907 These commands request a set of run-of-homozygosity reports, and allow you 10907 --homozyg-window-missing [max missing calls in scanning window hit]
10908 to customize how they are generated. 10908 --homozyg-window-threshold [min scanning window hit rate]
10909 * If you're satisfied with all the default settings described below, just 10909 These commands request a set of run-of-homozygosity reports, and allow you
10910 use --homozyg with no modifiers. Otherwise, --homozyg lets you change a 10910 to customize how they are generated.
10911 few binary settings: 10911 * If you're satisfied with all the default settings described below, just
10912 * 'group{-verbose}' adds a report on pools of overlapping runs of 10912 use --homozyg with no modifiers. Otherwise, --homozyg lets you change a
10913 homozygosity. (Automatically set when --homozyg-match is present.) 10913 few binary settings:
10914 * With 'group{-verbose}', 'consensus-match' causes pairwise segmental 10914 * 'group{-verbose}' adds a report on pools of overlapping runs of
10915 matches to be called based on the variants in the pool's consensus 10915 homozygosity. (Automatically set when --homozyg-match is present.)
10916 segment, rather than the variants in the pairwise intersection. 10916 * With 'group{-verbose}', 'consensus-match' causes pairwise segmental
10917 * Due to how the scanning window algorithm works, it is possible for a 10917 matches to be called based on the variants in the pool's consensus
10918 reported ROH to be adjacent to a few homozygous variants. The 'extend' 10918 segment, rather than the variants in the pairwise intersection.
10919 modifier causes them to be included in the reported ROH if that 10919 * Due to how the scanning window algorithm works, it is possible for a
10920 wouldn't cause a violation of the --homozyg-density bound. 10920 reported ROH to be adjacent to a few homozygous variants. The 'extend'
10921 * By default, segment bp lengths are calculated as [end bp position] - 10921 modifier causes them to be included in the reported ROH if that
10922 [start bp position] + 1. Therefore, reports normally differ slightly 10922 wouldn't cause a violation of the --homozyg-density bound.
10923 from PLINK 1.07, which does not add 1 at the end. For testing 10923 * By default, segment bp lengths are calculated as [end bp position] -
10924 purposes, you can use the 'subtract-1-from-lengths' modifier to apply 10924 [start bp position] + 1. Therefore, reports normally differ slightly
10925 the old formula. 10925 from PLINK 1.07, which does not add 1 at the end. For testing
10926 * By default, only runs of homozygosity containing at least 100 variants, 10926 purposes, you can use the 'subtract-1-from-lengths' modifier to apply
10927 and of total length >= 1000 kilobases, are noted. You can change these 10927 the old formula.
10928 minimums with --homozyg-snp and --homozyg-kb, respectively. 10928 * By default, only runs of homozygosity containing at least 100 variants,
10929 * By default, a ROH must have at least one variant per 50 kb on average; 10929 and of total length >= 1000 kilobases, are noted. You can change these
10930 change this bound with --homozyg-density. 10930 minimums with --homozyg-snp and --homozyg-kb, respectively.
10931 * By default, if two consecutive variants are more than 1000 kb apart, they 10931 * By default, a ROH must have at least one variant per 50 kb on average;
10932 cannot be in the same ROH; change this bound with --homozyg-gap. 10932 change this bound with --homozyg-density.
10933 * By default, a ROH can contain an unlimited number of heterozygous calls; 10933 * By default, if two consecutive variants are more than 1000 kb apart, they
10934 you can impose a limit with --homozyg-het. 10934 cannot be in the same ROH; change this bound with --homozyg-gap.
10935 * By default, the scanning window contains 50 variants; change this with 10935 * By default, a ROH can contain an unlimited number of heterozygous calls;
10936 --homozyg-window-snp. 10936 you can impose a limit with --homozyg-het.
10937 * By default, a scanning window hit can contain at most 1 heterozygous 10937 * By default, the scanning window contains 50 variants; change this with
10938 call and 5 missing calls; change these limits with --homozyg-window-het 10938 --homozyg-window-snp.
10939 and --homozyg-window-missing, respectively. 10939 * By default, a scanning window hit can contain at most 1 heterozygous
10940 * By default, for a variant to be eligible for inclusion in a ROH, the hit 10940 call and 5 missing calls; change these limits with --homozyg-window-het
10941 rate of all scanning windows containing the variant must be at least 10941 and --homozyg-window-missing, respectively.
10942 0.05; change this threshold with --homozyg-window-threshold. 10942 * By default, for a variant to be eligible for inclusion in a ROH, the hit
10943 10943 rate of all scanning windows containing the variant must be at least
10944 --cluster <cc> <group-avg | old-tiebreaks> <missing> <only2> 10944 0.05; change this threshold with --homozyg-window-threshold.
10945 Cluster samples using a pairwise similarity statistic (normally IBS). 10945
10946 * The 'cc' modifier forces every cluster to have at least one case and one 10946 --cluster <cc> <group-avg | old-tiebreaks> <missing> <only2>
10947 control. 10947 Cluster samples using a pairwise similarity statistic (normally IBS).
10948 * The 'group-avg' modifier causes clusters to be joined based on average 10948 * The 'cc' modifier forces every cluster to have at least one case and one
10949 instead of minimum pairwise similarity. 10949 control.
10950 * The 'missing' modifier causes clustering to be based on 10950 * The 'group-avg' modifier causes clusters to be joined based on average
10951 identity-by-missingness instead of identity-by-state, and writes a 10951 instead of minimum pairwise similarity.
10952 space-delimited identity-by-missingness matrix to disk. 10952 * The 'missing' modifier causes clustering to be based on
10953 * The 'only2' modifier causes only a .cluster2 file (which is valid input 10953 identity-by-missingness instead of identity-by-state, and writes a
10954 for --within) to be written; otherwise 2 other files will be produced. 10954 space-delimited identity-by-missingness matrix to disk.
10955 * By default, IBS ties are not broken in the same manner as PLINK 1.07, so 10955 * The 'only2' modifier causes only a .cluster2 file (which is valid input
10956 final cluster solutions tend to differ. This is generally harmless. 10956 for --within) to be written; otherwise 2 other files will be produced.
10957 However, to simplify testing, you can use the 'old-tiebreaks' modifier to 10957 * By default, IBS ties are not broken in the same manner as PLINK 1.07, so
10958 force emulation of the old algorithm. 10958 final cluster solutions tend to differ. This is generally harmless.
10959 10959 However, to simplify testing, you can use the 'old-tiebreaks' modifier to
10960 --pca {count} <header> <tabs> <var-wts> 10960 force emulation of the old algorithm.
10961 Calculates a variance-standardized relationship matrix (use 10961
10962 --make-rel/--make-grm-gz/--make-grm-bin to dump it), and extracts the top 10962 --pca {count} <header> <tabs> <var-wts>
10963 20 principal components. 10963 Calculates a variance-standardized relationship matrix (use
10964 * It is usually best to perform this calculation on a marker set in 10964 --make-rel/--make-grm-gz/--make-grm-bin to dump it), and extracts the top
10965 approximate linkage equilibrium. 10965 20 principal components.
10966 * You can change the number of PCs by passing a numeric parameter. 10966 * It is usually best to perform this calculation on a marker set in
10967 * The 'header' modifier adds a header line to the .eigenvec output file. 10967 approximate linkage equilibrium.
10968 (For compatibility with the GCTA flag of the same name, the default is no 10968 * You can change the number of PCs by passing a numeric parameter.
10969 header line.) 10969 * The 'header' modifier adds a header line to the .eigenvec output file.
10970 * The 'tabs' modifier causes the .eigenvec file(s) to be tab-delimited. 10970 (For compatibility with the GCTA flag of the same name, the default is no
10971 * The 'var-wts' modifier requests an additional .eigenvec.var file with PCs 10971 header line.)
10972 expressed as variant weights instead of sample weights. 10972 * The 'tabs' modifier causes the .eigenvec file(s) to be tab-delimited.
10973 10973 * The 'var-wts' modifier requests an additional .eigenvec.var file with PCs
10974 --neighbour [n1] [n2] 10974 expressed as variant weights instead of sample weights.
10975 (alias: --neighbor) 10975
10976 Report IBS distances from each sample to their n1th- to n2th-nearest 10976 --neighbour [n1] [n2]
10977 neighbors, associated Z-scores, and the identities of those neighbors. 10977 (alias: --neighbor)
10978 Useful for outlier detection. 10978 Report IBS distances from each sample to their n1th- to n2th-nearest
10979 10979 neighbors, associated Z-scores, and the identities of those neighbors.
10980 --assoc <perm | mperm=[value]> <perm-count> <fisher | fisher-midp> <counts> 10980 Useful for outlier detection.
10981 <set-test> 10981
10982 --assoc <perm | mperm=[value]> <perm-count> <qt-means> <lin> <set-test> 10982 --assoc <perm | mperm=[value]> <perm-count> <fisher | fisher-midp> <counts>
10983 --model <perm | mperm=[value]> <perm-count> 10983 <set-test>
10984 <fisher | fisher-midp | trend-only> <set-test> 10984 --assoc <perm | mperm=[value]> <perm-count> <qt-means> <lin> <set-test>
10985 <dom | rec | gen | trend> 10985 --model <perm | mperm=[value]> <perm-count>
10986 Basic association analysis report. 10986 <fisher | fisher-midp | trend-only> <set-test>
10987 Given a case/control phenotype, --assoc performs a 1df chi-square allelic 10987 <dom | rec | gen | trend>
10988 test, while --model performs 4 other tests as well (1df dominant gene 10988 Basic association analysis report.
10989 action, 1df recessive gene action, 2df genotypic, Cochran-Armitage trend). 10989 Given a case/control phenotype, --assoc performs a 1df chi-square allelic
10990 * With 'fisher'/'fisher-midp', Fisher's exact test is used to generate 10990 test, while --model performs 4 other tests as well (1df dominant gene
10991 p-values. 'fisher-midp' also applies Lancaster's mid-p adjustment. 10991 action, 1df recessive gene action, 2df genotypic, Cochran-Armitage trend).
10992 * 'perm' causes an adaptive permutation test to be performed. 10992 * With 'fisher'/'fisher-midp', Fisher's exact test is used to generate
10993 * 'mperm=[value]' causes a max(T) permutation test with the specified 10993 p-values. 'fisher-midp' also applies Lancaster's mid-p adjustment.
10994 number of replications to be performed. 10994 * 'perm' causes an adaptive permutation test to be performed.
10995 * 'perm-count' causes the permutation test report to include counts instead 10995 * 'mperm=[value]' causes a max(T) permutation test with the specified
10996 of frequencies. 10996 number of replications to be performed.
10997 * 'counts' causes --assoc to report allele counts instead of frequencies. 10997 * 'perm-count' causes the permutation test report to include counts instead
10998 * 'set-test' tests the significance of variant sets. Requires permutation; 10998 of frequencies.
10999 can be customized with --set-p/--set-r2/--set-max. 10999 * 'counts' causes --assoc to report allele counts instead of frequencies.
11000 * 'dom', 'rec', 'gen', and 'trend' force the corresponding test to be used 11000 * 'set-test' tests the significance of variant sets. Requires permutation;
11001 as the basis for --model permutation. (By default, the most significant 11001 can be customized with --set-p/--set-r2/--set-max.
11002 result among the allelic, dominant, and recessive tests is used.) 11002 * 'dom', 'rec', 'gen', and 'trend' force the corresponding test to be used
11003 * 'trend-only' causes only the trend test to be performed. 11003 as the basis for --model permutation. (By default, the most significant
11004 Given a quantitative phenotype, --assoc normally performs a Wald test. 11004 result among the allelic, dominant, and recessive tests is used.)
11005 * In this case, the 'qt-means' modifier causes trait means and standard 11005 * 'trend-only' causes only the trend test to be performed.
11006 deviations stratified by genotype to be reported as well. 11006 Given a quantitative phenotype, --assoc normally performs a Wald test.
11007 * 'lin' causes the Lin statistic to be computed, and makes it the basis for 11007 * In this case, the 'qt-means' modifier causes trait means and standard
11008 multiple-testing corrections and permutation tests. 11008 deviations stratified by genotype to be reported as well.
11009 Several other flags (most notably, --aperm) can be used to customize the 11009 * 'lin' causes the Lin statistic to be computed, and makes it the basis for
11010 permutation test. 11010 multiple-testing corrections and permutation tests.
11011 11011 Several other flags (most notably, --aperm) can be used to customize the
11012 --mh <perm | mperm=[value]> <perm-count> <set-test> 11012 permutation test.
11013 (alias: --cmh) 11013
11014 --bd <perm | perm-bd | mperm=[value]> <perm-count> <set-test> 11014 --mh <perm | mperm=[value]> <perm-count> <set-test>
11015 --mh2 11015 (alias: --cmh)
11016 --homog 11016 --bd <perm | perm-bd | mperm=[value]> <perm-count> <set-test>
11017 Given a case/control phenotype and a set of clusters, --mh computes 2x2xK 11017 --mh2
11018 Cochran-Mantel-Haenszel statistics for each variant, while --bd also 11018 --homog
11019 performs the Breslow-Day test for odds ratio homogeneity. Permutation and 11019 Given a case/control phenotype and a set of clusters, --mh computes 2x2xK
11020 variant set testing based on the CMH (default) or Breslow-Day (when 11020 Cochran-Mantel-Haenszel statistics for each variant, while --bd also
11021 'perm-bd' is present) statistic are supported. 11021 performs the Breslow-Day test for odds ratio homogeneity. Permutation and
11022 The following similar analyses are also available: 11022 variant set testing based on the CMH (default) or Breslow-Day (when
11023 * --mh2 swaps the roles of case/control status and cluster membership, 11023 'perm-bd' is present) statistic are supported.
11024 performing a phenotype-stratified IxJxK Cochran-Mantel-Haenszel test on 11024 The following similar analyses are also available:
11025 association between cluster assignments and genotypes. 11025 * --mh2 swaps the roles of case/control status and cluster membership,
11026 * --homog executes an alternative to the Breslow-Day test, based on 11026 performing a phenotype-stratified IxJxK Cochran-Mantel-Haenszel test on
11027 partitioning of the chi-square statistic. 11027 association between cluster assignments and genotypes.
11028 11028 * --homog executes an alternative to the Breslow-Day test, based on
11029 --gxe {covariate index} 11029 partitioning of the chi-square statistic.
11030 Given both a quantitative phenotype and a case/control covariate loaded 11030
11031 with --covar defining two groups, --gxe compares the regression coefficient 11031 --gxe {covariate index}
11032 derived from considering only members of one group to the regression 11032 Given both a quantitative phenotype and a case/control covariate loaded
11033 coefficient derived from considering only members of the other. By 11033 with --covar defining two groups, --gxe compares the regression coefficient
11034 default, the first covariate in the --covar file defines the groups; use 11034 derived from considering only members of one group to the regression
11035 e.g. '--gxe 3' to base them on the third covariate instead. 11035 coefficient derived from considering only members of the other. By
11036 11036 default, the first covariate in the --covar file defines the groups; use
11037 --linear <perm | mperm=[value]> <perm-count> <set-test> 11037 e.g. '--gxe 3' to base them on the third covariate instead.
11038 <genotypic | hethom | dominant | recessive | no-snp> <hide-covar> 11038
11039 <sex | no-x-sex> <interaction> <beta> <standard-beta> <intercept> 11039 --linear <perm | mperm=[value]> <perm-count> <set-test>
11040 --logistic <perm | mperm=[value]> <perm-count> <set-test>
11041 <genotypic | hethom | dominant | recessive | no-snp> <hide-covar> 11040 <genotypic | hethom | dominant | recessive | no-snp> <hide-covar>
11042 <sex | no-x-sex> <interaction> <beta> <intercept> 11041 <sex | no-x-sex> <interaction> <beta> <standard-beta> <intercept>
11043 Multi-covariate association analysis on a quantitative (--linear) or 11042 --logistic <perm | mperm=[value]> <perm-count> <set-test>
11044 case/control (--logistic) phenotype. Normally used with --covar. 11043 <genotypic | hethom | dominant | recessive | no-snp> <hide-covar>
11045 * 'perm' normally causes an adaptive permutation test to be performed on 11044 <sex | no-x-sex> <interaction> <beta> <intercept>
11046 the main effect, while 'mperm=[value]' starts a max(T) permutation test. 11045 Multi-covariate association analysis on a quantitative (--linear) or
11047 * 'perm-count' causes the permutation test report to include counts instead 11046 case/control (--logistic) phenotype. Normally used with --covar.
11048 of frequencies. 11047 * 'perm' normally causes an adaptive permutation test to be performed on
11049 * 'set-test' tests the significance of variant sets. Requires permutation; 11048 the main effect, while 'mperm=[value]' starts a max(T) permutation test.
11050 can be customized with --set-p/--set-r2/--set-max. 11049 * 'perm-count' causes the permutation test report to include counts instead
11051 * The 'genotypic' modifier adds an additive effect/dominance deviation 2df 11050 of frequencies.
11052 joint test (0/1/2 and 0/1/0 coding), while 'hethom' uses 0/0/1 and 0/1/0 11051 * 'set-test' tests the significance of variant sets. Requires permutation;
11053 coding instead. If permutation is also requested, these modifiers cause 11052 can be customized with --set-p/--set-r2/--set-max.
11054 permutation to be based on the joint test. 11053 * The 'genotypic' modifier adds an additive effect/dominance deviation 2df
11055 * 'dominant' and 'recessive' specify a model assuming full dominance or 11054 joint test (0/1/2 and 0/1/0 coding), while 'hethom' uses 0/0/1 and 0/1/0
11056 recessiveness, respectively, for the A1 allele. 11055 coding instead. If permutation is also requested, these modifiers cause
11057 * 'no-snp' causes regression to be performed only on the phenotype and the 11056 permutation to be based on the joint test.
11058 covariates, without reference to genomic data. If permutation is also 11057 * 'dominant' and 'recessive' specify a model assuming full dominance or
11059 requested, results are reported for all covariates. 11058 recessiveness, respectively, for the A1 allele.
11060 * 'hide-covar' removes covariate-specific lines from the report. 11059 * 'no-snp' causes regression to be performed only on the phenotype and the
11061 * By default, sex (male = 1, female = 0) is automatically added as a 11060 covariates, without reference to genomic data. If permutation is also
11062 covariate on X chromosome variants, and nowhere else. The 'sex' modifier 11061 requested, results are reported for all covariates.
11063 causes it to be added everywhere, while 'no-x-sex' excludes it. 11062 * 'hide-covar' removes covariate-specific lines from the report.
11064 * 'interaction' adds genotype x covariate interactions to the model. This 11063 * By default, sex (male = 1, female = 0) is automatically added as a
11065 cannot be used with the usual permutation tests; use --tests to define 11064 covariate on X chromosome variants, and nowhere else. The 'sex' modifier
11066 the permutation test statistic instead. 11065 causes it to be added everywhere, while 'no-x-sex' excludes it.
11067 * 'intercept' causes intercepts to be included in the main report. 11066 * 'interaction' adds genotype x covariate interactions to the model. This
11068 * For logistic regressions, the 'beta' modifier causes regression 11067 cannot be used with the usual permutation tests; use --tests to define
11069 coefficients instead of odds ratios to be reported. 11068 the permutation test statistic instead.
11070 * With --linear, the 'standard-beta' modifier standardizes the phenotype 11069 * 'intercept' causes intercepts to be included in the main report.
11071 and all predictors to zero mean and unit variance before regression. 11070 * For logistic regressions, the 'beta' modifier causes regression
11072 11071 coefficients instead of odds ratios to be reported.
11073 --dosage [allele dosage file] <noheader> <skip0=[i]> <skip1=[j]> <skip2=[k]> 11072 * With --linear, the 'standard-beta' modifier standardizes the phenotype
11074 <dose1> <format=[m]> <Zout> <occur | standard-beta> <sex> 11073 and all predictors to zero mean and unit variance before regression.
11075 <case-control-freqs> 11074
11076 --dosage [list file] list <sepheader | noheader> <skip0=[i]> <skip1=[j]> 11075 --dosage [allele dosage file] <noheader> <skip0=[i]> <skip1=[j]> <skip2=[k]>
11077 <skip2=[k]> <dose1> <format=[m]> <Zout> <occur | standard-beta> 11076 <dose1> <format=[m]> <Zout> <occur | standard-beta> <sex>
11078 <sex> <case-control-freqs> 11077 <case-control-freqs>
11079 --write-dosage 11078 --dosage [list file] list <sepheader | noheader> <skip0=[i]> <skip1=[j]>
11080 Process (possibly gzipped) text files with variant-major allelic dosage 11079 <skip2=[k]> <dose1> <format=[m]> <Zout> <occur | standard-beta>
11081 data. This cannot be used with a regular input fileset; instead, you must 11080 <sex> <case-control-freqs>
11082 *only* specify a .fam and possibly a .map file, and you can't specify any 11081 --write-dosage
11083 other commands. 11082 Process (possibly gzipped) text files with variant-major allelic dosage
11084 * PLINK 2.0 will have first-class support for genotype probabilities. An 11083 data. This cannot be used with a regular input fileset; instead, you must
11085 equivalent data import flag will be provided then, and --dosage will be 11084 *only* specify a .fam and possibly a .map file, and you can't specify any
11086 retired. 11085 other commands.
11087 * By default, --dosage assumes that only one allelic dosage file should be 11086 * PLINK 2.0 will have first-class support for genotype probabilities. An
11088 loaded. To specify multiple files, 11087 equivalent data import flag will be provided then, and --dosage will be
11089 1. create a master list with one entry per line. There are normally two 11088 retired.
11090 supported formats for this list: just a filename per line, or variant 11089 * By default, --dosage assumes that only one allelic dosage file should be
11091 batch numbers in the first column and filenames in the second. 11090 loaded. To specify multiple files,
11092 2. Provide the name of that list as the first --dosage parameter. 11091 1. create a master list with one entry per line. There are normally two
11093 3. Add the 'list' modifier. 11092 supported formats for this list: just a filename per line, or variant
11094 * By default, --dosage assumes the allelic dosage file(s) contain a header 11093 batch numbers in the first column and filenames in the second.
11095 line, which has 'SNP' in column i+1, 'A1' in column i+j+2, 'A2' in column 11094 2. Provide the name of that list as the first --dosage parameter.
11096 i+j+3, and sample FID/IIDs starting from column i+j+k+4. (i/j/k are 11095 3. Add the 'list' modifier.
11097 normally zero, but can be changed with 'skip0', 'skip1', and 'skip2' 11096 * By default, --dosage assumes the allelic dosage file(s) contain a header
11098 respectively.) If such a header line is not present, 11097 line, which has 'SNP' in column i+1, 'A1' in column i+j+2, 'A2' in column
11099 * when all samples appear in the same order as they do in the .fam file, 11098 i+j+3, and sample FID/IIDs starting from column i+j+k+4. (i/j/k are
11100 you can use the 'noheader' modiifer. 11099 normally zero, but can be changed with 'skip0', 'skip1', and 'skip2'
11101 * Otherwise, use the 'sepheader' modifier, and append sample ID filenames 11100 respectively.) If such a header line is not present,
11102 to your 'list' file entries. 11101 * when all samples appear in the same order as they do in the .fam file,
11103 * The 'format' modifier lets you specify the number of values used to 11102 you can use the 'noheader' modiifer.
11104 represent each dosage. 'format=1' normally indicates a single 0..2 A1 11103 * Otherwise, use the 'sepheader' modifier, and append sample ID filenames
11105 expected count; 'dose1' modifies this to a 0..1 frequency. 'format=2' 11104 to your 'list' file entries.
11106 (the default) indicates a 0..1 homozygous A1 likelihood followed by a 11105 * The 'format' modifier lets you specify the number of values used to
11107 0..1 het likelihood, while 'format=3' indicates 0..1 hom A1, 0..1 het, 11106 represent each dosage. 'format=1' normally indicates a single 0..2 A1
11108 0..1 hom A2. 11107 expected count; 'dose1' modifies this to a 0..1 frequency. 'format=2'
11109 * 'Zout' causes the output file to be gzipped. 11108 (the default) indicates a 0..1 homozygous A1 likelihood followed by a
11110 * Normally, an association analysis is performed. 'standard-beta' and 11109 0..1 het likelihood, while 'format=3' indicates 0..1 hom A1, 0..1 het,
11111 'sex' behave as they are supposed to with --linear/--logistic. 11110 0..1 hom A2.
11112 'case-control-freqs' causes case and control allele frequencies to be 11111 * 'Zout' causes the output file to be gzipped.
11113 reported separately. 11112 * Normally, an association analysis is performed. 'standard-beta' and
11114 * There are three alternate modes which cause the association analysis to 11113 'sex' behave as they are supposed to with --linear/--logistic.
11115 be skipped. 11114 'case-control-freqs' causes case and control allele frequencies to be
11116 * 'occur' requests a simple variant occurrence report. 11115 reported separately.
11117 * --write-dosage causes a simple merged file matching the 'format' 11116 * There are three alternate modes which cause the association analysis to
11118 specification (not including 'dose1') to be generated. 11117 be skipped.
11119 * --score applies a linear scoring system to the dosages. 11118 * 'occur' requests a simple variant occurrence report.
11120 11119 * --write-dosage causes a simple merged file matching the 'format'
11121 --lasso [h2 estimate] {min lambda} <report-zeroes> 11120 specification (not including 'dose1') to be generated.
11122 Estimate variant effect sizes via LASSO regression. You must provide an 11121 * --score applies a linear scoring system to the dosages.
11123 additive heritability estimate to calibrate the regression. 11122
11124 Note that this method may require a very large sample size (e.g. hundreds 11123 --lasso [h2 estimate] {min lambda} <report-zeroes>
11125 of thousands) to be effective on complex polygenic traits. 11124 Estimate variant effect sizes via LASSO regression. You must provide an
11126 11125 additive heritability estimate to calibrate the regression.
11127 --test-missing <perm | mperm=[value]> <perm-count> <midp> 11126 Note that this method may require a very large sample size (e.g. hundreds
11128 Check for association between missingness and case/control status, using 11127 of thousands) to be effective on complex polygenic traits.
11129 Fisher's exact test. The 'midp' modifier causes Lancaster's mid-p 11128
11130 adjustment to be applied. 11129 --test-missing <perm | mperm=[value]> <perm-count> <midp>
11131 11130 Check for association between missingness and case/control status, using
11132 --make-perm-pheno [ct] 11131 Fisher's exact test. The 'midp' modifier causes Lancaster's mid-p
11133 Generate phenotype permutations and write them to disk, without invoking an 11132 adjustment to be applied.
11134 association test. 11133
11135 11134 --make-perm-pheno [ct]
11136 --tdt <exact | exact-midp | poo> <perm | mperm=[value]> <perm-count> 11135 Generate phenotype permutations and write them to disk, without invoking an
11137 <parentdt1 | parentdt2 | pat | mat> <set-test> 11136 association test.
11138 Report transmission disequilibrium test statistics, given case/control 11137
11139 phenotypes and pedigree information. 11138 --tdt <exact | exact-midp | poo> <perm | mperm=[value]> <perm-count>
11140 * A Mendel error check is performed before the main tests; offending 11139 <parentdt1 | parentdt2 | pat | mat> <set-test>
11141 genotypes are treated as missing by this analysis. 11140 Report transmission disequilibrium test statistics, given case/control
11142 * By default, the basic TDT p-value is based on a chi-square test unless 11141 phenotypes and pedigree information.
11143 you request the exact binomial test with 'exact' or 'exact-midp'. 11142 * A Mendel error check is performed before the main tests; offending
11144 * 'perm'/'mperm=[value]' requests a family-based adaptive or max(T) 11143 genotypes are treated as missing by this analysis.
11145 permutation test. By default, the permutation test statistic is the 11144 * By default, the basic TDT p-value is based on a chi-square test unless
11146 basic TDT p-value; 'parentdt1'/'parentdt2' cause parenTDT or combined 11145 you request the exact binomial test with 'exact' or 'exact-midp'.
11147 test p-values, respectively, to be considered instead. 11146 * 'perm'/'mperm=[value]' requests a family-based adaptive or max(T)
11148 * 'set-test' tests the significance of variant sets. This cannot be used 11147 permutation test. By default, the permutation test statistic is the
11149 with exact tests for now. 11148 basic TDT p-value; 'parentdt1'/'parentdt2' cause parenTDT or combined
11150 The 'poo' modifier causes a parent-of-origin analysis to be performed 11149 test p-values, respectively, to be considered instead.
11151 instead, with transmissions from heterozygous fathers and heterozygous 11150 * 'set-test' tests the significance of variant sets. This cannot be used
11152 mothers considered separately. 11151 with exact tests for now.
11153 * The parent-of-origin analysis does not currently support exact tests. 11152 The 'poo' modifier causes a parent-of-origin analysis to be performed
11154 * By default, the permutation test statistic is the absolute 11153 instead, with transmissions from heterozygous fathers and heterozygous
11155 parent-of-origin test Z score; 'pat'/'mat' cause paternal or maternal TDT 11154 mothers considered separately.
11156 chi-square statistics, respectively, to be considered instead. 11155 * The parent-of-origin analysis does not currently support exact tests.
11157 11156 * By default, the permutation test statistic is the absolute
11158 --qfam <perm | mperm=[value]> <perm-count> <emp-se> 11157 parent-of-origin test Z score; 'pat'/'mat' cause paternal or maternal TDT
11159 --qfam-parents <perm | mperm=[value]> <perm-count> <emp-se> 11158 chi-square statistics, respectively, to be considered instead.
11160 --qfam-between <perm | mperm=[value]> <perm-count> <emp-se> 11159
11161 --qfam-total <perm | mperm=[value]> <perm-count> <emp-se> 11160 --qfam <perm | mperm=[value]> <perm-count> <emp-se>
11162 QFAM family-based association test for quantitative traits. 11161 --qfam-parents <perm | mperm=[value]> <perm-count> <emp-se>
11163 * A Mendel error check is performed before the main tests; offending 11162 --qfam-between <perm | mperm=[value]> <perm-count> <emp-se>
11164 genotypes are treated as missing by this analysis. 11163 --qfam-total <perm | mperm=[value]> <perm-count> <emp-se>
11165 * This procedure requires permutation. 'perm' and 'perm-count' have the 11164 QFAM family-based association test for quantitative traits.
11166 usual meanings. However, 'mperm=[value]' just specifies a fixed number 11165 * A Mendel error check is performed before the main tests; offending
11167 of permutations; the method does not support a proper max(T) test. 11166 genotypes are treated as missing by this analysis.
11168 * The 'emp-se' modifier adds BETA and EMP_SE (empirical standard error for 11167 * This procedure requires permutation. 'perm' and 'perm-count' have the
11169 beta) fields to the .perm output file. 11168 usual meanings. However, 'mperm=[value]' just specifies a fixed number
11170 11169 of permutations; the method does not support a proper max(T) test.
11171 --annotate [PLINK report] <attrib=[file]> <ranges=[file]> <filter=[file]> 11170 * The 'emp-se' modifier adds BETA and EMP_SE (empirical standard error for
11172 <snps=[file]> <NA | prune> <block> <subset=[file]> <minimal> 11171 beta) fields to the .perm output file.
11173 <distance> 11172
11174 Add annotations to a variant-based PLINK report. This requires an 11173 --annotate [PLINK report] <attrib=[file]> <ranges=[file]> <filter=[file]>
11175 annotation source: 11174 <snps=[file]> <NA | prune> <block> <subset=[file]> <minimal>
11176 * 'attrib=[file]' specifies a (possibly gzipped) attribute file. 11175 <distance>
11177 * 'ranges=[file]' specifies a gene/range list file. 11176 Add annotations to a variant-based PLINK report. This requires an
11178 (Both source types can be specified simultaneously.) The following options 11177 annotation source:
11179 are also supported: 11178 * 'attrib=[file]' specifies a (possibly gzipped) attribute file.
11180 * 'filter=[file]' causes only variants within one of the ranges in the file 11179 * 'ranges=[file]' specifies a gene/range list file.
11181 to be included in the new report. 11180 (Both source types can be specified simultaneously.) The following options
11182 * 'snps=[file]' causes only variants named in the file to be included in 11181 are also supported:
11183 the new report. 11182 * 'filter=[file]' causes only variants within one of the ranges in the file
11184 * The 'NA' modifier causes unannotated variants to have 'NA' instead of '.' 11183 to be included in the new report.
11185 in the new report's ANNOT column, while the 'prune' modifier excludes 11184 * 'snps=[file]' causes only variants named in the file to be included in
11186 them entirely. 11185 the new report.
11187 * The 'block' modifier replaces the single ANNOT column with a 0/1-coded 11186 * The 'NA' modifier causes unannotated variants to have 'NA' instead of '.'
11188 column for each possible annotation. 11187 in the new report's ANNOT column, while the 'prune' modifier excludes
11189 * With 'ranges', 11188 them entirely.
11190 * 'subset=[file]' causes only intervals named in the subset file to be 11189 * The 'block' modifier replaces the single ANNOT column with a 0/1-coded
11191 loaded from the ranges file. 11190 column for each possible annotation.
11192 * interval annotations normally come with a parenthesized signed distance 11191 * With 'ranges',
11193 to the interval boundary (0 if the variant is located inside the 11192 * 'subset=[file]' causes only intervals named in the subset file to be
11194 interval; this is always true without --border). They can be excluded 11193 loaded from the ranges file.
11195 with the 'minimal' modifier. 11194 * interval annotations normally come with a parenthesized signed distance
11196 * the 'distance' modifier adds 'DIST' and 'SGN' columns describing signed 11195 to the interval boundary (0 if the variant is located inside the
11197 distance to the nearest interval. 11196 interval; this is always true without --border). They can be excluded
11198 * When --pfilter is present, high p-values are filtered out. 11197 with the 'minimal' modifier.
11199 11198 * the 'distance' modifier adds 'DIST' and 'SGN' columns describing signed
11200 --clump [PLINK report filename(s)...] 11199 distance to the nearest interval.
11201 Process association analysis report(s) with 'SNP' and p-value columns, 11200 * When --pfilter is present, high p-values are filtered out.
11202 organizing results by LD-based clumps. Multiple filenames can be separated 11201
11203 by spaces or commas. 11202 --clump [PLINK report filename(s)...]
11204 11203 Process association analysis report(s) with 'SNP' and p-value columns,
11205 --gene-report [PLINK report] [gene range file] 11204 organizing results by LD-based clumps. Multiple filenames can be separated
11206 Generate a gene-based report from a variant-based report. 11205 by spaces or commas.
11207 * When --pfilter is present, high p-values are filtered out. 11206
11208 * When --extract (without 'range') is present, only variants named in the 11207 --gene-report [PLINK report] [gene range file]
11209 --extract file are considered. 11208 Generate a gene-based report from a variant-based report.
11210 11209 * When --pfilter is present, high p-values are filtered out.
11211 --meta-analysis [PLINK report filenames...] 11210 * When --extract (without 'range') is present, only variants named in the
11212 --meta-analysis [PLINK report filenames...] + <logscale | qt> 11211 --extract file are considered.
11213 <no-map | no-allele> <study> <report-all> <weighted-z> 11212
11214 Perform a meta-analysis on several variant-based reports with 'SNP' and 11213 --meta-analysis [PLINK report filenames...]
11215 'SE' fields. 11214 --meta-analysis [PLINK report filenames...] + <logscale | qt>
11216 * Normally, an 'OR' odds ratio field must also be present in each input 11215 <no-map | no-allele> <study> <report-all> <weighted-z>
11217 file. With 'logscale', 'BETA' log-odds values/regression coefficients 11216 Perform a meta-analysis on several variant-based reports with 'SNP' and
11218 are expected instead, but the generated report will still contain odds 11217 'SE' fields.
11219 ratio estimates. With 'qt', both input and output values are regression 11218 * Normally, an 'OR' odds ratio field must also be present in each input
11220 betas. 11219 file. With 'logscale', 'BETA' log-odds values/regression coefficients
11221 * 'CHR', 'BP', and 'A1' fields are also normally required. 'no-map' causes 11220 are expected instead, but the generated report will still contain odds
11222 them to all be ignored, while 'no-allele' causes just 'A1' to be ignored. 11221 ratio estimates. With 'qt', both input and output values are regression
11223 * If 'A2' fields are present, and neither 'no-map' nor 'no-allele' was 11222 betas.
11224 specified, A1/A2 allele flips are handled properly. Otherwise, A1 11223 * 'CHR', 'BP', and 'A1' fields are also normally required. 'no-map' causes
11225 mismatches are thrown out. 11224 them to all be ignored, while 'no-allele' causes just 'A1' to be ignored.
11226 * 'study' causes study-specific effect estimates to be collated in the 11225 * If 'A2' fields are present, and neither 'no-map' nor 'no-allele' was
11227 meta-analysis report. 11226 specified, A1/A2 allele flips are handled properly. Otherwise, A1
11228 * 'report-all' causes variants present in only a single input file to be 11227 mismatches are thrown out.
11229 included in the meta-analysis report. 11228 * 'study' causes study-specific effect estimates to be collated in the
11230 * 'weighted-z' requests weighted Z-score-based p-values (as computed by the 11229 meta-analysis report.
11231 Abecasis Lab's METAL software) in addition to the usual inverse 11230 * 'report-all' causes variants present in only a single input file to be
11232 variance-based analysis. This requires P and effective sample size 11231 included in the meta-analysis report.
11233 fields. 11232 * 'weighted-z' requests weighted Z-score-based p-values (as computed by the
11234 * When --extract (without 'range') is present, only variants named in the 11233 Abecasis Lab's METAL software) in addition to the usual inverse
11235 --extract file are considered. 11234 variance-based analysis. This requires P and effective sample size
11236 * Unless 'no-map' is specified, chromosome filters are also respected. 11235 fields.
11237 11236 * When --extract (without 'range') is present, only variants named in the
11238 --fast-epistasis <boost | joint-effects | no-ueki> <case-only> 11237 --extract file are considered.
11239 <set-by-set | set-by-all> <nop> 11238 * Unless 'no-map' is specified, chromosome filters are also respected.
11240 --epistasis <set-by-set | set-by-all> 11239
11241 Scan for epistatic interactions. --fast-epistasis inspects 3x3 joint 11240 --fast-epistasis <boost | joint-effects | no-ueki> <case-only>
11242 genotype count tables and only applies to case/control phenotypes, while 11241 <set-by-set | set-by-all> <nop>
11243 --epistasis performs linear or logistic regression. 11242 --epistasis <set-by-set | set-by-all>
11244 * By default, --fast-epistasis uses the PLINK 1.07 allele-based test. Two 11243 Scan for epistatic interactions. --fast-epistasis inspects 3x3 joint
11245 newer tests are now supported: 'boost' invokes the likelihood ratio test 11244 genotype count tables and only applies to case/control phenotypes, while
11246 introduced by Wan X et al. (2010) BOOST: A Fast Approach to Detecting 11245 --epistasis performs linear or logistic regression.
11247 Gene-Gene Interactions in Genome-wide Case-Control Studies, while 11246 * By default, --fast-epistasis uses the PLINK 1.07 allele-based test. Two
11248 'joint-effects' applies the joint effects test introduced in Ueki M, 11247 newer tests are now supported: 'boost' invokes the likelihood ratio test
11249 Cordell HJ (2012) Improved statistics for genome-wide interaction 11248 introduced by Wan X et al. (2010) BOOST: A Fast Approach to Detecting
11250 analysis. 11249 Gene-Gene Interactions in Genome-wide Case-Control Studies, while
11251 * The original --fast-epistasis test normally applies the variance and 11250 'joint-effects' applies the joint effects test introduced in Ueki M,
11252 empty cell corrections suggested by Ueki and Cordell's paper. To disable 11251 Cordell HJ (2012) Improved statistics for genome-wide interaction
11253 them, use the 'no-ueki' modifier. 11252 analysis.
11254 * 'case-only' requests a case-only instead of a case/control test. 11253 * The original --fast-epistasis test normally applies the variance and
11255 * By default, all pairs of variants across the entire genome are tested. 11254 empty cell corrections suggested by Ueki and Cordell's paper. To disable
11256 To just test pairs of variants within a single set, add the 'set-by-set' 11255 them, use the 'no-ueki' modifier.
11257 modifier and load exactly one set with --set/--make-set; with exactly two 11256 * 'case-only' requests a case-only instead of a case/control test.
11258 sets loaded, all variants in one set are tested against all variants in 11257 * By default, all pairs of variants across the entire genome are tested.
11259 the other. 'set-by-all' tests all variants in one set against the entire 11258 To just test pairs of variants within a single set, add the 'set-by-set'
11260 genome instead. 11259 modifier and load exactly one set with --set/--make-set; with exactly two
11261 * 'nop' strips p-values from the main report. 11260 sets loaded, all variants in one set are tested against all variants in
11262 * These computations can be subdivided with --parallel; however... 11261 the other. 'set-by-all' tests all variants in one set against the entire
11263 --epistasis-summary-merge [common file prefix] [ct] 11262 genome instead.
11264 When a --{fast-}epistasis job is subdivided with --parallel, the main 11263 * 'nop' strips p-values from the main report.
11265 report can be assembled at the end by applying Unix 'cat' in the usual 11264 * These computations can be subdivided with --parallel; however...
11266 manner, but the .summary.1, .summary.2, ... files may require a specialized 11265 --epistasis-summary-merge [common file prefix] [ct]
11267 merge. --epistasis-summary-merge takes care of the latter. 11266 When a --{fast-}epistasis job is subdivided with --parallel, the main
11268 11267 report can be assembled at the end by applying Unix 'cat' in the usual
11269 --twolocus [variant ID] [variant ID] 11268 manner, but the .summary.1, .summary.2, ... files may require a specialized
11270 Two-locus joint genotype count report. 11269 merge. --epistasis-summary-merge takes care of the latter.
11271 11270
11272 --score [filename] {i} {j} {k} <header> <sum | no-sum> 11271 --twolocus [variant ID] [variant ID]
11273 <no-mean-imputation | center> <include-cnt> <double-dosage> 11272 Two-locus joint genotype count report.
11274 Apply a linear scoring system to each sample. 11273
11275 The input file should have one line per scored variant. Variant IDs are 11274 --score [filename] {i} {j} {k} <header> <sum | no-sum>
11276 read from column #i, allele codes are read from column #j, and scores are 11275 <no-mean-imputation | center> <include-cnt> <double-dosage>
11277 read from column #k, where i defaults to 1, j defaults to i+1, and k 11276 Apply a linear scoring system to each sample.
11278 defaults to j+1. 11277 The input file should have one line per scored variant. Variant IDs are
11279 * The 'header' modifier causes the first nonempty line of the input file to 11278 read from column #i, allele codes are read from column #j, and scores are
11280 be ignored; otherwise, --score assumes there is no header line. 11279 read from column #k, where i defaults to 1, j defaults to i+1, and k
11281 * By default, final scores are averages of the valid per-variant scores. 11280 defaults to j+1.
11282 The 'sum' modifier causes sums to be reported instead. (This cannot be 11281 * The 'header' modifier causes the first nonempty line of the input file to
11283 used with 'no-mean-imputation'. And for backward compatibility, 'sum' is 11282 be ignored; otherwise, --score assumes there is no header line.
11284 automatically on with dosage data unless 'no-sum' is specified.) 11283 * By default, final scores are averages of the valid per-variant scores.
11285 * By default, copies of the unnamed allele contribute zero to score, while 11284 The 'sum' modifier causes sums to be reported instead. (This cannot be
11286 missing genotypes contribute an amount proportional to the loaded (via 11285 used with 'no-mean-imputation'. And for backward compatibility, 'sum' is
11287 --read-freq) or imputed allele frequency. To throw out missing 11286 automatically on with dosage data unless 'no-sum' is specified.)
11288 observations instead (decreasing the denominator in the final average 11287 * By default, copies of the unnamed allele contribute zero to score, while
11289 when this happens), use the 'no-mean-imputation' modifier. 11288 missing genotypes contribute an amount proportional to the loaded (via
11290 * Alternatively, you can use the 'center' modifier to shift all scores to 11289 --read-freq) or imputed allele frequency. To throw out missing
11291 mean zero. 11290 observations instead (decreasing the denominator in the final average
11292 * This command can be used with dosage data. By default, the 'CNT' column 11291 when this happens), use the 'no-mean-imputation' modifier.
11293 is omitted from the output file in this case; use 'include-cnt' to keep 11292 * Alternatively, you can use the 'center' modifier to shift all scores to
11294 it. Also, note that scores are multiplied by 0..1 dosages, not 0..2 11293 mean zero.
11295 diploid allele counts, unless the 'double-dosage' modifier is present. 11294 * This command can be used with dosage data. By default, the 'CNT' column
11296 11295 is omitted from the output file in this case; use 'include-cnt' to keep
11297 --write-var-ranges [block ct] 11296 it. Also, note that scores are multiplied by 0..1 dosages, not 0..2
11298 Divide the set of variants into equal-size blocks. (Can be used with 11297 diploid allele counts, unless the 'double-dosage' modifier is present.
11299 --snps to split a job across multiple machines.) 11298
11300 11299 --write-var-ranges [block ct]
11301 The following other flags are supported. (Order of operations is described at 11300 Divide the set of variants into equal-size blocks. (Can be used with
11302 https://www.cog-genomics.org/plink2/order .) 11301 --snps to split a job across multiple machines.)
11303 --script [fname] : Include command-line options from file. 11302
11304 --rerun {log} : Rerun commands in log (default 'plink.log'). 11303 The following other flags are supported. (Order of operations is described at
11305 --version : Display only version number before exiting. 11304 https://www.cog-genomics.org/plink2/order .)
11306 --silent : Suppress output to console. 11305 --script [fname] : Include command-line options from file.
11307 --gplink : Reserved for interoperation with gPLINK. 11306 --rerun {log} : Rerun commands in log (default 'plink.log').
11308 --missing-genotype [char] : Set missing genotype code (normally '0'). 11307 --version : Display only version number before exiting.
11309 --double-id : Set both FIDs and IIDs to the VCF/BCF sample ID. 11308 --silent : Suppress output to console.
11310 --const-fid {ID} : Set all FIDs to the given constant (default '0'). 11309 --gplink : Reserved for interoperation with gPLINK.
11311 --id-delim {d} : Parse sample IDs as [FID][d][IID] (default delim '_'). 11310 --missing-genotype [char] : Set missing genotype code (normally '0').
11312 --vcf-idspace-to [c] : Convert spaces in sample IDs to the given character. 11311 --double-id : Set both FIDs and IIDs to the VCF/BCF sample ID.
11313 --biallelic-only <strict> <list> : Skip VCF variants with 2+ alt. alleles. 11312 --const-fid {ID} : Set all FIDs to the given constant (default '0').
11314 --vcf-min-qual [val] : Skip VCF variants with low/missing QUAL. 11313 --id-delim {d} : Parse sample IDs as [FID][d][IID] (default delim '_').
11315 --vcf-filter {exception(s)...} : Skip variants which have FILTER failures. 11314 --vcf-idspace-to [c] : Convert spaces in sample IDs to the given character.
11316 --vcf-require-gt : Skip variants with no GT field. 11315 --biallelic-only <strict> <list> : Skip VCF variants with 2+ alt. alleles.
11317 --vcf-min-gq [val] : No-call a genotype when GQ is below the 11316 --vcf-min-qual [val] : Skip VCF variants with low/missing QUAL.
11318 given threshold. 11317 --vcf-filter {exception(s)...} : Skip variants which have FILTER failures.
11319 --vcf-min-gp [val] : No-call a genotype when 0-1 scaled GP is 11318 --vcf-require-gt : Skip variants with no GT field.
11320 below the given threshold. 11319 --vcf-min-gq [val] : No-call a genotype when GQ is below the
11321 --vcf-half-call [m] : Specify how '0/.' and similar VCF GT values should be 11320 given threshold.
11322 handled. The following four modes are supported: 11321 --vcf-min-gp [val] : No-call a genotype when 0-1 scaled GP is
11323 * 'error'/'e' (default) errors out and reports line #. 11322 below the given threshold.
11324 * 'haploid'/'h' treats them as haploid calls. 11323 --vcf-half-call [m] : Specify how '0/.' and similar VCF GT values should be
11325 * 'missing'/'m' treats them as missing. 11324 handled. The following four modes are supported:
11326 * 'reference'/'r' treats the missing value as 0. 11325 * 'error'/'e' (default) errors out and reports line #.
11327 --oxford-single-chr [chr nm] : Specify single-chromosome .gen file with 11326 * 'haploid'/'h' treats them as haploid calls.
11328 ignorable first column. 11327 * 'missing'/'m' treats them as missing.
11329 --oxford-pheno-name [col nm] : Import named phenotype from the .sample file. 11328 * 'reference'/'r' treats the missing value as 0.
11330 --hard-call-threshold [val] : When an Oxford-format fileset is loaded, calls 11329 --oxford-single-chr [chr nm] : Specify single-chromosome .gen file with
11331 --hard-call-threshold random with uncertainty level greater than 0.1 are 11330 ignorable first column.
11332 normally treated as missing. You can adjust 11331 --oxford-pheno-name [col nm] : Import named phenotype from the .sample file.
11333 this threshold by providing a numeric 11332 --hard-call-threshold [val] : When an Oxford-format fileset is loaded, calls
11334 parameter, or randomize all calls with 11333 --hard-call-threshold random with uncertainty level greater than 0.1 are
11335 'random'. 11334 normally treated as missing. You can adjust
11336 --missing-code {string list} : Comma-delimited list of missing phenotype 11335 this threshold by providing a numeric
11337 (alias: --missing_code) values for Oxford-format filesets (def. 'NA'). 11336 parameter, or randomize all calls with
11338 --simulate-ncases [num] : Set --simulate case count (default 1000). 11337 'random'.
11339 --simulate-ncontrols [n] : Set --simulate control count (default 1000). 11338 --missing-code {string list} : Comma-delimited list of missing phenotype
11340 --simulate-prevalence [p] : Set --simulate disease prevalence (default 0.01). 11339 (alias: --missing_code) values for Oxford-format filesets (def. 'NA').
11341 --simulate-n [num] : Set --simulate-qt sample count (default 1000). 11340 --simulate-ncases [num] : Set --simulate case count (default 1000).
11342 --simulate-label [prefix] : Set --simulate{-qt} FID/IID name prefix. 11341 --simulate-ncontrols [n] : Set --simulate control count (default 1000).
11343 --simulate-missing [freq] : Set --simulate{-qt} missing genotype frequency. 11342 --simulate-prevalence [p] : Set --simulate disease prevalence (default 0.01).
11344 --allow-extra-chr <0> : Permit unrecognized chromosome codes. The '0' 11343 --simulate-n [num] : Set --simulate-qt sample count (default 1000).
11345 (alias: --aec) modifier causes them to be treated as if they had 11344 --simulate-label [prefix] : Set --simulate{-qt} FID/IID name prefix.
11346 been set to zero. 11345 --simulate-missing [freq] : Set --simulate{-qt} missing genotype frequency.
11347 --chr-set [autosome ct] <no-x> <no-y> <no-xy> <no-mt> : 11346 --allow-extra-chr <0> : Permit unrecognized chromosome codes. The '0'
11348 Specify a nonhuman chromosome set. The first parameter sets the number of 11347 (alias: --aec) modifier causes them to be treated as if they had
11349 diploid autosome pairs if positive, or haploid chromosomes if negative. 11348 been set to zero.
11350 Given diploid autosomes, the remaining modifiers indicate the absence of 11349 --chr-set [autosome ct] <no-x> <no-y> <no-xy> <no-mt> :
11351 the named non-autosomal chromosomes. 11350 Specify a nonhuman chromosome set. The first parameter sets the number of
11352 --cow/--dog/--horse/--mouse/--rice/--sheep : Shortcuts for those species. 11351 diploid autosome pairs if positive, or haploid chromosomes if negative.
11353 --autosome-num [value] : Alias for '--chr-set [value] no-y no-xy no-mt'. 11352 Given diploid autosomes, the remaining modifiers indicate the absence of
11354 --cm-map [fname pattern] {chr} : Use SHAPEIT-format recombination maps to set 11353 the named non-autosomal chromosomes.
11355 centimorgan positions. To process more than 11354 --cow/--dog/--horse/--mouse/--rice/--sheep : Shortcuts for those species.
11356 one chromosome, include a '@' in the first 11355 --autosome-num [value] : Alias for '--chr-set [value] no-y no-xy no-mt'.
11357 parameter where the chrom. number belongs, 11356 --cm-map [fname pattern] {chr} : Use SHAPEIT-format recombination maps to set
11358 e.g. 'genetic_map_chr@_combined_b37.txt'. 11357 centimorgan positions. To process more than
11359 --zero-cms : Zero out centimorgan positions. 11358 one chromosome, include a '@' in the first
11360 --pheno [fname] : Load phenotype data from the specified file, instead of 11359 parameter where the chrom. number belongs,
11361 using the values in the main input fileset. 11360 e.g. 'genetic_map_chr@_combined_b37.txt'.
11362 --all-pheno : For basic association tests, loop through all phenotypes 11361 --zero-cms : Zero out centimorgan positions.
11363 in --pheno file. 11362 --pheno [fname] : Load phenotype data from the specified file, instead of
11364 --mpheno [n] : Load phenotype from column (n+2) in --pheno file. 11363 using the values in the main input fileset.
11365 --pheno-name [c] : If --pheno file has a header row, use column with the 11364 --all-pheno : For basic association tests, loop through all phenotypes
11366 given name. 11365 in --pheno file.
11367 --pheno-merge : When the main input fileset contains an phenotype value 11366 --mpheno [n] : Load phenotype from column (n+2) in --pheno file.
11368 for a sample, but the --pheno file does not, use the 11367 --pheno-name [c] : If --pheno file has a header row, use column with the
11369 original value instead of treating the phenotype as 11368 given name.
11370 missing. 11369 --pheno-merge : When the main input fileset contains an phenotype value
11371 --missing-phenotype [v] : Set missing phenotype value (normally -9). 11370 for a sample, but the --pheno file does not, use the
11372 --1 : Expect case/control phenotypes to be coded as 11371 original value instead of treating the phenotype as
11373 0 = control, 1 = case, instead of the usual 11372 missing.
11374 0 = missing, 1 = control, 2 = case. 11373 --missing-phenotype [v] : Set missing phenotype value (normally -9).
11375 --make-pheno [fn] [val] : Define a new case/control phenotype. If the val 11374 --1 : Expect case/control phenotypes to be coded as
11376 parameter is '*', all samples listed in the given 11375 0 = control, 1 = case, instead of the usual
11377 file are cases, and everyone else is a control. 11376 0 = missing, 1 = control, 2 = case.
11378 (Note that, in some shells, it is necessary to 11377 --make-pheno [fn] [val] : Define a new case/control phenotype. If the val
11379 surround the * with quotes.) 11378 parameter is '*', all samples listed in the given
11380 Otherwise, all samples with third column entry 11379 file are cases, and everyone else is a control.
11381 equal to the val parameter are cases, and all other 11380 (Note that, in some shells, it is necessary to
11382 samples mentioned in the file are controls. 11381 surround the * with quotes.)
11383 --tail-pheno [Lt] {Hbt} : Downcode a scalar phenotype to a case/control 11382 Otherwise, all samples with third column entry
11384 phenotype. All samples with phenotype values 11383 equal to the val parameter are cases, and all other
11385 greater than Hbt are cases, and all with values 11384 samples mentioned in the file are controls.
11386 less than or equal to Lt are controls. If Hbt is 11385 --tail-pheno [Lt] {Hbt} : Downcode a scalar phenotype to a case/control
11387 unspecified, it is equal to Lt; otherwise, 11386 phenotype. All samples with phenotype values
11388 in-between phenotype values are set to missing. 11387 greater than Hbt are cases, and all with values
11389 --covar [filename] <keep-pheno-on-missing-cov> : Specify covariate file. 11388 less than or equal to Lt are controls. If Hbt is
11390 --covar-name [...] : Specify covariate(s) in --covar file by name. 11389 unspecified, it is equal to Lt; otherwise,
11391 Separate multiple names with spaces or commas, and 11390 in-between phenotype values are set to missing.
11392 use dashes to designate ranges. 11391 --covar [filename] <keep-pheno-on-missing-cov> : Specify covariate file.
11393 --covar-number [...] : Specify covariate(s) in --covar file by index. 11392 --covar-name [...] : Specify covariate(s) in --covar file by name.
11394 --no-const-covar : Exclude constant covariates. 11393 Separate multiple names with spaces or commas, and
11395 --within [f] <keep-NA> : Specify initial cluster assignments. 11394 use dashes to designate ranges.
11396 --mwithin [n] : Load cluster assignments from column n+2. 11395 --covar-number [...] : Specify covariate(s) in --covar file by index.
11397 --family : Create a cluster for each family ID. 11396 --no-const-covar : Exclude constant covariates.
11398 --loop-assoc [f] <keep-NA> : Run specified case/control association 11397 --within [f] <keep-NA> : Specify initial cluster assignments.
11399 commands once for each cluster in the file, 11398 --mwithin [n] : Load cluster assignments from column n+2.
11400 using cluster membership as the phenotype. 11399 --family : Create a cluster for each family ID.
11401 --set [filename] : Load sets from a .set file. 11400 --loop-assoc [f] <keep-NA> : Run specified case/control association
11402 --set-names [name(s)...] : Load only sets named on the command line. 11401 commands once for each cluster in the file,
11403 Use spaces to separate multiple names. 11402 using cluster membership as the phenotype.
11404 --subset [filename] : Load only sets named in the given text file. 11403 --set [filename] : Load sets from a .set file.
11405 --set-collapse-all [set name] : Merge all sets. 11404 --set-names [name(s)...] : Load only sets named on the command line.
11406 --complement-sets : Invert all sets. (Names gain 'C_' prefixes.) 11405 Use spaces to separate multiple names.
11407 --make-set-complement-all [s] : --set-collapse-all + inversion. 11406 --subset [filename] : Load only sets named in the given text file.
11408 --make-set [filename] : Define sets from a list of named bp ranges. 11407 --set-collapse-all [set name] : Merge all sets.
11409 --make-set-border [kbs] : Stretch regions in --make-set file. 11408 --complement-sets : Invert all sets. (Names gain 'C_' prefixes.)
11410 --make-set-collapse-group : Define sets from groups instead of sets in 11409 --make-set-complement-all [s] : --set-collapse-all + inversion.
11411 --make-set file. 11410 --make-set [filename] : Define sets from a list of named bp ranges.
11412 --keep [filename] : Exclude all samples not named in the file. 11411 --make-set-border [kbs] : Stretch regions in --make-set file.
11413 --remove [filename] : Exclude all samples named in the file. 11412 --make-set-collapse-group : Define sets from groups instead of sets in
11414 --keep-fam [filename] : Exclude all families not named in the file. 11413 --make-set file.
11415 --remove-fam [fname] : Exclude all families named in the file. 11414 --keep [filename] : Exclude all samples not named in the file.
11416 --extract <range> [f] : Exclude all variants not named in the file. 11415 --remove [filename] : Exclude all samples named in the file.
11417 --exclude <range> [f] : Exclude all variants named in the file. 11416 --keep-fam [filename] : Exclude all families not named in the file.
11418 --keep-clusters [filename] : These can be used individually or in 11417 --remove-fam [fname] : Exclude all families named in the file.
11419 --keep-cluster-names [name(s)...] combination to define a list of 11418 --extract <range> [f] : Exclude all variants not named in the file.
11420 clusters to keep; all samples not in a 11419 --exclude <range> [f] : Exclude all variants named in the file.
11421 cluster in that list are then excluded. 11420 --keep-clusters [filename] : These can be used individually or in
11422 Use spaces to separate cluster names 11421 --keep-cluster-names [name(s)...] combination to define a list of
11423 for --keep-cluster-names. 11422 clusters to keep; all samples not in a
11424 --remove-clusters [filename] : Exclude all clusters named in the file. 11423 cluster in that list are then excluded.
11425 --remove-cluster-names [name(s)...] : Exclude the named clusters. 11424 Use spaces to separate cluster names
11426 --gene [sets...] : Exclude variants not in a set named on the command line. 11425 for --keep-cluster-names.
11427 (Separate multiple set names with spaces.) 11426 --remove-clusters [filename] : Exclude all clusters named in the file.
11428 --gene-all : Exclude variants which aren't a member of any set. (PLINK 11427 --remove-cluster-names [name(s)...] : Exclude the named clusters.
11429 1.07 automatically did this under some circumstances.) 11428 --gene [sets...] : Exclude variants not in a set named on the command line.
11430 --attrib [f] {att lst} : Given a file assigning attributes to variants, and a 11429 (Separate multiple set names with spaces.)
11431 --attrib-indiv [f] {a} comma-delimited list (with no whitespace) of 11430 --gene-all : Exclude variants which aren't a member of any set. (PLINK
11432 attribute names, remove variants/samples which are 11431 1.07 automatically did this under some circumstances.)
11433 either missing from the file or don't have any of 11432 --attrib [f] {att lst} : Given a file assigning attributes to variants, and a
11434 the listed attributes. If some attribute names in 11433 --attrib-indiv [f] {a} comma-delimited list (with no whitespace) of
11435 the list are preceded by '-', they are treated as 11434 attribute names, remove variants/samples which are
11436 'negative match conditions' instead: variants with 11435 either missing from the file or don't have any of
11437 at least one negative match attribute are removed. 11436 the listed attributes. If some attribute names in
11438 The first character in the list cannot be a '-', due 11437 the list are preceded by '-', they are treated as
11439 to how command-line parsing works; add a comma in 11438 'negative match conditions' instead: variants with
11440 front to get around this. 11439 at least one negative match attribute are removed.
11441 --chr [chrs...] : Exclude all variants not on the given chromosome(s). 11440 The first character in the list cannot be a '-', due
11442 Valid choices for humans are 0 (unplaced), 1-22, X, Y, XY, 11441 to how command-line parsing works; add a comma in
11443 and MT. Separate multiple chromosomes with spaces and/or 11442 front to get around this.
11444 commas, and use a dash (no adjacent spaces permitted) to 11443 --chr [chrs...] : Exclude all variants not on the given chromosome(s).
11445 denote a range, e.g. '--chr 1-4, 22, xy'. 11444 Valid choices for humans are 0 (unplaced), 1-22, X, Y, XY,
11446 --not-chr [...] : Reverse of --chr (exclude variants on listed chromosomes). 11445 and MT. Separate multiple chromosomes with spaces and/or
11447 --autosome : Exclude all non-autosomal variants. 11446 commas, and use a dash (no adjacent spaces permitted) to
11448 --autosome-xy : Exclude all non-autosomal variants, except those with 11447 denote a range, e.g. '--chr 1-4, 22, xy'.
11449 chromosome code XY (pseudo-autosomal region of X). 11448 --not-chr [...] : Reverse of --chr (exclude variants on listed chromosomes).
11450 --snps-only <just-acgt> : Exclude non-SNP variants. By default, SNP = both 11449 --autosome : Exclude all non-autosomal variants.
11451 allele codes are single-character; 'just-acgt' 11450 --autosome-xy : Exclude all non-autosomal variants, except those with
11452 restricts SNP codes to {A,C,G,T,a,c,g,t,[missing]}. 11451 chromosome code XY (pseudo-autosomal region of X).
11453 --from [var ID] : Use ID(s) to specify a variant range to load. When used 11452 --snps-only <just-acgt> : Exclude non-SNP variants. By default, SNP = both
11454 --to [var ID] together, both variants must be on the same chromosome. 11453 allele codes are single-character; 'just-acgt'
11455 --snp [var ID] : Specify a single variant to load. 11454 restricts SNP codes to {A,C,G,T,a,c,g,t,[missing]}.
11456 --exclude-snp [] : Specify a single variant to exclude. 11455 --from [var ID] : Use ID(s) to specify a variant range to load. When used
11457 --window [kbs] : With --snp or --exclude-snp, loads/excludes all variants 11456 --to [var ID] together, both variants must be on the same chromosome.
11458 within half the specified kb distance of the named one. 11457 --snp [var ID] : Specify a single variant to load.
11459 --from-bp [pos] : Use physical position(s) to define a variant range to 11458 --exclude-snp [] : Specify a single variant to exclude.
11460 --to-bp [pos] load. --from-kb/--to-kb/--from-mb/--to-mb allow decimal 11459 --window [kbs] : With --snp or --exclude-snp, loads/excludes all variants
11461 --from-kb [pos] values. You must also specify a single chromosome (using 11460 within half the specified kb distance of the named one.
11462 --to-kb [pos] e.g. --chr) when using these flags. 11461 --from-bp [pos] : Use physical position(s) to define a variant range to
11463 --from-mb [pos] 11462 --to-bp [pos] load. --from-kb/--to-kb/--from-mb/--to-mb allow decimal
11464 --to-mb [pos] 11463 --from-kb [pos] values. You must also specify a single chromosome (using
11465 --snps [var IDs...] : Use IDs to specify variant range(s) to load or 11464 --to-kb [pos] e.g. --chr) when using these flags.
11466 --exclude-snps [...] exclude. E.g. '--snps rs1111-rs2222, rs3333, rs4444'. 11465 --from-mb [pos]
11467 --thin [p] : Randomly remove variants, retaining each with prob. p. 11466 --to-mb [pos]
11468 --thin-count [n] : Randomly remove variants until n of them remain. 11467 --snps [var IDs...] : Use IDs to specify variant range(s) to load or
11469 --bp-space [bps] : Remove variants so that each pair is no closer than the 11468 --exclude-snps [...] exclude. E.g. '--snps rs1111-rs2222, rs3333, rs4444'.
11470 given bp distance. (Equivalent to VCFtools --thin.) 11469 --thin [p] : Randomly remove variants, retaining each with prob. p.
11471 --thin-indiv [p] : Randomly remove samples, retaining with prob. p. 11470 --thin-count [n] : Randomly remove variants until n of them remain.
11472 --thin-indiv-count [n] : Randomly remove samples until n of them remain. 11471 --bp-space [bps] : Remove variants so that each pair is no closer than the
11473 --filter [f] [val(s)...] : Exclude all samples without a 3rd column entry in 11472 given bp distance. (Equivalent to VCFtools --thin.)
11474 the given file matching one of the given 11473 --thin-indiv [p] : Randomly remove samples, retaining with prob. p.
11475 space-separated value(s). 11474 --thin-indiv-count [n] : Randomly remove samples until n of them remain.
11476 --mfilter [n] : Match against (n+2)th column instead. 11475 --filter [f] [val(s)...] : Exclude all samples without a 3rd column entry in
11477 --geno {val} : Exclude variants with missing call frequencies greater 11476 the given file matching one of the given
11478 than a threshold (default 0.1). (Note that the default 11477 space-separated value(s).
11479 threshold is only applied if --geno is invoked without a 11478 --mfilter [n] : Match against (n+2)th column instead.
11480 parameter; when --geno is not invoked, no per-variant 11479 --geno {val} : Exclude variants with missing call frequencies greater
11481 missing call frequency ceiling is enforced at all. Other 11480 than a threshold (default 0.1). (Note that the default
11482 inclusion/exclusion default thresholds work the same way.) 11481 threshold is only applied if --geno is invoked without a
11483 --mind {val} : Exclude samples with missing call frequencies greater than 11482 parameter; when --geno is not invoked, no per-variant
11484 a threshold (default 0.1). 11483 missing call frequency ceiling is enforced at all. Other
11485 --oblig-missing [f1] [f2] : Specify blocks of missing genotype calls for 11484 inclusion/exclusion default thresholds work the same way.)
11486 --geno/--mind to ignore. The first file should 11485 --mind {val} : Exclude samples with missing call frequencies greater than
11487 have variant IDs in the first column and block 11486 a threshold (default 0.1).
11488 IDs in the second, while the second file should 11487 --oblig-missing [f1] [f2] : Specify blocks of missing genotype calls for
11489 have FIDs in the first column, IIDs in the 11488 --geno/--mind to ignore. The first file should
11490 second, and block IDs in the third. 11489 have variant IDs in the first column and block
11491 --prune : Remove samples with missing phenotypes. 11490 IDs in the second, while the second file should
11492 --maf {freq} : Exclude variants with minor allele frequency lower than 11491 have FIDs in the first column, IIDs in the
11493 a threshold (default 0.01). 11492 second, and block IDs in the third.
11494 --max-maf [freq] : Exclude variants with MAF greater than the threshold. 11493 --prune : Remove samples with missing phenotypes.
11495 --mac [ct] : Exclude variants with minor allele count lower than the 11494 --maf {freq} : Exclude variants with minor allele frequency lower than
11496 (alias: --min-ac) given threshold. 11495 a threshold (default 0.01).
11497 --max-mac [ct] : Exclude variants with minor allele count greater than 11496 --max-maf [freq] : Exclude variants with MAF greater than the threshold.
11498 (alias: --max-ac) the given threshold. 11497 --mac [ct] : Exclude variants with minor allele count lower than the
11499 --maf-succ : Rule of succession MAF estimation (used in EIGENSOFT). 11498 (alias: --min-ac) given threshold.
11500 Given j observations of one allele and k >= j observations 11499 --max-mac [ct] : Exclude variants with minor allele count greater than
11501 of the other, infer a MAF of (j+1) / (j+k+2), rather than 11500 (alias: --max-ac) the given threshold.
11502 the default j / (j+k). 11501 --maf-succ : Rule of succession MAF estimation (used in EIGENSOFT).
11503 --read-freq [fn] : Estimate MAFs and heterozygote frequencies from the given 11502 Given j observations of one allele and k >= j observations
11504 --freq{x} report, instead of the input fileset. 11503 of the other, infer a MAF of (j+1) / (j+k+2), rather than
11505 --hwe [p] <midp> <include-nonctrl> : Exclude variants with Hardy-Weinberg 11504 the default j / (j+k).
11506 equilibrium exact test p-values below a 11505 --read-freq [fn] : Estimate MAFs and heterozygote frequencies from the given
11507 threshold. 11506 --freq{x} report, instead of the input fileset.
11508 --me [t] [v] <var-first> : Filter out trios and variants with Mendel error 11507 --hwe [p] <midp> <include-nonctrl> : Exclude variants with Hardy-Weinberg
11509 rates exceeding the given thresholds. 11508 equilibrium exact test p-values below a
11510 --me-exclude-one {ratio} : Make --me exclude only one sample per trio. 11509 threshold.
11511 --qual-scores [f] {qcol} {IDcol} {skip} : Filter out variants with 11510 --me [t] [v] <var-first> : Filter out trios and variants with Mendel error
11512 out-of-range quality scores. 11511 rates exceeding the given thresholds.
11513 Default range is now [0, \infty ). 11512 --me-exclude-one {ratio} : Make --me exclude only one sample per trio.
11514 --qual-threshold [min qual score] : Set --qual-scores range floor. 11513 --qual-scores [f] {qcol} {IDcol} {skip} : Filter out variants with
11515 --qual-max-threshold [max qual score] : Set --qual-scores range ceiling. 11514 out-of-range quality scores.
11516 --allow-no-sex : Do not treat ambiguous-sex samples as having missing 11515 Default range is now [0, \infty ).
11517 phenotypes in analysis commands. (Automatic /w --no-sex.) 11516 --qual-threshold [min qual score] : Set --qual-scores range floor.
11518 --must-have-sex : Force ambiguous-sex phenotypes to missing on 11517 --qual-max-threshold [max qual score] : Set --qual-scores range ceiling.
11519 --make-bed/--make-just-fam/--recode/--write-covar. 11518 --allow-no-sex : Do not treat ambiguous-sex samples as having missing
11520 --filter-cases : Include only cases in the current analysis. 11519 phenotypes in analysis commands. (Automatic /w --no-sex.)
11521 --filter-controls : Include only controls. 11520 --must-have-sex : Force ambiguous-sex phenotypes to missing on
11522 --filter-males : Include only males. 11521 --make-bed/--make-just-fam/--recode/--write-covar.
11523 --filter-females : Include only females. 11522 --filter-cases : Include only cases in the current analysis.
11524 --filter-founders : Include only founders. 11523 --filter-controls : Include only controls.
11525 --filter-nonfounders : Include only nonfounders. 11524 --filter-males : Include only males.
11526 --nonfounders : Include nonfounders in allele freq/HWE calculations. 11525 --filter-females : Include only females.
11527 --make-founders <require-2-missing> <first> : Clear parental IDs for those 11526 --filter-founders : Include only founders.
11528 with 1+ missing parent(s). 11527 --filter-nonfounders : Include only nonfounders.
11529 --recode-allele [fn] : With --recode A/A-transpose/AD, count alleles named in 11528 --nonfounders : Include nonfounders in allele freq/HWE calculations.
11530 the file (otherwise A1 alleles are always counted). 11529 --make-founders <require-2-missing> <first> : Clear parental IDs for those
11531 --output-chr [MT code] : Set chromosome coding scheme in output files by 11530 with 1+ missing parent(s).
11532 providing the desired human mitochondrial code. 11531 --recode-allele [fn] : With --recode A/A-transpose/AD, count alleles named in
11533 (Options are '26', 'M', 'MT', '0M', 'chr26', 'chrM', 11532 the file (otherwise A1 alleles are always counted).
11534 and 'chrMT'.) 11533 --output-chr [MT code] : Set chromosome coding scheme in output files by
11535 --output-missing-genotype [ch] : Set the code used to represent missing 11534 providing the desired human mitochondrial code.
11536 genotypes in output files (normally the 11535 (Options are '26', 'M', 'MT', '0M', 'chr26', 'chrM',
11537 --missing-genotype value). 11536 and 'chrMT'.)
11538 --output-missing-phenotype [s] : Set the string used to represent missing 11537 --output-missing-genotype [ch] : Set the code used to represent missing
11539 phenotypes in output files (normally the 11538 genotypes in output files (normally the
11540 --missing-phenotype value). 11539 --missing-genotype value).
11541 --zero-cluster [f] : In combination with --within/--family, set blocks of 11540 --output-missing-phenotype [s] : Set the string used to represent missing
11542 genotype calls to missing. The input file should have 11541 phenotypes in output files (normally the
11543 variant IDs in the first column and cluster IDs in the 11542 --missing-phenotype value).
11544 second. This must now be used with --make-bed and no 11543 --zero-cluster [f] : In combination with --within/--family, set blocks of
11545 other output commands. 11544 genotype calls to missing. The input file should have
11546 --set-hh-missing : Cause --make-bed and --recode to set heterozygous 11545 variant IDs in the first column and cluster IDs in the
11547 haploid genotypes to missing. 11546 second. This must now be used with --make-bed and no
11548 --set-mixed-mt-missing : Cause --make-bed and --recode to set mixed MT 11547 other output commands.
11549 genotypes to missing. 11548 --set-hh-missing : Cause --make-bed and --recode to set heterozygous
11550 --split-x [bp1] [bp2] <no-fail> : Changes chromosome code of all X chromosome 11549 haploid genotypes to missing.
11551 --split-x [build] <no-fail> variants with bp position <= bp1 or >= bp2 11550 --set-mixed-mt-missing : Cause --make-bed and --recode to set mixed MT
11552 to XY. The following build codes are 11551 genotypes to missing.
11553 supported as shorthand: 11552 --split-x [bp1] [bp2] <no-fail> : Changes chromosome code of all X chromosome
11554 * 'b36'/'hg18' = NCBI 36, 2709521/154584237 11553 --split-x [build] <no-fail> variants with bp position <= bp1 or >= bp2
11555 * 'b37'/'hg19' = GRCh37, 2699520/154931044 11554 to XY. The following build codes are
11556 * 'b38'/'hg38' = GRCh38, 2781479/155701383 11555 supported as shorthand:
11557 By default, PLINK errors out when no 11556 * 'b36'/'hg18' = NCBI 36, 2709521/154584237
11558 variants would be affected by --split-x; 11557 * 'b37'/'hg19' = GRCh37, 2699520/154931044
11559 the 'no-fail' modifier (useful in scripts) 11558 * 'b38'/'hg38' = GRCh38, 2781479/155701383
11560 overrides this. 11559 By default, PLINK errors out when no
11561 --merge-x <no-fail> : Merge XY chromosome back with X. 11560 variants would be affected by --split-x;
11562 --set-me-missing : Cause --make-bed to set Mendel errors to missing. 11561 the 'no-fail' modifier (useful in scripts)
11563 --fill-missing-a2 : Cause --make-bed to replace all missing calls with 11562 overrides this.
11564 homozygous A2 calls. 11563 --merge-x <no-fail> : Merge XY chromosome back with X.
11565 --set-missing-var-ids [t] : Given a template string with a '@' where the 11564 --set-me-missing : Cause --make-bed to set Mendel errors to missing.
11566 chromosome code should go and '#' where the bp 11565 --fill-missing-a2 : Cause --make-bed to replace all missing calls with
11567 coordinate belongs, --set-missing-var-ids 11566 homozygous A2 calls.
11568 assigns chromosome-and-bp-based IDs to unnamed 11567 --set-missing-var-ids [t] : Given a template string with a '@' where the
11569 variants. 11568 chromosome code should go and '#' where the bp
11570 You may also use '$1' and '$2' to refer to 11569 coordinate belongs, --set-missing-var-ids
11571 allele names in the template string, and in 11570 assigns chromosome-and-bp-based IDs to unnamed
11572 fact this becomes essential when multiple 11571 variants.
11573 variants share the same coordinate. 11572 You may also use '$1' and '$2' to refer to
11574 --new-id-max-allele-len [n] : Specify maximum number of leading characters 11573 allele names in the template string, and in
11575 from allele names to include in new variant IDs 11574 fact this becomes essential when multiple
11576 (default 23). 11575 variants share the same coordinate.
11577 --missing-var-code [string] : Change unnamed variant code (default '.'). 11576 --new-id-max-allele-len [n] : Specify maximum number of leading characters
11578 --update-chr [f] {chrcol} {IDcol} {skip} : Update variant chromosome codes. 11577 from allele names to include in new variant IDs
11579 --update-cm [f] {cmcol} {IDcol} {skip} : Update centimorgan positions. 11578 (default 23).
11580 --update-map [f] {bpcol} {IDcol} {skip} : Update variant bp positions. 11579 --missing-var-code [string] : Change unnamed variant code (default '.').
11581 --update-name [f] {newcol} {oldcol} {skip} : Update variant IDs. 11580 --update-chr [f] {chrcol} {IDcol} {skip} : Update variant chromosome codes.
11582 --update-alleles [fname] : Update variant allele codes. 11581 --update-cm [f] {cmcol} {IDcol} {skip} : Update centimorgan positions.
11583 --allele1234 <multichar> : Interpret/recode A/C/G/T alleles as 1/2/3/4. 11582 --update-map [f] {bpcol} {IDcol} {skip} : Update variant bp positions.
11584 With 'multichar', converts all A/C/G/Ts in allele 11583 --update-name [f] {newcol} {oldcol} {skip} : Update variant IDs.
11585 names to 1/2/3/4s. 11584 --update-alleles [fname] : Update variant allele codes.
11586 --alleleACGT <multichar> : Reverse of --allele1234. 11585 --allele1234 <multichar> : Interpret/recode A/C/G/T alleles as 1/2/3/4.
11587 --update-ids [f] : Update sample IDs. 11586 With 'multichar', converts all A/C/G/Ts in allele
11588 --update-parents [f] : Update parental IDs. 11587 names to 1/2/3/4s.
11589 --update-sex [f] {n} : Update sexes. Sex (1 or M = male, 2 or F = female, 0 11588 --alleleACGT <multichar> : Reverse of --allele1234.
11590 = missing) is loaded from column n+2 (default n is 1). 11589 --update-ids [f] : Update sample IDs.
11591 --flip [filename] : Flip alleles (A<->T, C<->G) for SNP IDs in the file. 11590 --update-parents [f] : Update parental IDs.
11592 --flip-subset [fn] : Only apply --flip to samples in --flip-subset file. 11591 --update-sex [f] {n} : Update sexes. Sex (1 or M = male, 2 or F = female, 0
11593 --flip-scan-window [ct+1] : Set --flip-scan max variant ct dist. (def. 10). 11592 = missing) is loaded from column n+2 (default n is 1).
11594 --flip-scan-window-kb [x] : Set --flip-scan max kb distance (default 1000). 11593 --flip [filename] : Flip alleles (A<->T, C<->G) for SNP IDs in the file.
11595 --flip-scan-threshold [x] : Set --flip-scan min correlation (default 0.5). 11594 --flip-subset [fn] : Only apply --flip to samples in --flip-subset file.
11596 --keep-allele-order : Keep the allele order defined in the .bim file, 11595 --flip-scan-window [ct+1] : Set --flip-scan max variant ct dist. (def. 10).
11597 --real-ref-alleles instead of forcing A2 to be the major allele. 11596 --flip-scan-window-kb [x] : Set --flip-scan max kb distance (default 1000).
11598 --real-ref-alleles also removes 'PR' from the INFO 11597 --flip-scan-threshold [x] : Set --flip-scan min correlation (default 0.5).
11599 values emitted by --recode vcf{-fid/-iid}. 11598 --keep-allele-order : Keep the allele order defined in the .bim file,
11600 --a1-allele [f] {a1col} {IDcol} {skip} : Force alleles in the file to A1. 11599 --real-ref-alleles instead of forcing A2 to be the major allele.
11601 --a2-allele [filename] {a2col} {IDcol} {skip} : 11600 --real-ref-alleles also removes 'PR' from the INFO
11602 Force alleles in the file to A2. ("--a2-allele [VCF filename] 4 3 '#'", 11601 values emitted by --recode vcf{-fid/-iid}.
11603 which scrapes reference allele assignments from a VCF file, is especially 11602 --a1-allele [f] {a1col} {IDcol} {skip} : Force alleles in the file to A1.
11604 useful.) 11603 --a2-allele [filename] {a2col} {IDcol} {skip} :
11605 --indiv-sort [m] {f} : Specify FID/IID sort order. The following four modes 11604 Force alleles in the file to A2. ("--a2-allele [VCF filename] 4 3 '#'",
11606 are supported: 11605 which scrapes reference allele assignments from a VCF file, is especially
11607 * 'none'/'0' keeps samples in the order they were 11606 useful.)
11608 loaded. Default for non-merge operations. 11607 --indiv-sort [m] {f} : Specify FID/IID sort order. The following four modes
11609 * 'natural'/'n' invokes 'natural sort', e.g. 11608 are supported:
11610 'id2' < 'ID3' < 'id10'. Default when merging. 11609 * 'none'/'0' keeps samples in the order they were
11611 * 'ascii'/'a' sorts in ASCII order, e.g. 11610 loaded. Default for non-merge operations.
11612 'ID3' < 'id10' < 'id2'. 11611 * 'natural'/'n' invokes 'natural sort', e.g.
11613 * 'file'/'f' uses the order in the given file (named 11612 'id2' < 'ID3' < 'id10'. Default when merging.
11614 in the second parameter). 11613 * 'ascii'/'a' sorts in ASCII order, e.g.
11615 For now, only --merge/--bmerge/--merge-list and 11614 'ID3' < 'id10' < 'id2'.
11616 --make-bed/--make-just-fam respect this flag. 11615 * 'file'/'f' uses the order in the given file (named
11617 --with-phenotype <no-parents> <no-sex | female-2> : Include more sample info 11616 in the second parameter).
11618 in new .cov file. 11617 For now, only --merge/--bmerge/--merge-list and
11619 --dummy-coding {N} <no-round> : Split categorical variables (n categories, 11618 --make-bed/--make-just-fam respect this flag.
11620 2 < n <= N, default N is 49) into n-1 binary 11619 --with-phenotype <no-parents> <no-sex | female-2> : Include more sample info
11621 dummy variables when writing covariate file. 11620 in new .cov file.
11622 --merge-mode [n] : Adjust --{b}merge/--merge-list behavior based on a 11621 --dummy-coding {N} <no-round> : Split categorical variables (n categories,
11623 numeric code. 11622 2 < n <= N, default N is 49) into n-1 binary
11624 1 (default) = ignore missing calls, otherwise difference 11623 dummy variables when writing covariate file.
11625 -> missing 11624 --merge-mode [n] : Adjust --{b}merge/--merge-list behavior based on a
11626 2 = only overwrite originally missing calls 11625 numeric code.
11627 3 = only overwrite when nonmissing in new file 11626 1 (default) = ignore missing calls, otherwise difference
11628 4/5 = never overwrite and always overwrite, respectively 11627 -> missing
11629 6 = report all mismatching calls without merging 11628 2 = only overwrite originally missing calls
11630 7 = report mismatching nonmissing calls without merging 11629 3 = only overwrite when nonmissing in new file
11631 --merge-equal-pos : With --merge/--bmerge/--merge-list, merge variants with 11630 4/5 = never overwrite and always overwrite, respectively
11632 different names but identical positions. (Exception: 11631 6 = report all mismatching calls without merging
11633 same-position chromosome code 0 variants aren't merged.) 11632 7 = report mismatching nonmissing calls without merging
11634 --mendel-duos : Make Mendel error checks consider samples with only one 11633 --merge-equal-pos : With --merge/--bmerge/--merge-list, merge variants with
11635 parent in the dataset. 11634 different names but identical positions. (Exception:
11636 --mendel-multigen : Make Mendel error checks consider (great-)grandparental 11635 same-position chromosome code 0 variants aren't merged.)
11637 genotypes when parental genotype data is missing. 11636 --mendel-duos : Make Mendel error checks consider samples with only one
11638 --ld-window [ct+1] : Set --r/--r2 max variant ct pairwise distance (usu. 10). 11637 parent in the dataset.
11639 --ld-window-kb [x] : Set --r/--r2 max kb pairwise distance (usually 1000). 11638 --mendel-multigen : Make Mendel error checks consider (great-)grandparental
11640 --ld-window-cm [x] : Set --r/--r2 max centimorgan pairwise distance. 11639 genotypes when parental genotype data is missing.
11641 --ld-window-r2 [x] : Set threshold for --r2 report inclusion (usually 0.2). 11640 --ld-window [ct+1] : Set --r/--r2 max variant ct pairwise distance (usu. 10).
11642 --ld-snp [var ID] : Set first variant in all --r/--r2 pairs. 11641 --ld-window-kb [x] : Set --r/--r2 max kb pairwise distance (usually 1000).
11643 --ld-snps [vID...] : Restrict first --r/--r2 variant to the given ranges. 11642 --ld-window-cm [x] : Set --r/--r2 max centimorgan pairwise distance.
11644 --ld-snp-list [f] : Restrict first --r/--r2 var. to those named in the file. 11643 --ld-window-r2 [x] : Set threshold for --r2 report inclusion (usually 0.2).
11645 --list-all : Generate the 'all' mode report when using --show-tags in 11644 --ld-snp [var ID] : Set first variant in all --r/--r2 pairs.
11646 file mode. 11645 --ld-snps [vID...] : Restrict first --r/--r2 variant to the given ranges.
11647 --tag-kb [kbs] : Set --show-tags max tag kb distance (default 250). 11646 --ld-snp-list [f] : Restrict first --r/--r2 var. to those named in the file.
11648 --tag-r2 [val] : Set --show-tags min tag r-squared (default 0.8) 11647 --list-all : Generate the 'all' mode report when using --show-tags in
11649 --tag-mode2 : Use two-column --show-tags (file mode) I/O format. 11648 file mode.
11650 --ld-xchr [code] : Set Xchr model for --indep{-pairwise}, --r/--r2, 11649 --tag-kb [kbs] : Set --show-tags max tag kb distance (default 250).
11651 --flip-scan, and --show-tags. 11650 --tag-r2 [val] : Set --show-tags min tag r-squared (default 0.8)
11652 1 (default) = males coded 0/1, females 0/1/2 (A1 dosage) 11651 --tag-mode2 : Use two-column --show-tags (file mode) I/O format.
11653 2 = males coded 0/2 11652 --ld-xchr [code] : Set Xchr model for --indep{-pairwise}, --r/--r2,
11654 3 = males coded 0/2, but females given double weighting 11653 --flip-scan, and --show-tags.
11655 --blocks-max-kb [kbs] : Set --blocks maximum haploblock span (def. 200). 11654 1 (default) = males coded 0/1, females 0/1/2 (A1 dosage)
11656 --blocks-min-maf [cutoff] : Adjust --blocks MAF minimum (default 0.05). 11655 2 = males coded 0/2
11657 --blocks-strong-lowci [x] : Set --blocks 'strong LD' CI thresholds (defaults 11656 3 = males coded 0/2, but females given double weighting
11658 --blocks-strong-highci [x] 0.70 and 0.98). 11657 --blocks-max-kb [kbs] : Set --blocks maximum haploblock span (def. 200).
11659 --blocks-recomb-highci [x] : Set 'recombination' CI threshold (default 0.90). 11658 --blocks-min-maf [cutoff] : Adjust --blocks MAF minimum (default 0.05).
11660 --blocks-inform-frac [x] : Force haploblock [strong LD pairs]:[total 11659 --blocks-strong-lowci [x] : Set --blocks 'strong LD' CI thresholds (defaults
11661 informative pairs] ratios to be larger than this 11660 --blocks-strong-highci [x] 0.70 and 0.98).
11662 value (default 0.95). 11661 --blocks-recomb-highci [x] : Set 'recombination' CI threshold (default 0.90).
11663 --distance-wts exp=[x] : When computing genomic distances, assign each 11662 --blocks-inform-frac [x] : Force haploblock [strong LD pairs]:[total
11664 variant a weight of (2q(1-q))^{-x}, where q 11663 informative pairs] ratios to be larger than this
11665 is the loaded or inferred MAF. 11664 value (default 0.95).
11666 --read-dists [dist file] {id file} : Load a triangular binary distance matrix 11665 --distance-wts exp=[x] : When computing genomic distances, assign each
11667 instead of recalculating from scratch. 11666 variant a weight of (2q(1-q))^{-x}, where q
11668 --ppc-gap [val] : Minimum number of base pairs, in thousands, between 11667 is the loaded or inferred MAF.
11669 informative pairs of markers used in --genome PPC test. 11668 --read-dists [dist file] {id file} : Load a triangular binary distance matrix
11670 500 if unspecified. 11669 instead of recalculating from scratch.
11671 --min [cutoff] : Specify minimum PI_HAT for inclusion in --genome report. 11670 --ppc-gap [val] : Minimum number of base pairs, in thousands, between
11672 --max [cutoff] : Specify maximum PI_HAT for inclusion in --genome report. 11671 informative pairs of markers used in --genome PPC test.
11673 --homozyg-match [] : Set minimum concordance across jointly homozygous 11672 500 if unspecified.
11674 variants for a pairwise allelic match to be declared. 11673 --min [cutoff] : Specify minimum PI_HAT for inclusion in --genome report.
11675 --pool-size [ct] : Set minimum size of pools in '--homozyg group' report. 11674 --max [cutoff] : Specify maximum PI_HAT for inclusion in --genome report.
11676 --read-genome [fn] : Load --genome report for --cluster/--neighbour, instead 11675 --homozyg-match [] : Set minimum concordance across jointly homozygous
11677 of recalculating IBS and PPC test p-values from scratch. 11676 variants for a pairwise allelic match to be declared.
11678 --ppc [p-val] : Specify minimum PPC test p-value within a cluster. 11677 --pool-size [ct] : Set minimum size of pools in '--homozyg group' report.
11679 --mc [max size] : Specify maximum cluster size. 11678 --read-genome [fn] : Load --genome report for --cluster/--neighbour, instead
11680 --mcc [c1] [c2] : Specify maximum case and control counts per cluster. 11679 of recalculating IBS and PPC test p-values from scratch.
11681 --K [min count] : Specify minimum cluster count. 11680 --ppc [p-val] : Specify minimum PPC test p-value within a cluster.
11682 --ibm [val] : Specify minimum identity-by-missingness. 11681 --mc [max size] : Specify maximum cluster size.
11683 --match [f] {mv} : Use covariate values to restrict clustering. Without 11682 --mcc [c1] [c2] : Specify maximum case and control counts per cluster.
11684 --match-type, two samples can only be in the same cluster 11683 --K [min count] : Specify minimum cluster count.
11685 if all covariates match. The optional second parameter 11684 --ibm [val] : Specify minimum identity-by-missingness.
11686 specifies a covariate value to treat as missing. 11685 --match [f] {mv} : Use covariate values to restrict clustering. Without
11687 --match-type [f] : Refine interpretation of --match file. The --match-type 11686 --match-type, two samples can only be in the same cluster
11688 file is expected to be a single line with as many entries 11687 if all covariates match. The optional second parameter
11689 as the --match file has covariates; '0' entries specify 11688 specifies a covariate value to treat as missing.
11690 'negative matches' (i.e. samples with equal covariate 11689 --match-type [f] : Refine interpretation of --match file. The --match-type
11691 values cannot be in the same cluster), '1' entries specify 11690 file is expected to be a single line with as many entries
11692 'positive matches' (default), and '-1' causes the 11691 as the --match file has covariates; '0' entries specify
11693 corresponding covariate to be ignored. 11692 'negative matches' (i.e. samples with equal covariate
11694 --qmatch [f] {m} : Force all members of a cluster to have similar 11693 values cannot be in the same cluster), '1' entries specify
11695 --qt [fname] quantitative covariate values. The --qmatch file contains 11694 'positive matches' (default), and '-1' causes the
11696 the covariate values, while the --qt file is a list of 11695 corresponding covariate to be ignored.
11697 nonnegative tolerances (and '-1's marking covariates to 11696 --qmatch [f] {m} : Force all members of a cluster to have similar
11698 skip). 11697 --qt [fname] quantitative covariate values. The --qmatch file contains
11699 --pca-cluster-names [...] : These can be used individually or in combination 11698 the covariate values, while the --qt file is a list of
11700 --pca-clusters [fname] to define a list of clusters to use in the basic 11699 nonnegative tolerances (and '-1's marking covariates to
11701 --pca computation. (--pca-cluster-names expects 11700 skip).
11702 a space-delimited sequence of cluster names, 11701 --pca-cluster-names [...] : These can be used individually or in combination
11703 while --pca-clusters expects a file with one 11702 --pca-clusters [fname] to define a list of clusters to use in the basic
11704 cluster name per line.) All samples outside 11703 --pca computation. (--pca-cluster-names expects
11705 those clusters will then be projected on to the 11704 a space-delimited sequence of cluster names,
11706 calculated PCs. 11705 while --pca-clusters expects a file with one
11707 --mds-plot [dims] <by-cluster> <eigendecomp> <eigvals> : 11706 cluster name per line.) All samples outside
11708 Multidimensional scaling analysis. Requires --cluster. 11707 those clusters will then be projected on to the
11709 --cell [thresh] : Skip some --model tests when a contingency table entry is 11708 calculated PCs.
11710 smaller than the given threshold. 11709 --mds-plot [dims] <by-cluster> <eigendecomp> <eigvals> :
11711 --condition [var ID] <dominant | recessive> : Add one variant as a --linear 11710 Multidimensional scaling analysis. Requires --cluster.
11712 or --logistic covariate. 11711 --cell [thresh] : Skip some --model tests when a contingency table entry is
11713 --condition-list [f] <dominant | recessive> : Add variants named in the file 11712 smaller than the given threshold.
11714 as --linear/--logistic covs. 11713 --condition [var ID] <dominant | recessive> : Add one variant as a --linear
11715 --parameters [...] : Include only the given covariates/interactions in the 11714 or --logistic covariate.
11716 --linear/--logistic models, identified by a list of 11715 --condition-list [f] <dominant | recessive> : Add variants named in the file
11717 1-based indices and/or ranges of them. 11716 as --linear/--logistic covs.
11718 --tests <all> {...} : Perform a (joint) test on the specified term(s) in the 11717 --parameters [...] : Include only the given covariates/interactions in the
11719 --linear/--logistic model, identified by 1-based 11718 --linear/--logistic models, identified by a list of
11720 indices and/or ranges of them. If permutation was 11719 1-based indices and/or ranges of them.
11721 requested, it is based on this test. 11720 --tests <all> {...} : Perform a (joint) test on the specified term(s) in the
11722 * Note that, when --parameters is also present, the 11721 --linear/--logistic model, identified by 1-based
11723 indices refer to the terms remaining AFTER pruning by 11722 indices and/or ranges of them. If permutation was
11724 --parameters. 11723 requested, it is based on this test.
11725 * You can use '--tests all' to include all terms. 11724 * Note that, when --parameters is also present, the
11726 --vif [max VIF] : Set VIF threshold for --linear multicollinearity check 11725 indices refer to the terms remaining AFTER pruning by
11727 (default 50). 11726 --parameters.
11728 --xchr-model [code] : Set the X chromosome --linear/--logistic model. 11727 * You can use '--tests all' to include all terms.
11729 0 = skip sex and haploid chromosomes 11728 --vif [max VIF] : Set VIF threshold for --linear multicollinearity check
11730 1 (default) = add sex as a covariate on X chromosome 11729 (default 50).
11731 2 = code male genotypes 0/2 instead of 0/1 11730 --xchr-model [code] : Set the X chromosome --linear/--logistic model.
11732 3 = test for interaction between genotype and sex 11731 0 = skip sex and haploid chromosomes
11733 --lasso-select-covars {cov(s)...} : Subject some or all covariates to LASSO 11732 1 (default) = add sex as a covariate on X chromosome
11734 model selection. 11733 2 = code male genotypes 0/2 instead of 0/1
11735 --adjust <gc> <log10> <qq-plot> : Report some multiple-testing corrections. 11734 3 = test for interaction between genotype and sex
11736 --lambda [val] : Set genomic control lambda for --adjust. 11735 --lasso-select-covars {cov(s)...} : Subject some or all covariates to LASSO
11737 --ci [size] : Report confidence intervals for odds ratios. 11736 model selection.
11738 --pfilter [val] : Filter out association test results with higher p-values. 11737 --adjust <gc> <log10> <qq-plot> : Report some multiple-testing corrections.
11739 --aperm [min perms - 1] {max perms} {alpha} {beta} {init interval} {slope} : 11738 --lambda [val] : Set genomic control lambda for --adjust.
11740 Set up to six parameters controlling adaptive permutation tests. 11739 --ci [size] : Report confidence intervals for odds ratios.
11741 * The first two control the minimum and maximum number of permutations that 11740 --pfilter [val] : Filter out association test results with higher p-values.
11742 may be run for each variant; default values are 5 and 1000000. 11741 --aperm [min perms - 1] {max perms} {alpha} {beta} {init interval} {slope} :
11743 * The next two control the early termination condition. A 11742 Set up to six parameters controlling adaptive permutation tests.
11744 100% * (1 - beta/2T) confidence interval is calculated for each empirical 11743 * The first two control the minimum and maximum number of permutations that
11745 p-value, where T is the total number of variants; whenever this 11744 may be run for each variant; default values are 5 and 1000000.
11746 confidence interval doesn't contain alpha, the variant is exempted from 11745 * The next two control the early termination condition. A
11747 further permutation testing. Default values are 0 and 1e-4. 11746 100% * (1 - beta/2T) confidence interval is calculated for each empirical
11748 * The last two control when the early termination condition is checked. If 11747 p-value, where T is the total number of variants; whenever this
11749 a check occurs at permutation #p, the next check occurs after 11748 confidence interval doesn't contain alpha, the variant is exempted from
11750 [slope]p + [init interval] more permutations (rounded down). Default 11749 further permutation testing. Default values are 0 and 1e-4.
11751 initial interval is 1, and default slope is 0.001. 11750 * The last two control when the early termination condition is checked. If
11752 --mperm-save : Save best max(T) permutation test statistics. 11751 a check occurs at permutation #p, the next check occurs after
11753 --mperm-save-all : Save all max(T) permutation test statistics. 11752 [slope]p + [init interval] more permutations (rounded down). Default
11754 --set-p [p-val] : Adjust set test significant variant p-value ceiling 11753 initial interval is 1, and default slope is 0.001.
11755 (default 0.05). 11754 --mperm-save : Save best max(T) permutation test statistics.
11756 --set-r2 {v} <write> : Adjust set test significant variant pairwise r^2 11755 --mperm-save-all : Save all max(T) permutation test statistics.
11757 ceiling (default 0.5). 'write' causes violating 11756 --set-p [p-val] : Adjust set test significant variant p-value ceiling
11758 pairs to be dumped to {output prefix}.ldset. 11757 (default 0.05).
11759 --set-max [ct] : Adjust set test maximum # of significant variants 11758 --set-r2 {v} <write> : Adjust set test significant variant pairwise r^2
11760 considered per set (default 5). 11759 ceiling (default 0.5). 'write' causes violating
11761 --set-test-lambda [v] : Specify genomic control correction for set test. 11760 pairs to be dumped to {output prefix}.ldset.
11762 --border [kbs] : Extend --annotate range intervals by given # kbs. 11761 --set-max [ct] : Adjust set test maximum # of significant variants
11763 --annotate-snp-field [nm] : Set --annotate variant ID field name. 11762 considered per set (default 5).
11764 --clump-p1 [pval] : Set --clump index var. p-value ceiling (default 1e-4). 11763 --set-test-lambda [v] : Specify genomic control correction for set test.
11765 --clump-p2 [pval] : Set --clump secondary p-value threshold (default 0.01). 11764 --border [kbs] : Extend --annotate range intervals by given # kbs.
11766 --clump-r2 [r^2] : Set --clump r^2 threshold (default 0.5). 11765 --annotate-snp-field [nm] : Set --annotate variant ID field name.
11767 --clump-kb [kbs] : Set --clump kb radius (default 250). 11766 --clump-p1 [pval] : Set --clump index var. p-value ceiling (default 1e-4).
11768 --clump-snp-field [n...] : Set --clump variant ID field name (default 11767 --clump-p2 [pval] : Set --clump secondary p-value threshold (default 0.01).
11769 'SNP'). With multiple field names, earlier names 11768 --clump-r2 [r^2] : Set --clump r^2 threshold (default 0.5).
11770 take precedence over later ones. 11769 --clump-kb [kbs] : Set --clump kb radius (default 250).
11771 --clump-field [name...] : Set --clump p-value field name (default 'P'). 11770 --clump-snp-field [n...] : Set --clump variant ID field name (default
11772 --clump-allow-overlap : Let --clump non-index vars. join multiple clumps. 11771 'SNP'). With multiple field names, earlier names
11773 --clump-verbose : Request extended --clump report. 11772 take precedence over later ones.
11774 --clump-annotate [hdr...] : Include named extra fields in --clump-verbose and 11773 --clump-field [name...] : Set --clump p-value field name (default 'P').
11775 --clump-best reports. (Field names can be 11774 --clump-allow-overlap : Let --clump non-index vars. join multiple clumps.
11776 separated with spaces or commas.) 11775 --clump-verbose : Request extended --clump report.
11777 --clump-range [filename] : Report overlaps between clumps and regions. 11776 --clump-annotate [hdr...] : Include named extra fields in --clump-verbose and
11778 --clump-range-border [kb] : Stretch regions in --clump-range file. 11777 --clump-best reports. (Field names can be
11779 --clump-index-first : Extract --clump index vars. from only first file. 11778 separated with spaces or commas.)
11780 --clump-replicate : Exclude clumps which contain secondary results 11779 --clump-range [filename] : Report overlaps between clumps and regions.
11781 from only one file. 11780 --clump-range-border [kb] : Stretch regions in --clump-range file.
11782 --clump-best : Report best proxy for each --clump index var. 11781 --clump-index-first : Extract --clump index vars. from only first file.
11783 --meta-analysis-snp-field [n...] : Set --meta-analysis variant ID, A1/A2 11782 --clump-replicate : Exclude clumps which contain secondary results
11784 --meta-analysis-a1-field [n...] allele, p-value, and/or effective sample 11783 from only one file.
11785 --meta-analysis-a2-field [n...] size field names. Defauls are 'SNP', 11784 --clump-best : Report best proxy for each --clump index var.
11786 --meta-analysis-p-field [n...] 'A1', 'A2', 'P', and 'NMISS', 11785 --meta-analysis-snp-field [n...] : Set --meta-analysis variant ID, A1/A2
11787 --meta-analysis-ess-field [n...] respectively. When multiple parameters 11786 --meta-analysis-a1-field [n...] allele, p-value, and/or effective sample
11788 are given to these flags, earlier names 11787 --meta-analysis-a2-field [n...] size field names. Defauls are 'SNP',
11789 take precedence over later ones. 11788 --meta-analysis-p-field [n...] 'A1', 'A2', 'P', and 'NMISS',
11790 Note that, if the numbers of cases and 11789 --meta-analysis-ess-field [n...] respectively. When multiple parameters
11791 controls are unequal, effective sample 11790 are given to these flags, earlier names
11792 size should be 11791 take precedence over later ones.
11793 4 / (1/[# cases] + 1/[# controls]). 11792 Note that, if the numbers of cases and
11794 --meta-analysis-report-dups : When a variant appears multiple times in 11793 controls are unequal, effective sample
11795 in the same file, report that. 11794 size should be
11796 --gene-list-border [kbs] : Extend --gene-report regions by given # of kbs. 11795 4 / (1/[# cases] + 1/[# controls]).
11797 --gene-subset [filename] : Specify gene name subset for --gene-report. 11796 --meta-analysis-report-dups : When a variant appears multiple times in
11798 --gene-report-snp-field [] : Set --gene-report variant ID field name (default 11797 in the same file, report that.
11799 'SNP'). Only relevant with --extract. 11798 --gene-list-border [kbs] : Extend --gene-report regions by given # of kbs.
11800 --gap [kbs] : Set '--fast-epistasis case-only' min. gap (default 1000). 11799 --gene-subset [filename] : Specify gene name subset for --gene-report.
11801 --epi1 [p-value] : Set --{fast-}epistasis reporting threshold (default 11800 --gene-report-snp-field [] : Set --gene-report variant ID field name (default
11802 5e-6 for 'boost', 1e-4 otherwise). 11801 'SNP'). Only relevant with --extract.
11803 --epi2 [p-value] : Set threshold for contributing to SIG_E count (def. 0.01). 11802 --gap [kbs] : Set '--fast-epistasis case-only' min. gap (default 1000).
11804 --je-cellmin [n] : Set required number of observations per 3x3x2 contingency 11803 --epi1 [p-value] : Set --{fast-}epistasis reporting threshold (default
11805 table cell for joint-effects test (default 5). 11804 5e-6 for 'boost', 1e-4 otherwise).
11806 --q-score-range [range file] [data file] {i} {j} <header> : 11805 --epi2 [p-value] : Set threshold for contributing to SIG_E count (def. 0.01).
11807 Apply --score to subset(s) of variants in the primary score list based 11806 --je-cellmin [n] : Set required number of observations per 3x3x2 contingency
11808 on e.g. p-value ranges. 11807 table cell for joint-effects test (default 5).
11809 * The first file should have range labels in the first column, p-value 11808 --q-score-range [range file] [data file] {i} {j} <header> :
11810 lower bounds in the second column, and upper bounds in the third column. 11809 Apply --score to subset(s) of variants in the primary score list based
11811 Lines with too few entries, or nonnumeric values in the second or third 11810 on e.g. p-value ranges.
11812 column, are ignored. 11811 * The first file should have range labels in the first column, p-value
11813 * The second file should contain a variant ID and a p-value on each 11812 lower bounds in the second column, and upper bounds in the third column.
11814 nonempty line (except possibly the first). Variant IDs are read from 11813 Lines with too few entries, or nonnumeric values in the second or third
11815 column #i and p-values are read from column #j, where i defaults to 1 and 11814 column, are ignored.
11816 j defaults to i+1. The 'header' modifier causes the first nonempty line 11815 * The second file should contain a variant ID and a p-value on each
11817 of this file to be skipped. 11816 nonempty line (except possibly the first). Variant IDs are read from
11818 --parallel [k] [n] : Divide the output matrix into n pieces, and only compute 11817 column #i and p-values are read from column #j, where i defaults to 1 and
11819 the kth piece. The primary output file will have the 11818 j defaults to i+1. The 'header' modifier causes the first nonempty line
11820 piece number included in its name, e.g. plink.rel.13 or 11819 of this file to be skipped.
11821 plink.rel.13.gz if k is 13. Concatenating these files 11820 --parallel [k] [n] : Divide the output matrix into n pieces, and only compute
11822 in order will yield the full matrix of interest. (Yes, 11821 the kth piece. The primary output file will have the
11823 this can be done before unzipping.) 11822 piece number included in its name, e.g. plink.rel.13 or
11824 N.B. This generally cannot be used to directly write a 11823 plink.rel.13.gz if k is 13. Concatenating these files
11825 symmetric square matrix. Choose square0 or triangle 11824 in order will yield the full matrix of interest. (Yes,
11826 shape instead, and postprocess as necessary. 11825 this can be done before unzipping.)
11827 --memory [val] : Set size, in MB, of initial workspace malloc attempt. 11826 N.B. This generally cannot be used to directly write a
11828 (Practically mandatory when using GNU parallel.) 11827 symmetric square matrix. Choose square0 or triangle
11829 --threads [val] : Set maximum number of concurrent threads. 11828 shape instead, and postprocess as necessary.
11830 This has one known limitation: some BLAS/LAPACK linear 11829 --memory [val] : Set size, in MB, of initial workspace malloc attempt.
11831 algebra operations are multithreaded in a way that PLINK 11830 (Practically mandatory when using GNU parallel.)
11832 cannot control. If this is problematic, you should 11831 --threads [val] : Set maximum number of concurrent threads.
11833 recompile against single-threaded BLAS/LAPACK. 11832 This has one known limitation: some BLAS/LAPACK linear
11834 --d [char] : Change variant/covariate range delimiter (normally '-'). 11833 algebra operations are multithreaded in a way that PLINK
11835 --seed [val...] : Set random number seed(s). Each value must be an 11834 cannot control. If this is problematic, you should
11836 integer between 0 and 4294967295 inclusive. 11835 recompile against single-threaded BLAS/LAPACK.
11837 --perm-batch-size [val] : Set number of permutations per batch for some 11836 --d [char] : Change variant/covariate range delimiter (normally '-').
11838 permutation tests. 11837 --seed [val...] : Set random number seed(s). Each value must be an
11839 --output-min-p [p] : Specify minimum p-value to write to reports. 11838 integer between 0 and 4294967295 inclusive.
11840 --debug : Use slower, more crash-resistant logging method. 11839 --perm-batch-size [val] : Set number of permutations per batch for some
11841 11840 permutation tests.
11842 Primary methods paper: 11841 --output-min-p [p] : Specify minimum p-value to write to reports.
11843 Chang CC, Chow CC, Tellier LCAM, Vattikuti S, Purcell SM, Lee JJ (2015) 11842 --debug : Use slower, more crash-resistant logging method.
11844 Second-generation PLINK: rising to the challenge of larger and richer datasets. 11843
11845 GigaScience, 4. 11844 Primary methods paper:
11846 11845 Chang CC, Chow CC, Tellier LCAM, Vattikuti S, Purcell SM, Lee JJ (2015)
11847 For further documentation and support, consult the main webpage 11846 Second-generation PLINK: rising to the challenge of larger and richer datasets.
11848 (https://www.cog-genomics.org/plink2 ) and/or the mailing list 11847 GigaScience, 4.
11849 (https://groups.google.com/d/forum/plink2-users ). 11848
11849 For further documentation and support, consult the main webpage
11850 (https://www.cog-genomics.org/plink2 ) and/or the mailing list
11851 (https://groups.google.com/d/forum/plink2-users ).
11852
11853
11850 11854
11851 ]]></help> 11855 ]]></help>
11852 <citations> 11856 <citations>
11853 <citation type="doi">10.1186/s13742-015-0047-8</citation> 11857 <citation type="doi">10.1186/s13742-015-0047-8</citation>
11854 <citation type="bibtex">@ARTICLE{Blankenberg19-plink, 11858 <citation type="bibtex">@ARTICLE{Blankenberg19-plink,