· 4 years ago · Aug 25, 2021, 02:42 PM
1{
2 "class": "Workflow",
3 "cwlVersion": "v1.2",
4 "doc": "**Aggregate Association Testing workflow** runs aggregate association tests, using Burden, SKAT [1], fastSKAT [2], SMMAT [3], or SKAT-O [4] to aggregate a user-defined set of variants. Association tests are parallelized by segments within chromosomes.\n\nDefine segments splits the genome into segments and assigns each aggregate unit to a segment based on the position of its first variant. Note that number of segments refers to the whole genome, not a number of segments per chromosome. Association testing is then for each segment in parallel, before combining results on chromosome level. Finally, the last step creates QQ and Manhattan plots.\n\nAggregate tests are typically used to jointly test rare variants. The **Alt freq max** parameter allows specification of the maximum alternate allele frequency allowable for inclusion in the test. Included variants are usually weighted using either a function of allele frequency (specified via the **Weight Beta** parameter) or some other annotation information (specified via the **Variant weight file** and **Weight user** parameters). \n\nWhen running a burden test, the effect estimate is for each additional unit of burden; there are no effect size estimates for the other tests. Multiple alternate alleles for a single variant are treated separately.\n\nThis workflow utilizes the *assocTestAggregate* function from the GENESIS software.\n\n\n### Common Use Cases\n * This workflow is designed to perform multi-variant association testing on a user-defined groups of variants.\n\n\n### Common Issues and important notes:\n* This pipeline expects that **GDS Files**, **Variant Include Files**, and **Variant group files** are separated per chromosome, and that files are properly named. It is expected that chromosome is included in the filename in following format: chr## , where ## is the name of the chromosome (1-24 or X, Y). Chromosome can be included at any part of the filename. Examples: data_subset_chr1.vcf, data_chr1_subset.vcf, chr1_data_subset.vcf.\n\n* If **Weight Beta** parameter is set, it needs to follow proper convention, two space-delimited floating point numbers.\n\n* **Number of segments** parameter, if provided, needs to be equal or higher than number of chromosomes.\n\n* Testing showed that default parameters for **CPU** and **memory GB** (8GB) are sufficient for testing studies (up to 50k samples), however different null models might increase the requirements.\n\n### Performance Benchmarking\n\nIn the following table you can find estimates of running time and cost. \n\n| Samples | | Rel. matrix in NM | Test | Parallel instances | Instance type | Instance | CPU | RAM (GB) | Time | Cost |\n| ------- | -------- | -------------------- | ------ | --------------------- | ---------------- | ------------- | --- | -------- | ----------- | ---- |\n| 10K | | w/o | Burden | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 11 min | 16$ |\n| 10K | | Sparse | Burden | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 10min | 17$ |\n| 10K | | Dense | Burden | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 12 min | 16$ |\n| 36K | | w/o | Burden | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 46 min | 28$ |\n| 36K | | Sparse | Burden | 8 | On Dm | r5.12xlarge | 1 | 9 | 1 h, 50 min | 31$ |\n| 36K | | Dense | Burden | 8 | On Dm | r5.12xlarge | 4 | 36 | 2h, 59 min | 66$ |\n| 50K | | w/o | Burden | 8 | On Dm | r5.12xlarge | 1 | 8 | 2 h, 28 min | 40$ |\n| 50K | | Sparse | Burden | 8 | On Dm | r5.12xlarge | 1 | 9 | 2 h, 11 min | 43$ |\n| 50K | | Dense | Burden | 8 | On Dm | r5.24xlarge | 8 | 70 | 4 h, 47 min | 208$ |\n| 50K | | Dense | Burden | 8 | On Dm | r5.24xlarge | 8 | 70 | 4 h, 30 min | 218$ |\n| 50K | | Dense | Burden | 8 | On Dm | r5.12xlarge | 8 | 70 | 9 h | 218$ |\n| 10K | | w/o | Burden | 8 | On Dm | n1-highmem-32 | 1 | 8 | 1 h, 55 min | 16$ |\n| 10K | | Sparse | Burden | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h | 17$ |\n| 10K | | Dense | Burden | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h, 40 min | 16$ |\n| 36K | | w/o | Burden | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h, 17 min | 30$ |\n| 36K | | Sparse | Burden | 8 | On Dm | n1-highmem-32 | 1 | 9 | 2 h, 30 min | 30$ |\n| 36K | | Dense | Burden | 8 | On Dm | n1-highmem-32 | 4 | 36 | 6 h | 91$ |\n| 50K | | w/o | Burden | 8 | On Dm | n1-highmem-32 | 1 | 8 | 5 h, 50 min | 43$ |\n| 50K | | Sparse | Burden | 8 | On Dm | n1-highmem-32 | 1 | 9 | 5 h, 50 min | 40$ |\n| 50K | | Dense | Burden | 8 | On Dm | n1-highmem-96 | 8 | 70 | 6 h | 270$ | \n| 10K | | w/o | SKAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 15 min | 16$ |\n| 10K | | Sparse | SKAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 15 min | 17$ |\n| 10K | | Dense | SKAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 17 min | 17$ |\n| 36K | | w/o | SKAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 47 min | 27$ |\n| 36K | | Sparse | SKAT | 8 | On Dm | r5.12xlarge | 1 | 9 | 1 h, 50 min | 31$ |\n| 36K | | Dense | SKAT | 8 | On Dm | r5.12xlarge | 6 | 48 | 5 h, 5 min | 110$ |\n| 50K | | w/o | SKAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 2 h, 27 min | 40$ |\n| 50K | | Sparse | SKAT | 8 | On Dm | r5.12xlarge | 1 | 9 | 2 h, 23 min | 44$ |\n| 50K | | Dense | SKAT | 8 | On Dm | r5.24xlarge | 13 | 100 | 11 h, 2 min | 500$ |\n| 50K | | Dense | SKAT | 8 | On Dm | r5.24xlarge | 12 | 90 | 9 h | 435$ |\n| 50K | | Dense | SKAT | 8 | On Dm | r5.12xlarge | 12 | 90 | 18 h | 435$ |\n| 10K | | w/o | SKAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 1 h, 50 min | 17$ |\n| 10K | | Sparse | SKAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h | 16$ |\n| 10K | | Dense | SKAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h, 50 min | 17$ |\n| 36K | | w/o | SKAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h, 45 min | 30$ |\n| 36K | | Sparse | SKAT | 8 | On Dm | n1-highmem-32 | 1 | 9 | 2 h, 20 min | 30$ |\n| 36K | | Dense | SKAT | 8 | On Dm | n1-highmem-32 | 6 | 48 | 12 h | 162$ |\n| 50K | | w/o | SKAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 5 h | 45$ |\n| 50K | | Sparse | SKAT | 8 | On Dm | n1-highmem-32 | 1 | 9 | 5 h | 45$ |\n| 50K | | Dense | SKAT | 8 | On Dm | n1-highmem-96 | 13 | 100 | 14 h | 620$ |\n| 50K | | Dense | SKAT | 8 | On Dm | n1-highmem-96 | 12 | 90 | 14 h | 620$ |\n| 10K | | w/o | SMMAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 15 min | 16$ |\n| 10K | | Sparse | SMMAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 16 min | 16$ |\n| 10K | | Dense | SMMAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 18 min | 17$ |\n| 36K | | w/o | SMMAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 45 min | 28$ |\n| 36K | | Sparse | SMMAT | 8 | On Dm | r5.12xlarge | 1 | 9 | 1 h, 48 min | 32$ |\n| 36K | | Dense | SMMAT | 8 | On Dm | r5.12xlarge | 6 | 48 | 5h | 111$ |\n| 50K | | w/o | SMMAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 2 h, 30 min | 40$ |\n| 50K | | Sparse | SMMAT | 8 | On Dm | r5.12xlarge | 1 | 9 | 2 h, 47 min | 44$ |\n| 50K | | Dense | SMMAT | 8 | On Dm | r5.24xlarge | 13 | 100 | 11 h, 30 min | 500$ |\n| 50K | | Dense | SMMAT | 8 | On Dm | r5.24xlarge | 12 | 90 | 9 h | 435$ |\n| 50K | | Dense | SMMAT | 8 | On Dm | r5.12xlarge | 12 | 90 | 18 h | 435$ |\n| 10K | | w/o | SMMAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 1 h 30 min | 15$ |\n| 10K | | Sparse | SMMAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h | 16$ |\n| 10K | | Dense | SMMAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h, 50 min | 17$ |\n| 36K | | w/o | SMMAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h, 43 min | 30$ |\n| 36K | | Sparse | SMMAT | 8 | On Dm | n1-highmem-32 | 1 | 9 | 2 h, 25 min | 30$ |\n| 36K | | Dense | SMMAT | 8 | On Dm | n1-highmem-32 | 6 | 48 | 12 h | 160$ |\n| 50K | | w/o | SMMAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 5 h | 42$ |\n| 50K | | Sparse | SMMAT | 8 | On Dm | n1-highmem-32 | 1 | 9 | 5 h | 50$ |\n| 50K | | Dense | SMMAT | 8 | On Dm | n1-highmem-96 | 13 | 100 | 14 h | 620$ |\n| 50K | | Dense | SMMAT | 8 | On Dm | n1-highmem-96 | 12 | 90 | 14 h | 620$ |\n| 10K | | w/o | Fast SKAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 14 min | 16$ |\n| 10K | | Sparse | Fast SKAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 15 min | 16$ |\n| 10K | | Dense | Fast SKAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 17 min | 17$ |\n| 36K | | w/o | Fast SKAT | 8 | On Dm | r5.12xlarge | 1 | 8 | 1 h, 50 min | 28$ |\n| 36K | | Sparse | Fast SKAT | 8 | On Dm | r5.12xlarge | 1 | 9 | 1 h, 40 min | 34$ |\n| 36K | | Dense | Fast SKAT | 8 | On Dm | r5.12xlarge | 6 | 50 | 5 h, 30 min | 135$ |\n| 50K | | w/o | Fast SKAT | 8 | On Dm | r5.12xlarge | 1 | 10 | 1 h, 30 min | 40$ |\n| 50K | | Sparse | Fast SKAT | 8 | On Dm | r5.12xlarge | 1 | 10 | 1 h, 30 min | 43$ |\n| 50K | | Dense | Fast SKAT | 8 | On Dm | r5.24xlarge | 13 | 100 | 11 h, 41 min | 501$ |\n| 10K | | w/o | Fast SKAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 1 h, 30 min | 16$ |\n| 10K | | Sparse | Fast SKAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 1 h, 30 min | 16$ |\n| 10K | | Dense | Fast SKAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 2 h, 50 min | 17$ |\n| 36K | | w/o | Fast SKAT | 8 | On Dm | n1-highmem-32 | 1 | 8 | 3 h | 30$ |\n| 36K | | Sparse | Fast SKAT | 8 | On Dm | n1-highmem-32 | 1 | 9 | 4 h, 30min | 32$ |\n| 36K | | Dense | Fast SKAT | 8 | On Dm | n1-highmem-32 | 6 | 50 | 11 h | 160$ |\n| 50K | | w/o | Fast SKAT | 8 | On Dm | n1-highmem-32 | 1 | 10 | 3 h | 45$ |\n| 50K | | Sparse | Fast SKAT | 8 | On Dm | n1-highmem-32 | 1 | 10 | 3 h | 45$ |\n| 50K | | Dense | Fast SKAT | 8 | On Dm | n1-highmem-96 | 13 | 100 | 14 h | 650$ |\n\nIn tests performed we used **1000G** (tasks with 2.5k participants) and **TOPMed freeze5** datasets (tasks with 10k or more participants). All these tests are done with applied **MAF < 1% filter.** There are **70 mio** variants with MAF <= 1% in **1000G** and **460 mio** in **TOPMed freeze5 dataset**. Typically, aggregate tests only use a subset of these variants; e.g. grouped by gene. Computational performance will vary depending on how many total variants are tested, and how many variants are included in each aggregation unit.\n\n*For more details on **spot/preemptible instances** please visit the [Knowledge Center](https://docs.sevenbridges.com/docs/about-spot-instances).* \n\n### API Python Implementation\n\nThe app's draft task can also be submitted via the **API**. In order to learn how to get your **Authentication token** and **API endpoint** for the corresponding Platform visit our [documentation](https://github.com/sbg/sevenbridges-python#authentication-and-configuration).\n\n```python\nfrom sevenbridges import Api\n\nauthentication_token, api_endpoint = \"enter_your_token\", \"enter_api_endpoint\"\napi = Api(token=authentication_token, url=api_endpoint)\n# Get project_id/app_id from your address bar. Example: https://f4c.sbgenomics.com/u/your_username/project/app\nproject_id, app_id = \"your_username/project\", \"your_username/project/app\"\n# Get file names from files in your project. Example: Names are taken from Data/Public Reference Files.\ninputs = {\n \"input_gds_files\": api.files.query(project=project_id, names=[\"basename_chr1.gds\", \"basename_chr2.gds\", ..]),\n \"variant_group_files\": api.files.query(project=project_id, names=[\"variant_group_chr1.RData\", \"variant_group_chr2.RData\", ..]),\n \"phenotype_file\": api.files.query(project=project_id, names=[\"name_of_phenotype_file\"])[0],\n \"null_model_file\": api.files.query(project=project_id, names=[\"name_of_null_model_file\"])[0]\n}\ntask = api.tasks.create(name='Aggregate Association Testing - API Run', project=project_id, app=app_id, inputs=inputs, run=False)\n```\nInstructions for installing and configuring the API Python client, are provided on [github](https://github.com/sbg/sevenbridges-python#installation). For more information about using the API Python client, consult [sevenbridges-python documentation](http://sevenbridges-python.readthedocs.io/en/latest/). **More examples** are available [here](https://github.com/sbg/okAPI).\n\nAdditionally, [API R](https://github.com/sbg/sevenbridges-r) and [API Java](https://github.com/sbg/sevenbridges-java) clients are available. To learn more about using these API clients please refer to the [API R client documentation](https://sbg.github.io/sevenbridges-r/), and [API Java client documentation](https://docs.sevenbridges.com/docs/java-library-quickstart).\n\n\n### References\n[1] [SKAT](https://dx.doi.org/10.1016%2Fj.ajhg.2011.05.029) \n[2] [fastSKAT](https://doi.org/10.1002/gepi.22136) \n[3] [SMMAT](https://doi.org/10.1016/j.ajhg.2018.12.012) \n[4] [SKAT-O](https://doi.org/10.1093/biostatistics/kxs014) \n[5] [GENESIS](https://f4c.sbgenomics.com/u/boris_majic/genesis-pipelines-dev/apps/doi.org/10.1093/bioinformatics/btz567)",
5 "label": "GENESIS Aggregate Association Testing",
6 "$namespaces": {
7 "sbg": "https://sevenbridges.com"
8 },
9 "inputs": [
10 {
11 "id": "segment_length",
12 "type": "int?",
13 "label": "Segment length",
14 "doc": "Segment length in kb, used for parallelization.",
15 "sbg:toolDefaultValue": "10000kb",
16 "sbg:x": -518,
17 "sbg:y": -142
18 },
19 {
20 "id": "n_segments",
21 "type": "int?",
22 "label": "Number of segments",
23 "doc": "Number of segments, used for parallelization (overrides Segment length). Note that this parameter defines the number of segments for the entire genome, so using this argument with selected chromosomes may result in fewer segments than you expect (and the minimum is one segment per chromosome).",
24 "sbg:x": -691.4876708984375,
25 "sbg:y": -61
26 },
27 {
28 "id": "genome_build",
29 "type": [
30 "null",
31 {
32 "type": "enum",
33 "symbols": [
34 "hg19",
35 "hg38"
36 ],
37 "name": "genome_build"
38 }
39 ],
40 "label": "Genome build",
41 "doc": "Genome build for the genotypes in the GDS file (hg19 or hg38). Used to divide the genome into segments for parallel processing.",
42 "sbg:toolDefaultValue": "hg38",
43 "sbg:x": -517,
44 "sbg:y": 5
45 },
46 {
47 "id": "variant_group_files",
48 "sbg:fileTypes": "RDATA",
49 "type": "File[]",
50 "label": "Variant group files",
51 "doc": "RData file with data.frame defining aggregate groups. If aggregate_type is allele, columns should be group_id, chr, pos, ref, alt. If aggregate_type is position, columns should be group_id, chr, start, end. Files may be separated by chromosome with ‘chr##’ string corresponding to each GDS file. If Variant Include file is specified, groups will be subset to included variants.",
52 "sbg:x": -282,
53 "sbg:y": 94
54 },
55 {
56 "id": "group_id",
57 "type": "string?",
58 "label": "Group ID",
59 "doc": "Alternate name for group_id column in Variant Group file.",
60 "sbg:x": -397,
61 "sbg:y": 190
62 },
63 {
64 "id": "aggregate_type",
65 "type": [
66 "null",
67 {
68 "type": "enum",
69 "symbols": [
70 "position",
71 "allele"
72 ],
73 "name": "aggregate_type"
74 }
75 ],
76 "label": "Aggregate type",
77 "doc": "Type of aggregate grouping. Options are to select variants by allele (unique variants) or position (regions of interest).",
78 "sbg:toolDefaultValue": "allele",
79 "sbg:x": -281,
80 "sbg:y": 269
81 },
82 {
83 "id": "weight_user",
84 "type": "string?",
85 "label": "Weight user",
86 "doc": "Name of column in variant_weight_file or variant_group_file containing the weight for each variant. Overrides Weight beta.",
87 "sbg:x": 184.77578735351562,
88 "sbg:y": 321.9588928222656
89 },
90 {
91 "id": "weight_beta",
92 "type": "string?",
93 "label": "Weight Beta",
94 "doc": "Parameters of the Beta distribution used to determine variant weights based on minor allele frequency; two space delimited values. \"1 1\" is flat (uniform) weights, \"0.5 0.5\" is proportional to the Madsen-Browning weights, and \"1 25\" gives the Wu weights. This parameter is ignored if weight_user is provided.",
95 "sbg:toolDefaultValue": "1 1",
96 "sbg:x": 84,
97 "sbg:y": 369.5714416503906
98 },
99 {
100 "id": "variant_weight_file",
101 "sbg:fileTypes": "RDATA",
102 "type": "File?",
103 "label": "Variant Weight file",
104 "doc": "RData file(s) with data.frame specifying variant weights. Columns should contain either variant.id or all of (chr, pos, ref, alt). Files may be separated by chromosome with ‘chr##’ string corresponding to each GDS file. If not provided, all variants will be given equal weight in the test.",
105 "sbg:x": -11.714285850524902,
106 "sbg:y": 415.8571472167969
107 },
108 {
109 "id": "test",
110 "type": [
111 "null",
112 {
113 "type": "enum",
114 "symbols": [
115 "burden",
116 "skat",
117 "smmat",
118 "fastskat",
119 "skato"
120 ],
121 "name": "test"
122 }
123 ],
124 "label": "Test",
125 "doc": "Test to perform. Options are burden, SKAT, SMMAT, fast-SKAT, or SKAT-O.",
126 "sbg:toolDefaultValue": "Burden",
127 "sbg:x": 189.85714721679688,
128 "sbg:y": 460
129 },
130 {
131 "id": "rho",
132 "type": "float[]?",
133 "label": "Rho",
134 "doc": "A numeric value or list of values in range [0,1] specifying the rho parameter when test is SKAT-O. 0 is a standard SKAT test, 1 is a burden test, and intermediate values are a weighted combination of both.",
135 "sbg:toolDefaultValue": "0",
136 "sbg:x": 91.28571319580078,
137 "sbg:y": 511.7142639160156
138 },
139 {
140 "id": "phenotype_file",
141 "sbg:fileTypes": "RDATA",
142 "type": "File",
143 "label": "Phenotype file",
144 "doc": "RData file with an AnnotatedDataFrame of phenotypes and covariates. Sample identifiers must be in column named “sample.id”. It is recommended to use the phenotype file output by the GENESIS Null Model app.",
145 "sbg:x": -10.142857551574707,
146 "sbg:y": 556.7142944335938
147 },
148 {
149 "id": "pass_only",
150 "type": "boolean?",
151 "label": "Pass only",
152 "doc": "TRUE to select only variants with FILTER=PASS. If FALSE, variants that failed the quality filter will be included in the test.",
153 "sbg:toolDefaultValue": "TRUE",
154 "sbg:x": 193.14285278320312,
155 "sbg:y": 603.1428833007812
156 },
157 {
158 "id": "null_model_file",
159 "sbg:fileTypes": "RDATA",
160 "type": "File",
161 "label": "Null model file",
162 "doc": "RData file containing a null model object. Run the GENESIS Null Model app to create this file.",
163 "sbg:x": 95.42857360839844,
164 "sbg:y": 650.4285888671875
165 },
166 {
167 "id": "memory_gb",
168 "type": "float?",
169 "label": "Memory GB",
170 "doc": "Memory in GB per job.",
171 "sbg:toolDefaultValue": "8",
172 "sbg:x": -8.857142448425293,
173 "sbg:y": 694
174 },
175 {
176 "id": "cpu",
177 "type": "int?",
178 "label": "CPU",
179 "doc": "Number of CPUs for each job.",
180 "sbg:toolDefaultValue": "1",
181 "sbg:x": 194.2857208251953,
182 "sbg:y": 741.5714111328125
183 },
184 {
185 "id": "alt_freq_max",
186 "type": "float?",
187 "label": "Alt Freq Max",
188 "doc": "Maximum alternate allele frequency of variants to include in the test. Default: 1 (no filtering of variants by frequency).",
189 "sbg:toolDefaultValue": "1",
190 "sbg:x": 100.28571319580078,
191 "sbg:y": 786.1428833007812
192 },
193 {
194 "id": "thin_npoints",
195 "type": "int?",
196 "label": "Thin N points",
197 "doc": "Number of points in each bin after thinning.",
198 "sbg:toolDefaultValue": "10000",
199 "sbg:x": 1115.013427734375,
200 "sbg:y": 307.66204833984375
201 },
202 {
203 "id": "thin_nbins",
204 "type": "int?",
205 "label": "Thin N bins",
206 "doc": "Number of bins to use for thinning.",
207 "sbg:toolDefaultValue": "10",
208 "sbg:x": 954.8375854492188,
209 "sbg:y": 338.2701110839844
210 },
211 {
212 "id": "known_hits_file",
213 "sbg:fileTypes": "RDATA",
214 "type": "File?",
215 "label": "Known hits file",
216 "doc": "RData file with data.frame containing columns chr and pos. If provided, 1 Mb regions surrounding each variant listed will be omitted from the QQ and manhattan plots.",
217 "sbg:x": 1191.838134765625,
218 "sbg:y": 453.8511962890625
219 },
220 {
221 "id": "disable_thin",
222 "type": "boolean?",
223 "label": "Disable Thin",
224 "doc": "Logical for whether to thin points in the QQ and Manhattan plots. By default, points are thinned in dense regions to reduce plotting time. If this parameter is set to TRUE, all variant p-values will be included in the plots, and the plotting will be very long and memory intensive.",
225 "sbg:x": 1195.4864501953125,
226 "sbg:y": 579.4459228515625
227 },
228 {
229 "id": "input_gds_files",
230 "sbg:fileTypes": "GDS",
231 "type": "File[]",
232 "label": "GDS files",
233 "doc": "GDS files with genotype data for variants to be tested for association. If multiple files are selected, they will be run in parallel. Files separated by chromosome are expected to have ‘chr##’ strings indicating chromosome number, where ‘##’ can be (1-24, X, Y). Output files for each chromosome will include the corresponding chromosome number.",
234 "sbg:x": -516.9198608398438,
235 "sbg:y": -340.3941650390625
236 },
237 {
238 "id": "out_prefix",
239 "type": "string",
240 "label": "Output prefix",
241 "doc": "Prefix that will be included in all output files.",
242 "sbg:x": 236.25506591796875,
243 "sbg:y": 888.9534301757812
244 },
245 {
246 "id": "truncate_pval_threshold",
247 "type": "float?",
248 "label": "Truncate pval threshold",
249 "doc": "Maximum p-value to display in truncated QQ and manhattan plots.",
250 "sbg:x": 974.3208618164062,
251 "sbg:y": 581.1929931640625
252 },
253 {
254 "id": "plot_mac_threshold",
255 "type": "int?",
256 "label": "Plot MAC threshold",
257 "doc": "Minimum minor allele count for variants or aggregate units to include in plots (if different from MAC threshold).",
258 "sbg:x": 976.2490234375,
259 "sbg:y": 719.1896362304688
260 },
261 {
262 "id": "variant_include_files",
263 "sbg:fileTypes": "RData",
264 "type": "File[]?",
265 "label": "Variant Include Files",
266 "doc": "RData file containing ids of variants to be included.",
267 "sbg:x": -105.97315979003906,
268 "sbg:y": -521.52490234375
269 }
270 ],
271 "outputs": [
272 {
273 "id": "assoc_combined",
274 "outputSource": [
275 "assoc_combine_r/assoc_combined"
276 ],
277 "sbg:fileTypes": "RDATA",
278 "type": "File[]?",
279 "label": "Association test results",
280 "doc": "RData file with data.frame of association test results (test statistic, p-value, etc.) See the documentation of the GENESIS R package for detailed description of output.",
281 "sbg:x": 1454.7757568359375,
282 "sbg:y": 0.5113019943237305
283 },
284 {
285 "id": "assoc_plots",
286 "outputSource": [
287 "assoc_plots_r/assoc_plots"
288 ],
289 "sbg:fileTypes": "PNG",
290 "type": "File[]?",
291 "label": "Association test plots",
292 "doc": "QQ and Manhattan Plots of p-values in association test results.",
293 "sbg:x": 1609.0615234375,
294 "sbg:y": 196.2255859375
295 }
296 ],
297 "steps": [
298 {
299 "id": "define_segments_r",
300 "in": [
301 {
302 "id": "segment_length",
303 "source": "segment_length"
304 },
305 {
306 "id": "n_segments",
307 "source": "n_segments"
308 },
309 {
310 "id": "genome_build",
311 "source": "genome_build"
312 }
313 ],
314 "out": [
315 {
316 "id": "config"
317 },
318 {
319 "id": "define_segments_output"
320 }
321 ],
322 "run": {
323 "class": "CommandLineTool",
324 "cwlVersion": "v1.1",
325 "$namespaces": {
326 "sbg": "https://sevenbridges.com"
327 },
328 "id": "boris_majic/genesis-toolkit-demo/define-segments-r/6",
329 "baseCommand": [],
330 "inputs": [
331 {
332 "sbg:altPrefix": "-s",
333 "sbg:toolDefaultValue": "10000",
334 "sbg:category": "Optional parameters",
335 "id": "segment_length",
336 "type": "int?",
337 "inputBinding": {
338 "prefix": "--segment_length",
339 "shellQuote": false,
340 "position": 1
341 },
342 "label": "Segment length",
343 "doc": "Segment length in kb, used for paralelization."
344 },
345 {
346 "sbg:altPrefix": "-n",
347 "sbg:category": "Optional parameters",
348 "id": "n_segments",
349 "type": "int?",
350 "inputBinding": {
351 "prefix": "--n_segments",
352 "shellQuote": false,
353 "position": 2
354 },
355 "label": "Number of segments",
356 "doc": "Number of segments, used for paralelization (overrides Segment length). Note that this parameter defines the number of segments for the entire genome, so using this argument with selected chromosomes may result in fewer segments than you expect (and the minimum is one segment per chromosome)."
357 },
358 {
359 "sbg:toolDefaultValue": "hg38",
360 "sbg:category": "Configs",
361 "id": "genome_build",
362 "type": [
363 "null",
364 {
365 "type": "enum",
366 "symbols": [
367 "hg19",
368 "hg38"
369 ],
370 "name": "genome_build"
371 }
372 ],
373 "label": "Genome build",
374 "doc": "Genome build for the genotypes in the GDS file (hg19 or hg38). Used to divide the genome into segments for parallel processing.",
375 "default": "hg38"
376 }
377 ],
378 "outputs": [
379 {
380 "id": "config",
381 "doc": "Config file.",
382 "label": "Config file",
383 "type": "File?",
384 "outputBinding": {
385 "glob": "*.config"
386 },
387 "sbg:fileTypes": "CONFIG"
388 },
389 {
390 "id": "define_segments_output",
391 "doc": "Segments txt file.",
392 "label": "Segments file",
393 "type": "File?",
394 "outputBinding": {
395 "glob": "segments.txt"
396 },
397 "sbg:fileTypes": "TXT"
398 }
399 ],
400 "label": "define_segments.R",
401 "arguments": [
402 {
403 "prefix": "",
404 "separate": false,
405 "shellQuote": false,
406 "position": 100,
407 "valueFrom": "define_segments.config"
408 },
409 {
410 "prefix": "",
411 "shellQuote": false,
412 "position": 0,
413 "valueFrom": "Rscript /usr/local/analysis_pipeline/R/define_segments.R"
414 },
415 {
416 "prefix": "",
417 "shellQuote": false,
418 "position": 100,
419 "valueFrom": "${\n return ' >> job.out.log'\n}"
420 }
421 ],
422 "requirements": [
423 {
424 "class": "ShellCommandRequirement"
425 },
426 {
427 "class": "DockerRequirement",
428 "dockerPull": "uwgac/topmed-master:2.10.0"
429 },
430 {
431 "class": "InitialWorkDirRequirement",
432 "listing": [
433 {
434 "entryname": "define_segments.config",
435 "entry": "${\n var argument = [];\n argument.push('out_file \"segments.txt\"')\n if(inputs.genome_build){\n argument.push('genome_build \"' + inputs.genome_build + '\"')\n }\n return argument.join('\\n')\n}",
436 "writable": false
437 }
438 ]
439 },
440 {
441 "class": "InlineJavascriptRequirement"
442 }
443 ],
444 "hints": [
445 {
446 "class": "sbg:SaveLogs",
447 "value": "job.out.log"
448 }
449 ],
450 "sbg:projectName": "GENESIS Toolkit - DEMO",
451 "sbg:image_url": null,
452 "sbg:revisionsInfo": [
453 {
454 "sbg:revision": 0,
455 "sbg:modifiedBy": "boris_majic",
456 "sbg:modifiedOn": 1577360777,
457 "sbg:revisionNotes": null
458 },
459 {
460 "sbg:revision": 1,
461 "sbg:modifiedBy": "boris_majic",
462 "sbg:modifiedOn": 1577360800,
463 "sbg:revisionNotes": "Import from F4C"
464 },
465 {
466 "sbg:revision": 2,
467 "sbg:modifiedBy": "dajana_panovic",
468 "sbg:modifiedOn": 1594132905,
469 "sbg:revisionNotes": "Docker image update to 2.0.8"
470 },
471 {
472 "sbg:revision": 3,
473 "sbg:modifiedBy": "dajana_panovic",
474 "sbg:modifiedOn": 1602155769,
475 "sbg:revisionNotes": "Import from BDC 2.8.1 version"
476 },
477 {
478 "sbg:revision": 4,
479 "sbg:modifiedBy": "dajana_panovic",
480 "sbg:modifiedOn": 1603798568,
481 "sbg:revisionNotes": "BDC import"
482 },
483 {
484 "sbg:revision": 5,
485 "sbg:modifiedBy": "dajana_panovic",
486 "sbg:modifiedOn": 1608907204,
487 "sbg:revisionNotes": "CWLtool prep"
488 },
489 {
490 "sbg:revision": 6,
491 "sbg:modifiedBy": "dajana_panovic",
492 "sbg:modifiedOn": 1616077263,
493 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0"
494 }
495 ],
496 "sbg:appVersion": [
497 "v1.1"
498 ],
499 "sbg:id": "h-68485524/h-f60a3b4b/h-9fa49491/0",
500 "sbg:revision": 6,
501 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0",
502 "sbg:modifiedOn": 1616077263,
503 "sbg:modifiedBy": "dajana_panovic",
504 "sbg:createdOn": 1577360777,
505 "sbg:createdBy": "boris_majic",
506 "sbg:project": "boris_majic/genesis-toolkit-demo",
507 "sbg:sbgMaintained": false,
508 "sbg:validationErrors": [],
509 "sbg:contributors": [
510 "dajana_panovic",
511 "boris_majic"
512 ],
513 "sbg:latestRevision": 6,
514 "sbg:publisher": "sbg",
515 "sbg:content_hash": "abcb7884a4e9f96eab06afefcfd6ac9a971605d3a26b810578009f05e0f63455d"
516 },
517 "label": "define_segments.R",
518 "sbg:x": -246.3984375,
519 "sbg:y": -60
520 },
521 {
522 "id": "aggregate_list",
523 "in": [
524 {
525 "id": "variant_group_file",
526 "source": "variant_group_files"
527 },
528 {
529 "id": "aggregate_type",
530 "source": "aggregate_type"
531 },
532 {
533 "id": "group_id",
534 "source": "group_id"
535 }
536 ],
537 "out": [
538 {
539 "id": "aggregate_list"
540 },
541 {
542 "id": "config_file"
543 }
544 ],
545 "run": {
546 "class": "CommandLineTool",
547 "cwlVersion": "v1.1",
548 "$namespaces": {
549 "sbg": "https://sevenbridges.com"
550 },
551 "id": "boris_majic/genesis-toolkit-demo/aggregate-list/6",
552 "baseCommand": [],
553 "inputs": [
554 {
555 "sbg:category": "Input File",
556 "id": "variant_group_file",
557 "type": "File",
558 "label": "Variant group file",
559 "doc": "RData file with data frame defining aggregate groups. If aggregate_type is allele, columns should be group_id, chromosome, position, ref, alt. If aggregate_type is position, columns should be group_id, chromosome, start, end.",
560 "sbg:fileTypes": "RDATA"
561 },
562 {
563 "sbg:toolDefaultValue": "allele",
564 "sbg:category": "Input Options",
565 "id": "aggregate_type",
566 "type": [
567 "null",
568 {
569 "type": "enum",
570 "symbols": [
571 "position",
572 "allele"
573 ],
574 "name": "aggregate_type"
575 }
576 ],
577 "label": "Aggregate type",
578 "doc": "Type of aggregate grouping. Options are to select variants by allele (unique variants) or position (regions of interest). Default is allele."
579 },
580 {
581 "sbg:toolDefaultValue": "aggregate_list.RData",
582 "id": "out_file",
583 "type": "string?",
584 "label": "Out file",
585 "doc": "Out file."
586 },
587 {
588 "sbg:category": "General",
589 "sbg:toolDefaultValue": "group_id",
590 "id": "group_id",
591 "type": "string?",
592 "label": "Group ID",
593 "doc": "Alternate name for group_id column."
594 }
595 ],
596 "outputs": [
597 {
598 "id": "aggregate_list",
599 "label": "Aggregate list",
600 "type": "File?",
601 "outputBinding": {
602 "glob": "${\n var comm;\n \n if (!inputs.variant_group_file.basename.includes('chr'))\n {\n return '*RData'\n }\n if (!inputs.out_file) {\n comm = \"aggregate_list*.RData\"\n } else {\n \tcomm = inputs.out_file + \".RData\"\n }\n return comm\n}"
603 },
604 "sbg:fileTypes": "RDATA"
605 },
606 {
607 "id": "config_file",
608 "label": "Config file",
609 "type": "File?",
610 "outputBinding": {
611 "glob": "*.config"
612 },
613 "sbg:fileTypes": "CONFIG"
614 }
615 ],
616 "label": "aggregate_list",
617 "arguments": [
618 {
619 "prefix": "",
620 "shellQuote": false,
621 "position": 0,
622 "valueFrom": "${ \n var cmd_line;\n if (inputs.variant_group_file)\n {\n cmd_line = \"cp \" + inputs.variant_group_file.path + \" \" + inputs.variant_group_file.basename\n \n return cmd_line\n }\n else\n {\n return \"echo variant group file not provided\"\n }\n}"
623 },
624 {
625 "prefix": "",
626 "shellQuote": false,
627 "position": 0,
628 "valueFrom": "${\n var cmd_line = \"\";\n if (inputs.variant_group_file)\n {\n cmd_line = \"&& Rscript /usr/local/analysis_pipeline/R/aggregate_list.R aggregate_list.config \"\n }\n\n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"chr\")[1];\n \n if(isNumeric(chrom_num.charAt(1)))\n {\n chr_array.push(chrom_num.substr(0,2))\n }\n else\n {\n chr_array.push(chrom_num.substr(0,1))\n }\n return chr_array.toString()\n }\n \n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n \n var chromosome;\n \n if (inputs.variant_group_file.basename.includes('chr'))\n {\n\t chromosome = find_chromosome(inputs.variant_group_file.path)\n cmd_line += \"--chromosome \" + chromosome \n return cmd_line\n }\n return ''\n \n \n}"
629 },
630 {
631 "prefix": "",
632 "shellQuote": false,
633 "position": 100,
634 "valueFrom": "${\n return ' >> job.out.log'\n}"
635 }
636 ],
637 "requirements": [
638 {
639 "class": "ShellCommandRequirement"
640 },
641 {
642 "class": "DockerRequirement",
643 "dockerPull": "uwgac/topmed-master:2.10.0"
644 },
645 {
646 "class": "InitialWorkDirRequirement",
647 "listing": [
648 {
649 "entryname": "aggregate_list.config",
650 "entry": "${\n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"/\").pop();\n chrom_num = file.split(\"chr\")[1]\n \n if(isNumeric(chrom_num.charAt(1)))\n {\n chr_array.push(chrom_num.substr(0,2))\n }\n else\n {\n chr_array.push(chrom_num.substr(0,1))\n }\n return chr_array.toString();\n }\n \n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n \n var argument = [];\n \n \n if (inputs.variant_group_file.basename.includes('chr'))\n { \n var chr = find_chromosome(inputs.variant_group_file.path);\n var chromosomes_basename = inputs.variant_group_file.path.slice(0,-6).replace(/\\/.+\\//g,\"\");\n \n var i; \n for(i = chromosomes_basename.length - 1; i > 0; i--)\n if(chromosomes_basename[i] != 'X' && chromosomes_basename[i] != \"Y\" && isNaN(chromosomes_basename[i]))\n break;\n chromosomes_basename = inputs.variant_group_file.basename.split('chr'+chr)[0]+\"chr \"+ inputs.variant_group_file.basename.split('chr'+chr)[1]\n \n argument.push('variant_group_file \"' + chromosomes_basename + '\"')\n }\n else\n {\n argument.push('variant_group_file \"' + inputs.variant_group_file.basename + '\"')\n }\n \n if(inputs.out_file)\n if (inputs.variant_group_file.basename.includes('chr')){\n \n argument.push('out_file \"' + inputs.out_file + ' .RData\"')}\n else\n argument.push('out_file \"' + inputs.out_file + '.RData\"')\n else\n {\n if (inputs.variant_group_file.basename.includes('chr'))\n argument.push('out_file \"aggregate_list_chr .RData\"')\n else\n argument.push('out_file aggregate_list.RData')\n }\n \n\n if(inputs.aggregate_type)\n argument.push('aggregate_type \"' + inputs.aggregate_type + '\"')\n \n if(inputs.group_id)\n argument.push('group_id \"' + inputs.group_id + '\"')\n \n return argument.join('\\n') + '\\n'\n\n}",
651 "writable": false
652 }
653 ]
654 },
655 {
656 "class": "InlineJavascriptRequirement"
657 }
658 ],
659 "hints": [
660 {
661 "class": "sbg:SaveLogs",
662 "value": "job.out.log"
663 }
664 ],
665 "sbg:projectName": "GENESIS Toolkit - DEMO",
666 "sbg:image_url": null,
667 "sbg:revisionsInfo": [
668 {
669 "sbg:revision": 0,
670 "sbg:modifiedBy": "boris_majic",
671 "sbg:modifiedOn": 1577360995,
672 "sbg:revisionNotes": null
673 },
674 {
675 "sbg:revision": 1,
676 "sbg:modifiedBy": "boris_majic",
677 "sbg:modifiedOn": 1577361017,
678 "sbg:revisionNotes": "Import from F4C"
679 },
680 {
681 "sbg:revision": 2,
682 "sbg:modifiedBy": "dajana_panovic",
683 "sbg:modifiedOn": 1584373405,
684 "sbg:revisionNotes": "GDS filename corrected"
685 },
686 {
687 "sbg:revision": 3,
688 "sbg:modifiedBy": "dajana_panovic",
689 "sbg:modifiedOn": 1594133165,
690 "sbg:revisionNotes": "Docker image update to 2.8.0"
691 },
692 {
693 "sbg:revision": 4,
694 "sbg:modifiedBy": "dajana_panovic",
695 "sbg:modifiedOn": 1602155556,
696 "sbg:revisionNotes": "Import from BDC 2.8.1 version"
697 },
698 {
699 "sbg:revision": 5,
700 "sbg:modifiedBy": "dajana_panovic",
701 "sbg:modifiedOn": 1608906991,
702 "sbg:revisionNotes": "CWLtool prep"
703 },
704 {
705 "sbg:revision": 6,
706 "sbg:modifiedBy": "dajana_panovic",
707 "sbg:modifiedOn": 1616077425,
708 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0"
709 }
710 ],
711 "sbg:appVersion": [
712 "v1.1"
713 ],
714 "sbg:id": "h-19602971/h-1017d745/h-c50a0c99/0",
715 "sbg:revision": 6,
716 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0",
717 "sbg:modifiedOn": 1616077425,
718 "sbg:modifiedBy": "dajana_panovic",
719 "sbg:createdOn": 1577360995,
720 "sbg:createdBy": "boris_majic",
721 "sbg:project": "boris_majic/genesis-toolkit-demo",
722 "sbg:sbgMaintained": false,
723 "sbg:validationErrors": [],
724 "sbg:contributors": [
725 "dajana_panovic",
726 "boris_majic"
727 ],
728 "sbg:latestRevision": 6,
729 "sbg:publisher": "sbg",
730 "sbg:content_hash": "af4437d8ee5d6c1a9ea398abbfd0aedc04573ae93290c94d5be44b1ab3615c39a"
731 },
732 "label": "Aggregate List",
733 "scatter": [
734 "variant_group_file"
735 ],
736 "sbg:x": -96,
737 "sbg:y": 191
738 },
739 {
740 "id": "assoc_aggregate",
741 "in": [
742 {
743 "id": "gds_file",
744 "linkMerge": "merge_flattened",
745 "source": [
746 "sbg_prepare_segments_1/gds_output"
747 ],
748 "valueFrom": "$(self ? [].concat(self)[0] : self)"
749 },
750 {
751 "id": "null_model_file",
752 "source": "null_model_file"
753 },
754 {
755 "id": "phenotype_file",
756 "source": "phenotype_file"
757 },
758 {
759 "id": "aggregate_variant_file",
760 "linkMerge": "merge_flattened",
761 "source": [
762 "sbg_prepare_segments_1/aggregate_output"
763 ],
764 "valueFrom": "$(self ? [].concat(self)[0] : self)"
765 },
766 {
767 "id": "out_prefix",
768 "source": "out_prefix"
769 },
770 {
771 "id": "rho",
772 "source": [
773 "rho"
774 ]
775 },
776 {
777 "id": "segment_file",
778 "linkMerge": "merge_flattened",
779 "source": [
780 "define_segments_r/define_segments_output"
781 ],
782 "valueFrom": "$(self ? [].concat(self)[0] : self)"
783 },
784 {
785 "id": "test",
786 "source": "test"
787 },
788 {
789 "id": "variant_include_file",
790 "linkMerge": "merge_flattened",
791 "source": [
792 "sbg_prepare_segments_1/variant_include_output"
793 ],
794 "valueFrom": "$(self ? [].concat(self)[0] : self)"
795 },
796 {
797 "id": "weight_beta",
798 "source": "weight_beta"
799 },
800 {
801 "id": "segment",
802 "linkMerge": "merge_flattened",
803 "source": [
804 "sbg_prepare_segments_1/segments"
805 ],
806 "valueFrom": "$(self ? [].concat(self)[0] : self)"
807 },
808 {
809 "id": "aggregate_type",
810 "source": "aggregate_type"
811 },
812 {
813 "id": "alt_freq_max",
814 "source": "alt_freq_max"
815 },
816 {
817 "id": "pass_only",
818 "source": "pass_only"
819 },
820 {
821 "id": "variant_weight_file",
822 "source": "variant_weight_file"
823 },
824 {
825 "id": "weight_user",
826 "source": "weight_user"
827 },
828 {
829 "id": "cpu",
830 "source": "cpu"
831 },
832 {
833 "id": "memory_gb",
834 "source": "memory_gb"
835 },
836 {
837 "id": "genome_build",
838 "source": "genome_build"
839 }
840 ],
841 "out": [
842 {
843 "id": "assoc_aggregate"
844 },
845 {
846 "id": "config"
847 }
848 ],
849 "run": {
850 "class": "CommandLineTool",
851 "cwlVersion": "v1.1",
852 "$namespaces": {
853 "sbg": "https://sevenbridges.com"
854 },
855 "id": "boris_majic/genesis-toolkit-demo/assoc-aggregate/7",
856 "baseCommand": [],
857 "inputs": [
858 {
859 "sbg:category": "Input File",
860 "id": "gds_file",
861 "type": "File",
862 "label": "GDS file",
863 "doc": "GDS File.",
864 "sbg:fileTypes": "GDS"
865 },
866 {
867 "sbg:category": "Input File",
868 "id": "null_model_file",
869 "type": "File",
870 "label": "Null model file",
871 "doc": "Null model file.",
872 "sbg:fileTypes": "RDATA"
873 },
874 {
875 "sbg:category": "Input File",
876 "id": "phenotype_file",
877 "type": "File",
878 "label": "Phenotype file",
879 "doc": "RData file with AnnotatedDataFrame of phenotypes. Used for plotting kinship estimates separately by study.",
880 "sbg:fileTypes": "RDATA"
881 },
882 {
883 "id": "aggregate_variant_file",
884 "type": "File",
885 "label": "Aggregate variant file",
886 "doc": "File with regions that we want to test.",
887 "sbg:fileTypes": "RDATA"
888 },
889 {
890 "sbg:toolDefaultValue": "assoc_aggregate",
891 "sbg:category": "Inputs",
892 "id": "out_prefix",
893 "type": "string?",
894 "label": "Output prefix",
895 "doc": "Output prefix."
896 },
897 {
898 "sbg:toolDefaultValue": "0",
899 "id": "rho",
900 "type": "float[]?",
901 "label": "Rho",
902 "doc": "A numeric value or list of values in range [0,1] specifying the rho parameter when test is skat. 0 is a standard SKAT test, 1 is a score burden test, and multiple values is a SKAT-O test."
903 },
904 {
905 "sbg:category": "Input files",
906 "id": "segment_file",
907 "type": "File?",
908 "label": "Segment File",
909 "doc": "Segment File.",
910 "sbg:fileTypes": "TXT"
911 },
912 {
913 "sbg:toolDefaultValue": "burden",
914 "id": "test",
915 "type": [
916 "null",
917 {
918 "type": "enum",
919 "symbols": [
920 "burden",
921 "skat",
922 "smmat",
923 "fastskat",
924 "skato"
925 ],
926 "name": "test"
927 }
928 ],
929 "label": "Test type",
930 "doc": "Test to perform. Options are burden, skat, smmat, fastskat, or skato."
931 },
932 {
933 "id": "variant_include_file",
934 "type": "File?",
935 "label": "Variant Include File",
936 "doc": "Variants to be included when perform testing.",
937 "sbg:fileTypes": "RDATA"
938 },
939 {
940 "sbg:toolDefaultValue": "1 1",
941 "id": "weight_beta",
942 "type": "string?",
943 "label": "Weight Beta",
944 "doc": "Parameters of the Beta distribution used to determine variant weights, two space delimited values. \"1 1\" is flat weights, \"0.5 0.5\" is proportional to the Madsen-Browning weights, and \"1 25\" gives the Wu weights. This parameter is ignored if weight_user is provided."
945 },
946 {
947 "sbg:category": "Input Options",
948 "id": "segment",
949 "type": "int?",
950 "inputBinding": {
951 "prefix": "--segment",
952 "shellQuote": false,
953 "position": 10
954 },
955 "label": "Segment number",
956 "doc": "Segment Number"
957 },
958 {
959 "sbg:toolDefaultValue": "Allele",
960 "id": "aggregate_type",
961 "type": [
962 "null",
963 {
964 "type": "enum",
965 "symbols": [
966 "allele",
967 "position"
968 ],
969 "name": "aggregate_type"
970 }
971 ],
972 "label": "Aggregate type",
973 "doc": "Type of aggregate grouping. Options are to select variants by allele (unique variants) or position (regions of interest)."
974 },
975 {
976 "sbg:toolDefaultValue": "1",
977 "id": "alt_freq_max",
978 "type": "float?",
979 "label": "Alt Freq Max",
980 "doc": "Maximum alternate allele frequency to consider."
981 },
982 {
983 "sbg:toolDefaultValue": "TRUE",
984 "id": "pass_only",
985 "type": "boolean?",
986 "label": "Pass only",
987 "doc": "TRUE to select only variants with FILTER=PASS."
988 },
989 {
990 "id": "variant_weight_file",
991 "type": "File?",
992 "label": "Variant Weight file",
993 "doc": "Variant Weight file."
994 },
995 {
996 "id": "weight_user",
997 "type": "string?",
998 "label": "Weight user",
999 "doc": "Name of column in variant_weight_file or variant_group_file containing the weight for each variant."
1000 },
1001 {
1002 "sbg:category": "Input options",
1003 "sbg:toolDefaultValue": "1",
1004 "id": "cpu",
1005 "type": "int?",
1006 "label": "CPU",
1007 "doc": "Number of CPUs for each tool job. Default value: 1."
1008 },
1009 {
1010 "sbg:category": "Input options",
1011 "sbg:toolDefaultValue": "8",
1012 "id": "memory_gb",
1013 "type": "float?",
1014 "label": "memory GB",
1015 "doc": "Memory in GB per job. Default value: 8."
1016 },
1017 {
1018 "id": "genome_build",
1019 "type": [
1020 "null",
1021 {
1022 "type": "enum",
1023 "symbols": [
1024 "hg19",
1025 "hg38"
1026 ],
1027 "name": "genome_build"
1028 }
1029 ],
1030 "label": "Genome build",
1031 "doc": "Genome build for the genotypes in the GDS file (hg19 or hg38). Used to divide the genome into segments for parallel processing."
1032 }
1033 ],
1034 "outputs": [
1035 {
1036 "id": "assoc_aggregate",
1037 "label": "Assoc aggregate output",
1038 "type": "File?",
1039 "outputBinding": {
1040 "glob": "${\n /*\n if (!inputs.out_prefix) {\n comm = \"assoc_aggregate*.RData\"\n } else {\n \tcomm = inputs.out_prefix + \".RData\"\n }\n return comm */\n return \"*.RData\"\n}"
1041 },
1042 "sbg:fileTypes": "RDATA"
1043 },
1044 {
1045 "id": "config",
1046 "label": "Config file",
1047 "type": "File?",
1048 "outputBinding": {
1049 "glob": "*.config"
1050 },
1051 "sbg:fileTypes": "CONFIG"
1052 }
1053 ],
1054 "label": "assoc_aggregate",
1055 "arguments": [
1056 {
1057 "prefix": "",
1058 "shellQuote": false,
1059 "position": 1,
1060 "valueFrom": "${\n var cmd_line = \"Rscript /usr/local/analysis_pipeline/R/assoc_aggregate.R assoc_aggregate.config \";\n\n function isNumeric(s) {\n\n return !isNaN(s - parseFloat(s));\n }\n\n function find_chromosome(file){\n\n var chr_array = [];\n var chrom_num = file.split(\"/\").pop();\n chrom_num = chrom_num.split(\".\")[0]\n \n if(isNumeric(chrom_num.charAt(chrom_num.length-2)))\n {\n chr_array.push(chrom_num.substr(chrom_num.length - 2))\n }\n else\n {\n chr_array.push(chrom_num.substr(chrom_num.length - 1))\n }\n\n return chr_array.toString()\n }\n \n// \tchromosome = find_chromosome(inputs.gds_file.path)\n// cmd_line += \"--chromosome \" + chromosome \n return cmd_line\n \n}"
1061 },
1062 {
1063 "prefix": "",
1064 "shellQuote": false,
1065 "position": 0,
1066 "valueFrom": "${\n if (inputs.cpu)\n return 'export NSLOTS=' + inputs.cpu + ' &&'\n else\n return ''\n}"
1067 },
1068 {
1069 "prefix": "",
1070 "shellQuote": false,
1071 "position": 100,
1072 "valueFrom": "${\n return ' >> job.out.log'\n}"
1073 }
1074 ],
1075 "requirements": [
1076 {
1077 "class": "ShellCommandRequirement"
1078 },
1079 {
1080 "class": "ResourceRequirement",
1081 "ramMin": "${\n if(inputs.memory_gb)\n return parseFloat(inputs.memory_gb * 1024)\n else\n return 8*1024\n}",
1082 "coresMin": "${ if(inputs.cpu)\n return inputs.cpu \n else \n return 1\n}"
1083 },
1084 {
1085 "class": "DockerRequirement",
1086 "dockerPull": "uwgac/topmed-master:2.10.0"
1087 },
1088 {
1089 "class": "InitialWorkDirRequirement",
1090 "listing": [
1091 {
1092 "entryname": "assoc_aggregate.config",
1093 "entry": "${\n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"chr\")[1];\n \n if(isNumeric(chrom_num.charAt(1)))\n {\n chr_array.push(chrom_num.substr(0,2))\n }\n else\n {\n chr_array.push(chrom_num.substr(0,1))\n }\n return chr_array.toString()\n }\n \n \n var chr = find_chromosome(inputs.gds_file.path);\n \n var config = \"\";\n \n if(inputs.out_prefix)\n config += \"out_prefix \\\"\" + inputs.out_prefix + \"_chr\"+chr + \"\\\"\\n\"\n else\n {\n var data_prefix = inputs.gds_file.basename.split('chr');\n var data_prefix2 = inputs.gds_file.basename.split('.chr');\n \n if (data_prefix.length == data_prefix2.length)\n config += 'out_prefix \"' + data_prefix2[0] + '_aggregate_chr' + chr + inputs.gds_file.basename.split('chr'+chr)[1].split('.gds')[0] +'\"'+ \"\\n\";\n else\n config += 'out_prefix \"' + data_prefix[0] + 'aggregate_chr' + chr +inputs.gds_file.basename.split('chr'+chr)[1].split('.gds')[0]+'\"' + \"\\n\";\n } \n if(inputs.gds_file)\n config += \"gds_file \\\"\" + inputs.gds_file.path + \"\\\"\\n\"\n if(inputs.phenotype_file)\n config += \"phenotype_file \\\"\" + inputs.phenotype_file.path + \"\\\"\\n\"\n if(inputs.aggregate_variant_file)\n config += \"aggregate_variant_file \\\"\" + inputs.aggregate_variant_file.path + \"\\\"\\n\"\n if(inputs.null_model_file)\n config += \"null_model_file \\\"\" + inputs.null_model_file.path + \"\\\"\\n\"\n if(inputs.null_model_params)\n config += \"null_model_params \\\"\" + inputs.null_model_params.path + \"\\\"\\n\"\n if(inputs.rho)\n {\n config += \"rho \\\"\";;\n for(var i=0; i<inputs.rho.length; i++)\n {\n config += inputs.rho[i].toString() + \" \";\n }\n config += \"\\\"\\n\";\n }\n if(inputs.segment_file)\n config += \"segment_file \\\"\" + inputs.segment_file.path + \"\\\"\\n\"\n if(inputs.test)\n config += \"test \\\"\" + inputs.test + \"\\\"\\n\"\n if(inputs.test_type)\n config += \"test_type \\\"\" + inputs.test_type + \"\\\"\\n\"\n if(inputs.variant_include_file)\n config +=\"variant_include_file \\\"\" + inputs.variant_include_file.path + \"\\\"\\n\"\n if(inputs.weight_beta)\n config +=\"weight_beta \\\"\" + inputs.weight_beta + \"\\\"\\n\"\n if(inputs.aggregate_type)\n config +=\"aggregate_type \\\"\" + inputs.aggregate_type + \"\\\"\\n\"\n if(inputs.alt_freq_max)\n config +=\"alt_freq_max \" + inputs.alt_freq_max + \"\\n\"\n if(!inputs.pass_only)\n config += \"pass_only FALSE\"+ \"\\n\"\n if(inputs.variant_weight_file)\n config +=\"variant_weight_file \\\"\" + inputs.variant_weight_file + \"\\\"\\n\"\n if(inputs.weight_user)\n config +=\"weight_user \\\"\" + inputs.weight_user + \"\\\"\\n\"\n if(inputs.genome_build)\n config +=\"genome_build \\\"\" + inputs.genome_build + \"\\\"\\n\" \n \n return config\n}",
1094 "writable": false
1095 }
1096 ]
1097 },
1098 {
1099 "class": "InlineJavascriptRequirement"
1100 }
1101 ],
1102 "hints": [
1103 {
1104 "class": "sbg:SaveLogs",
1105 "value": "job.out.log"
1106 }
1107 ],
1108 "sbg:revisionsInfo": [
1109 {
1110 "sbg:revision": 0,
1111 "sbg:modifiedBy": "boris_majic",
1112 "sbg:modifiedOn": 1577360633,
1113 "sbg:revisionNotes": null
1114 },
1115 {
1116 "sbg:revision": 1,
1117 "sbg:modifiedBy": "boris_majic",
1118 "sbg:modifiedOn": 1577360681,
1119 "sbg:revisionNotes": "Import from F4C"
1120 },
1121 {
1122 "sbg:revision": 2,
1123 "sbg:modifiedBy": "dajana_panovic",
1124 "sbg:modifiedOn": 1584373824,
1125 "sbg:revisionNotes": "GDS filename corrected"
1126 },
1127 {
1128 "sbg:revision": 3,
1129 "sbg:modifiedBy": "dajana_panovic",
1130 "sbg:modifiedOn": 1594133421,
1131 "sbg:revisionNotes": "Docker image update 2.8.0"
1132 },
1133 {
1134 "sbg:revision": 4,
1135 "sbg:modifiedBy": "dajana_panovic",
1136 "sbg:modifiedOn": 1602155215,
1137 "sbg:revisionNotes": "Import from BDC 2.8.1 version"
1138 },
1139 {
1140 "sbg:revision": 5,
1141 "sbg:modifiedBy": "dajana_panovic",
1142 "sbg:modifiedOn": 1603797838,
1143 "sbg:revisionNotes": "BDC import"
1144 },
1145 {
1146 "sbg:revision": 6,
1147 "sbg:modifiedBy": "dajana_panovic",
1148 "sbg:modifiedOn": 1608907090,
1149 "sbg:revisionNotes": "CWLtool prep"
1150 },
1151 {
1152 "sbg:revision": 7,
1153 "sbg:modifiedBy": "dajana_panovic",
1154 "sbg:modifiedOn": 1616077315,
1155 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0"
1156 }
1157 ],
1158 "sbg:projectName": "GENESIS Toolkit - DEMO",
1159 "sbg:image_url": null,
1160 "sbg:appVersion": [
1161 "v1.1"
1162 ],
1163 "sbg:id": "h-e8ef821e/h-085ef037/h-8b1376d9/0",
1164 "sbg:revision": 7,
1165 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0",
1166 "sbg:modifiedOn": 1616077315,
1167 "sbg:modifiedBy": "dajana_panovic",
1168 "sbg:createdOn": 1577360633,
1169 "sbg:createdBy": "boris_majic",
1170 "sbg:project": "boris_majic/genesis-toolkit-demo",
1171 "sbg:sbgMaintained": false,
1172 "sbg:validationErrors": [],
1173 "sbg:contributors": [
1174 "dajana_panovic",
1175 "boris_majic"
1176 ],
1177 "sbg:latestRevision": 7,
1178 "sbg:publisher": "sbg",
1179 "sbg:content_hash": "a1fc9237f50620307a4def5c7ca276c1f610d84c729d5399e0e6d5b8a3da2ae5b"
1180 },
1181 "label": "Association Testing Aggregate",
1182 "scatter": [
1183 "gds_file",
1184 "aggregate_variant_file",
1185 "variant_include_file",
1186 "segment"
1187 ],
1188 "scatterMethod": "dotproduct",
1189 "sbg:x": 656,
1190 "sbg:y": 120
1191 },
1192 {
1193 "id": "assoc_combine_r",
1194 "in": [
1195 {
1196 "id": "chromosome",
1197 "source": [
1198 "sbg_group_segments_1/chromosome"
1199 ],
1200 "valueFrom": "$(self ? [].concat(self) : self)"
1201 },
1202 {
1203 "id": "assoc_type",
1204 "default": "aggregate"
1205 },
1206 {
1207 "id": "assoc_files",
1208 "source": [
1209 "sbg_group_segments_1/grouped_assoc_files"
1210 ],
1211 "valueFrom": "$(self ? [].concat(self) : self)"
1212 }
1213 ],
1214 "out": [
1215 {
1216 "id": "assoc_combined"
1217 },
1218 {
1219 "id": "configs"
1220 }
1221 ],
1222 "run": {
1223 "class": "CommandLineTool",
1224 "cwlVersion": "v1.1",
1225 "$namespaces": {
1226 "sbg": "https://sevenbridges.com"
1227 },
1228 "id": "boris_majic/genesis-toolkit-demo/assoc-combine-r/7",
1229 "baseCommand": [],
1230 "inputs": [
1231 {
1232 "sbg:altPrefix": "-c",
1233 "sbg:category": "Optional inputs",
1234 "id": "chromosome",
1235 "type": "string[]?",
1236 "inputBinding": {
1237 "prefix": "--chromosome",
1238 "shellQuote": false,
1239 "position": 10
1240 },
1241 "label": "Chromosome",
1242 "doc": "Chromosome (1-24 or X,Y)."
1243 },
1244 {
1245 "id": "assoc_type",
1246 "type": {
1247 "type": "enum",
1248 "symbols": [
1249 "single",
1250 "aggregate",
1251 "window"
1252 ],
1253 "name": "assoc_type"
1254 },
1255 "label": "Association Type",
1256 "doc": "Type of association test: single, window or aggregate."
1257 },
1258 {
1259 "id": "assoc_files",
1260 "type": "File[]",
1261 "label": "Association files",
1262 "doc": "Association files to be combined.",
1263 "sbg:fileTypes": "RDATA"
1264 },
1265 {
1266 "id": "out_prefix",
1267 "type": "string?",
1268 "label": "Out Prefix",
1269 "doc": "Output prefix."
1270 },
1271 {
1272 "sbg:category": "Input options",
1273 "sbg:toolDefaultValue": "4",
1274 "id": "memory_gb",
1275 "type": "float?",
1276 "label": "memory GB",
1277 "doc": "Memory in GB per one job. Default value: 4GB."
1278 },
1279 {
1280 "sbg:category": "Input Options",
1281 "sbg:toolDefaultValue": "1",
1282 "id": "cpu",
1283 "type": "int?",
1284 "label": "CPU",
1285 "doc": "Number of CPUs for each tool job. Default value: 1."
1286 },
1287 {
1288 "sbg:category": "General",
1289 "id": "conditional_variant_file",
1290 "type": "File?",
1291 "label": "Conditional variant file",
1292 "doc": "RData file with data frame of of conditional variants. Columns should include chromosome (or chr) and variant.id. The alternate allele dosage of these variants will be included as covariates in the analysis.",
1293 "sbg:fileTypes": "RData, RDATA"
1294 }
1295 ],
1296 "outputs": [
1297 {
1298 "id": "assoc_combined",
1299 "doc": "Assoc combined.",
1300 "label": "Assoc combined",
1301 "type": "File?",
1302 "outputBinding": {
1303 "glob": "${\n \n //var input_files = [].concat(inputs.assoc_files);\n //var first_filename = input_files[0].basename;\n \n //var chr = first_filename.split('_chr')[1].split('_')[0].split('.RData')[0];\n \n //return first_filename.split('chr')[0]+'chr'+chr+'.RData';\n \n return '*.RData'\n}",
1304 "outputEval": "$(inheritMetadata(self, inputs.assoc_files))"
1305 },
1306 "sbg:fileTypes": "RDATA"
1307 },
1308 {
1309 "id": "configs",
1310 "doc": "Config files.",
1311 "label": "Config files",
1312 "type": "File[]?",
1313 "outputBinding": {
1314 "glob": "*config*"
1315 },
1316 "sbg:fileTypes": "CONFIG"
1317 }
1318 ],
1319 "label": "assoc_combine.R",
1320 "arguments": [
1321 {
1322 "prefix": "",
1323 "shellQuote": false,
1324 "position": 100,
1325 "valueFrom": "assoc_combine.config"
1326 },
1327 {
1328 "prefix": "",
1329 "shellQuote": false,
1330 "position": 5,
1331 "valueFrom": "Rscript /usr/local/analysis_pipeline/R/assoc_combine.R"
1332 },
1333 {
1334 "prefix": "",
1335 "shellQuote": false,
1336 "position": 1,
1337 "valueFrom": "${\n var command = '';\n var i;\n for(i=0; i<inputs.assoc_files.length; i++)\n command += \"ln -s \" + inputs.assoc_files[i].path + \" \" + inputs.assoc_files[i].path.split(\"/\").pop() + \" && \"\n \n return command\n}"
1338 },
1339 {
1340 "prefix": "",
1341 "shellQuote": false,
1342 "position": 100,
1343 "valueFrom": "${\n return ' >> job.out.log'\n}"
1344 }
1345 ],
1346 "requirements": [
1347 {
1348 "class": "ShellCommandRequirement"
1349 },
1350 {
1351 "class": "ResourceRequirement",
1352 "ramMin": "${\n if(inputs.memory_gb)\n return parseInt(inputs.memory_gb * 1024)\n else\n return 4*1024\n}",
1353 "coresMin": "${ if(inputs.cpu)\n return inputs.cpu \n else \n return 1\n}"
1354 },
1355 {
1356 "class": "DockerRequirement",
1357 "dockerPull": "uwgac/topmed-master:2.10.0"
1358 },
1359 {
1360 "class": "InitialWorkDirRequirement",
1361 "listing": [
1362 {
1363 "entryname": "assoc_combine.config",
1364 "entry": "${\n var argument = [];\n argument.push('assoc_type \"'+ inputs.assoc_type + '\"');\n var data_prefix = inputs.assoc_files[0].basename.split('_chr')[0];\n if (inputs.out_prefix)\n {\n argument.push('out_prefix \"' + inputs.out_prefix+ '\"');\n }\n else\n {\n argument.push('out_prefix \"' + data_prefix+ '\"');\n }\n \n if(inputs.conditional_variant_file){\n argument.push('conditional_variant_file \"' + inputs.conditional_variant_file.path + '\"');\n }\n //if(inputs.assoc_files)\n //{\n // arguments.push('assoc_files \"' + inputs.assoc_files[0].path + '\"')\n //}\n return argument.join('\\n') + '\\n'\n}",
1365 "writable": false
1366 }
1367 ]
1368 },
1369 {
1370 "class": "InlineJavascriptRequirement",
1371 "expressionLib": [
1372 "\nvar setMetadata = function(file, metadata) {\n if (!('metadata' in file))\n file['metadata'] = metadata;\n else {\n for (var key in metadata) {\n file['metadata'][key] = metadata[key];\n }\n }\n return file\n};\n\nvar inheritMetadata = function(o1, o2) {\n var commonMetadata = {};\n if (!Array.isArray(o2)) {\n o2 = [o2]\n }\n for (var i = 0; i < o2.length; i++) {\n var example = o2[i]['metadata'];\n for (var key in example) {\n if (i == 0)\n commonMetadata[key] = example[key];\n else {\n if (!(commonMetadata[key] == example[key])) {\n delete commonMetadata[key]\n }\n }\n }\n }\n if (!Array.isArray(o1)) {\n o1 = setMetadata(o1, commonMetadata)\n } else {\n for (var i = 0; i < o1.length; i++) {\n o1[i] = setMetadata(o1[i], commonMetadata)\n }\n }\n return o1;\n};"
1373 ]
1374 }
1375 ],
1376 "hints": [
1377 {
1378 "class": "sbg:SaveLogs",
1379 "value": "job.out.log"
1380 }
1381 ],
1382 "sbg:projectName": "GENESIS Toolkit - DEMO",
1383 "sbg:image_url": null,
1384 "sbg:revisionsInfo": [
1385 {
1386 "sbg:revision": 0,
1387 "sbg:modifiedBy": "boris_majic",
1388 "sbg:modifiedOn": 1577360839,
1389 "sbg:revisionNotes": null
1390 },
1391 {
1392 "sbg:revision": 1,
1393 "sbg:modifiedBy": "boris_majic",
1394 "sbg:modifiedOn": 1577360864,
1395 "sbg:revisionNotes": "Import from F4C"
1396 },
1397 {
1398 "sbg:revision": 2,
1399 "sbg:modifiedBy": "dajana_panovic",
1400 "sbg:modifiedOn": 1584373599,
1401 "sbg:revisionNotes": "GDS filename correction"
1402 },
1403 {
1404 "sbg:revision": 3,
1405 "sbg:modifiedBy": "dajana_panovic",
1406 "sbg:modifiedOn": 1594133318,
1407 "sbg:revisionNotes": "Docker image update 2.8.0"
1408 },
1409 {
1410 "sbg:revision": 4,
1411 "sbg:modifiedBy": "dajana_panovic",
1412 "sbg:modifiedOn": 1602155372,
1413 "sbg:revisionNotes": "Import from BDC 2.8.1 version"
1414 },
1415 {
1416 "sbg:revision": 5,
1417 "sbg:modifiedBy": "dajana_panovic",
1418 "sbg:modifiedOn": 1603797891,
1419 "sbg:revisionNotes": "BDC import"
1420 },
1421 {
1422 "sbg:revision": 6,
1423 "sbg:modifiedBy": "dajana_panovic",
1424 "sbg:modifiedOn": 1608907124,
1425 "sbg:revisionNotes": "CWLtool prep"
1426 },
1427 {
1428 "sbg:revision": 7,
1429 "sbg:modifiedBy": "dajana_panovic",
1430 "sbg:modifiedOn": 1616077298,
1431 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0"
1432 }
1433 ],
1434 "sbg:appVersion": [
1435 "v1.1"
1436 ],
1437 "sbg:id": "h-a1458e1b/h-2be2f0ca/h-6c650723/0",
1438 "sbg:revision": 7,
1439 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0",
1440 "sbg:modifiedOn": 1616077298,
1441 "sbg:modifiedBy": "dajana_panovic",
1442 "sbg:createdOn": 1577360839,
1443 "sbg:createdBy": "boris_majic",
1444 "sbg:project": "boris_majic/genesis-toolkit-demo",
1445 "sbg:sbgMaintained": false,
1446 "sbg:validationErrors": [],
1447 "sbg:contributors": [
1448 "dajana_panovic",
1449 "boris_majic"
1450 ],
1451 "sbg:latestRevision": 7,
1452 "sbg:publisher": "sbg",
1453 "sbg:content_hash": "a9441836c8bc986fc185a4d0cacafb79eee2380bb33c56e1b49de6a4cabdbf4b8"
1454 },
1455 "label": "Association Combine",
1456 "scatter": [
1457 "chromosome",
1458 "assoc_files"
1459 ],
1460 "scatterMethod": "dotproduct",
1461 "sbg:x": 1267,
1462 "sbg:y": 180.71429443359375
1463 },
1464 {
1465 "id": "assoc_plots_r",
1466 "in": [
1467 {
1468 "id": "assoc_files",
1469 "linkMerge": "merge_flattened",
1470 "source": [
1471 "assoc_combine_r/assoc_combined"
1472 ],
1473 "valueFrom": "$(self ? [].concat(self) : self)"
1474 },
1475 {
1476 "id": "assoc_type",
1477 "default": "aggregate"
1478 },
1479 {
1480 "id": "plots_prefix",
1481 "source": "out_prefix"
1482 },
1483 {
1484 "id": "disable_thin",
1485 "source": "disable_thin"
1486 },
1487 {
1488 "id": "known_hits_file",
1489 "source": "known_hits_file"
1490 },
1491 {
1492 "id": "thin_npoints",
1493 "source": "thin_npoints"
1494 },
1495 {
1496 "id": "thin_nbins",
1497 "source": "thin_nbins"
1498 },
1499 {
1500 "id": "plot_mac_threshold",
1501 "source": "plot_mac_threshold"
1502 },
1503 {
1504 "id": "truncate_pval_threshold",
1505 "source": "truncate_pval_threshold"
1506 }
1507 ],
1508 "out": [
1509 {
1510 "id": "assoc_plots"
1511 },
1512 {
1513 "id": "configs"
1514 },
1515 {
1516 "id": "Lambdas"
1517 }
1518 ],
1519 "run": {
1520 "class": "CommandLineTool",
1521 "cwlVersion": "v1.2",
1522 "$namespaces": {
1523 "sbg": "https://sevenbridges.com"
1524 },
1525 "id": "boris_majic/genesis-toolkit-demo/assoc-plots-r/21",
1526 "baseCommand": [],
1527 "inputs": [
1528 {
1529 "sbg:category": "Input Files",
1530 "id": "assoc_files",
1531 "type": "File[]",
1532 "label": "Results from association testing",
1533 "doc": "Rdata files. Results from association testing workflow.",
1534 "sbg:fileTypes": "RDATA"
1535 },
1536 {
1537 "sbg:category": "Input options",
1538 "id": "assoc_type",
1539 "type": {
1540 "type": "enum",
1541 "symbols": [
1542 "single",
1543 "window",
1544 "aggregate"
1545 ],
1546 "name": "assoc_type"
1547 },
1548 "label": "Association Type",
1549 "doc": "Type of association test: single, window or aggregate"
1550 },
1551 {
1552 "sbg:toolDefaultValue": "1-23",
1553 "sbg:category": "Input options",
1554 "id": "chromosomes",
1555 "type": "string?",
1556 "label": "Chromosomes",
1557 "doc": "List of chromosomes. If not provided, in case of multiple files, it will be automatically generated with assumtion that files are in format *chr*.RData\nExample: 1 2 3"
1558 },
1559 {
1560 "sbg:toolDefaultValue": "plots",
1561 "sbg:category": "Input Options",
1562 "id": "plots_prefix",
1563 "type": "string?",
1564 "label": "Plots prefix",
1565 "doc": "Prefix for output files."
1566 },
1567 {
1568 "sbg:category": "Input Options",
1569 "id": "disable_thin",
1570 "type": "boolean?",
1571 "label": "Disable Thin",
1572 "doc": "Logical for whether to thin points in the QQ and Manhattan plots. By default, points are thinned in dense regions to reduce plotting time. If this parameter is set to TRUE, all variant p-values will be included in the plots, and the plotting will be very long and memory intensive."
1573 },
1574 {
1575 "sbg:category": "Inputs",
1576 "id": "known_hits_file",
1577 "type": "File?",
1578 "label": "Known hits file",
1579 "doc": "RData file with data.frame containing columns chr and pos. If provided, 1 Mb regions surrounding each variant listed will be omitted from the QQ and manhattan plots.",
1580 "sbg:fileTypes": "RData, RDATA"
1581 },
1582 {
1583 "sbg:category": "General",
1584 "sbg:toolDefaultValue": "10000",
1585 "id": "thin_npoints",
1586 "type": "int?",
1587 "label": "Number of points in each bin after thinning",
1588 "doc": "Number of points in each bin after thinning."
1589 },
1590 {
1591 "sbg:toolDefaultValue": "10",
1592 "sbg:category": "General",
1593 "id": "thin_nbins",
1594 "type": "int?",
1595 "label": "Thin N binsNumber of bins to use for thinning",
1596 "doc": "Number of bins to use for thinning."
1597 },
1598 {
1599 "id": "plot_mac_threshold",
1600 "type": "int?",
1601 "label": "Plot MAC threshold",
1602 "doc": "Minimum minor allele count for variants or Minimum cumulative minor allele count for aggregate units to include in plots (if different from threshold used to run tests; see `mac_threshold`)."
1603 },
1604 {
1605 "id": "truncate_pval_threshold",
1606 "type": "float?",
1607 "label": "Truncate pval threshold",
1608 "doc": "Truncate pval threshold."
1609 },
1610 {
1611 "sbg:toolDefaultValue": "FALSE",
1612 "id": "plot_qq_by_chrom",
1613 "type": "boolean?",
1614 "label": "Plot qq by chromosome",
1615 "doc": "Logical indicator for whether to generate QQ plots faceted by chromosome."
1616 },
1617 {
1618 "id": "plot_include_file",
1619 "type": "File?",
1620 "label": "Plot include file",
1621 "doc": "RData file with vector of ids to include. See `TopmedPipeline::assocFilterByFile` for format requirements.",
1622 "sbg:fileTypes": "RDATA"
1623 },
1624 {
1625 "id": "signif_type",
1626 "type": [
1627 "null",
1628 {
1629 "type": "enum",
1630 "symbols": [
1631 "fixed",
1632 "bonferroni",
1633 "none"
1634 ],
1635 "name": "signif_type"
1636 }
1637 ],
1638 "label": "Significance type",
1639 "doc": "`fixed`, `bonferroni`, or `none`; character string for how to calculate the significance threshold. Default is `fixed` for single variant analysis and `bonferroni` for other analysis types."
1640 },
1641 {
1642 "sbg:toolDefaultValue": "5e-9",
1643 "id": "signif_line_fixed",
1644 "type": "float?",
1645 "label": "Significance line",
1646 "doc": "P-value for the significance line. Only used if `signif_type = fixed`."
1647 },
1648 {
1649 "id": "qq_mac_bins",
1650 "type": "string?",
1651 "label": "QQ MAC bins",
1652 "doc": "Space separated string of integers (e.g., `\"5 20 50\"`). If set, generate a QQ plot binned by the specified MAC thresholds. 0 and Infinity will automatically be added."
1653 },
1654 {
1655 "id": "qq_maf_bins",
1656 "type": "string?",
1657 "label": "QQ MAF bins",
1658 "doc": "Space separated string of minor allele frequencies (e.g., \"0.01 0.05 0.1\"). If set, generate a QQ plot binned by the specified minor allele frequencies. 0 and Infinity will automatically be added. Single variant tests only."
1659 },
1660 {
1661 "id": "lambda_quantiles",
1662 "type": "string?",
1663 "label": "Lambda quantiles",
1664 "doc": "Space separated string of quantiles at which to calculate genomic inflation lambda (e.g., “0.25 0.5 0.75”). If set, create a text file with lambda calculated at the specified quantiles stored in `out_file_lambdas`."
1665 },
1666 {
1667 "sbg:toolDefaultValue": "lambda.txt",
1668 "id": "out_file_lambdas",
1669 "type": "string?",
1670 "label": "Lambda outfile name",
1671 "doc": "File name of file to store lambda calculated at different quantiles. The default is `lambda.txt`."
1672 },
1673 {
1674 "sbg:toolDefaultValue": "1",
1675 "id": "plot_max_p",
1676 "type": "float?",
1677 "label": "Plot max p",
1678 "doc": "Maximum p-value to plot in QQ and Manhattan plots. Expected QQ values are still calculated using the full set of p-values."
1679 },
1680 {
1681 "id": "plot_maf_threshold",
1682 "type": "float?",
1683 "label": "Plot MAF threshold",
1684 "doc": "Minimum minor allele frequency for variants to include in plots. Ignored if `plot_mac_threshold` is specified. Single variant association tests only."
1685 }
1686 ],
1687 "outputs": [
1688 {
1689 "id": "assoc_plots",
1690 "doc": "QQ and Manhattan Plots generated by assoc_plots.R script.",
1691 "label": "Assoc plots",
1692 "type": "File[]?",
1693 "outputBinding": {
1694 "glob": "*.png"
1695 },
1696 "sbg:fileTypes": "PNG"
1697 },
1698 {
1699 "id": "configs",
1700 "doc": "Config files.",
1701 "label": "Config files",
1702 "type": "File[]?",
1703 "outputBinding": {
1704 "glob": "*config*"
1705 },
1706 "sbg:fileTypes": "CONFIG"
1707 },
1708 {
1709 "id": "Lambdas",
1710 "doc": "File to store lambda calculated at different quantiles.",
1711 "label": "File to store lambda calculated at different quantiles",
1712 "type": "File?",
1713 "outputBinding": {
1714 "glob": "*.txt"
1715 },
1716 "sbg:fileTypes": "TXT"
1717 }
1718 ],
1719 "doc": "### Description\n\nThe UW-GAC GENESIS Association Result Plotting standalone app creates Manhattan and QQ plots from GENESIS association test results with additional filtering and stratification options available. This app is run automatically with default options set by the GENESIS Association Testing Workflows. Users can fine-tune the Manhattan and QQ plots by running this app separately, after one of the association testing workflows. The available options are:\n - Create QQ plots by chromosome.\n - Include a user-specified subset of the results in the plots.\n - Filter results to only those with MAC or MAF greater than a specified threshold.\n - Calculate genomic inflation lambda at various quantiles.\n - Specify the significance type and level.\n - Create QQ plots stratified by MAC or MAF.\n - Specify a maximum p-value to display on the plots.\n\n### Common use cases\n\nThe UW-GAC GENESIS Association Result Plotting standalone app creates Manhattan and QQ plots from GENESIS association test results with additional filtering and stratification options available.\n\n### Changes introduced by Seven Bridges\n\nNo changes introduced by Seven Bridges.",
1720 "label": "GENESIS Association results plotting",
1721 "arguments": [
1722 {
1723 "prefix": "",
1724 "shellQuote": false,
1725 "position": 5,
1726 "valueFrom": "assoc_file.config"
1727 },
1728 {
1729 "prefix": "",
1730 "shellQuote": false,
1731 "position": 3,
1732 "valueFrom": "Rscript /usr/local/analysis_pipeline/R/assoc_plots.R"
1733 },
1734 {
1735 "prefix": "",
1736 "shellQuote": false,
1737 "position": 1,
1738 "valueFrom": "${\n var command = '';\n var i;\n for(i=0; i<inputs.assoc_files.length; i++)\n command += \"ln -s \" + inputs.assoc_files[i].path + \" \" + inputs.assoc_files[i].path.split(\"/\").pop() + \" && \"\n \n return command\n}"
1739 },
1740 {
1741 "prefix": "",
1742 "shellQuote": false,
1743 "position": 100,
1744 "valueFrom": "${\n return ' >> job.out.log'\n}"
1745 }
1746 ],
1747 "requirements": [
1748 {
1749 "class": "ShellCommandRequirement"
1750 },
1751 {
1752 "class": "ResourceRequirement",
1753 "ramMin": 64000,
1754 "coresMin": 1
1755 },
1756 {
1757 "class": "DockerRequirement",
1758 "dockerPull": "uwgac/topmed-master:2.10.0"
1759 },
1760 {
1761 "class": "InitialWorkDirRequirement",
1762 "listing": [
1763 {
1764 "entryname": "assoc_file.config",
1765 "entry": "${\n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n \n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"chr\")[1];\n \n if(isNumeric(chrom_num.charAt(1)))\n {\n chr_array.push(chrom_num.substr(0,2))\n }\n else\n {\n chr_array.push(chrom_num.substr(0,1))\n }\n return chr_array.toString()\n }\n \n var argument = [];\n argument.push('out_prefix \"assoc_single\"');\n var a_file = [].concat(inputs.assoc_files)[0];\n var chr = find_chromosome(a_file.basename);\n var path = a_file.path.split('chr'+chr);\n var extension = path[1].split('.')[1];\n \n \n if(inputs.plots_prefix){\n argument.push('plots_prefix ' + inputs.plots_prefix);\n argument.push('out_file_manh ' + inputs.plots_prefix + '_manh.png');\n argument.push('out_file_qq ' + inputs.plots_prefix + '_qq.png');\n }\n else{\n var data_prefix = path[0].split('/').pop();\n argument.push('out_file_manh ' + data_prefix + 'manh.png');\n argument.push('out_file_qq ' + data_prefix + 'qq.png');\n argument.push('plots_prefix \"plots\"')\n }\n if(inputs.assoc_type){\n argument.push('assoc_type ' + inputs.assoc_type)\n }\n \n argument.push('assoc_file ' + '\"' + path[0].split('/').pop() + 'chr ' +path[1] + '\"')\n\n if(inputs.chromosomes){\n argument.push('chromosomes \"' + inputs.chromosomes + '\"')\n }\n else {\n var chr_array = [];\n var chrom_num;\n var i;\n for (var i = 0; i < inputs.assoc_files.length; i++) \n {\n chrom_num = inputs.assoc_files[i].path.split(\"/\").pop()\n chrom_num = find_chromosome(chrom_num)\n \n chr_array.push(chrom_num)\n }\n \n chr_array = chr_array.sort(function(a, b) { a.localeCompare(b, 'en', {numeric: true, ignorePunctuation: true})})\n \n var chrs = \"\";\n for (var i = 0; i < chr_array.length; i++) \n {\n chrs += chr_array[i] + \" \"\n }\n argument.push('chromosomes \"' + chrs + '\"')\n }\n if(inputs.disable_thin){\n argument.push('thin FALSE')\n }\n if(inputs.thin_npoints)\n argument.push('thin_npoints ' + inputs.thin_npoints)\n if(inputs.thin_npoints)\n argument.push('thin_nbins ' + inputs.thin_nbins)\n if(inputs.known_hits_file)\n argument.push('known_hits_file \"' + inputs.known_hits_file.path + '\"')\n if(inputs.plot_mac_threshold)\n argument.push('plot_mac_threshold ' + inputs.plot_mac_threshold) \n if(inputs.truncate_pval_threshold)\n argument.push('truncate_pval_threshold ' + inputs.truncate_pval_threshold) \n if(inputs.plot_qq_by_chrom){\n argument.push('plot_qq_by_chrom ' + inputs.plot_qq_by_chrom)\n }\n if(inputs.plot_include_file){\n argument.push('plot_include_file ' + '\"'+ inputs.plot_include_file.path + '\"')\n }\n if(inputs.signif_type){\n argument.push('signif_type ' + inputs.signif_type)\n } \n if(inputs.signif_line_fixed){\n argument.push('signif_line_fixed ' + inputs.signif_line_fixed)\n } \n if(inputs.qq_mac_bins){\n argument.push('qq_mac_bins ' + inputs.qq_mac_bins)\n }\n if(inputs.qq_maf_bins){\n argument.push('qq_maf_bins ' + inputs.qq_maf_bins)\n } \n if(inputs.lambda_quantiles){\n argument.push('lambda_quantiles ' + inputs.lambda_quantiles)\n } \n if(inputs.out_file_lambdas){\n argument.push('out_file_lambdas ' + inputs.out_file_lambdas)\n } \n if(inputs.plot_max_p){\n argument.push('plot_max_p ' + inputs.plot_max_p)\n } \n if(inputs.plot_maf_threshold){\n argument.push('plot_maf_threshold ' + inputs.plot_maf_threshold)\n }\n \n \n argument.push('\\n')\n return argument.join('\\n')\n}",
1766 "writable": false
1767 }
1768 ]
1769 },
1770 {
1771 "class": "InlineJavascriptRequirement"
1772 }
1773 ],
1774 "hints": [
1775 {
1776 "class": "sbg:SaveLogs",
1777 "value": "job.out.log"
1778 },
1779 {
1780 "class": "sbg:AzureInstanceType",
1781 "value": "Standard_D8s_v4;PremiumSSD;512"
1782 }
1783 ],
1784 "sbg:projectName": "GENESIS Toolkit - DEMO",
1785 "sbg:image_url": null,
1786 "sbg:revisionsInfo": [
1787 {
1788 "sbg:revision": 0,
1789 "sbg:modifiedBy": "boris_majic",
1790 "sbg:modifiedOn": 1577360892,
1791 "sbg:revisionNotes": null
1792 },
1793 {
1794 "sbg:revision": 1,
1795 "sbg:modifiedBy": "boris_majic",
1796 "sbg:modifiedOn": 1577360921,
1797 "sbg:revisionNotes": "Import from F4C"
1798 },
1799 {
1800 "sbg:revision": 2,
1801 "sbg:modifiedBy": "dajana_panovic",
1802 "sbg:modifiedOn": 1584373539,
1803 "sbg:revisionNotes": "GDS filename correction"
1804 },
1805 {
1806 "sbg:revision": 3,
1807 "sbg:modifiedBy": "dajana_panovic",
1808 "sbg:modifiedOn": 1594133277,
1809 "sbg:revisionNotes": "Docker image update to 2.8.0"
1810 },
1811 {
1812 "sbg:revision": 4,
1813 "sbg:modifiedBy": "dajana_panovic",
1814 "sbg:modifiedOn": 1602155455,
1815 "sbg:revisionNotes": "Import from BDC 2.8.1 version"
1816 },
1817 {
1818 "sbg:revision": 5,
1819 "sbg:modifiedBy": "dajana_panovic",
1820 "sbg:modifiedOn": 1603797944,
1821 "sbg:revisionNotes": "BDC import"
1822 },
1823 {
1824 "sbg:revision": 6,
1825 "sbg:modifiedBy": "dajana_panovic",
1826 "sbg:modifiedOn": 1608907158,
1827 "sbg:revisionNotes": "CWLtool prep"
1828 },
1829 {
1830 "sbg:revision": 7,
1831 "sbg:modifiedBy": "dajana_panovic",
1832 "sbg:modifiedOn": 1616077280,
1833 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0"
1834 },
1835 {
1836 "sbg:revision": 8,
1837 "sbg:modifiedBy": "dajana_panovic",
1838 "sbg:modifiedOn": 1616144891,
1839 "sbg:revisionNotes": "Input descriptions updated"
1840 },
1841 {
1842 "sbg:revision": 9,
1843 "sbg:modifiedBy": "dajana_panovic",
1844 "sbg:modifiedOn": 1617264622,
1845 "sbg:revisionNotes": "Description updated"
1846 },
1847 {
1848 "sbg:revision": 10,
1849 "sbg:modifiedBy": "dajana_panovic",
1850 "sbg:modifiedOn": 1617267791,
1851 "sbg:revisionNotes": "Name update"
1852 },
1853 {
1854 "sbg:revision": 11,
1855 "sbg:modifiedBy": "dajana_panovic",
1856 "sbg:modifiedOn": 1617282860,
1857 "sbg:revisionNotes": "Description updated"
1858 },
1859 {
1860 "sbg:revision": 12,
1861 "sbg:modifiedBy": "dajana_panovic",
1862 "sbg:modifiedOn": 1617282904,
1863 "sbg:revisionNotes": "Description updated"
1864 },
1865 {
1866 "sbg:revision": 13,
1867 "sbg:modifiedBy": "dajana_panovic",
1868 "sbg:modifiedOn": 1617976100,
1869 "sbg:revisionNotes": "Description updated"
1870 },
1871 {
1872 "sbg:revision": 14,
1873 "sbg:modifiedBy": "dajana_panovic",
1874 "sbg:modifiedOn": 1617976150,
1875 "sbg:revisionNotes": "Description updated"
1876 },
1877 {
1878 "sbg:revision": 15,
1879 "sbg:modifiedBy": "dajana_panovic",
1880 "sbg:modifiedOn": 1617976183,
1881 "sbg:revisionNotes": "Description updated"
1882 },
1883 {
1884 "sbg:revision": 16,
1885 "sbg:modifiedBy": "dajana_panovic",
1886 "sbg:modifiedOn": 1617976234,
1887 "sbg:revisionNotes": "Description updated"
1888 },
1889 {
1890 "sbg:revision": 17,
1891 "sbg:modifiedBy": "dajana_panovic",
1892 "sbg:modifiedOn": 1617978175,
1893 "sbg:revisionNotes": "Description update"
1894 },
1895 {
1896 "sbg:revision": 18,
1897 "sbg:modifiedBy": "dajana_panovic",
1898 "sbg:modifiedOn": 1617985142,
1899 "sbg:revisionNotes": "Description updated"
1900 },
1901 {
1902 "sbg:revision": 19,
1903 "sbg:modifiedBy": "milan.domazet",
1904 "sbg:modifiedOn": 1617986285,
1905 "sbg:revisionNotes": "Description update"
1906 },
1907 {
1908 "sbg:revision": 20,
1909 "sbg:modifiedBy": "dajana_panovic",
1910 "sbg:modifiedOn": 1620727319,
1911 "sbg:revisionNotes": "Labels update"
1912 },
1913 {
1914 "sbg:revision": 21,
1915 "sbg:modifiedBy": "dajana_panovic",
1916 "sbg:modifiedOn": 1622800608,
1917 "sbg:revisionNotes": "Azure instance type"
1918 }
1919 ],
1920 "sbg:appVersion": [
1921 "v1.2"
1922 ],
1923 "sbg:id": "h-b85fb043/h-57a726e7/h-0f68f310/0",
1924 "sbg:revision": 21,
1925 "sbg:revisionNotes": "Azure instance type",
1926 "sbg:modifiedOn": 1622800608,
1927 "sbg:modifiedBy": "dajana_panovic",
1928 "sbg:createdOn": 1577360892,
1929 "sbg:createdBy": "boris_majic",
1930 "sbg:project": "boris_majic/genesis-toolkit-demo",
1931 "sbg:sbgMaintained": false,
1932 "sbg:validationErrors": [],
1933 "sbg:contributors": [
1934 "milan.domazet",
1935 "dajana_panovic",
1936 "boris_majic"
1937 ],
1938 "sbg:latestRevision": 21,
1939 "sbg:publisher": "sbg",
1940 "sbg:content_hash": "ac69853666a83b66464719ab99d0f3422d6986b55831a7556e56b6d8311c6d61b"
1941 },
1942 "label": "Association test plots",
1943 "sbg:x": 1462.4285888671875,
1944 "sbg:y": 357.4285583496094
1945 },
1946 {
1947 "id": "sbg_gds_renamer",
1948 "in": [
1949 {
1950 "id": "in_variants",
1951 "source": "input_gds_files"
1952 }
1953 ],
1954 "out": [
1955 {
1956 "id": "renamed_variants"
1957 }
1958 ],
1959 "run": {
1960 "class": "CommandLineTool",
1961 "cwlVersion": "v1.1",
1962 "$namespaces": {
1963 "sbg": "https://sevenbridges.com"
1964 },
1965 "id": "sevenbridges/sbgtools-cwl1-0-demo/sbg-gds-renamer/3",
1966 "baseCommand": [
1967 "cp"
1968 ],
1969 "inputs": [
1970 {
1971 "id": "in_variants",
1972 "type": "File",
1973 "label": "GDS input",
1974 "doc": "This tool removes suffix after 'chr##' in GDS filename. ## stands for chromosome name and can be (1-22,X,Y).",
1975 "sbg:fileTypes": "GDS"
1976 }
1977 ],
1978 "outputs": [
1979 {
1980 "id": "renamed_variants",
1981 "doc": "Renamed GDS file.",
1982 "label": "Renamed GDS",
1983 "type": "File",
1984 "outputBinding": {
1985 "glob": "${\n return '*'+inputs.in_variants.nameext\n}"
1986 },
1987 "sbg:fileTypes": "GDS"
1988 }
1989 ],
1990 "doc": "This tool renames GDS file in GENESIS pipelines if they contain suffixes after chromosome (chr##) in the filename.\nFor example: If GDS file has name data_chr1_subset.gds the tool will rename GDS file to data_chr1.gds.",
1991 "label": "SBG GDS renamer",
1992 "arguments": [
1993 {
1994 "prefix": "",
1995 "shellQuote": false,
1996 "position": 0,
1997 "valueFrom": "${\n if(inputs.in_variants){\n return inputs.in_variants.path}\n}"
1998 },
1999 {
2000 "prefix": "",
2001 "shellQuote": false,
2002 "position": 0,
2003 "valueFrom": "${\n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n \n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"chr\")[1];\n \n if(isNumeric(chrom_num.charAt(1)))\n {\n chr_array.push(chrom_num.substr(0,2))\n }\n else\n {\n chr_array.push(chrom_num.substr(0,1))\n }\n return chr_array.toString()\n }\n \n var chr = find_chromosome(inputs.in_variants.nameroot)\n var base = inputs.in_variants.nameroot.split('chr'+chr)[0]\n \n return base+'chr' + chr + inputs.in_variants.nameext\n \n}"
2004 },
2005 {
2006 "prefix": "",
2007 "shellQuote": false,
2008 "position": 100,
2009 "valueFrom": "${\n return ' >> job.out.log' \n}"
2010 }
2011 ],
2012 "requirements": [
2013 {
2014 "class": "ShellCommandRequirement"
2015 },
2016 {
2017 "class": "DockerRequirement",
2018 "dockerPull": "uwgac/topmed-master:2.8.1"
2019 },
2020 {
2021 "class": "InlineJavascriptRequirement"
2022 }
2023 ],
2024 "hints": [
2025 {
2026 "class": "sbg:SaveLogs",
2027 "value": "job.out.log"
2028 }
2029 ],
2030 "sbg:projectName": "SBGTools - CWL1.0 - Demo",
2031 "sbg:revisionsInfo": [
2032 {
2033 "sbg:revision": 0,
2034 "sbg:modifiedBy": "dajana_panovic",
2035 "sbg:modifiedOn": 1584358811,
2036 "sbg:revisionNotes": null
2037 },
2038 {
2039 "sbg:revision": 1,
2040 "sbg:modifiedBy": "dajana_panovic",
2041 "sbg:modifiedOn": 1584358844,
2042 "sbg:revisionNotes": "Initial wrap"
2043 },
2044 {
2045 "sbg:revision": 2,
2046 "sbg:modifiedBy": "dajana_panovic",
2047 "sbg:modifiedOn": 1584359010,
2048 "sbg:revisionNotes": "Description updated"
2049 },
2050 {
2051 "sbg:revision": 3,
2052 "sbg:modifiedBy": "dajana_panovic",
2053 "sbg:modifiedOn": 1608907259,
2054 "sbg:revisionNotes": "CWLtool prep"
2055 }
2056 ],
2057 "sbg:image_url": null,
2058 "sbg:appVersion": [
2059 "v1.1"
2060 ],
2061 "sbg:id": "h-f65cf041/h-578dc940/h-4447a4c0/0",
2062 "sbg:revision": 3,
2063 "sbg:revisionNotes": "CWLtool prep",
2064 "sbg:modifiedOn": 1608907259,
2065 "sbg:modifiedBy": "dajana_panovic",
2066 "sbg:createdOn": 1584358811,
2067 "sbg:createdBy": "dajana_panovic",
2068 "sbg:project": "sevenbridges/sbgtools-cwl1-0-demo",
2069 "sbg:sbgMaintained": false,
2070 "sbg:validationErrors": [],
2071 "sbg:contributors": [
2072 "dajana_panovic"
2073 ],
2074 "sbg:latestRevision": 3,
2075 "sbg:publisher": "sbg",
2076 "sbg:content_hash": "ab721cbd39c33d272c5c42693fb02e02e43d95a3f421f40615cbf79ed023c35cc"
2077 },
2078 "label": "SBG GDS renamer",
2079 "scatter": [
2080 "in_variants"
2081 ],
2082 "sbg:x": -372.694091796875,
2083 "sbg:y": -244.2437744140625
2084 },
2085 {
2086 "id": "sbg_flatten_lists",
2087 "in": [
2088 {
2089 "id": "input_list",
2090 "source": [
2091 "assoc_aggregate/assoc_aggregate"
2092 ],
2093 "valueFrom": "${ var out = []; for (var i = 0; i<self.length; i++){ if (self[i]) out.push(self[i]) } return out }"
2094 }
2095 ],
2096 "out": [
2097 {
2098 "id": "output_list"
2099 }
2100 ],
2101 "run": {
2102 "class": "CommandLineTool",
2103 "cwlVersion": "v1.1",
2104 "$namespaces": {
2105 "sbg": "https://sevenbridges.com"
2106 },
2107 "id": "sevenbridges/sbgtools-cwl1-0-demo/sbg-flatten-lists/3",
2108 "baseCommand": [
2109 "echo"
2110 ],
2111 "inputs": [
2112 {
2113 "sbg:category": "File inputs",
2114 "id": "input_list",
2115 "type": "File[]?",
2116 "label": "Input list of files and lists",
2117 "doc": "List of inputs, can be any combination of lists of files and single files, it will be combined into a single list of files at the output."
2118 }
2119 ],
2120 "outputs": [
2121 {
2122 "id": "output_list",
2123 "doc": "Single list of files that combines all files from all inputs.",
2124 "label": "Output list of files",
2125 "type": "File[]?",
2126 "outputBinding": {
2127 "outputEval": "${\n function flatten(files) {\n var a = [];\n for (var i = 0; i < files.length; i++) {\n if (files[i]) {\n if (files[i].constructor == Array) a = a.concat(flatten(files[i]))\n else a = a.concat(files[i])\n }\n }\n return a\n }\n\n {\n if (inputs.input_list) {\n var arr = [].concat(inputs.input_list);\n var return_array = [];\n return_array = flatten(arr)\n return return_array\n }\n }\n}"
2128 }
2129 }
2130 ],
2131 "doc": "###**Overview** \n\nSBG FlattenLists is used to merge any combination of single file and list of file inputs into a single list of files. This is important because most tools and the CWL specification doesn't allow array of array types, and combinations of single file and array need to be converted into a single list for tools that can process a list of files.\n\n###**Input** \n\nAny combination of input nodes that are of types File or array of File, and any tool outputs that produce types File or array of File.\n\n###**Output** \n\nSingle array of File list containing all Files from all inputs combined, provided there are no duplicate files in those lists.\n\n###**Usage example** \n\nExample of usage is combining the outputs of two tools, one which produces a single file, and the other that produces an array of files, so that the next tool, which takes in an array of files, can process them together.",
2132 "label": "SBG FlattenLists",
2133 "arguments": [
2134 {
2135 "shellQuote": false,
2136 "position": 0,
2137 "valueFrom": "\"Output"
2138 },
2139 {
2140 "shellQuote": false,
2141 "position": 1,
2142 "valueFrom": "is"
2143 },
2144 {
2145 "shellQuote": false,
2146 "position": 2,
2147 "valueFrom": "now"
2148 },
2149 {
2150 "shellQuote": false,
2151 "position": 3,
2152 "valueFrom": "a"
2153 },
2154 {
2155 "shellQuote": false,
2156 "position": 4,
2157 "valueFrom": "single"
2158 },
2159 {
2160 "shellQuote": false,
2161 "position": 5,
2162 "valueFrom": "list\""
2163 }
2164 ],
2165 "requirements": [
2166 {
2167 "class": "ShellCommandRequirement"
2168 },
2169 {
2170 "class": "ResourceRequirement",
2171 "ramMin": 1000,
2172 "coresMin": 1
2173 },
2174 {
2175 "class": "DockerRequirement",
2176 "dockerPull": "uwgac/topmed-master:2.8.1"
2177 },
2178 {
2179 "class": "InitialWorkDirRequirement",
2180 "listing": [
2181 "$(inputs.input_list)"
2182 ]
2183 },
2184 {
2185 "class": "InlineJavascriptRequirement",
2186 "expressionLib": [
2187 "var updateMetadata = function(file, key, value) {\n file['metadata'][key] = value;\n return file;\n};\n\n\nvar setMetadata = function(file, metadata) {\n if (!('metadata' in file))\n file['metadata'] = metadata;\n else {\n for (var key in metadata) {\n file['metadata'][key] = metadata[key];\n }\n }\n return file\n};\n\nvar inheritMetadata = function(o1, o2) {\n var commonMetadata = {};\n if (!Array.isArray(o2)) {\n o2 = [o2]\n }\n for (var i = 0; i < o2.length; i++) {\n var example = o2[i]['metadata'];\n for (var key in example) {\n if (i == 0)\n commonMetadata[key] = example[key];\n else {\n if (!(commonMetadata[key] == example[key])) {\n delete commonMetadata[key]\n }\n }\n }\n }\n if (!Array.isArray(o1)) {\n o1 = setMetadata(o1, commonMetadata)\n } else {\n for (var i = 0; i < o1.length; i++) {\n o1[i] = setMetadata(o1[i], commonMetadata)\n }\n }\n return o1;\n};\n\nvar toArray = function(file) {\n return [].concat(file);\n};\n\nvar groupBy = function(files, key) {\n var groupedFiles = [];\n var tempDict = {};\n for (var i = 0; i < files.length; i++) {\n var value = files[i]['metadata'][key];\n if (value in tempDict)\n tempDict[value].push(files[i]);\n else tempDict[value] = [files[i]];\n }\n for (var key in tempDict) {\n groupedFiles.push(tempDict[key]);\n }\n return groupedFiles;\n};\n\nvar orderBy = function(files, key, order) {\n var compareFunction = function(a, b) {\n if (a['metadata'][key].constructor === Number) {\n return a['metadata'][key] - b['metadata'][key];\n } else {\n var nameA = a['metadata'][key].toUpperCase();\n var nameB = b['metadata'][key].toUpperCase();\n if (nameA < nameB) {\n return -1;\n }\n if (nameA > nameB) {\n return 1;\n }\n return 0;\n }\n };\n\n files = files.sort(compareFunction);\n if (order == undefined || order == \"asc\")\n return files;\n else\n return files.reverse();\n};"
2188 ]
2189 }
2190 ],
2191 "hints": [
2192 {
2193 "class": "sbg:SaveLogs",
2194 "value": "job.out.log"
2195 }
2196 ],
2197 "sbg:projectName": "SBGTools - CWL1.0 - Demo",
2198 "sbg:toolAuthor": "Seven Bridges",
2199 "sbg:cmdPreview": "echo \"Output is now a single list\"",
2200 "sbg:image_url": null,
2201 "sbg:revisionsInfo": [
2202 {
2203 "sbg:revision": 0,
2204 "sbg:modifiedBy": "nens",
2205 "sbg:modifiedOn": 1566552375,
2206 "sbg:revisionNotes": null
2207 },
2208 {
2209 "sbg:revision": 1,
2210 "sbg:modifiedBy": "nens",
2211 "sbg:modifiedOn": 1566552393,
2212 "sbg:revisionNotes": "v2-dev"
2213 },
2214 {
2215 "sbg:revision": 2,
2216 "sbg:modifiedBy": "dajana_panovic",
2217 "sbg:modifiedOn": 1588599015,
2218 "sbg:revisionNotes": "Updated to CWL1.0"
2219 },
2220 {
2221 "sbg:revision": 3,
2222 "sbg:modifiedBy": "dajana_panovic",
2223 "sbg:modifiedOn": 1608907303,
2224 "sbg:revisionNotes": "CWLtool prep"
2225 }
2226 ],
2227 "sbg:license": "Apache License 2.0",
2228 "sbg:categories": [
2229 "Other"
2230 ],
2231 "sbg:toolkit": "SBGTools",
2232 "sbg:toolkitVersion": "1.0",
2233 "sbg:appVersion": [
2234 "v1.1"
2235 ],
2236 "sbg:id": "h-30b109c5/h-2a3b9c4b/h-c5dc3136/0",
2237 "sbg:revision": 3,
2238 "sbg:revisionNotes": "CWLtool prep",
2239 "sbg:modifiedOn": 1608907303,
2240 "sbg:modifiedBy": "dajana_panovic",
2241 "sbg:createdOn": 1566552375,
2242 "sbg:createdBy": "nens",
2243 "sbg:project": "sevenbridges/sbgtools-cwl1-0-demo",
2244 "sbg:sbgMaintained": false,
2245 "sbg:validationErrors": [],
2246 "sbg:contributors": [
2247 "dajana_panovic",
2248 "nens"
2249 ],
2250 "sbg:latestRevision": 3,
2251 "sbg:publisher": "sbg",
2252 "sbg:content_hash": "a8ab04a2a11a3f02f5cb29025dbeebbe3bb71cc8f1eb7caafb6e2140373cc62f3"
2253 },
2254 "label": "SBG FlattenLists",
2255 "sbg:x": 915.6107788085938,
2256 "sbg:y": 182.4495849609375
2257 },
2258 {
2259 "id": "sbg_group_segments_1",
2260 "in": [
2261 {
2262 "id": "assoc_files",
2263 "source": [
2264 "sbg_flatten_lists/output_list"
2265 ]
2266 }
2267 ],
2268 "out": [
2269 {
2270 "id": "grouped_assoc_files"
2271 },
2272 {
2273 "id": "chromosome"
2274 }
2275 ],
2276 "run": {
2277 "class": "CommandLineTool",
2278 "cwlVersion": "v1.1",
2279 "$namespaces": {
2280 "sbg": "https://sevenbridges.com"
2281 },
2282 "id": "sevenbridges/sbgtools-cwl1-0-demo/sbg-group-segments/1",
2283 "baseCommand": [
2284 "echo",
2285 "\"Grouping\""
2286 ],
2287 "inputs": [
2288 {
2289 "sbg:category": "Inputs",
2290 "id": "assoc_files",
2291 "type": "File[]",
2292 "label": "Assoc files",
2293 "doc": "Assoc files.",
2294 "sbg:fileTypes": "RDATA"
2295 }
2296 ],
2297 "outputs": [
2298 {
2299 "id": "grouped_assoc_files",
2300 "type": [
2301 "null",
2302 {
2303 "type": "array",
2304 "items": [
2305 {
2306 "type": "array",
2307 "items": [
2308 "File",
2309 "null"
2310 ]
2311 },
2312 "null"
2313 ]
2314 }
2315 ],
2316 "outputBinding": {
2317 "outputEval": "${\n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"/\").pop();\n chrom_num = chrom_num.substr(0,chrom_num.lastIndexOf(\".\")).split('_').slice(0,-1).join('_')\n if(isNumeric(chrom_num.charAt(chrom_num.length-2)))\n {\n chr_array.push(chrom_num.substr(chrom_num.length - 2))\n }\n else\n {\n chr_array.push(chrom_num.substr(chrom_num.length - 1))\n }\n return chr_array.toString()\n }\n \n var assoc_files_dict = {};\n var grouped_assoc_files = [];\n var chr;\n for(var i=0; i<inputs.assoc_files.length; i++){\n chr = find_chromosome(inputs.assoc_files[i].path)\n if(chr in assoc_files_dict){\n assoc_files_dict[chr].push(inputs.assoc_files[i])\n }\n else{\n assoc_files_dict[chr] = [inputs.assoc_files[i]]\n }\n }\n for(var key in assoc_files_dict){\n grouped_assoc_files.push(assoc_files_dict[key])\n }\n return grouped_assoc_files\n \n}"
2318 }
2319 },
2320 {
2321 "id": "chromosome",
2322 "doc": "Chromosomes.",
2323 "label": "Chromosomes",
2324 "type": "string[]?",
2325 "outputBinding": {
2326 "outputEval": "${\n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"/\").pop();\n chrom_num = chrom_num.substr(0,chrom_num.lastIndexOf(\".\")).split('_').slice(0,-1).join('_')\n if(isNumeric(chrom_num.charAt(chrom_num.length-2)))\n {\n chr_array.push(chrom_num.substr(chrom_num.length - 2))\n }\n else\n {\n chr_array.push(chrom_num.substr(chrom_num.length - 1))\n }\n return chr_array.toString()\n }\n \n var assoc_files_dict = {};\n var output_chromosomes = [];\n var chr;\n for(var i=0; i<inputs.assoc_files.length; i++){\n chr = find_chromosome(inputs.assoc_files[i].path)\n if(chr in assoc_files_dict){\n assoc_files_dict[chr].push(inputs.assoc_files[i])\n }\n else{\n assoc_files_dict[chr] = [inputs.assoc_files[i]]\n }\n }\n for(var key in assoc_files_dict){\n output_chromosomes.push(key)\n }\n return output_chromosomes\n \n}"
2327 }
2328 }
2329 ],
2330 "label": "SBG Group Segments",
2331 "requirements": [
2332 {
2333 "class": "DockerRequirement",
2334 "dockerPull": "uwgac/topmed-master:2.8.1"
2335 },
2336 {
2337 "class": "InlineJavascriptRequirement"
2338 }
2339 ],
2340 "hints": [
2341 {
2342 "class": "sbg:SaveLogs",
2343 "value": "job.out.log"
2344 }
2345 ],
2346 "sbg:revisionsInfo": [
2347 {
2348 "sbg:revision": 0,
2349 "sbg:modifiedBy": "dajana_panovic",
2350 "sbg:modifiedOn": 1608907549,
2351 "sbg:revisionNotes": null
2352 },
2353 {
2354 "sbg:revision": 1,
2355 "sbg:modifiedBy": "dajana_panovic",
2356 "sbg:modifiedOn": 1608907559,
2357 "sbg:revisionNotes": "CWLtool prep"
2358 }
2359 ],
2360 "sbg:image_url": null,
2361 "sbg:projectName": "SBGTools - CWL1.0 - Demo",
2362 "sbg:appVersion": [
2363 "v1.1"
2364 ],
2365 "sbg:id": "h-e9e18633/h-044876a8/h-347e1ccf/0",
2366 "sbg:revision": 1,
2367 "sbg:revisionNotes": "CWLtool prep",
2368 "sbg:modifiedOn": 1608907559,
2369 "sbg:modifiedBy": "dajana_panovic",
2370 "sbg:createdOn": 1608907549,
2371 "sbg:createdBy": "dajana_panovic",
2372 "sbg:project": "sevenbridges/sbgtools-cwl1-0-demo",
2373 "sbg:sbgMaintained": false,
2374 "sbg:validationErrors": [],
2375 "sbg:contributors": [
2376 "dajana_panovic"
2377 ],
2378 "sbg:latestRevision": 1,
2379 "sbg:publisher": "sbg",
2380 "sbg:content_hash": "a515be0f5124c62e65c743e3ca9940a2d4d90f71217b08949ce69537195ad562c"
2381 },
2382 "label": "SBG Group Segments",
2383 "sbg:x": 1075.814208984375,
2384 "sbg:y": 178.85438537597656
2385 },
2386 {
2387 "id": "sbg_prepare_segments_1",
2388 "in": [
2389 {
2390 "id": "input_gds_files",
2391 "source": [
2392 "sbg_gds_renamer/renamed_variants"
2393 ]
2394 },
2395 {
2396 "id": "segments_file",
2397 "source": "define_segments_r/define_segments_output"
2398 },
2399 {
2400 "id": "aggregate_files",
2401 "source": [
2402 "aggregate_list/aggregate_list"
2403 ]
2404 },
2405 {
2406 "id": "variant_include_files",
2407 "source": [
2408 "variant_include_files"
2409 ]
2410 }
2411 ],
2412 "out": [
2413 {
2414 "id": "gds_output"
2415 },
2416 {
2417 "id": "segments"
2418 },
2419 {
2420 "id": "aggregate_output"
2421 },
2422 {
2423 "id": "variant_include_output"
2424 }
2425 ],
2426 "run": {
2427 "class": "CommandLineTool",
2428 "cwlVersion": "v1.1",
2429 "$namespaces": {
2430 "sbg": "https://sevenbridges.com"
2431 },
2432 "id": "sevenbridges/sbgtools-cwl1-0-demo/sbg-prepare-segments/1",
2433 "baseCommand": [],
2434 "inputs": [
2435 {
2436 "sbg:category": "Inputs",
2437 "id": "input_gds_files",
2438 "type": "File[]",
2439 "label": "GDS files",
2440 "doc": "GDS files.",
2441 "sbg:fileTypes": "GDS"
2442 },
2443 {
2444 "sbg:category": "Inputs",
2445 "id": "segments_file",
2446 "type": "File",
2447 "label": "Segments file",
2448 "doc": "Segments file.",
2449 "sbg:fileTypes": "TXT"
2450 },
2451 {
2452 "sbg:category": "Inputs",
2453 "id": "aggregate_files",
2454 "type": "File[]?",
2455 "label": "Aggregate files",
2456 "doc": "Aggregate files.",
2457 "sbg:fileTypes": "RDATA"
2458 },
2459 {
2460 "sbg:category": "Inputs",
2461 "id": "variant_include_files",
2462 "type": "File[]?",
2463 "label": "Variant Include Files",
2464 "doc": "RData file containing ids of variants to be included.",
2465 "sbg:fileTypes": "RData"
2466 }
2467 ],
2468 "outputs": [
2469 {
2470 "id": "gds_output",
2471 "doc": "GDS files.",
2472 "label": "GDS files",
2473 "type": "File[]?",
2474 "outputBinding": {
2475 "loadContents": true,
2476 "glob": "*.txt",
2477 "outputEval": "${\n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n \n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"chr\")[1];\n \n if(isNumeric(chrom_num.charAt(1)))\n {\n chr_array.push(chrom_num.substr(0,2))\n }\n else\n {\n chr_array.push(chrom_num.substr(0,1))\n }\n return chr_array.toString()\n }\n \n \n \n function pair_chromosome_gds(file_array){\n var gdss = {};\n for(var i=0; i<file_array.length; i++){\n gdss[find_chromosome(file_array[i].path)] = file_array[i]\n }\n return gdss\n }\n\n var input_gdss = pair_chromosome_gds(inputs.input_gds_files)\n var output_gdss = [];\n var segments = self[0].contents.split('\\n');\n var chr;\n \n segments = segments.slice(1)\n for(var i=0;i<segments.length;i++){\n chr = segments[i].split('\\t')[0]\n if(chr in input_gdss){\n output_gdss.push(input_gdss[chr])\n }\n }\n return output_gdss\n}"
2478 },
2479 "sbg:fileTypes": "GDS"
2480 },
2481 {
2482 "id": "segments",
2483 "doc": "Segments.",
2484 "label": "Segments",
2485 "type": "int[]?",
2486 "outputBinding": {
2487 "loadContents": true,
2488 "glob": "*.txt",
2489 "outputEval": "${\n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n \n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"chr\")[1];\n \n if(isNumeric(chrom_num.charAt(1)))\n {\n chr_array.push(chrom_num.substr(0,2))\n }\n else\n {\n chr_array.push(chrom_num.substr(0,1))\n }\n return chr_array.toString()\n }\n \n function pair_chromosome_gds(file_array){\n var gdss = {};\n for(var i=0; i<file_array.length; i++){\n gdss[find_chromosome(file_array[i].path)] = file_array[i]\n }\n return gdss\n }\n \n var input_gdss = pair_chromosome_gds(inputs.input_gds_files)\n var output_segments = []\n var segments = self[0].contents.split('\\n');\n segments = segments.slice(1)\n var chr;\n \n for(var i=0;i<segments.length;i++){\n chr = segments[i].split('\\t')[0]\n if(chr in input_gdss){\n output_segments.push(i+1)\n }\n }\n return output_segments\n \n}"
2490 }
2491 },
2492 {
2493 "id": "aggregate_output",
2494 "doc": "Aggregate output.",
2495 "label": "Aggregate output",
2496 "type": [
2497 "null",
2498 {
2499 "type": "array",
2500 "items": [
2501 "null",
2502 "File"
2503 ]
2504 }
2505 ],
2506 "outputBinding": {
2507 "loadContents": true,
2508 "glob": "*.txt",
2509 "outputEval": "${\n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n \n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"chr\")[1];\n \n if(isNumeric(chrom_num.charAt(1)))\n {\n chr_array.push(chrom_num.substr(0,2))\n }\n else\n {\n chr_array.push(chrom_num.substr(0,1))\n }\n return chr_array.toString()\n }\n \n function pair_chromosome_gds(file_array){\n var gdss = {};\n for(var i=0; i<file_array.length; i++){\n gdss[find_chromosome(file_array[i].path)] = file_array[i]\n }\n return gdss\n }\n function pair_chromosome_gds_special(file_array, agg_file){\n var gdss = {};\n for(var i=0; i<file_array.length; i++){\n gdss[find_chromosome(file_array[i].path)] = agg_file\n }\n return gdss\n }\n var input_gdss = pair_chromosome_gds(inputs.input_gds_files)\n var segments = self[0].contents.split('\\n');\n segments = segments.slice(1)\n var chr;\n \n if(inputs.aggregate_files){\n if (inputs.aggregate_files[0] != null){\n if (inputs.aggregate_files[0].basename.includes('chr'))\n var input_aggregate_files = pair_chromosome_gds(inputs.aggregate_files);\n else\n var input_aggregate_files = pair_chromosome_gds_special(inputs.input_gds_files, inputs.aggregate_files[0].path);\n var output_aggregate_files = []\n for(var i=0;i<segments.length;i++){\n chr = segments[i].split('\\t')[0]\n if(chr in input_aggregate_files){\n output_aggregate_files.push(input_aggregate_files[chr])\n }\n else if(chr in input_gdss){\n output_aggregate_files.push(null)\n }\n }\n return output_aggregate_files\n }\n }\n else{\n var null_outputs = []\n for(var i=0; i<segments.length; i++){\n chr = segments[i].split('\\t')[0]\n if(chr in input_gdss){\n null_outputs.push(null)\n }\n }\n return null_outputs\n }\n}"
2510 }
2511 },
2512 {
2513 "id": "variant_include_output",
2514 "doc": "Variant Include Output",
2515 "label": "Variant Include Output",
2516 "type": [
2517 "null",
2518 {
2519 "type": "array",
2520 "items": [
2521 "null",
2522 "File"
2523 ]
2524 }
2525 ],
2526 "outputBinding": {
2527 "loadContents": true,
2528 "glob": "*.txt",
2529 "outputEval": "${\n function isNumeric(s) {\n return !isNaN(s - parseFloat(s));\n }\n \n function find_chromosome(file){\n var chr_array = [];\n var chrom_num = file.split(\"chr\")[1];\n \n if(isNumeric(chrom_num.charAt(1)))\n {\n chr_array.push(chrom_num.substr(0,2))\n }\n else\n {\n chr_array.push(chrom_num.substr(0,1))\n }\n return chr_array.toString()\n }\n \n function pair_chromosome_gds(file_array){\n var gdss = {};\n for(var i=0; i<file_array.length; i++){\n gdss[find_chromosome(file_array[i].path)] = file_array[i]\n }\n return gdss\n }\n var input_gdss = pair_chromosome_gds(inputs.input_gds_files)\n var segments = self[0].contents.split('\\n');\n segments = segments.slice(1)\n var chr;\n \n if(inputs.variant_include_files){\n if (inputs.variant_include_files[0] != null){\n var input_variant_files = pair_chromosome_gds(inputs.variant_include_files)\n var output_variant_files = []\n for(var i=0;i<segments.length;i++){\n chr = segments[i].split('\\t')[0]\n if(chr in input_variant_files){\n output_variant_files.push(input_variant_files[chr])\n }\n else if(chr in input_gdss){\n output_variant_files.push(null)\n }\n }\n return output_variant_files\n }\n }\n else{\n var null_outputs = [];\n for(var i=0; i<segments.length; i++){\n chr = segments[i].split('\\t')[0]\n if(chr in input_gdss){\n null_outputs.push(null)\n }\n }\n return null_outputs\n }\n}"
2530 }
2531 }
2532 ],
2533 "label": "SBG Prepare Segments",
2534 "arguments": [
2535 {
2536 "prefix": "",
2537 "shellQuote": false,
2538 "position": 0,
2539 "valueFrom": "${\n return \"cp \" + inputs.segments_file.path + \" .\"\n}"
2540 }
2541 ],
2542 "requirements": [
2543 {
2544 "class": "ShellCommandRequirement"
2545 },
2546 {
2547 "class": "DockerRequirement",
2548 "dockerPull": "uwgac/topmed-master:2.8.1"
2549 },
2550 {
2551 "class": "InlineJavascriptRequirement"
2552 }
2553 ],
2554 "hints": [
2555 {
2556 "class": "sbg:SaveLogs",
2557 "value": "job.out.log"
2558 }
2559 ],
2560 "sbg:revisionsInfo": [
2561 {
2562 "sbg:revision": 0,
2563 "sbg:modifiedBy": "dajana_panovic",
2564 "sbg:modifiedOn": 1608907510,
2565 "sbg:revisionNotes": null
2566 },
2567 {
2568 "sbg:revision": 1,
2569 "sbg:modifiedBy": "dajana_panovic",
2570 "sbg:modifiedOn": 1608907520,
2571 "sbg:revisionNotes": "CWLtool prep"
2572 }
2573 ],
2574 "sbg:projectName": "SBGTools - CWL1.0 - Demo",
2575 "sbg:image_url": null,
2576 "sbg:appVersion": [
2577 "v1.1"
2578 ],
2579 "sbg:id": "h-173ea8ca/h-86758b20/h-df4a6d04/0",
2580 "sbg:revision": 1,
2581 "sbg:revisionNotes": "CWLtool prep",
2582 "sbg:modifiedOn": 1608907520,
2583 "sbg:modifiedBy": "dajana_panovic",
2584 "sbg:createdOn": 1608907510,
2585 "sbg:createdBy": "dajana_panovic",
2586 "sbg:project": "sevenbridges/sbgtools-cwl1-0-demo",
2587 "sbg:sbgMaintained": false,
2588 "sbg:validationErrors": [],
2589 "sbg:contributors": [
2590 "dajana_panovic"
2591 ],
2592 "sbg:latestRevision": 1,
2593 "sbg:publisher": "sbg",
2594 "sbg:content_hash": "af5431cfdc789d53445974b82b534a1ba1c6df2ac79d7b39af88dce65def8cb34"
2595 },
2596 "label": "SBG Prepare Segments",
2597 "sbg:x": 95.99354553222656,
2598 "sbg:y": -143.77420043945312
2599 }
2600 ],
2601 "hints": [
2602 {
2603 "class": "sbg:AWSInstanceType",
2604 "value": "c5.2xlarge;ebs-gp2;1024"
2605 },
2606 {
2607 "class": "sbg:maxNumberOfParallelInstances",
2608 "value": "8"
2609 },
2610 {
2611 "class": "sbg:AzureInstanceType",
2612 "value": "Standard_D8s_v4;PremiumSSD;1024"
2613 }
2614 ],
2615 "requirements": [
2616 {
2617 "class": "ScatterFeatureRequirement"
2618 },
2619 {
2620 "class": "InlineJavascriptRequirement"
2621 },
2622 {
2623 "class": "StepInputExpressionRequirement"
2624 }
2625 ],
2626 "sbg:projectName": "SBG Public data",
2627 "sbg:revisionsInfo": [
2628 {
2629 "sbg:revision": 0,
2630 "sbg:modifiedBy": "admin",
2631 "sbg:modifiedOn": 1602755622,
2632 "sbg:revisionNotes": null
2633 },
2634 {
2635 "sbg:revision": 1,
2636 "sbg:modifiedBy": "admin",
2637 "sbg:modifiedOn": 1602755622,
2638 "sbg:revisionNotes": "Revision for publishing"
2639 },
2640 {
2641 "sbg:revision": 2,
2642 "sbg:modifiedBy": "admin",
2643 "sbg:modifiedOn": 1602755622,
2644 "sbg:revisionNotes": "Modify description in accordance to reviews"
2645 },
2646 {
2647 "sbg:revision": 3,
2648 "sbg:modifiedBy": "admin",
2649 "sbg:modifiedOn": 1602755622,
2650 "sbg:revisionNotes": "Modify description"
2651 },
2652 {
2653 "sbg:revision": 4,
2654 "sbg:modifiedBy": "admin",
2655 "sbg:modifiedOn": 1602755622,
2656 "sbg:revisionNotes": "CWL1.0"
2657 },
2658 {
2659 "sbg:revision": 5,
2660 "sbg:modifiedBy": "admin",
2661 "sbg:modifiedOn": 1602755623,
2662 "sbg:revisionNotes": "Latest"
2663 },
2664 {
2665 "sbg:revision": 6,
2666 "sbg:modifiedBy": "admin",
2667 "sbg:modifiedOn": 1602755623,
2668 "sbg:revisionNotes": "SBG GDS Renamer added"
2669 },
2670 {
2671 "sbg:revision": 7,
2672 "sbg:modifiedBy": "admin",
2673 "sbg:modifiedOn": 1602755623,
2674 "sbg:revisionNotes": "Output prefix required"
2675 },
2676 {
2677 "sbg:revision": 8,
2678 "sbg:modifiedBy": "admin",
2679 "sbg:modifiedOn": 1602755623,
2680 "sbg:revisionNotes": "Description updated"
2681 },
2682 {
2683 "sbg:revision": 9,
2684 "sbg:modifiedBy": "admin",
2685 "sbg:modifiedOn": 1602755623,
2686 "sbg:revisionNotes": "SBG FlattenList updated to CWL1.0"
2687 },
2688 {
2689 "sbg:revision": 10,
2690 "sbg:modifiedBy": "admin",
2691 "sbg:modifiedOn": 1602755623,
2692 "sbg:revisionNotes": "Input descriptions update"
2693 },
2694 {
2695 "sbg:revision": 11,
2696 "sbg:modifiedBy": "admin",
2697 "sbg:modifiedOn": 1602755623,
2698 "sbg:revisionNotes": "Import from BDC 2.8.1 version"
2699 },
2700 {
2701 "sbg:revision": 12,
2702 "sbg:modifiedBy": "admin",
2703 "sbg:modifiedOn": 1602755623,
2704 "sbg:revisionNotes": "Input and output descriptions updated"
2705 },
2706 {
2707 "sbg:revision": 13,
2708 "sbg:modifiedBy": "admin",
2709 "sbg:modifiedOn": 1602755623,
2710 "sbg:revisionNotes": "Input and output update"
2711 },
2712 {
2713 "sbg:revision": 14,
2714 "sbg:modifiedBy": "admin",
2715 "sbg:modifiedOn": 1602755623,
2716 "sbg:revisionNotes": "Plot prefix = output prefix"
2717 },
2718 {
2719 "sbg:revision": 15,
2720 "sbg:modifiedBy": "admin",
2721 "sbg:modifiedOn": 1602755623,
2722 "sbg:revisionNotes": "Apps ordering in app settings"
2723 },
2724 {
2725 "sbg:revision": 16,
2726 "sbg:modifiedBy": "admin",
2727 "sbg:modifiedOn": 1602755623,
2728 "sbg:revisionNotes": "Description updated"
2729 },
2730 {
2731 "sbg:revision": 17,
2732 "sbg:modifiedBy": "admin",
2733 "sbg:modifiedOn": 1604053011,
2734 "sbg:revisionNotes": "Congig cleaning"
2735 },
2736 {
2737 "sbg:revision": 18,
2738 "sbg:modifiedBy": "admin",
2739 "sbg:modifiedOn": 1604053011,
2740 "sbg:revisionNotes": "Config cleaning"
2741 },
2742 {
2743 "sbg:revision": 19,
2744 "sbg:modifiedBy": "admin",
2745 "sbg:modifiedOn": 1604053011,
2746 "sbg:revisionNotes": "Config cleaning"
2747 },
2748 {
2749 "sbg:revision": 20,
2750 "sbg:modifiedBy": "admin",
2751 "sbg:modifiedOn": 1616425658,
2752 "sbg:revisionNotes": "CWLtool compatible"
2753 },
2754 {
2755 "sbg:revision": 21,
2756 "sbg:modifiedBy": "admin",
2757 "sbg:modifiedOn": 1616425658,
2758 "sbg:revisionNotes": "Docker updated to uwgac/topmed-master:2.10.0"
2759 },
2760 {
2761 "sbg:revision": 22,
2762 "sbg:modifiedBy": "admin",
2763 "sbg:modifiedOn": 1616425658,
2764 "sbg:revisionNotes": "Plot update"
2765 },
2766 {
2767 "sbg:revision": 23,
2768 "sbg:modifiedBy": "admin",
2769 "sbg:modifiedOn": 1616425658,
2770 "sbg:revisionNotes": "Benchmarking table updated"
2771 },
2772 {
2773 "sbg:revision": 24,
2774 "sbg:modifiedBy": "admin",
2775 "sbg:modifiedOn": 1617276222,
2776 "sbg:revisionNotes": "Plot update"
2777 },
2778 {
2779 "sbg:revision": 25,
2780 "sbg:modifiedBy": "admin",
2781 "sbg:modifiedOn": 1617276222,
2782 "sbg:revisionNotes": "Plot update"
2783 },
2784 {
2785 "sbg:revision": 26,
2786 "sbg:modifiedBy": "admin",
2787 "sbg:modifiedOn": 1621514949,
2788 "sbg:revisionNotes": "Assoc plot labels updated"
2789 },
2790 {
2791 "sbg:revision": 27,
2792 "sbg:modifiedBy": "admin",
2793 "sbg:modifiedOn": 1621514949,
2794 "sbg:revisionNotes": "Azure instance hint added"
2795 },
2796 {
2797 "sbg:revision": 28,
2798 "sbg:modifiedBy": "admin",
2799 "sbg:modifiedOn": 1624463195,
2800 "sbg:revisionNotes": "Azure hint change"
2801 },
2802 {
2803 "sbg:revision": 29,
2804 "sbg:modifiedBy": "admin",
2805 "sbg:modifiedOn": 1624463195,
2806 "sbg:revisionNotes": "Azure hint change"
2807 }
2808 ],
2809 "sbg:image_url": "https://cgc.sbgenomics.com/ns/brood/images/admin/sbg-public-data/aggregate-association-testing/29.png",
2810 "sbg:toolAuthor": "TOPMed DCC",
2811 "sbg:license": "MIT",
2812 "sbg:categories": [
2813 "GWAS",
2814 "CWL1.0"
2815 ],
2816 "sbg:links": [
2817 {
2818 "id": "https://github.com/UW-GAC/analysis_pipeline",
2819 "label": "Source Code, Download"
2820 },
2821 {
2822 "id": "https://academic.oup.com/bioinformatics/advance-article-abstract/doi/10.1093/bioinformatics/btz567/5536872?redirectedFrom=fulltext",
2823 "label": "Publication"
2824 },
2825 {
2826 "id": "https://www.bioconductor.org/packages/release/bioc/vignettes/GENESIS/inst/doc/assoc_test.html",
2827 "label": "Home Page"
2828 },
2829 {
2830 "id": "https://bioconductor.org/packages/devel/bioc/manuals/GENESIS/man/GENESIS.pdf",
2831 "label": "Documentation"
2832 }
2833 ],
2834 "sbg:expand_workflow": false,
2835 "sbg:appVersion": [
2836 "v1.2",
2837 "v1.1"
2838 ],
2839 "id": "https://cgc-api.sbgenomics.com/v2/apps/admin/sbg-public-data/aggregate-association-testing/29/raw/",
2840 "sbg:id": "admin/sbg-public-data/aggregate-association-testing/29",
2841 "sbg:revision": 29,
2842 "sbg:revisionNotes": "Azure hint change",
2843 "sbg:modifiedOn": 1624463195,
2844 "sbg:modifiedBy": "admin",
2845 "sbg:createdOn": 1602755622,
2846 "sbg:createdBy": "admin",
2847 "sbg:project": "admin/sbg-public-data",
2848 "sbg:sbgMaintained": false,
2849 "sbg:validationErrors": [],
2850 "sbg:contributors": [
2851 "admin"
2852 ],
2853 "sbg:latestRevision": 29,
2854 "sbg:publisher": "sbg",
2855 "sbg:content_hash": "ac81fe895ef38755ab2ee94f94f6b251caaa60792340273d325627b08e0a15baa"
2856}