v1 to v2 Migration Guide
This guide maps legacy snpArcher v1-style inputs to the v2 contracts used by the current workflow.
Sample sheet migration
v2 uses a minimal manifest:
required:
sample_id,input_type,inputoptional:
library_id,mark_duplicates
Column mapping
v1 column |
v2 column |
Notes |
|---|---|---|
|
|
Must be alphanumeric plus |
|
|
Use with |
|
|
Join as |
|
|
Optional; defaults to |
|
|
Optional; defaults to |
|
config |
v2 keeps reference in config, not sample sheet. |
Example: v1 fastq row -> v2 row
v1:
BioSample,LibraryName,Run,fq1,fq2
bird_1,bird_1_lib,1,/data/bird_1_R1.fastq.gz,/data/bird_1_R2.fastq.gz
v2:
sample_id,input_type,input,library_id,mark_duplicates
bird_1,fastq,/data/bird_1_R1.fastq.gz;/data/bird_1_R2.fastq.gz,bird_1_lib,true
Example: v1 SRA row -> v2 row
v1:
BioSample,Run
bird_2,SRR12345678
v2:
sample_id,input_type,input
bird_2,srr,SRR12345678
Config migration
v2 uses nested config keys. Core keys are:
samplesreference.namereference.sourcevariant_calling.*intervals.*callable_sites.*modules.*
Common legacy -> v2 config mapping
v1 key |
v2 key |
Notes |
|---|---|---|
|
|
Required in v2. |
|
|
Can be local path, URL, or accession. |
|
|
Set to |
|
|
String path or license value. |
|
|
Integer. |
|
|
Boolean. |
|
|
Integer. |
|
|
Number in [0,1]. |
|
|
Integer. |
Notes
v2 no longer expects per-sample reference columns in the manifest.
Optional metadata columns from v1 (for example
lat,long,SampleType) are not part of the v2 core manifest contract.If a sample appears in multiple rows,
library_ididentifies the library and may repeat across rows for the same library (for example, multiple lanes/runs); it only needs to differ when a sample truly has multiple distinct libraries.