38 lines
1.4 KiB
Markdown
38 lines
1.4 KiB
Markdown
# Phantom MRI study dataset
|
|
|
|
This dataset tracks all phantom MRI acquisitions done for the TRR379
|
|
for the purpose of validating the Q01 protocol at all sites.
|
|
|
|
## Key aspects of this setup
|
|
|
|
### Session labels are pseudonymized identifiers
|
|
|
|
This first layer of personal data protection reduces the chances
|
|
of participant identifiers appearing as part of file/path names.
|
|
|
|
Each session or acquisition is placed into a directory/dataset \
|
|
in `sessions/` that is given a project-internal pseudonymous identifier
|
|
as its directory name.
|
|
|
|
### Multi-project ID mapping
|
|
|
|
The top-level `id_map.tsv` is a tab-separated table, which maps
|
|
session source identifiers to any number of contexts. The source
|
|
identifier corresponds to the directory name for a DICOM dataset in the
|
|
`sessions/` directory. This is the value in the first column of each
|
|
table row. Every subsequent column define the ID mapping to a different
|
|
context. The context label is defined in the header row.
|
|
|
|
A script to perform "re-identification" from a particular context
|
|
is provided at `code/reidentify.py`. It can be used like this
|
|
|
|
```bash
|
|
python3 code/reidentify.py id_map.tsv q01 AP001
|
|
```
|
|
|
|
The script returns the source identifier linked to the `q01` identifier
|
|
`AP001`.
|
|
|
|
The file `id_map.tsv` is an annexed file. Once the last copy of this file is
|
|
destroyed, identifier-based re-identification is no longer possible
|
|
(a precondition for data anonymization).
|