Aachen: Determine where/how Q01 DICOMs will be stored for TRR379 access #10

Open
opened 2025-10-18 11:46:20 +00:00 by mih · 0 comments
Owner

If there can be a persistent location, this location should be indexed in the DataLad datasets that are tracking the individual DICOM acquisitions.

Beyond the mere tracking, this availability metadata would enable a BIDS conversion pipeline to run using the true storage location (rather than a temporary location), thereby verifying its integrity and completeness. This initial verification is also the test for the ability to rerun a BIDS conversion (e.g. after an issue with a DICOM conversion step was found and fixed).

The repeated reconversion (or transformation) of a BIDS dataset may also be necessary to apply project/site based re-pseudonymization.

One aspect to consider is whether to store DICOMs as individual files, or an entire acquisition as a single tarball?

Reasons in favor of tarball storage:

  • heudiconv can handle tarballs directly
  • datalad could index tarball content, hence preserving single-file tracking, while only a single file needs to be archived -- however, it is rare that individual DICOMs need to be accessed (compared to the whole acqusition, and retrieving a single file with all of them can be faster than thousands for a subset)
  • tarballs can be gzipped, saving about 30% or (0.7GB) of a single Q01 protocol run (~2.4G total uncompressed)
  • the NFS4 file system at the BIF struggles with large numbers of files and a single DICOM set is >21k files
If there can be a persistent location, this location should be indexed in the DataLad datasets that are tracking the individual DICOM acquisitions. Beyond the mere tracking, this availability metadata would enable a BIDS conversion pipeline to run using the true storage location (rather than a temporary location), thereby verifying its integrity and completeness. This initial verification is also the test for the ability to rerun a BIDS conversion (e.g. after an issue with a DICOM conversion step was found and fixed). The repeated reconversion (or transformation) of a BIDS dataset may also be necessary to apply project/site based re-pseudonymization. One aspect to consider is whether to store DICOMs as individual files, or an entire acquisition as a single tarball? Reasons in favor of tarball storage: - heudiconv can handle tarballs directly - datalad could index tarball content, hence preserving single-file tracking, while only a single file needs to be archived -- however, it is rare that individual DICOMs need to be accessed (compared to the whole acqusition, and retrieving a single file with all of them can be faster than thousands for a subset) - tarballs can be gzipped, saving about 30% or (0.7GB) of a single Q01 protocol run (~2.4G total uncompressed) - the NFS4 file system at the BIF struggles with large numbers of files and a single DICOM set is >21k files
mih changed title from Determine where/how Q01 DICOMs will be stored for TRR379 access to Aachen: Determine where/how Q01 DICOMs will be stored for TRR379 access 2025-10-18 11:48:24 +00:00
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
q02/phantom-mri-bids#10
No description provided.