Make namespace (prefixes) configurable #5

Open
opened 2026-02-20 18:29:26 +00:00 by msz · 0 comments
Member

As already hinted at and discussed elsewhere, although the Research Information schema used by the TRR and by the Psychoinformatics group are equivalent (sans the prefixes used), the code is not 100% reusable. One reason are hardcoded values.

These should become configurable -- likely through CLI arguments (worth considering Click's environment variables support while we're at it).

Searching for hardcoded "trr" currently yields the following hits - mostly regular expressions which work on PIDs:

❱ rg -i --type py trr379
project.py
53:    ctb_pat = re.compile(r"trr379root:contributors/([\w\-]+)")
63:        if obj.get("schema_type") == "trr379ri:TRR379Organization" and "marcrel:sht" in roles:
67:        elif obj.get("schema_type") == "trr379ri:TRR379Person":
76:    topic_pat = re.compile(r"trr379root:topics/([\w\-]+)")
95:    pat = re.compile(r"trr379root:roles/([\w\-]+)")
108:    pat = re.compile(r"trr379root:projects/([a-cq])(\d+)")
155:    pat = re.compile(r"(https://trr379\.de/|trr379root:)projects/([\w\-]+)")
163:            # stick to trr379root:projects/ namespace (excludes root project)

person.py
58:            and delegation["object"].get("schema_type") == "trr379ri:TRR379Organization"
76:            identifier.get("schema_type") == "trr379ri:ORCID"
92:        if pn is not None and pn != "TRR379":
112:    pat = re.compile(r"(https://trr379\.de/|trr379root:)roles/([\w\-]+)")
144:    pat = re.compile(r"(https://trr379\.de/|trr379root:)contributors/([\w\-]+)")
152:            # only build pages for trr379root:/contributors prefix

publication.py
73:    If the PID is in the trr379.de/publications/ namespace (default in
82:    pat = r"(https://trr379\.de/|trr379root:)publications/([\w\-]+)"
156:        if generation.get("object").startswith("trr379root:projects")
161:        if topic.startswith("trr379root:topics")
197:        if ctb["pid"].startswith("trr379root:contributors/")

filters/enrich-via-doi.py
188:            identifier.get("schema_type") == "trr379ri:ORCID"
As already hinted at and discussed [elsewhere](https://hub.psychoinformatics.de/www/www-from-model/pulls/5#issuecomment-7143), although the Research Information schema used by the TRR and by the Psychoinformatics group are equivalent (sans the prefixes used), the code is not 100% reusable. One reason are hardcoded values. These should become configurable -- likely through CLI arguments (worth considering [Click's environment variables support](https://click.palletsprojects.com/en/stable/options/#values-from-environment-variables) while we're at it). Searching for hardcoded "trr" currently yields the following hits - mostly regular expressions which work on PIDs: ``` ❱ rg -i --type py trr379 project.py 53: ctb_pat = re.compile(r"trr379root:contributors/([\w\-]+)") 63: if obj.get("schema_type") == "trr379ri:TRR379Organization" and "marcrel:sht" in roles: 67: elif obj.get("schema_type") == "trr379ri:TRR379Person": 76: topic_pat = re.compile(r"trr379root:topics/([\w\-]+)") 95: pat = re.compile(r"trr379root:roles/([\w\-]+)") 108: pat = re.compile(r"trr379root:projects/([a-cq])(\d+)") 155: pat = re.compile(r"(https://trr379\.de/|trr379root:)projects/([\w\-]+)") 163: # stick to trr379root:projects/ namespace (excludes root project) person.py 58: and delegation["object"].get("schema_type") == "trr379ri:TRR379Organization" 76: identifier.get("schema_type") == "trr379ri:ORCID" 92: if pn is not None and pn != "TRR379": 112: pat = re.compile(r"(https://trr379\.de/|trr379root:)roles/([\w\-]+)") 144: pat = re.compile(r"(https://trr379\.de/|trr379root:)contributors/([\w\-]+)") 152: # only build pages for trr379root:/contributors prefix publication.py 73: If the PID is in the trr379.de/publications/ namespace (default in 82: pat = r"(https://trr379\.de/|trr379root:)publications/([\w\-]+)" 156: if generation.get("object").startswith("trr379root:projects") 161: if topic.startswith("trr379root:topics") 197: if ctb["pid"].startswith("trr379root:contributors/") filters/enrich-via-doi.py 188: identifier.get("schema_type") == "trr379ri:ORCID" ```
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
q02/pool-publication-page#5
No description provided.