Ticket #417 (closed defect: fixed)

Opened 3 years ago

Last modified 3 years ago

Improve 'simple wizard' to also be able to automatically infer study design

Reported by: business@… Owned by: business@…
Priority: major Milestone: 0.8.4
Component: Import wizard Version: 0.8.0
Keywords: Cc: t.w.abma@…
Product: Operating system:
URL: Hardware:

Description

For NMC studies, it is important that the study design can be automatically inferred from an Excel file, which will give at once all entities for the study design, very much like the simple wizard.
The only improvement that needs to be made to the simple wizard is that the study design needs to be inferred.
This can be done as follows:
- Unique subjects, events, sampling events and samples need to be recognized (e.g. if all given fields for a subject are exactly the same in multiple rows, this should be considered the same subject, possibly with multiple different events attached - same for events etc)
- Eventgroup generation will then be done by first determining for each subject all connected events, sampling events and samples, and then creating an event group for each unique combination of events and sampling events (and their field values), thereby hopefully minimizing the number of resulting event groups. Groups can be named by numbers. Finally the subjects, events, sampling events and samples should be linked to those event groups, and the samples to their parent sampling event and subject.

The result of this should be 2 options in the simple wizard:
- Import 'as is' (with the advantage that the study also will be updateable via Excel)
- Import with inferring study design.

After this is implemented we probably will need to take another look at the different create and import options and group them in a logical way for the user.

Change History

Changed 3 years ago by business@…

  • cc t.w.abma@… added
  • status changed from new to assigned
  • owner changed from business@… to s.h.sikkema@…

Changed 3 years ago by business@…

Obviously I have to clarify the above definition. Specifically, we have to separate out the sampling event logic and the event logic, so it would make sense to create different eventgroups for linking events and linking sampling events.

The following procedure would to the best of my knowledge work to re-create the study design as closely as possible:

Sampling event logic


Assume that each row in the input table is one unique sample.
Iterate over all rows and generate these samples (an error can be thrown if a sample name is not unique - or we could just number them - I'm not sure what would be the most elegant solution, use the first for now)

Iterate over all rows and extract the sampling event data
Generate a sampling event for each unique combination of sampling event field data
For each row aka sample, link it to its unique sampling event by setting parentEvent (which is the sampling event that is mentioned in the same row)

Iterate over all rows and extract the subject data
Generate a subject for each unique combination of subject data
For each row aka sample, link it to its unique subject by setting parentSubject (which is the subject that is mentioned in the same row)

Now, we have to create event groups in such a way, that they link each sample correctly to their one and only parent subject and parent sampling event.
Correct me if I'm wrong, but the easiest way to do this (although possibly not the most sophisticated way) would be to simple create an EventGroup? for each unique sampling event and then iterate over all samples in that group and find out which subjects should be linked in the group. In concreto:

Iterate over all created unique sampling events
Create an eventgroup named Sampling_[Sampling event template name]_[Sampling event start time] (e.g. Sampling_BloodSampling_0w)
For each row aka sample, link it to its unique event group by setting parentEventGroup (which is the group created for the sampling event that is mentioned in the same row)
Now, for each created event group, iterate over all its child samples, collect all the mentioned subjects (parentSubject) from those samples and add them to the event group

Now, we have to create event groups in such a way, that they link each sample correctly to their one and only parent subject and parent sampling event.
Correct me if I'm wrong, but the easiest way to do this (although possibly not the most sophisticated way) would be to simple create an EventGroup? for each unique sampling event and then iterate over all samples in that group and find out which subjects should be linked in the group. In concreto:

Iterate over all created unique sampling events
Create an eventgroup named Sampling_[Sampling event template name]_[Sampling event start time] (e.g. Sampling_BloodSampling_0w)
For each row aka sample, link it to its unique event group by setting parentEventGroup (which is the group created for the sampling event that is mentioned in the same row)
Now, for each created event group, iterate over all its child samples, collect all the mentioned subjects (parentSubject) from those samples and add them to the event group

Event logic


Iterate over all rows and extract the event data
Generate an event for each unique combination of event field data

Iterate over all rows to find out the event - subject links (make a map M of subject --> events, where for each subject all associated events are listed)
Find all unique combinations of events in M and generate an event group for each combination, with as name all event data concatenated (e.g. Placebo_0w_Fenofibrate_6w_Fishoil_12w)
Now, for each of these generated event groups, link the events (the ones that the group is based on) and the subjects (the ones which were linked in M)
Which should connect the subject to all their associated events in such a way that a minimum of event groups is created!

Changed 3 years ago by business@…

Sorry, ignore the duplicate alinea's (last 2 of section Sampling event logic)

Changed 3 years ago by work@…

  • milestone changed from 0.8.1 to 0.8.2

Changed 3 years ago by business@…

Event import still needs to be added, can be done separately by also including an 'attach to subjects' option if you choose Event as import entity.

Changed 3 years ago by work@…

  • milestone changed from 0.8.2 to 0.8.3

Changed 3 years ago by s.h.sikkema@…

  • owner changed from s.h.sikkema@… to business@…

added event to subject linking. I'll leave it to Kees whether to close this ticket or not

Changed 3 years ago by work@…

  • component changed from Unknown to Import wizard

Changed 3 years ago by work@…

  • milestone changed from 0.8.3 to 0.8.4

Changed 3 years ago by business@…

  • status changed from assigned to closed
  • resolution set to fixed

Done now via gdtimporter.

Note: See TracTickets for help on using tickets.