Ticket #159 (closed enhancement: fixed)

Opened 6 years ago

Last modified 6 years ago

Excel exporter

Reported by: business@… Owned by: s.h.sikkema@…
Priority: major Milestone: 0.7.0
Component: Export functionality Version:
Keywords: Cc:
Hardware: Operating system:
Product: URL:

Description (last modified by business@…) (diff)

We need to implement the querying functionality as specified in the mockups on https://wiki.nbic.nl/index.php/DbNPFeatures#User_Interface_to_query_both_metadata_and_omics_data.

This is a big task, that we need to do in iterations. The first task, which is also what is needed for the DSP most at this stage, is to be able to export a study assay with a combination of the metadata and the measurement data from the module. In the DSP case, those are metabolomics peaks, but since the query REST interface that has to be implemented is general, we might as well program it general so that it will also work with other modules. We will for this ticket however test with the metabolomics module.

So to be clear, the outcome of the wizard should be an Excel file with for every sample in a certain assay, some metadata information and measurement information from the clean data module in which the assay data resides.

This functionality should reside under a new menu called 'Assays', and it should be named 'Export data to Excel'. When you click this, you should get a wizard with the following steps:

First step: Select assay

  • Dropdown: Select study (if a person is not logged in, redirect to login first, see View studies page how to do that)
  • Dropdown dependent on chosen study: Select assay

After first step: check if assay exists, clean data module is reachable, and actually has data for that assay, if not display an error message.

Second step: Which metadata to include

  • Checkbox: Subject data (optionally: show checkboxes for all fields that actually contain data for any sample, you can determine this by checking for all sample.parentSubject.giveFields() field if (sample.parentSubject.getField(field)) (hint: have a look at study/show, that one also only shows fields which actually have data)
  • Checkbox: Sampling Event data (optionally: show checkboxes for all Sampling Event fields that contain any data for sample.parentEvent)
  • Checkbox: Event data (also here optionally give the possibility to tick only specific fields) --> you can get the events by iterating over Sample.parentEventGroup.events
  • Checkbox: Sample data (same here)
  • Checkbox: Sample metadata from assay module (check if there is any via the REST method getSampleMetadataFields() of the assay module, in this testcase the metabolomics module)

Note: Sample parentEvent, parentSubject and parentEventGroup are nullable, so in case e.g. no subject is available for any sample, you can totally gray out Subject information.

All of these options should be on by default, the most likely case is that the user just wants a dump of all available information.

Third step: which assay data to include
This step requires some more calls to the assay module (in this testcase, the metabolomics module). At this point, call getMeasurements(assayToken) to determine which measurements were done in the assay.

Initially, just present the user with two choices here:

  • Optionbox: Get all [getMeasurements().count] measurements
  • Optionbox: Get a specific measurement: [dropdown with all measurements]

Final step: display confirmation

In this step, display the first five resulting rows of the Excel file. If there are more than 5 measurements in the assay module, also only show the first five columns.

The rows in this table obviously are all the samples in the assay.
The columns in the table are:

  • Sample name (always the first column)
  • Sample group (Sample.parentEventGroup.name) (if given for any sample)
  • Optionally, if specified, additional GSCF Sample properties
  • Optionally, if specified, additional assay module sample properties
  • Optionally, if specified, GSCF Subject properties
  • Optionally, if specified, GSCF Sampling Event properties
  • Optionally, if specified, GSCF Event properties
  • Then, the specified measurements from the assay module (all or just one) --> you can get this data by calling getMeasurementData()

Change History

comment:1 Changed 6 years ago by business@…

  • Description modified (diff)

comment:2 Changed 6 years ago by business@…

Jildau had a few comments and remarks, which I reflect here:

  • The Excel sheet might get very big. Partly that's why everything is optional (you can uncheck information that is not needed), but also we might have to generate R and MATLAB exports as well at some point.
  • The study list may grow large. That's why we also should add an 'Export' button to the full text search function (now in simpleQuery) which should lead straight to step 1 of this wizard with the study in question selected
  • Subject data might contain many fields: we partly circumvent this by only outputting fields that actually contain data. Also, if this gets into a problem, we could uncheck Subject information by default
  • check for Event information: maybe you also want to select on which Events you want information: indeed, if you implement the field checkboxes you already have to iterate over all the events (because the event templates of the events may differ)
  • Measurement dropdown might contain many measurements: that is a valid point, we probably need some sort of search function there. But let's first get this first version in place.
  • Possibility to output the different field categories to multiple Excel sheets within the output Excel file, instead of in one big table in the first sheet? @Jildau: I think this is a valid point, could you please point out which sheets should be generated and which information should be on which sheet?

comment:3 Changed 6 years ago by e.vansomeren@…

Currently, all studies except for the standard PPS3 and PPSH give an error when trying to export.

For example, login as user "user" and try to export WP5.1E1, WP5.1E2, WP5.3E1, GSE5509 or GSE5509_2.

Obviously, this part is still under construction. I just want to offer some examples for testing. Note: GSE5509 has other problems as well, so you could skip it. WPxxEx contain only study level info. GSE5509_2 is a fully filled-in study that is excellent for testing.

Hope this helps

comment:4 Changed 6 years ago by business@…

  • Milestone changed from 0.7 to 0.6.4

comment:5 Changed 6 years ago by s.h.sikkema@…

  • Status changed from new to closed
  • Resolution set to fixed

comment:6 Changed 6 years ago by business@…

  • Status changed from closed to reopened
  • Resolution fixed deleted
  • Milestone changed from 0.6.5 to 0.6.6

A few changes:

  • add info the steps:
    • Select the assay you want to export data from
    • Select the columns that you want to be included in the resulting Excel file
    • Below you see a preview of the resulting Excel file, click OK to generate it.
  • field select step:
    • change 'measurement tokens' to 'measurements'
    • put checkbox before the fields and put every field on a new line to improve layout
    • add (?) info boxes behind each field which give a popup with the field description on mouse over
    • multiple select box for measurements
  • test export of 1 or more measurements

comment:7 Changed 6 years ago by business@…

  • Status changed from reopened to assigned

comment:8 Changed 6 years ago by s.h.sikkema@…

  • Owner changed from s.h.sikkema@… to kees.vanbochove@…

Works on ci now. To test: log in, click 'Assays', 'Export Assay to Excel', select 'Assay Export Test Study' and click submit ...
There might be an issue where it says: Assay has no value (null). Try again and please try to figure out under what circumstances that occurs (if at all).

comment:9 Changed 6 years ago by business@…

  • Owner changed from kees.vanbochove@… to s.h.sikkema@…

When I log in as user on CI, and test exactly like you said, under all circumstances it gives me the error Assay has no value (null). What does this mean?

comment:10 Changed 6 years ago by business@…

I think it's a good idea to multiselect all measurements by default, makes it easier for the user.

comment:11 Changed 6 years ago by work@…

  • Milestone changed from 0.6.6 to 0.7

comment:12 Changed 6 years ago by s.h.sikkema@…

  • Status changed from assigned to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.