Changes between Initial Version and Version 1 of 1.0CodeCleanup


Ignore:
Timestamp:
Feb 7, 2011, 4:26:09 PM (12 years ago)
Author:
jannekevdp@…
Comment:

moved from nbic wiki

Legend:

Unmodified
Added
Removed
Modified
  • 1.0CodeCleanup

    v1 v1  
     1== GSCF 1.0 Code Cleanup ==
     2
     3To professionalize our code base and to truly allow others to collaborate in this open source project easily, we have to make a few efforts. This page is intended to collect those efforts.
     4
     5=== Add authentication ===
     6
     7Without authentication, the usage of deploying GSCF on an internet site is limited.
     8
     9=== Refactor the code in study/show ===
     10
     11The code of the study/show is in bad shape. The controller and view logic are intertwined and almost everything is in the view. The data model / business logic should be moved to the controller, and the view should be split up into multiple templates.
     12
     13=== Ability to exchange templates ===
     14
     15In order to be able to copy example templates or specific template sets without having to modify the code, we should come to a template description format (probably XML or RDF or XMI!). [https://gforge.nbic.nl/tracker/index.php?func=detail&aid=121&group_id=53&atid=273 Feature request #121] is a prerequisite for this.
     16
     17=== Ability to export/import studies ===
     18
     19This feature could arguably also be saved for a later version, but sooner or later we will need an open exchange format, preferably something which easily parseable such as XML. The templates could act as XML descriptors, but this is not necessary. See also [https://gforge.nbic.nl/tracker/index.php?func=detail&aid=80&group_id=53&atid=273 Feature request #80].
     20
     21=== Improve test coverage ===
     22
     23The test coverage of large portions of the code is very low. This introduces issues in terms of reliability and maintainability of the code.
     24
     25==== Test study create wizard systematically ====
     26
     27In order to test the study create wizard, we need to implement some sort of webtesting (e.g. bij using the Grails webtest plugin) to check if the data in the flow scope remains consistent with all possible changes.
     28
     29E.g. what happens in the following scenario:
     30* create study
     31* create template
     32* select template
     33* enter some values
     34* change template: delete the fields you just entered values in
     35...
     36
     37=== Turn GSCF into a Grails plugin ===
     38
     39To really encourage people to build their own GSCF application, and to facilitate omics modules development, we could turn GSCF into a Grails plugin. This plugin should contain three key elements:
     40
     41* the templating structure
     42* the wizard
     43* the importer
     44
     45The first goal, encourage people to build their own application, may be a little far-fetched, but on the other hand, if you look at the way for example MOLGENIS works, this is a good approach. And at least it would help ourselves to make our code prettier by having our own plugin project with core elements and an application project with specifics (maybe we need in the end even different ones for the different consortia).
     46
     47The second goal is even more obvious. If you look at the current implementation of SAM, both the wizard and importer code is copied there. Of course that's not really good programming practice. In order to refactor on that level, the nicest way of sharing this could would be to create a Grails plugin containing those elements.
     48
     49Also, the metabolomics module could then reuse this plugin and also profit from both the wizard and the importer, plus the templating structure. Because Excel-like editing is also needed there.
     50
     51== GSCF 1.0 - Hurdles to take from a user perspective ==
     52
     53The biggest obstacles to smoothly enter a medium-sized nutrigenomics study (let's say 80 subjects, 40 events, 1000 samples) from a user perspective are:
     54
     55== Import of data in general ==
     56
     57The importer should recognize the header names right away, even if they don't match exactly (best guess). Setting all the dropdown boxes to the right property can be very tedious. This goes even more so for the SAM module.
     58
     59== Entering of events ==
     60
     61Due to complex time-based schema's, the flat representation of all possible events x times goes up quickly (something like 40-50 events). Right now, the only viable way to get all the events in in a reasonable manner is to write them down in Excel and then import them. This is because right now, you have to click 'Add' for each new event. It would already improve things if we could just say 'add 20 events of type X' just like that is possible with Subjects.
     62But the real killer app would be if we could just enter the events in a event types x time (+ group?) table, and they would automatically be added. That would save a lot of entering, copying and group clicking.
     63
     64== Entering of samples ==
     65
     66With samples, it's basically the same story as with events, but the numbers are even worse. It's no use scrolling through 1000 samples to see if they have the right template or something like that. The template should be auto-set, depending on the parent SamplingEvent.
     67It certainly would help if the table could be reviewed after that, especially with smaller studies, but the automatic generation is most important.
     68
     69== Entering data into SAM ==
     70
     71As mentioned earlier, the column headers in the SAM importer should be auto-recognized, since there can be hundreds of them. Also, it should be possible to choose between different sheet layouts, the most common one being: samples in rows, different measurements (called MeasurementTypes) in the columns.
     72Recognition of the headers is fine, but if a MeasurementType is not already in the database, you cannot import it without importing the MeasurementType first. This is a little tedious as well. Maybe we should add on-the-fly adding of new MeasurementTypes. It would be even fancier if SAM knows how to connect to some public compound databases and search there if it can identify what you are importing (HMDB, DrugBank etc.)