1) Is all of the data kept behind a firewall at all times?
All “identified” data is kept behind firewall, but reports that have been de-identified reside in a public DMZ server.
2) Would all of the de-identification process be done at my institution?
The TIES de-identification process happens solely at the individual institution. There is no cloud transferring of data. Each institution is responsible for having the proper data formats, and the de-identification pipeline happens on their internal servers. The licensing of De-ID Corp software is recommended, by TIES does include an opens source de-identification software from MGH included. Most organizations have chosen to license the commercial option for its accuracy and processing of the reports. Again, all “identified” data is processed and kept stored behind the individual institutions database servers and firewall, but reports that have been de-identified reside in a public DMZ server
3) Where is the data storage done (i.e. would the data storage be local, on the cloud, or at Pitt)?
All “identified” data is processed and kept stored behind the individual institutions database servers and firewall, but reports that have been de-identified reside in a public DMZ server. The only data that resides at Pitt is authentication data, just username (no passwords, they reside on your node) and host metadata (dns/ip).
4) What is the typical workflow for extraction from old reports?
It can vary per individual institution. You can review this for more information http://ties.dbmi.pitt.edu/documentation/for-installers/loading-your-data/
5) Is there any way for TIES to update itself, by intercepting the data from one database to another?
In a general way TIES can be setup to “automatically” ingest data. I’ll briefly explain our setup. We have an automatic feed from our MARS data repository, which the MARS team has setup to push of reports on a nightly basis to a specific data server which has our custom data processing importer running on it. These reports are pushed to a certain directory that the importer periodically polls for files residing there. When it detects files in the directory, it processes these files and loads them into TIES automatically. No human intervention.
6) For extraction from a database, what is the typical time commitment that the IT staff would have to invest? This is both for extraction of past reports and for ongoing extraction to keep it up to date.
It will vary per institution, skill and capabilities of IT staff in relation to working with Cerner/HL7, etc. The process will need to undergo several of iterations of testing and validation of data between the two systems which depending on the number of reports, complexity of the extract interfaces, getting hl7 formats correct, etc could be some significate time commitments.
7) Who decides the level of access for each individual user?
Each individual institution governs their own policies and who gets access to which studies is solely the institution’s responsibility. Even the sharing of studies with other TCRN network participants is controlled by the individual organization. Pitt has no control over node sites at all. We only host the central server used as the lookup if you will for where each site resides and basic authentication of user info to all them to be routed to the appropriate node.
8) Rare diseases - would the quarantining system stop diseases that only have a few matches from showing up when searching?
TIES has a feature that will not return result sets less than a certain “threshold”. This would most likely will prevent those cases of rare disease from being returned.
9) How does TIES handle molecular data files?
TIES doesn’t do anything with whole exome data at this time. There will most likely NOT be support for this data in the immediate future. The only molecular data that would be in there would be summarized data from the pathology report.