Section Configuration

Learn why and how to configure section headers for TIES

Why do you need to configure section headers

There are two primary reasons to configure section headers in TIES:

  1. To improve the accuracy of your search results, for e.g. when searching within a pathology report, you may only search within the Final Diagnosis section to filter out reports that might contain your search term within the Clinical History section. Without section configuration you cannot limit your search to specific sections of the report.
  2. To standardize section headers and map them to their semantic equivalents. For e.g. reports might have section headers like FINAL DIAGNOSIS, FINAL PATHOLOGICAL DIAGNOSIS that you would want to map to a “Final Diagnosis” section. The reason to do this is so that your researchers have a more manageable list of sections to choose from when searching.

SectionHeaderConfig.txt Syntax

Section header configuration is used by the initial data importers ( HLImportPipeController or DelimitedFileImporter ) and the DeidPipeController. You need to create a SectionHeaderConfig.txt file and save it in the classes/config directory for these pipelines as that is the default location where the pipelines look for the section configuration. If you would like to change this, edit the caties.deid.sectionheader.cfg property  in the caTIES.properties file.

The file must be a bar(|) delimited file. Comment lines can be added by starting the line with the hash(#) character. The file has the following columns:

  1. Synthesized Name – The standard name used by TIES. (E.g. Final Diagnosis)
  2. Source Label – The section header as it appears in the input file. (E.g. FIN or FINAL DIAGNOSIS:)
  3. Document Type – The report type (e.g. PATHOLOGY or RADIOLOGY etc.)
  4. Code – Whether TIES should detect concept codes in this section.
  5. Index – Whether TIES should index this section so that it can be searched on.
  6. Histogram – Not used. Should always be set to false.

 

Configure for input file containing the report text as separate sections

For HL7ImportPipeController.

As per specification, the HL7ImportPipeController looks in the OBX-3.1 field for the source section type. You need to create a row in the section header config file for all possible values of the OBX-3.1 field. The possible value is listed under the Source Label column, and the section it maps to should be listed in the Synthesized Name column. The HL7ImportPipeController, when putting together the full report text, will use the Synthesized Name as the section header.. For e.g. The FIN section text will be preceded with Final Diagnosis on a separate line if FIN was mapped to Final Diagnosis in the configuration file.

For DelimitedFileImporter

To import delimited input files that have a separate row per section of the report, you need to include the ties.model.section_type column in the input file configuration. For all possible values of this column, you need to create a row in the  section header config file. The possible value is listed under the Source Label column, and the section it maps to should be listed in the Synthesized Name column. If you set the caties.delimimporter.pretty.headers to true in the caTIES.properties for the DelimitedFileImporter, The Synthesized Name column is used differently by the DelimitedFileImporter than by the HL7ImportPipeController. While the HL7ImportPipeController uses the value as a section header in the report text, the DelimitedFileImporter only uses it to correctly map the different source headers to its standardized section. The actual section header in the text is generated from the source header column itself. If you set the caties.delimimporter.pretty.headers property to true in the configuration file, then it upper cases the source header and uses it as a section header in the text, otherwise it uses the source header as is.

 

 

Configure for input file containing the report text as a whole

For the  CaTIES_NonSectionedHL7ImportPipeController

Instead of using the CaTIES_Hl7ImportPipeController class, you need to use the CaTIES_NonSectionedHL7ImportPipeController class. This special case HL7 Importer is to be used when the OBX rows are used not to identify different sections, but rather as individual lines of the report. The CaTIES_NonSectionedHL7ImportPipeController simply contacenates all the OBX row values to create the report text. You still need to provide a SectionHeaderConfig.txt  so TIES can properly identify the sections within the text.  You need to create a row in the section header config file for all possible section headers that you may find in the incoming report text. The possible value is listed under the Source Label column, and the section it maps to should be listed in the Synthesized Name column.

For DelimitedFileImporter

If you do not include the ties.model.section_type column in the input file configuration, then the DelimitedFileImporter assumes that the text in the ties.model.text column is the entire report text. . You still need to provide a SectionHeaderConfig.txt  so TIES can properly identify the sections within the text.  You need to create a row in the section header config file for all possible section headers that you may find in the incoming report text. The possible value is listed under the Source Label column, and the section it maps to should be listed in the Synthesized Name column.