The XPS file format does not easily allow data to be extracted. The Royal Liverpool and Broadgreen University Hospitals NHS Trust (rlbuht) uses a system containing patient records which allows the export of data in the XPS file format.
The XPS file format is perfect for printing, but provided a major problem for complex data analysis. The XPS file format was not suitable for data to be interrogated or exported.
The task was to build a software application which could be executed on the secure and restricted hospital computer systems which was capable of reading and analysing the data in the exported XPS files.
The solution to the problem was to build a Microsoft Access application for interfacing and with XPS files containing patient records. This solution would allow the data within the XPS file to be read into a relational database and queried using SQL. This application used the Microsoft platform as it was possible to execute on the NHS computer systems.
The main functionality of the application was coded in Visual Basic for Applications (VBA). On start up of the MS Access DB AutoExec runs and parsed the specified XPS file, which contains a list of patient appointments and other clinical data. This file was read into a set of tables in a relational database. Once the relational database is built, this data can then be interrogated using SQL, or can be exported into excel / CSV format for further analysis in other software packages.
This section outlines some interesting and novel aspects of the software application. The particular aspects of interest in this application was the function for parsing the single long string of data from the XPS file.
The parsing function processes a string of data from the XPS file. The format of the XPS file has a structure such that there are some cues that are consistent. These cues can be used to parse the patient records from the string. A snippet of the code used in the main function is given below.
Public Function ParseFile(fileName As String) As Boolean On Error GoTo ErrHandle 'Debug on(true)/off(false) debugMode False MsgBox "Importing file" If dbg Then MsgBox "Debug mode on" 'drop table if exists dropTable 'create table to contain parsed input createTable If dbg Then MsgBox "Reading File: " + fileName Dim varArray As Variant intFileNum = FreeFile Open fileName For Input Access Read As intFileNum 'Loop for each line in input file Do While Not EOF(intFileNum) Line Input #intFileNum, strLine If dbg Then MsgBox "Line " + strLine varArray = Split(strLine, " ") Dim counter As Integer Dim endOfLine As Integer Dim resetCounter As Boolean Dim nhsnum As String Dim patientid As String 'data init nhsnum = "" patientid = "" 'init counter = 0 resetCounter = False endOfLine = -1 'Loop over each item For Each Item In varArray If dbg Then MsgBox "ItemNum:[" + CStr(counter) + "] endOfLine:[" + CStr(endOfLine) + "] Item:[" + Item + "]" 'Item: nhs num If counter >= 0 And counter < 3 Then If dbg Then MsgBox "NHSnum[" + CStr(counter) + "]: " + Item 'nhsnum = nhsnum + Item + " " nhsnum = nhsnum + Item + "" End If 'Item: patient id If counter = 3 Then If dbg Then MsgBox "PID[" + CStr(counter) + "]: " + Item patientid = "" + Item + "" End If 'Determine EOL If Item = "VISION" Then endOfLine = counter + 2 If dbg Then MsgBox "Item=VISION " + CStr(endOfLine) + " " + CStr(counter) End If 'Reset at end of line If endOfLine > -1 And counter >= endOfLine Then resetCounter = True endOfLine = -1 If dbg Then MsgBox "ResetCounter " + CStr(endOfLine) End If 'resetCounter If resetCounter Then resetCounter = False counter = -1 'Add current values to table ' MsgBox "Untrim [" + nhsnum + "]" 'nhsnum = Trim(nhsnum) 'patientid = Trim(patientid) ' MsgBox " trim [" + nhsnum + "]" addRecord nhsnum, patientid 'addRecord Trim(nhsnum), Trim(patientid) 'reset values nhsnum = "" patientid = "" End If 'increment counter counter = counter + 1 Next Loop MsgBox "Import complete" Exit Function
For more information about this project contact me.