Microsoft Access Database interface for reading XPS files

The XPS file format does not easily allow data to be extracted. The Royal Liverpool and Broadgreen University Hospitals NHS Trust (rlbuht) uses a system containing patient records which allows the export of data in the XPS file format.

Problem

The XPS file format is perfect for printing, but provided a major problem for complex data analysis. The XPS file format was not suitable for data to be interrogated or exported.

The task was to build a software application which could be executed on the secure and restricted hospital computer systems which was capable of reading and analysing the data in the exported XPS files.

Solution

The solution to the problem was to build a Microsoft Access application for interfacing and with XPS files containing patient records. This solution would allow the data within the XPS file to be read into a relational database and queried using SQL. This application used the Microsoft platform as it was possible to execute on the NHS computer systems.

The main functionality of the application was coded in Visual Basic for Applications (VBA). On start up of the MS Access DB AutoExec runs and parsed the specified XPS file, which contains a list of patient appointments and other clinical data. This file was read into a set of tables in a relational database. Once the relational database is built, this data can then be interrogated using SQL, or can be exported into excel / CSV format for further analysis in other software packages.

Software

This section outlines some interesting and novel aspects of the software application. The particular aspects of interest in this application was the function for parsing the single long string of data from the XPS file.

Parsing Function

The parsing function processes a string of data from the XPS file. The format of the XPS file has a structure such that there are some cues that are consistent. These cues can be used to parse the patient records from the string. A snippet of the code used in the main function is given below.

Public Function ParseFile(fileName As String) As Boolean
On Error GoTo ErrHandle

    'Debug on(true)/off(false)
    debugMode False
    
    MsgBox "Importing file"
    If dbg Then MsgBox "Debug mode on"

    'drop table if exists
    dropTable
    'create table to contain parsed input
    createTable

    If dbg Then MsgBox "Reading File: " + fileName
    Dim varArray As Variant
    
    intFileNum = FreeFile
    Open fileName For Input Access Read As intFileNum
    

    
    'Loop for each line in input file
    Do While Not EOF(intFileNum)
        Line Input #intFileNum, strLine
        If dbg Then MsgBox "Line " + strLine
        varArray = Split(strLine, " ")
        
        Dim counter As Integer
        Dim endOfLine As Integer
        Dim resetCounter As Boolean
 
        Dim nhsnum As String
        Dim patientid As String

        'data init
        nhsnum = ""
        patientid = ""
        
        'init
        counter = 0
        resetCounter = False
        endOfLine = -1
        
        'Loop over each item
        For Each Item In varArray
            If dbg Then MsgBox "ItemNum:[" + CStr(counter) + "] endOfLine:[" + CStr(endOfLine) + "] Item:[" + Item + "]"
            
            'Item: nhs num
            If counter >= 0 And counter < 3 Then
                If dbg Then MsgBox "NHSnum[" + CStr(counter) + "]: " + Item
                'nhsnum = nhsnum + Item + " "
                nhsnum = nhsnum + Item + ""
            End If
                                   
            'Item: patient id
            If counter = 3 Then
                If dbg Then MsgBox "PID[" + CStr(counter) + "]: " + Item
                patientid = "" + Item + ""
            End If
                        
                        
            'Determine EOL
            If Item = "VISION" Then
                endOfLine = counter + 2
                If dbg Then MsgBox "Item=VISION " + CStr(endOfLine) + " " + CStr(counter)
            End If

            'Reset at end of line
            If endOfLine > -1 And counter >= endOfLine Then
                resetCounter = True
                endOfLine = -1
                If dbg Then MsgBox "ResetCounter " + CStr(endOfLine)
            End If
                        
            'resetCounter
            If resetCounter Then
                resetCounter = False
                counter = -1
                
                'Add current values to table
              '  MsgBox "Untrim [" + nhsnum + "]"
                
                'nhsnum = Trim(nhsnum)
                'patientid = Trim(patientid)
                
              '  MsgBox " trim [" + nhsnum + "]"
                
                addRecord nhsnum, patientid
                'addRecord Trim(nhsnum), Trim(patientid)
                
                'reset values
                nhsnum = ""
                patientid = ""
            End If
                 
            'increment counter
            counter = counter + 1
                        
        Next
    Loop
    
    MsgBox "Import complete"
Exit Function

For more information about this project contact me.