Towards Web-based Representation And Processing Of Health Information

Sheng Gao; Darka Mioc; Xiaolun Yi; Francois Anton; Eddie Oldfield; David J Coleman


Int J Health Geogr 

In This Article


XML and OGC Web Services

The sharing of health information is critical for preventing diseases, responding to emergencies, and educating the public and policy makers. However, many health professionals and authorities do not have tools to map health information in some cases they cannot visualize health information to make time-sensitive decisions, since they do not have the time, money, or skills to statistically analyze vast amounts of distributed data and render aggregated results into a geographic interface for interpretation. XML, web services, and related standards, have matured, yet confidence in such technology to visualize or share health information is only beginning to emerge.

XML, as a platform independent language, can support information interchange and representation through the Web. XML has many advantages, such as platform and application independence, extensibility, user-driven development and an open standard for data interchange via the Internet.[13] Health Level 7 (HL7) standards promote health care information exchange through XML.[14] HL7 Clinical Document Architecture (CDA) is an XML standard used to exchange clinical documents. For example, an XML document can record the information of a patient's allergy to certain medicines. But the primary domain of HL7 standards is clinical and administrative data, and explicit spatial information and health data mapping are not considered. Therefore, a standard format in sharing the representation of health information in time and space is needed.

To overcome the disadvantages of tightly coupled systems and improve their reusability, the concept of Service Oriented Architecture (SOA) has gained popularity recently. SOA provides a flexible way to share data as well as processing functions over the Internet to reduce costs of building complex systems. SOA has many benefits, such as better return on investment, better maintainability, higher availability, flexible service assembly, more security, and support for multiple client types.[15] The Open Geospatial Consortium (OGC) initiated the Open Web Service (OWS) program based on service-oriented architectures and web services (a common implementation of service oriented architectures), and has proposed several geospatial specifications to support geospatial data sharing and interoperation, such as Web Map Service (WMS), Web Feature Service (WFS), and Web Processing Service (WPS). WMS publishes its ability to produce maps rather than its ability to access specific data holdings, and generates spatially referenced maps dynamically.[16] WFS defines the interfaces for the access and manipulation of geographical features and elements through Geography Markup Language (GML).[17] WPS provides standardized interfaces to facilitate publishing, discovering and binding geospatial services that enable spatial processing functions across a network.[18] It regulates the connection rules of input request and output response that govern the geospatial processing event. The interfaces (GetCapabilities, DescribeProcess, and Execute) define how the client and server can cooperate in the execution of a process and generate the processing results. The data used in the WPS can be stored at the server side or acquired from a network. Accessing health information through standard interfaces is important to achieve data accessing and interoperability. Using the standard geospatial service interfaces, the wide access of health information can improve the ability to intervene in health issues, inform the public of the availability of resources, reduce the number of people affected by illness, strengthen the cooperation between different health organizations, and therefore reduce costs to the health care system.

HEalth Representation XML (HERXML)

The HEalth Representation XML (HERXML) schema is designed for the sharing of health data cartographical representation, data source description and statistical methodologies used via the Web. There are different kinds of health activities, such as hospital observation, laboratory tests and results, healthcare and medication services, and training and education for patients. Since these activities are social events and related to spatial location, the proper way to support mapping of these activities on maps is a foremost concern in geographic health applications. In the mapping of health-related activities, statistical methods can be used to connect health-related activities with maps. The methods to generate maps from health-related activities need to be considered. The following statistical methods are applied in this research: Crude Morbidity Rate (CMR), Normalized Morbidity Ratio (NMR), Age-Specific Morbidity Ratio (ASMR), Age-Adjusted Morbidity Ratio (AAMR), and Standardized Morbidity Ratio (SMR), Summation, Mean, Standard Deviation, Variance, Skewness and Kurtosis. These statistical methods consider spatial, temporal, and demographic factors and their influence on health related activities, which can show the health information distribution with spatial, temporal, age, and gender differences. Other statistical methods can be introduced to analyze other influential factors.

The intention is to make the HERXML schema able to support the Web-based representation of health information for users to interpret the statistical results. Three dimensions of representation are related with spatial data: semantic, geometric and graphical.[19] Therefore, we include these three kinds of representations in the HERXML schema. Semantic representation describes the health related activities, data sources, and the statistical methods used. Geometric dimension shows what type of geometry (point, line, and polygon) will be used to represent these health data. Graphic representation defines what styles or symbols are used to generate health maps.

The design of the HERXML follows an iterative process, as shown in Figure 1. It starts with user requirement collection and analysis, such as the range health information, related influential factors, and ways of representations. With the consideration of policy, privacy and security issues, the main concepts used in the representation of health information are determined. Next, a XML design software tool, Altova XMLSpy[20] is used to encode the HERXML schema. After that, the HERXML schema will be tested in application to validate user requirements. The iteration continues with a new version of HERXML schema until the end users satisfy. With the above cyclic development process, our preliminary HERXML schema used in this project is defined (refer to Additional file 1).

Figure 1.

HERXML schema design process. The HERXML schema design process follows a cyclic development. The steps include user requirement collection and analysis, conceptual design, schema implementation, and schema validation in applications.

As shown in Figure 2, the designed HERXML schema includes three parts: health, mapping data, and representation.

Figure 2.

The HERXML schema. The HERXML includes a "Health" part, a "MappingData" part, and a "Representation" part.

The health part includes the basic information of the health-related activities, with the name, title, description, and keyword list elements, and a type attribute. HealthType is an abstract complex type. It can be extended to support disease observation or other activities.

The mapping data part mainly records the data used for mapping. As shown in Figure 3, it includes the bounding box of the data, the spatial data, the relation between spatial data and mapping values, and the mapping values.

  • "BoundingBox" represents the spatial range of the mapping data.

  • "SpatialData" could be GML from WFS services, GML records, or Xlink to GML databases. The data source item is used to show the metadata of the spatial data. The health data are statistical values and are linked with the spatial data through the joining attribute.

  • "Relation" records the linking attribute and the matching ID value of both spatial data and mapping values.

  • "Mapping values" includes the health data source description, the statistical method used and the mapping value lists. The statistical method part describes the name, title, description, data source, and statistical parameters of the statistical method used. The data source description shows metadata of health information, such as the source of the data, the time range of data, and the contact information. Statistical methods are used to generate classification maps and charts for health related activities. We predefined some parameters from the spatial, temporal, and demographic aspects for public health, such as AgeFrom, AgeTo, and StartTime, which can show health distributions with spatial, temporal, age and gender differences. Users can add additional parameters in the parameter group to support advanced statistical methods.

Figure 3.

The mapping data part schema. The "mapping data" part schema includes a "BoundingBox" component, a "SpatialData" component, a "Relation" component and a "MappingValues" component.

The Representation part defines the style used to represent health maps. It describes the default representation bounding box and style description. Depending on the kind of representation, the StyleType is extended to ChartStyleType, PointStyleType, LineStyleType, and PolygonStyleType. For instance, the PolygonStyleType includes the border and fill elements. The type of filling in a polygon can be gradient fill or range-based fill. For the range-based fill, the fill method can use color, pattern, and texture. The border element contains the color, line style and line weight of the border.

WPS for Health Data Processing with HERXML

The procedure of our WPS design is shown in Figure 4. The input includes health data and parameters. The health data for the Web-based processing could be stored in the server (in databases or files) or acquired through remote access (through web services or remote transfers). The parameters can be encoded by Key/Value pairs or XML, including the disease type, gender, age group, statistical method, time interval, spatial layer, and thematic mapping variables. The output of the processing could be either in the raster data format (JPEG, PNG, GIF) or in the vector data format (HERXML). The use of HERXML in processing can enhance people's understanding of the resulting health information mapped. In the configuration of the WPS, the access of WPS can be limited to certain domains or IP addresses. The WPS can be further divide into fine granularity, with one processing service for the statistical calculation and the other processing service for the thematic mapping.

Figure 4.

A WPS for health data processing. The flow shows the input data, output data, and processing components of the designed WPS.

Architecture for Health Data Processing and Sharing

To implement a Web-based application for statistical exploration of health information, service oriented architecture is an effective solution.[21] In this research, we implement the standard OGC services including WMS, WFS, and WPS. The proposed architecture (see Figure 5) includes three tiers: a data tier, a service tier, and a web portal tier.

Figure 5.

Implemented health data processing and sharing architecture. The architecture contains a data tier, a service tier, and a web portal tier.

The data tier stores all the health data and related data for health studies. These data could be available from databases or web services.

The service tier implements WMS, WFS, and WPS for health studies.

  • WMS provides standard interfaces to generate maps and charts for visualization of health information. It utilizes the health mapping module to generate maps to show events or facilities distribution. The input data could be obtained from HERXML, GML, WFS, WPS, DBMS, or files.

  • WFS uses the GML transformation module to share spatial data through GML. It can be linked with the mapping values (part of HERXML) to create thematic health maps.

  • WPS is used to analyze spatio-temporal health data. The health data analysis supports data rolling up from a low spatial level to a high spatial level. WPS uses the health mapping module and statistical procedures. The input data of WPS could be obtained through WFS, GML, DBMS, or files.

The web portal tier is a client for the visualization of disease data and maps. It can bring together different facets of health information into one location to improve health promotion, health care research, education, and policy making.