High Performance GML to SVG Transformation for the Visual Presentation of Geographic Data in Web-Based Mapping Systems

Kenneth S. Herdy

Graduate Student
Simon Fraser University School of Computing Science


                            Surrey
                            250-13450 102nd Avenue
                            V3T 0A3
                            Canada
                            
                        

Kenneth S. Herdy completed an Advanced Diploma of Technology in Geographical Information Systems at the British Columbia Institute of Technology in 2003 and earned a Bachelor of Science in Computing Science with a Certificate in Spatial Information Systems at Simon Fraser University in 2005. He is currently pursuing graduate studies in Computing Science at Simon Fraser University with industrial scholarship support from the Natural Sciences and Engineering Research Council of Canada, the Mathematics of Information Technology and Complex Systems NCE, and the BC Innovation Council. His research focus is an analysis of the principal techniques that may be used to improve XML processing performance in the context of the Geography Markup Language (GML).

David S. Burggraf

Director of Research and Development
Galdos Systems Inc. Research and Development


                            Vancouver
                            Suite 1300-409 Granville Street
                            V6C 1T2
                            Canada
                            
                        

David S. Burggraf earned his Ph.D. at the University of British Columbia (Mathematics) in 2003. He is presently the Director of Research and Development at Galdos Systems Inc. with research interests in the application of semantic, mathematical and spatial object modeling techniques to distributed geographic information systems. David is an active member of the Open Geospatial Consortium (OGC) and is a key contributor to the development of GeoWeb standards, such as, OGC KML, GML and GMLJP2.

Robert D. Cameron

Professor of Computing Science
Simon Fraser University School of Computing Science


                            Surrey
                            250-13450 102nd Avenue
                            V3T 0A3
                            Canada
                            
                        

Robert D. Cameron earned his Ph.D from the University of British Columbia in 1983. He is presently a Professor of Computing Science at Simon Fraser University with research interests in programming languages, software engineering, data compression, digital libraries and sociotechnical design of public computing infrastructure. Since completing a six-year term as Associate Dean of Applied Sciences in 2004, his research focus has been on high-performance XML processing using the SIMD capabilities of commodity processors.


Abstract


A performance study considering several alternatives for the visualization of geographic information (GI) in on-demand web-based mapping systems focused primarily on server-side generation of SVG from data encoded in Geography Markup Language (GML). In that context, a detailed performance comparison of GML to SVG transformation using several XML transformation technologies was carried out, including Java-based XSLT technologies, direct SAX implementations in both Java and C++, as well as high performance implementations using the recently released Intel XML Software Suite 1.0 and the new high performance XML technology based on parallel bit streams (Parabix). Other alternatives considered in the traditional 3-tier architecture for on-demand web-based mapping systems include the use of data parallel, AJAX-based architectures, as an alternative to traditional multi-threaded, server-side approaches in the generation of SVG map layers. The possibilities of using streaming SVG technologies and progressive rendering to reduce latency are also investigated.

Whereas XSLT technologies were found to be competitive to direct implementation with SAX-based implementations within a factor of two, implementation using the high-performance Parabix framework offered the best performance and were found to offer a speed-up over worst case XSLT of well over an order of magnitude. This performance improvement is primarily due to the reduction of XML parsing cost using parallel bit stream technology, but care is required in using the framework in order to avoid other bottlenecks in the transformation process. For example, an initial C-based implementation on top of Parabix was found to have a significant remaining bottleneck in formatted output, which was eliminated by careful reimplementation.

The client-side translation and scaling features of SVG were found to be of substantial value in addressing server-side performance. While the initial naive XSLT implementations performed transformations from world to screen using XSLT extension functions within the GML to SVG software, it proved possible to avoid this work by providing the client with the appropriate transformation matrix parameters. With some GML data sets dominated by long lists of coordinate data, this optimization proved quite valuable in all implementations of the GML to SVG transformation benchmarks. Overall, the highly efficient parallel bit stream based scanning routines of the Parabix engine provided the best performance results in the parsing of long coordinate lists. No apparent client-side performance degradation due to client side coordinate transformation was observed on any of the test platforms.

Although programming within the Parabix framework is similar in nature to programming under SAX, it would ultimately be desirable to provide high-performance alternatives using high-level programming paradigms such as that of XSLT. Implementation of a high-performance XSLT processor using Parabix is an active focus of our ongoing research. A further research direction worth considering is the possibility of deploying Parabix on the client-side to provide performance improvements in SVG rendering performance.


Table of Contents

Introduction
XML Transformation Technologies
XSLT Design Patterns
GML to SVG Document Transformation
Source GML Document Structure
Destination SVG Document Structure
Multiple Source GML Documents
Performance Evaluation
High Performance XML Technologies
Test Environment
Hardware Events
Benchmark Data Characteristics
GML to SVG Benchmarks
Performance Results
Discussion
Data Parallelism
Conclusion
Acknowledgements
Bibliography

The visualization of geographic information is one of the primary goals of on-demand web-based mapping systems [1]. Web-based mapping systems commonly encode spatial data with GML for transmission and with SVG for display [1][2][3]. GML is an XML grammar defined by the Open Geospatial Consortium (OGC) to encode geographical features [4]. As an XML grammar, GML is platform neutral and is well suited to facilitate the exchange of spatial data over the Internet. GML however, is not a visualization format. Rather, GML relies on commercially available viewers for data visualization, with Scalable Vector Graphics (SVG) viewers being one of the most common [1]. Large volumes of GML data are typical in on-demand web-based mapping, and as a consequence, the visualization of GML as SVG requires high-performance GML to SVG translation.

In general, a three step approach is required to model and display spatial data using GML and SVG. The following steps describe this process.

  1. Model and persist spatial data.

    The modelling and persistence of GML data is not often based on GML feature collections, but rather geographic features are typically mapped and stored in relational database tables. WFS (Web Feature Services) feature collections are used to aggregate and encode query results as GML for transmission to the client.

    By convention, many application types such as Open Geospatial Consortium (OGC) Component Web Map Services (FPS), Map Style Editors, and query results handlers for WFS organize spatial data at the GML document level as collections of GML features of the same GML feature type. That is, geographic features are first classified by GML feature type, and then grouped and stored as collections of GML features within separate GML documents. In general, a set of GML encoded query results may be obtained from several different network locations.

  2. Transform and assemble a set of source GML documents into a single SVG document for display.

    In general, the transformation and assembly of a set of GML documents into a single SVG document involves the extraction and translation of GML encoded spatial data to SVG. Conventionally, each GML data store query result consists of a single GML feature collection, with each feature collection corresponding to a distinct SVG map layer in the rendering of the final SVG document.

    For example, a collection of GML encoded river features may be rendered after other feature types, such as a surface topography types, or before additional feature types, such as a bridge or a boats types. Consequently, the display order of the set of source GML documents must be represented in the final SVG encoding. In SVG, rendering order is defined by the Painter's Model, as described in the SVG 1.1 Specification [5]. SVG rendering order follows pre-order document traversal. Overall, map layer rendering order is application specific.

  3. Parse and render the generated SVG map document.

    An SVG viewer parses and renders the SVG map document for display.

Several well-known technologies exist to parse and extract the spatial data contained within GML documents for translation to SVG. Commonly used approaches include Java or C++ implementations using SAX (Simple API for XML), DOM (Document Object Model) or pull parser interfaces and declarative implementations using XSLT (eXtensible Language Stylesheet Transformation) or XQuery processors. For the traditional programming approaches, we have confined our study to several SAX alternatives, avoiding the performance impact of tree-building with DOM and choosing SAX over the similarly performing pull parser model because it is more widely known and used. In addition, whereas XQuery may be easier to learn than XSLT, GML to SVG translation represents an XML to XML translation problem commonly solved with XSLT based software. So, among declarative language approaches we have chosen XSLT over XQuery.

SAX is a streaming interface [6] that provides serial access to the contents of an XML document. In general, a SAX parser functions as a stream parser, which provides an event-driven API to the application developer. Applications receive information from XML documents in a continuous stream, without backtracking or navigation [6]. A SAX parser does not maintain application level parse state context information. Instead, the maintenance of state information is the responsibility of the application.

SAX has a reputation as an efficient XML parsing model but often requires additional implementation effort and greater software development skill [6][7]. SAX is not an open standard and is not portable across programming languages. Despite these limitations, in many scenarios the efficiency of SAX together with the capability to process large XML documents in linear time and near-constant memory makes SAX a favored choice [6].

According to the W3C XSLT 1.0 Specification, XSLT is primarily designed for XML to XML document transformation [8]. Document Object Model (DOM) based XSLT processors provide random memory access and maintain parse state information but typically at the cost of increased memory usage. This additional memory requirement tends to eliminate DOM-based XSLT processors as a viable transformation alternative in the processing of large GML documents. Interestingly, despite its memory requirements, XSLT is commonly presented as the technology of choice to perform GML to SVG translation [1][2][3][9]. As a declarative language, the appeal of XSLT may be attributable to a perceived ease of use for non-programmers, or alternatively from the perspective of system architects, the appeal of XSLT may be enhanced portability and flexibility offered by open standard compliant XSLT processors.

Michael Kay, author of XSLT: Programmer's Reference, describes the "fill-in-the-blanks" stylesheet pattern as a common XSLT stylesheet design pattern in which an XSLT stylesheet acts largely as an output template but with the addition of extra tags used to retrieve and insert variable data at particular points in the destination document [10]. GML to SVG translation corresponds to this "fill-in-the-blanks" pattern but with the additional characteristic that the source GML document is traversed serially and without backtracking. Kay's "fill-in-the-Blanks" pattern together with serial source document traversal is straightforward to implement using the SAX event based API. Consequently, despite minor additional implementation complexity, SAX-based GML to SVG translation provides a reasonable alternative to XSLT.

The focus of this paper is the evaluation of GML to SVG transformation performance. Section 2 of this paper describes the GML to SVG translation problem. Section 3 then moves on to describe the methodology of our performance analysis. Section 4 presents the performance results. Section 5 provides an analysis of the performance results and describes data parallel GML to SVG translation with particular emphasis on system architecture and the reduction of request/response latency. Section 6 concludes the paper with a summary of the results and directions for future work.

GML to SVG document transformation involves the extraction and translation of source GML encoded features to equivalent destination SVG encodings. A basic understanding of the GML primitives and their equivalent SVG counterparts is necessary to perform this translation.

GML contains a rich set of primitives. The GML feature primitive and the GML geometries primitives are required to map GML to SVG.

In GML, a feature is an application defined object that represents a physical entity such as a bridge, river, or road [11]. In general, GML models real world concepts as geographic features, which are delivered as feature collections by feature services and organized by component web map services into feature layers, commonly referred to in the mapping world as GML feature layers. Each GML feature can contain multiple GML geometries. A set of transformed GML feature layers comprise the layers of an SVG encoded map.

The GML version 3.1.1 specification encodes coordinate data as child elements of GML geometry elements. The following code fragments demonstrate the various GML coordinate data encodings shown as child elements of GML geometry elements [12].

<gml:Point>
    <gml:coordinates cs="," decimal="." ts="  ">491837.890625,5459107.421875</gml:coordinates>
</gml:Point>
                                
<gml:LineString>
    <gml:pos>491837.890625 5459107.421875</gml:pos>
</gml:LineString>

<gml:LineString>
    <gml:coordinates cs="," decimal="." ts="  ">491837.890625,5459107.421875</gml:coordinates>
</gml:LineString>

<gml:Polygon>
    <gml:posList dimension="2">491837.890625 5459107.421875 492837.890625 5429107.42187</gml:posList>
</gml:Polygon>

SVG is an XML grammar used to encode 2-dimensional vector graphics. In the translation of GML to SVG, a destination SVG document is generated which contains a root SVG element. This root element in turn contains one or more group elements. Each group element corresponds to a distinct GML layer and contains the necessary information to render the set of spatial features contained within that layer. Layer specific styling rules are applied to each SVG map layer.

The following source GML document values are sufficient to generate a corresponding SVG destination document.

  1. GML bounding box coordinate pair values.

    Minimum and maximum bounding box coordinate pair values facilitate the transformation of a GML world coordinate system to an equivalent SVG screen coordinate system.

  2. GML geometry object identifiers.

    Unique GML geometry object attribute values map to unique SVG path attribute values. Unique path identifier values provide a means to select and identify GML geometry objects as the mapped SVG representation within SVG client viewers.

  3. GML geometry object coordinate data.

    Geometry object coordinate data is modified and assigned to the data or 'd' attribute of the corresponding SVG path element. This process involves the translation of GML encoded coordinate data to SVG encoded path data. SVG coordinate path data must be prepended with a single 'M' (absolute move to) command letter. If required, a single 'L' (absolute line to) command letter is also inserted after the first coordinate pair.

To achieve world to screen coordinate system translation, a global scaling operation, followed by a global translation operation is applied to each SVG feature layer group element. Transformation operations are based on the world coordinate system bounding box values and the SVG screen coordinate system bounding box values. The following figure illustrates the GML to SVG, world to screen coordinates system transformation process in terms of basic scaling and translation matrix operations. Alternatively these individual transformations may be combined to yield an equivalent SVG transformation matrix.


The above diagram demonstrates GML world coordinate reference system to SVG screen coordinate reference system transformation via the SVG transform attribute. The area represented by the blue rectangle labelled '1', illustrates a GML region containing a single triangle object expressed with respect to a world coordinates reference system. In this example, world coordinate y-values increase upwards and screen coordinates y-values increase downwards. As a consequence, the triangle appears inverted. Applying a SVG scaling operation produces the blue area labelled '2'. In this case, GML coordinates are now scaled to the resolution of SVG screen coordinates with y-values reflected across the x-axis. The scaled area is then shifted to the gold location labelled '3' via the SVG translate operation. The gold location represents the viewable on-screen region of the SVG viewbox.

The following GML document fragment illustrates the basic structure of a source GML document. In particular, this GML fragment models a 'bridge' feature collection.

<?xml version="1.0" encoding="UTF-8"?>
<gml:FeatureCollection xmlns:gml="http://www.opengis.net/gml" xmlns:van="http://www.galdosinc.com/vancouver" xmlns:xlink="http://www.w3.org/1999/xlink">
    <gml:boundedBy>
        <gml:Envelope srsName="EPSG:32610">
            <gml:lowerCorner>485831.999999062 5449693.02392436</gml:lowerCorner>
            <gml:upperCorner>499865.999999993 5471525.99963723</gml:upperCorner>
        </gml:Envelope>
    </gml:boundedBy>
    <gml:featureMember>
        <van:Bridge gml:id="Bridge294">
            <gml:description xlink:type="simple">RoadStructures bridge</gml:description>
            <gml:name/>
            <gml:centerLineOf xlink:type="simple">
                <gml:LineString gml:id="GML_LG_30779" srsName="EPSG:32610">
                    <gml:coordinates cs="," decimal="." ts=" ">498279.999999882,5456326.99963311 
                      498282.999999884,5456332.99963311 498285.999999884,5456338.99963311</gml:coordinates>
                </gml:LineString>
            </gml:centerLineOf>
        </van:Bridge>
    </gml:featureMember>
    <gml:featureMember>
        <van:Bridge gml:id="Bridge278">
          <gml:description xlink:type="simple">RoadStructures bridge</gml:description>
          <gml:name/>
          <gml:centerLineOf xlink:type="simple">
              <gml:LineString gml:id="GML_LG_30763" srsName="EPSG:32610">
                  <gml:coordinates cs="," decimal="." ts=" ">498313.999999892,5456417.99963313 
                    498296.999999881,5456435.99963315 498279.999999882,5456453.99963314</gml:coordinates>
              </gml:LineString>
          </gml:centerLineOf>
        </van:Bridge>
      </gml:featureMember>
</gml:FeatureCollection>

The following SVG document fragment illustrates the basic structure and contents of the resultant SVG document in the translation of the source GML 'bridge' document fragment to SVG. Of interest, a reduction in relative document size is observed due to the relatively flat SVG document structure as compared to the source GML.

<?xml version="1.0" encoding="UTF-8"?>
<svg width="600" height="600" xmlns="http://www.w3.org/2000/svg" version="1.1" baseProfile="tiny">
    <g transform="scale(0.02608695652173913,-0.02608695652173913) translate(-482000.000000,-5472000.000000)" 
                               style="stroke-width:38.333333333333336;stroke:rgb(69,34,118);fill:none;">
        <path id="Bridge294" d="M498279.999999882,5456326.99963311 
                                                     L498282.999999884,5456332.99963311 
                                                     498285.999999884,5456338.99963311"/>
        <path id="Bridge278" d="M498313.999999892,5456417.99963313 
                                                     L498296.999999881,5456435.99963315 
                                                     498279.999999882,5456453.99963314"/>
    </g>
</svg>

The algorithm to transform GML to SVG does not increase in complexity with the addition of multiple source GML documents. In the context of a single threaded process, transforming a set of source GML documents simply lengthens the total transformation time, with the results of each source GML document transformation appended to a single destination document. GML to SVG transformation decisions are simply based on a potentially larger set of GML feature element names, GML geometry element names and GML coordinates elements names.

In the case that GML source documents are transformed and assembled in parallel, each GML to SVG transformation thread can be initialized with the minimal layer-specific transformation information. Multiple source GML document can then be transformed and appended independently to the resulting SVG document tree. The completion of the final transformation marks the completion of the overall transformation process.

The assembly of the final SVG map document must follow the global logical rendering order of all transformed source GML documents. This ordering is expressed in the final SVG document structure. The following figure demonstrates the relationship between global GML layer order and the SVG document structure. Lower logically Z-indexed GML layers are located earlier in the SVG tree structure with respect to a pre-order traversal and rendered prior to higher Z-indexed layers.


The structure of the SVG document corresponding to the above SVG tree structure is illustrated with the following abbreviated SVG document fragment.

<?xml version="1.0" encoding="UTF-8"?>
<svg width="600" height="600" xmlns="http://www.w3.org/2000/svg" version="1.1" baseProfile="tiny">
    <!-- Layer 1 -->
    <g transform="..." style="...">
        <path id="L1-1" d="..."/>
        <path id="L1-2" d="..."/>
        .
        .
        .
        <path id="L1-i" d="..."/>                        
    </g>
    <!-- Layer 2 -->
    <g transform="..." style="...">
        <path id="L2-1  d="..."/>
        <path id="L2-2" d="..."/>
        .
        .
        .
        <path id="L2-j" d="..."/>                        
    </g>
    .
    .
    .
    <!-- Layer N -->
    <g transform="..." style="...">
        <path id="LN-1  d="..."/>
        <path id="LN-2" d="..."/>
        .
        .
        .
        <path id="LN-k" d="..."/>                        
    </g>
</svg>

In this section we present a performance evaluation of a wide spectrum of GML to SVG translation transformation technologies. If available, the SAX 2.0 API was selected in the evaluation of each of the SAX-based parsers. Direct SAX-based implementations include the following candidates.

  • Xerces-J 2.9.1 Release [13]

  • Xerces-C++ 2.8.0 Release[14]

  • Crimson 1.1 Release [15]

  • Intel XML Software Suite 1.0 for Java Environments (JAXP SAX API)

  • Intel XML Software Suite 1.0 for C/C++ (SAX API)

  • Parabix [16]

Several Java-based XSLT processors were also evaluated.

  • Saxon XSLT 2.0 9.1.0.1 Release [17]

  • Intel XSLT Accelerator for Java Environments

In addition, external world to screen coordinate system transformation methods are implemented to convert GML coordinate data to SVG path data. The Style Extension Functions project entitled the "OpenGIS XSLT Map Style Sheet Specification" implemented in Java, provides the necessary functionality to transform GML coordinate data to SVG path data [18]' . Equivalent functions were implemented in C for use in the evaluation the C/C++ based benchmarks.

The Intel XML Software Suite 1.0 and Parabix (Parallel bit streams for XML) represent new and high performance XML processing alternatives which merit additional description.

The Intel XML Software Suite (XSS) is a set of high-performance run-time libraries for processing XML. This software suite supplies application developers with a C/C++ SAX interface, a Java API for XML Processing (JAXP) SAX interface, and a JAXP XSLT interface. According to Intel, this software leverages the Intel Core™ microarchitecture and provides thread-safe and efficient memory utilization, scalable stream-to-stream processing, and large XML file processing capabilities, with continuous workload support [19].

XSS is based on the Intel XML Core. The Intel XML Core features accelerated XML parsing, XPath expression, Schema validation, and XSLT processing functionality. The Intel XML Software Suite for Java Environments provides indirect access to the underlying Intel XML core via the Java Native Interface (JNI) technology. The Intel XML Software Suite for C/C++ avoids JNI overhead and provides more direct access to the XML core through a custom C/C++ SAX API. The Intel XSLT Accelerator for Java Environments transforms XML data via the XSL language and the additional capabilities of XSLT extension functions. Intel XSLT functionality is XSLT 1.0 compliant.

Parabix is an open-source XML processing technology that uses a fundamentally new way to perform high-speed parsing of XML documents [16]. Parabix leverages the SIMD (Single Instruction, Multiple Data) capabilities of commodity processors to deliver dramatic performance improvements over traditional byte-at-a-time parsing technologies. Byte-oriented character data is first transformed to a set of 8 parallel bit streams with each stream comprising one bit per character code unit. Critical XML parsing operations are then carried out in parallel using bitwise logic and shifting operations. Traditional byte-at-a-time scanning loops are replaced with bit scan operations. Each bit scan operation can potentially advance by as many as 64 byte positions with a single instruction [21]. Since the core bitstream algorithms of Parabix are expected to be highly parallelizable, future directions for the Parabix engine includes work on leveraging the performance benefits of parallel processing on multicore technology [21].

To further leverage the high performance bit scan operations of Parabix, the Parabix engine provided a pull-based GML coordinate conversion method. This pull-based method allows an application to advance the underlying Parabix parsing engine directly and leverages the underlying bit scan operations of the engine.

For the purpose of this performance study, the GML to SVG benchmark implementations based on the Parabix pull parsing feature is known as Parabix ILAX (Pull). The standard serial access and event-based Parabix implementation, equivalent to each of the SAX benchmarks, is described simply as Parabix ILAX. ILAX is an acronym which stands for In-Line API for XML and is functionally equivalent to a SAX event-based API in which an application registers event handlers at compile time.

The key hardware event evaluated in this performance analysis is processor cycles. This metric is reported as the number of processor cycles per source GML byte. The PAPI facilitated the collection of hardware cycle data directly via the PAPI C API [20]. A JNI wrapper to the PAPI C API enabled the collection of hardware cycle counts for the Java implementations. Performance results are adjusted to account for additional cycle overhead as a result of performance monitoring instrumentation and specifically for the effects of JNI function calls crossing the Java/C boundary as well as.

GML to SVG data translations are executed on GML source data modelling the city of Vancouver, British Columbia, Canada. This data set consists of 46 distinct GML feature layers ranging in size from approximately 1.5 KB to 12 MB. In this performance study, approximately 21 MB of source GML data generates approximately 8.8 MB of destination SVG data.

The following table displays several test data set characteristics of interest. The Z-index property is application specific and indicates the SVG document assembly and rendering order. Lower Z-index value layers are rendered before higher Z-index layers. GML source feature element, geometry element and coordinates element tag names allow SAX implementations to build parse state information and locate GML coordinate and feature identifier data.

In external conversion function based benchmarks, GML coordinate data size impacts overall GML to SVG transformation performance. In this data set, water body features layers, such as the Ocean and Lake GML layers, contain of a relatively small number of geometry objects (polygons) but each geometry object contains a large volume of coordinate data. In contrast, the roads layers, RL1U to RP6U contain a large number of geometry objects (line segments), but each geometry objects contains relatively few coordinate data pairs.

The 'van' namespace prefix corresponds to a GML application specific namespace URI. The gml namespace prefix corresponds to the GML namespace URI value, 'http://www.opengis.net/gml'. Fully qualified element names are used to match and identify source document elements.

Z-IndexGML Feature Element Tag NameGML Geometry Element Tag NameGML Coordinates Tag Element NameGML Document Size (bytes)GML Coordinates Size (bytes)GML Coordinates Element CountGML Coordinates Average Length (bytes)SVG Document Size (bytes)
0van:Oceangml:extentOfgml:coordinates1392171329941310230.31131823
1van:Lakegml:extentOfgml:coordinates10856682220601370.3375729
2van:Buildupgml:extentOfgml:coordinates45543502135023492
3van:TailingPondgml:extentOfgml:coordinates15464811481505
4van:Reservoirgml:extentOfgml:coordinates744637237531.863055
5van:UnspecifiedBuildinggml:extentOfgml:coordinates182501088914777.799342
6van:Churchgml:extentOfgml:coordinates227787106560276386.0978304
7van:CityHallgml:extentOfgml:coordinates517923495469.82006
8van:Collegegml:extentOfgml:coordinates23703125282452210228
9van:Communicationsgml:extentOfgml:coordinates405815994399.751360
10van:CourtHousegml:extentOfgml:coordinates24949812490.5947
11van:FerryTerminalgml:extentOfgml:coordinates17897161716787
12van:FireStationgml:extentOfgml:coordinates20115910423395.836922
13van:Greenhousegml:extentOfgml:coordinates664328857412.142336
14van:Hospitalgml:extentOfgml:coordinates469642818641687.4624138
15van:PoliceStationgml:extentOfgml:coordinates12174560513431.154367
16van:PostOfficegml:extentOfgml:coordinates14404660516412.815058
17van:Schoolgml:extentOfgml:coordinates276203151043285529.98121861
18van:Universitygml:extentOfgml:coordinates22801216112161260
19van:TransmissionTowergml:extentOfgml:coordinates200836114394183242.9693157
20van:Riverbgml:extentOfgml:coordinates264731320629455.3810139
21van:Trestlegml:centerLineOfgml:coordinates13759482524201.044050
22van:Bridgegml:centerLineOfgml:coordinates2370079435041222980801
23van:Tunnelgml:centerLineOfgml:coordinates355412245244.81195
24van:SingleTrackgml:centerLineOfgml:coordinates319383151778464327.11138880
25van:DoubleTrackgml:centerLineOfgml:coordinates9090343514245.363188
26van:MultipleTrackgml:centerLineOfgml:coordinates18078941122427.778941
27van:AbandonedTrackgml:centerLineOfgml:coordinates21227692384.5860
28van:LightRailTransitgml:centerLineOfgml:coordinates483601887077245.0616860
29van:Spurgml:centerLineOfgml:coordinates467252407565370.3822400
30van:Pipelinegml:centerLineOfgml:coordinates8756358013275.383191
31van:TransmissionLinegml:centerLineOfgml:coordinates12780850049206242.9641644
32van:CutEarthworkgml:centerLineOfgml:coordinates11486688911626.276473
33van:FerryRoutegml:centerLineOfgml:coordinates3674163547161576
34van:FillEmbankmentgml:centerLineOfgml:coordinates10453505413388.774539
35van:Footbridgegml:centerLineOfgml:coordinates24266941140235.287517
36van:RetainingWallgml:centerLineOfgml:coordinates6928267210267.22298
37van:RL1Ugml:centerLineOfgml:coordinates450354321505176908311.311997301
38van:RL2Ugml:centerLineOfgml:coordinates295692145015441328.83135318
39van:RP2Ugml:centerLineOfgml:coordinates12523559570173920051284.365252389
40van:RP2U1Wgml:centerLineOfgml:coordinates17819989815285353.684146
41van:RP3Ugml:centerLineOfgml:coordinates6995401175733989
42van:RP4Dgml:centerLineOfgml:coordinates265199121473421288.53112176
43van:RP4Ugml:centerLineOfgml:coordinates13207515879042151273.32540190
44van:RP6Ugml:centerLineOfgml:coordinates17985042252603
45van:RROUGHgml:centerLineOfgml:coordinates227813140774250563.1135200

Table 3. Benchmark Data Characteristics


In each benchmark, GML feature elements and GML geometry elements tags are matched. GML coordinate data are then extracted and transformed to the SVG path data encodings. Equivalent SVG path elements are generated and output to the destination SVG document.

The following pair of XSLT stylesheet fragments demonstrate the per GML feature layer logic necessary to translate source GML to SVG. The XSLT fragment presented below illustrates world to screen coordination reference system conversion through SVG transform attribute parameterization.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform xmlns="http://www.w3.org/2000/svg" xmlns:van="http://www.galdosinc.com/vancouver"
                        xmlns:gml="http://www.opengis.net/gml"
                        xmlns:ext="/org.opengis.gml.StyleExt">
    .
    .    
    .
    <!-- Root Node  -->
    <xsl:template match="/">
        <svg width="{$width}" height="{$height}" version="1.1" baseProfile='"tiny'>
            <g transform="scale($scaleX, -1 * $scaleY) translate(-1*$x1, -1*$y2)"
                style="fill:rgb(79,166,255);fill-opacity:1.0;stroke:none;stroke-opacity:1.0;">
                <xsl:apply-templates select="//van:Ocean/gml:extentOf"/>
            </g>
        </svg>
    </xsl:template>

    <!-- Layer -->
    <xsl:template match="van:Ocean/gml:extentOf">
       <xsl:template name="Polygon">
       <xsl:variable name="id" select="../@gml:id"/>
       <xsl:variable name="d">
               <xsl:apply-templates mode="Path" select=".//gml:coordinates"/>
       </xsl:variable>
       <path id="{$id}" d="{$d}" onmouseover="showTooltip(evt)" onmouseout="hideTooltip(evt)"/>
    </xsl:template>

    <!-- Coordinate data --> 
    <xsl:template match="gml:coordinates">
       <xsl:variable name="svg-path"/>
       <xsl:variable name="svg-prefix" select="substring-before(string(text()),' ')"/>
       <xsl:variable name="svg-suffix" select="substring-after(string(text()),' ')"/>
           <xsl:choose>
           <xsl:when test="string-length($svg-suffix) > 0">
               <xsl:value-of select="concat('M',$svg-prefix,' L',$svg-suffix)"/>
           </xsl:when>
           <xsl:otherwise>
               <xsl:value-of select="''"/>
            </xsl:otherwise>
       </xsl:choose>
    </xsl:template>
 </xsl:transform> 

Note: The above XSLT coordinate data transformation assumes valid input GML coordinate data.

This section presents the benchmarking methodology and GML to SVG performance results. Of interest, the Intel XSLT Accelerator is configurable to allow the transformation of an XML document using multiple threads. This feature is disabled by default but is configurable to allow for parallel transformations through configuration of a maximum number of parallel threads. GML to SVG experiments with settings of two, four and eight parallel threads respectively produced a best case improvement of approximately 2 cycles per source GML byte overall. The following figures present default configuration Intel XML Software Suite results.

The performance results presented in the following figure demonstrate the processor cycle per source GML byte cost of translating the complete GML benchmark data set to SVG. These results illustrate the case in which the SVG transform attribute is used to convert the GML world coordinates system to an equivalent SVG screen coordinates system. As previously described, GML coordinate data values are first manipulated by prepending GML coordinate data strings with a single 'M' command letter, and if required, a single 'L' command letter is inserted after the first coordinate pair. Beyond this lightweight manipulation, source GML coordinate data is not directly altered by the benchmark applications. In addition, to achieve equivalent visual presentation results as the external function based GML to SVG transformation benchmarks, an inverse scaling is applied to each scale dependent style attribute values such as stroke-width CSS property. Since SVG styles are applied on a per GML source document basis, the additional cost of inverse CSS property scaling is not significant.

Overall, the following figure demonstrates a wide performance spectrum in GML to SVG translation technology performance. Parabix ILAX (Pull) demonstrates the best performance requiring approximately only 15 cycles per source GML byte to complete the GML to SVG translation task. A head-to-head comparison of the high performance Parabix ILAX (Pull) processor versus the Intel Software Suite for C++ (SAX API) reveals that Parabix ILAX (Pull) significantly outperforms the Intel C++ SAX implementation.

Parabix ILAX and the Intel Software Suite for C++ demonstrate similar levels of performance, requiring approximately only 21 cycles and 25 cycles per source GML byte respectively. Both Parabix and the Intel Software Suite for C++ dramatically outperform the Xerces-C parser, with each completing the transformation task over 5 times faster than this traditional byte-at-a-time, single core XML parsing technology.

Surprisingly, the Intel XSLT Accelerator for Java Environments outperforms each of the JAXP SAX implementation evaluated. These performance results indicate that the Intel XSLT Accelerator for Java Environments is over 3.5 times faster than the Intel XML Software Suite for Java, Xerces-J and Crimson. The inferior performance of the Intel XML Software Suite for Java (JAXP SAX API) implementation versus the Intel XML Software Suite JAXP XSLT is particularly unexpected. This performance discrepancy may be attributable to a requirement to more frequently cross the Java/C boundary under a JAXP SAX processing model. In a SAX event-based approach, a callback is required for each registered XML parsing event. In the case of the Intel XML Software Suite, each callback requires crossing the Java/C boundary via JNI. In this case, the additional cost of JNI may limit the overall performance benefits of the Intel XML core. Similar reasoning explains the relative superior performance of the Intel XSLT based performance. In this case, the overall impact of JNI overhead may be significantly less for the simple reason that an XSLT processor is not required to a generate an event callback per XML parsing event. As a result, the XSLT processor is not required cross the Java/C boundary as frequently.

Overall, Parabix ILAX (Pull) is over 4 times faster than the Intel XML Software Suite JAXP XSLT Templates implementation and of well over an order of magnitude faster than the Saxon JAXP Templates implementation. In addition, in comparison to each of the Java-based SAX implementation evaluated, Parabix ILAX (Pull) is again well over an order of magnitude faster.


For the purpose of comparison, the performance results presented in the following figure demonstrate the processor cycle per source GML byte cost of translating the complete GML benchmark data set to SVG. These results represent the translation scenario in which external C and Java functions are used to explicitly convert GML world coordinate reference system data to SVG screen coordinate reference system data. This conversion process requires the GML coordinate string tokenization, string to numeric data type conversion, explicit transformation of GML coordinate data, and numeric to string data type conversion. At minimum, the actual numeric conversion requires the inversion of the Y-axis. That is, each source GML Y coordinate value must be scaled by a factor of negative one.

A comparison of the SVG transform attribute based scenario versus the external function based transformation scenario performance reveals that each of the C++ based SAX API implementations incur an additional cost of approximately 90 cycles per source GML byte. This cost is due to the external C library conversion functions costs. A comparison of the JAXP SAX API implementations indicates that external coordinate data conversion adds approximately 300 cycles per source byte to the overall cost. In the case of the JAXP XSLT implementations, external functions add approximately 450 cycles per source GML byte overhead.

The C-based GML coordinates conversion methods are based on the string to double (strrod) C library routine. This C-based approach provides a relatively efficient means to complete the GML coordinate data tokenization and has the added benefit of simultaneous string to numeric data type conversion. In contrast, an examination of the Java library functions revealed the high cost the Java StringTokenizer object. The Java StringTokenizer allocates a new Java String object for each GML coordinate value. This small object memory allocation occurs with each call to the nextToken method of the Java StringTokenizer class. In addition, a floating point Java object is created in the conversion of each coordinate Java String type to the Java BigDecimal type. Excessive small object creation is a well know performance bottleneck and explains the poor Java versus C based performance.

Both the JAXP SAX API and the JAXP XSLT implementations rely on the same set of external Java library conversion methods. However, the JAXP XSLT external functions incurred an addition 150 cycles per source GML byte cost. This additional XSLT processing cost is attributable to the overhead of external function calls in XSLT.

Unfortunately the IntelXML Software Suite generates signal 11 errors with the use of the Java extension functions of this performance study. This runtime error prevented the inclusion of Intel XSLT Accelerator extension function based GML to SVG translation results. In addition, a function to performs explicit GML to SVG coordinates data conversion for the Parabix ILAX (Pull) implementation was not developed for this performance study and consequently performance results for the Parabix ILAX (Pull) implementation are also not presented in the following figure.


The investigation of GML to SVG transformation performance was motivated in part by the claims of the authors of the SVG Open 2003 paper entitled, "SVG Explorer of GML Data" [24]. In this paper, the authors claim low XSLT performance in the case of geographical elements with large numbers of coordinates. An further investigation of this claim was conducted in the context of GML to SVG transformation based on both external XSLT extension functions and the SVG transform attribute.

A linear regression analysis of the proportion GML documents coordinate data versus GML to SVG translation cycles per byte for the Saxon XSLT JAXP Transformers benchmark based on XSLT extension functions exhibited a correlation coefficient value of 0.79. In contrast, the Saxon XSLT JAXP Transformers benchmark based on the SVG transform attribute exhibited a correlation coefficient value of -0.44. Similar results were obtained for each of the other technologies. Since, GML coordinate data extension function based conversion is a relatively expensive operation, this analysis confirms that GML to SVG translation with large volumes of coordinate significantly decreases overall transformation performance. Further, the elimination of explicit server-side coordinate data conversion removes this performance bottleneck.

The most important performance criterion for interactive applications is responsiveness (latency). Latency determines the performance perceived by the end user. The possibilities of using streaming SVG technologies and progressive rendering together with parallel transformation provides a further direction to reduce system latency in on-demand web-based mapping systems.

Data parallelism is a form of parallelism in which the same transformation is applied to each piece of data. In the case of GML to SVG translation, data parallelism is exhibited at the GML document level and the GML feature level. As demonstrated in the following figure it is natural organize, transform and assemble source GML data in parallel at the GML document and without explicit synchronization. As a result, low parallel transformation overhead is present. Consequently, it is not a question of whether to parallelize GML to SVG translation but rather whether to locate the GML to SVG parallel translation logic at the server-side or client-side.


In a traditional, server-side threading approach, an individual thread is instantiated to translate each source GML document request to SVG. The complete document is assembled and then transmitted back to the client. In general this may result in additional request/response latency as the overall GML to SVG transformation performance is dominated by the slowest GML layer. In addition, this approach eliminates the potential of per GML source layer progressive rendering.

In an AJAX based approach, multiple simultaneous client-side layer requests are issued to the server by the SVG client viewer. Client-side rendering logic is then able to progressively render individual map layers transmitted back to the client as SVG results become available for display and with respect to the map layers rendering order. GML document level client-side progressive rendering is straightforward to implement at the client. Nevertheless, large source GML layers have the potential to bottleneck client-side layer rendering. Additional techniques such as compression, line generalization and the splitting of large GML documents are also necessary to reduce request/response latency.

As mentioned, data parallelism also exists at the GML feature level. However, parallel processing at the GML feature level may introduce high threading overhead. Instead, experimentation with the SVG 1.2 Progressive Rendering demonstrates the potential of smoother layer rendering and a further reduction in request/response latency.

The open source and high performance Parabix technology offers the prospect of dramatic performance improvement in XML to XML transformations applications. As illustrated by the GML to SVG transformation benchmark analysis presented, Parabix delivers demonstrable superior XML processing performance.

The streaming event-based approaches offered by the Parabix processor and the Intel XML Software Suite for C++ SAX API offer the ability to process large documents efficiently and avoids the construction of an underlying DOM reducing request/response latency. Consequently, despite some additional implementation effort, SAX-based GML to SVG implementation provides a simple and high performance alternative to XSLT.

From an architectural perspective it would ultimately be desirable to provide high-performance alternatives using high-level programming paradigms such as that of XSLT. The Intel XSLT Accelerator for Java Environments demonstrates improved processing performance in this area, outperforming both traditional JAXP SAX and XSLT implementations. Implementation of a high-performance XSLT processor using Parabix is also an active focus of our ongoing research. A further research direction worth considering is the possibility of deploying Parabix on the client-side to provide performance improvements in SVG rendering performance.

This work was supported in part by an Industrial Post Graduate scholarship provided by the Natural Sciences and Engineering Research Council and the Mathematics of Information Technology and Complex Systems of Canada. Additional support was supplied by the British Columbia Innovation Council via a British Columbia Industrial Innovation Scholarhip. GML Vancouver data set resources were provided by Galdos Systems Inc.