WMS Service Mining

Services such as Jeremy’s mapdex has certainly raised the profile or rather, the ease of use in finding geospatial web services. Although limited to picking up ArcIMS at the moment, i eagerley await his next installment of WMS service support *hint hint*.

In the mean time, i had a query from a collegue about how to go about finding some open WMS services so he could test out one of his apps. While CS-W (Catalogue Services – Web) is brewing away in the never never, lets use Mr Google,

http://www.google.com/search?q=allinurl:+”Request=getcapabilities”

12,100 results. A good enough start. Although be weary that pagerank doesnt really help us,

  1. in determining the more “popular” services
  2. if the service is still alive
  3. if the service is even supposed to be public

I’d highly advise contacting the owner before using any servers in a non-test environment :)
Lets narrow it down to country,

(Aus) http://www.google.com/search?q=allinurl:+”Request=getcapabilities”&meta=cr=countryAU

  • 30 results. Ouch

(US) http://www.google.com/search?q=allinurl:+”Request=getcapabilities”&meta=cr=countryUS

  • 980 results … better

(UK) http://www.google.com/search?q=allinurl:+”Request=getcapabilities”&meta=cr=countryUK

  • 16 results. Surprising.

Sure it’s hit and miss but at least it gives you somewhere to start. Who knows, you might even find a service which is documented and actually has metadata!

Skylab Mobilesystems has a rather large list available which was collected using their “WMS-Crawler”. No idea on how it actually collects the servers, but i suspect its much the same as what we use with google.

If there are further resources on public services, please leave a comment for others to use.

It raises an interesting point on WMS security though. Mapdex honours any ACL restrictions on ArcIMS services … but with no such alternative for most WMS apps, anyone who visits a getCapabilities document and then visits google is potentially opening up the server to the googlebot crawler.

I guess if you’re really that worried, you should look into adding your WMS applications to your robots.txt, or using some rewrite rule to deny access to identified crawlers.

Interesting point none the less.

ArcIMS stress test

Arcscripts can be a funny place. It can either annoy you with commercial “shareware” or 8 year old avenue code or sometimes, just sometimes you stumble upon an extremely useful tool. I have been involved in an upgrade of ArcIMS from 4 to 9.1 and obviously i was looking into performance testing before we moved the new infrastructure into production.

I was *this* close to writing my own jmeter script when i found ArcIMS Stress Test which effectively did every thing i was looking for,

  • Parses an existing image/queryserver logfile to re-use for the test
  • Ability to use the same time between requests as in the log file or a preset delay
  • Ability to view the ouput of the sent requests
  • Ability to log test summary information to file

Stress dialog

Sure, it doesnt log the hardware performance on the server or uses any special threading / multi-user system, but for my case all i wanted to do was test a “typical day” on the new rig. Highly recommended

Kudos to the author, Milos van Leeuwen

Spatial indexing

Update: Further investigation revealed that my GiST indexes weren’t built correctly and so i have since updated the timings on the postgis results. Thanks to Sean for pointing out my boo boo.

I have noticed a bit of interest especially on PostGIS‘ new implementation of a GiST spatial index to speed up its performance. What i think is lacking is that there is little to no documentation on how “much” faster spatial indexing (not to be confused with attribute indexes) performs.

So here we go, a simple benchmark in less than 15 mins :)

First up, my datasets. For the base data, i will use the roads dataset from my previous routing article.

The benchmark tests in this case will be a simple, arbitrary bbox on the following datasets interfacing with Mapserver 4.6 (win32).

Word of warning: Please dont take these results as gospel, its merely to highlight the performance differences across the board.

FYI:

  • All data is in CRS 4326 (WGS84)
  • The timings will be extracted using the debug output from mapserver
  • Note the number of features in each query, the first obviously not being very realistic
  • A WMS requests are for single layers only with the BBOX values below
  1. 115.69402,-32.1273,116.09642,-31.86770 Features: 57564
  2. 115.82508,-32.0358,115.93653,-31.97567 Features: 8408
  3. 115.81400,-32.0493,115.86604,-32.02131 Features: 1389
  4. 115.82171,-32.0405,115.84112,-32.03009 Features: 338
  Shapefile (no index) Shapefile (qix index^) PostGIS 8.1 (no index) PostGIS 8.1 (GiST index)*
1. 2.938s 2.201s 5.297s 4.275s
2. 0.694s 0.294s 2.812s 1.656s
3. 0.601s 0.140s 1.796s 0.987s
4. 0.219s 0.032s 0.914s 0.223s
  • ^ Created with the standard quadtree sizes determine by the shptree utility
  • * Created with “CREATE INDEX roadsindex ON roadsindex USING GIST( the_geom GIST_GEOMETRY_OPS );” as per postgis documentation

I think the table speaks for itself, but im a little bit cautious about drawing too many conclusions. Most people are aware that PostGIS is slower than shapefiles, but thats a given. Most users who use PostGIS are using it for other reasons (such as for its geoprocessing functions).

The PostGIS guys have claimed about a 10% performance loss over shapefiles. In this little test it was indeed more than this, but i have no doubts that in the hands of a more experienced postGIS expert that they could certainly narrow down the gap further.
Fact of the matter is, spatial indexing is an important part of performance optimisation but should be considered with other methods such as view scale limiting, simplified symbolisation and feature generalisation. Unfortunately i dont have access to a mapserver hooked up with ArcSDE 9 or Oracle Spatial … could of made things interesting :)

Please, i make no attempts at claiming i am the master of all that is postgis and mapserver. If there is something glaringly obvious that i did not configure or did not include, please let me know and i will be happy to update the results.

Whats your background?

After having been through the usual hunt for job candidates, i was quite surprised and alarmed by

  1. the lack of suitable, technical GIS applicants and
  2. the general IT professionals who think they are suitable when they have no spatial understanding

Do you think it’s more beneficial to have spatial understanding and then be trained up; or the opposite, and train them with spatial knowledge.

I’d love to hear from the public as to what your experiences have been and how you got into the industry. In my travels, it seems as though people with IT backgrounds + spatial training outnumber the pure spatial background people quite significantly .. which is a shame, as these people are usually the ones who end up with the Spatial Analyst type roles.
It seems that most GIS related graduates get sucked into the vacuum that is data capture/manipulation, never to be seen from again.

In answer to my own question, i obviously graduated with a Bsc(GIS) but was lucky enough to have significant technical background in web development and programming to get me through my current work. Perhaps, the problem is two-fold, in that the current geospatial degrees have too much spatial and not enough technical work.

I have been in more than one situation where an IT professional by trade has scared the living daylights out of me with their lack of any spatial understanding. Whats a projection? Whats topology?