Generic Web Proxies

In my quest for increased adoption of geospatial web services, I would constantly bash my head against the wall trying to debug GIS applications. So if you have suffered from “what the” behaviour such as …

  • weird uri encoding
  • apps pretending to talk SSL but only on some requests
  • not supporting BASIC authentication when they say they do
  • clients not sending the required STYLES WMS kvp
  • sending hundreds upon hundreds of chunked requests …

then these scripts/apps may be for you. They are pretty generic and can be applied to any AJAX-type cross-domain restriction. The only OGC specific type line is the string replace of the online resource with the proxy uri (for obvious reasons for the getcapabilities document).

Other recommends ..

  1. For desktop based apps, i highly recommend fiddler2 as man in the middle proxy interceptor for debugging HTTP. It even does HTTPS mitm :)
  2. If you want to enable HTTPS/BASIC authentication on a desktop client that doesnt support it, check out InteProxy or email me for my own “Gismo” command line version. This will allow apps such as GRASS or QGIS which only has standard WMS support to magically start working on these services

But if you are just trying to get your poor OpenLayers application talking to that lonesome WFS server sitting on the interweb, these might come in handy!

Note that these are open proxies by default!

< ?php
	$urlparams = urldecode($_SERVER['QUERY_STRING']);
         $ch = curl_init();
	curl_setopt($ch, CURLOPT_URL,$url."&Styles=");
	curl_setopt($ch, CURLOPT_SSL_VERIFYHOST,  2);
 	curl_setopt($ch, CURLOPT_USERAGENT, "Openlayers proxy - CTweedie hax"); // Set a different user-agent so we can track usage easier
	curl_setopt($ch, CURLOPT_FAILONERROR,1);
	//curl_setopt($ch, CURLOPT_VERBOSE, 1);
   	curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);  // the next 3 lines makes it work through https SSL3 with authorization.
	curl_setopt($ch, CURLOPT_SSLVERSION, 3);
	curl_setopt($ch, CURLOPT_USERPWD, $user.":".$pass);
	$data = curl_exec($ch); // Execute query
        $data = str_replace("https://www.wms.com/server/to/reflect/to?","https://www.wms.com/server/proxy?", $data)
        $content_type = curl_getinfo( $ch, CURLINFO_CONTENT_TYPE );
	header('Content-Type: '.$content_type);
	echo $data;
	curl_close($ch);
>

Python equivalent … almost identical to the OpenLayers version. In most situations, py urllib runs hands down quicker than php curl but it could well be my dodgy code!

#!/usr/bin/env python -u

import urllib
import urllib2
import cgi
import socket
import msvcrt
import os
import sys
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
# timeout in seconds
timeout = 15
socket.setdefaulttimeout(timeout)

fs = cgi.FieldStorage()
urlt = "https://www.wms.com/server/to/reflect/to?"

for i in fs.keys():
  urlt += i+"="+fs[i].value+"&"
url = urllib.unquote(urlt)
try:
    if url.startswith("http://") or url.startswith("https://"):
           passman = urllib2.HTTPPasswordMgrWithDefaultRealm()      # this creates a password manager
           passman.add_password(None, urlt, 'user', 'password')      # because we have put None at the start it will always use this username/password combination
           authhandler = urllib2.HTTPBasicAuthHandler(passman)                 # create the AuthHandler
           opener = urllib2.build_opener(authhandler)
           urllib2.install_opener(opener)
        y = urllib2.urlopen(url)

        headers = str(y.info()).split('\n')
        for h in headers:
            if h.startswith("Content-Type:"):
                print h
        print
        print y.read().replace("https://www.wms.com/server/to/reflect/to?","https://www.wms.com/server/proxy?")
        y.close()
    else:
        print """Content-Type: text/plain Illegal request."""
except Exception, E:
    print "Status: 500 Unexpected Error"
    print "Content-Type: text/plain"
    print
    print url
    print "Some unexpected error occurred. Error text was:", E

Geoserver KML output

As mentioned a while back, Geoserver had some experimental code for KML output. The latest PR1 release has vastly improved KML support, largely submitted by James MacGill.

There was a recent question on the GS-Users list about how to use the sucker inside Google Earth. My personal preference is still for WMS overlays, but if for some reason you’d like your live data outputted as KML, read on.

1. First things first, grab the PR1 release.

2. Setup your desired datastore using the GS web interface. In my case i will configure a new ArcSDE datastore.

3. Add your new featuretype, making sure you set the SRS as 4326 and generate the corresponding bounding box.

Featuretype config

4. Do the old Apply/Save/Load trick to load your changes.

5. Now our data is ready to go, we better check KML output is supported. Send a WMS (yes, WMS) GetCapabilities to your service and check that you have the following,


image/png
image/jpeg
image/svg+XML
image/gif
application/vnd.google-earth.kml+XML

6. We’re almost done. Now all we need to do is setup a corresponding network link to point to the “KML document” (which is in fact, just a WMS call to the KML output format).

Add the following in the location box for a new network link,

http://localhost:8080/geoserverpr1/wms?service=WMS&version=1.0.0&request=GetMap&format=application/vnd.google-earth.kml+XML&width=500&height=500&srs=EPSG:4326&layers=topp:lga&styles=green

and set the refresh parameters to fly-based refresh after “4 secs”

Refresh params

7. Assuming all went ok, you should now have a feature for each polygon which can be toggled individually.

Final

If you are feeling lucky, try adding label definitions and view scales to your SLD. Otherwise you may be unintentially trying to retrieve a KML file containing your whole road dataset :)

Be aware that due to the way GS extracts each feature, the polygon extents can and will extend beyond the requested BBOX, which can be a good or a bad thing i guess.

Things that could well be added in the future: KMZ support, more customisable KML output (such as Z/height attributes) … the list goes on. The flexibility in using the available Geoserver datastores certainly makes this a viable alternative to using the 100 different “arc exporters”. You just can’t beat live data getting sucked straight from your database

If this article interests you, please swing by the GS-Users list and say gday, they are always keen to get more contributors on board.

Adding routing overlays to kamap

Finally, my promised follow up to my build your own routing solution article. For those who have had success massaging their data to work the pgdjikstra module, lets rock and roll. I’m writing this on the fly so hopefully by the end we can get a usable, user-friendly routing solution into Kamap.

1. Kamap install.

Grab the latest stable (or CVS if you’re feeling lucky) release and follow the instructions to get it up an running. Paul and the rest of the crew have made this possibly the easiest frontend to get up and running in a flash, but if you do run into problems please contact the list.

Ignore my setup / mess, this was an existing mapfile used for benchmarks. This is possibly what you shouldn’t have setup, but its got the road centrelines so it’ll do for our purposes.
kamap

2. Create a database handler

Since we would like the users to be able to interface with our db, we need to create a little interface to query the roads and execute the shortest_path_as_geometry call. For the sake of simplicity, the following should give you somewhere to start. (Source: querydb.php, if the output below gets munged)

Dont worry if the output format doesnt really make much sense at the moment, we’ll touch on this in the next step.

< ?php
//SQL query
$query = "SELECT astext(the_geom) FROM shortest_path_as_geometry('roads', ".$start.", ".$end.");";
$result = pg_query($query) or die('Query failed: ' . pg_last_error()); // output
echo "n";
echo "n";
echo "	

3. Integrating PG’s kaXmlOverlay code

PG has done some great work looking into how best to integrate vector overlays on kamap, much like google maps does. Technically, there are lots of different options, but PG’s latest code uses a mix of the wz_jsgraphics and the PHP GD library.

This option has arguably the best cross browser support in that the route is actually rendered and thus positioned, as a PNG/gif image. But more on that later.

PG has posted a demo of the capabilities at his site (http://sistel.dyndns.info/ka-map/indext.html).

You will need to download the kaXmlOverlay.js, drawGeom.php and the wz_jsgraphics library.

Edit your existing kamap index.html and add,

The code is pretty self explanatory, we simply define the path to the XML (querydb.php) and attach it to the map initialised handler. PG has a slightly alternative setup on his demo website, adding a refresh function to automatically refresh the XML doc at a set period. This is a real handy feature if you’re tracking a live GPS feed, but in our case it just adds extra overhead.

4. Time to test

Since the code contains a few point of failures its best to start at the beginning,

  1. Chose a start and end edge id from your postgis table and try running http://your.host/kamap/querydb.php?ST=startid&EN=endid. You should get a well formatted text/xml response with the routing coordinates. If the edge ids dont exist or the geometry function did not work, you will get an error here.
  2. Now you have determined that you have got the coords, time to try kamap. Load up the modified index from step #3 and tail the apache logs. Amongst all the js queries, there should be one eventual request for the querydb, and then for the drawgeom (something like drawgeom.php?gt=L&st=5&bp=5&sc=25&cl=15,1800,0..). If after 30 seconds or so you dont appear to see any overlays, strip out the exact request from your log and try to run it in isolation. eg. http://localhost/kamap/drawgeom.php?gt=L&st=5&bp=5&… if GD is configured properly you should get an image output like below (PNG)
  3. gdoutput.png

  4. If its still not working, there probably just a URL thats throwing a 404. Keep checking the logs for what its requesting just to make sure its not trying to find a file in / and not /kamap

5. The result

Apologies in advance as i just didnt have time to implement a more dynamic approach such as a user entering a start/end address. I hope someone else out there has the time and the inclination to extend this stuff. The possiblities are endless.

6. Problems and future additions?

  1. Needs a better way of converting geo2pix. The existing js function has meant that potentially you could have 40 + coordinate pairs being converted (DB->XML->JS->Drawgeom->Image) to pixel space, and then passed via a URL parameter to drawgeom.php. Very inefficient, and can also go beyond the URL size limit … maybe short term the use of POST might be more suitable?
  2. Line simplification. Further work needs to see how suitable the PostGIS simplify() function is. Conceptually, a function to guesstimate a suitable tolerance for the zoom level, and then retrieve the new coords would be suitable but how easy at guessing said tolerance would be interesting.
  3. JS bloat. I’d prefer to move as much code as possible server side, especially for “calculations”. Being able to pass the current client params such as pixel/cell size, coords, scale etc. would mean much of the xmlOverlay.js code done by PG could be done server side, and potentially drawn in the same thread (eg. no need for a separate drawgeom.php … the initial query would pass the results direct)
  4. An extension to the current querying abilities, where users can click on the map for their start and end points, and the click points would be translated into geo and then fed back into a postgis function to grab the closest road edge. This was what i wanted to do for this article, but alas theres never time.

WMS Service Mining

Services such as Jeremy’s mapdex has certainly raised the profile or rather, the ease of use in finding geospatial web services. Although limited to picking up ArcIMS at the moment, i eagerley await his next installment of WMS service support *hint hint*.

In the mean time, i had a query from a collegue about how to go about finding some open WMS services so he could test out one of his apps. While CS-W (Catalogue Services – Web) is brewing away in the never never, lets use Mr Google,

http://www.google.com/search?q=allinurl:+”Request=getcapabilities”

12,100 results. A good enough start. Although be weary that pagerank doesnt really help us,

  1. in determining the more “popular” services
  2. if the service is still alive
  3. if the service is even supposed to be public

I’d highly advise contacting the owner before using any servers in a non-test environment :)
Lets narrow it down to country,

(Aus) http://www.google.com/search?q=allinurl:+”Request=getcapabilities”&meta=cr=countryAU

  • 30 results. Ouch

(US) http://www.google.com/search?q=allinurl:+”Request=getcapabilities”&meta=cr=countryUS

  • 980 results … better

(UK) http://www.google.com/search?q=allinurl:+”Request=getcapabilities”&meta=cr=countryUK

  • 16 results. Surprising.

Sure it’s hit and miss but at least it gives you somewhere to start. Who knows, you might even find a service which is documented and actually has metadata!

Skylab Mobilesystems has a rather large list available which was collected using their “WMS-Crawler”. No idea on how it actually collects the servers, but i suspect its much the same as what we use with google.

If there are further resources on public services, please leave a comment for others to use.

It raises an interesting point on WMS security though. Mapdex honours any ACL restrictions on ArcIMS services … but with no such alternative for most WMS apps, anyone who visits a getCapabilities document and then visits google is potentially opening up the server to the googlebot crawler.

I guess if you’re really that worried, you should look into adding your WMS applications to your robots.txt, or using some rewrite rule to deny access to identified crawlers.

Interesting point none the less.

ArcIMS stress test

Arcscripts can be a funny place. It can either annoy you with commercial “shareware” or 8 year old avenue code or sometimes, just sometimes you stumble upon an extremely useful tool. I have been involved in an upgrade of ArcIMS from 4 to 9.1 and obviously i was looking into performance testing before we moved the new infrastructure into production.

I was *this* close to writing my own jmeter script when i found ArcIMS Stress Test which effectively did every thing i was looking for,

  • Parses an existing image/queryserver logfile to re-use for the test
  • Ability to use the same time between requests as in the log file or a preset delay
  • Ability to view the ouput of the sent requests
  • Ability to log test summary information to file

Stress dialog

Sure, it doesnt log the hardware performance on the server or uses any special threading / multi-user system, but for my case all i wanted to do was test a “typical day” on the new rig. Highly recommended

Kudos to the author, Milos van Leeuwen