WFS Feature paging … yes please

Sean posted his thoughts in response to Chris’ and all i can say is, yes please!

My random thoughts,

  • Why this functionality was never embedded into WFS i will never know. After playing with CSW for the last 6 months where similar “pagination” is available … it just makes sense. How the average Jo Blogs will ever understand what maxFeatures should be set to is irrelevant if the user cannot even determine how many *total* features are available given his query. OGC CS-W handles this quite nicely, almost identical to how Chris H. described it. If i search for “hydro”, it will give me a numberOfRecordsMatched=”340″ but then also tell me that i’m just viewing the first 10 records.
  • Paging has been linked to server performance, particularly caching a set number of features. This imo, would only hold true if the given features are retrieved in the same manner. How this would handle filters i’m a little unsure of (beyond the simple bbox). Just because search engines index http://sigma.openplans.org/geoserver/water_shorelines/100 doesn’t mean that the same features will appear in the same page 10 days later, for example. Checksum? HTTP Last-modified? *shrug*

Looks like i need to pay more attention to the OGC boards :)

Geoserver testing ..

If you havent already heard, GS1.5 has been released and offers lots of little goodies hidden amongst the changelog. After lurking in the #geoserver channel picking up tidbits here and there i wanted to run some quick tests to confirm these magical WFS improvements. Refer to the following threads re: performance,

FYI, the test interface is php/Curl (local) -> geoserver 1.5 (local) -> ArcSDE (remote). Curl just allows finer control of the WFS POST requests

Partial-Buffer

15k Cadastral features
Tomcat 5.5 + …

* JDK 1.4.2

ZIP: 40.33sec (775kb)
GML: 6.85sec (11.5mb)
GML-GZ: 24.52sec (618kb)

Partial-Buffer2

15k Cadastral features
Tomcat 5.5 + …

* JDK 1.4.2

ZIP: 36.43sec (775kb)
GML: 6.52sec (11.5mb)
GML-GZ: 24.11sec (618kb)

* JDK 1.6u1

ZIP: 34.31sec (775kb)
GML: 5.88sec (12.0mb)
GML-GZ: 19.32sec (618kb)

:-(

Unfortunately i did not see a significant improvement in my testing of both the JDK and new Service strategies. Perhaps the bottleneck is in the I/O to the SDE datastore and not Geoserver itself. No time to test further on local datastores, but i promise i’ll post a followup comment with these later…

Whats surprising is the time required to create the zip’s. Everyone knows there is a cost involved with saving and writing the zip (instead of streaming the gml) but i didnt realise it was that much. Its a shame GML is not as common place as the old shapefile otherwise i would certainly be pushing gml2-gzip or even better, native “accept-encoding: gzip” headers when requesting GML2 output to the containers. In my experience as soon as you tell anyone that you can output a shapefile from Geoserver WFS, you can forget them ever considering GML again. These numbers may sway some of them at least

Further testing to come ..

Someone’s doing some sniffing …

Sean is on the trail of something on MS-USERS ..

I couldn’t be bothered checking whether the software has the required MIT License enclosed, but lets hope they do! While nothing is hindering piggy-backing on OSS projects, it does certainly strike a choord as to doing the “right-thing”. DMSolutions commercial products certainly know that balance, lets hope they do the same.

duckhunt.jpg

Taking it to “the man”

Paul has graciously asked people to send him some queries to raise before the OGC technical meeting in Ottawa. In somewhat of an irony, it has been exactly 7 months to the day that my rant on client support was first posted.

Unfortunately my post still stands. To reiterate my point,

While i understand the importance of server compliance using tools such as CITE, if the subsequent clients consuming these services are poorly implemented, the end user surely has to question the point of it all.

It should be all about the clients baby! Unfortunately outside of the OWS-X and other demonstrator projects around the globe (where arguably the roles are clearly defined), vendor support is more or less a waste of time. What can be done in one application can’t be done in another. Seemingly simple items of the specs are broken, poorly implemented or simply forgotten. Vendors are all to quick to leap to the conclusion that their *insert propriety acronym here* could solve the problem, even though its entirely feasible to use the standards if their product simply supported it better. Oh and lets not forget that the product leaf-let clearly states that the protocol is supported … but by how much? Who knows!

I think the following image sums up my feelings nicely, we need one of these …

yardstick.gif

Whether or not WMS/WFS/WCS/CSW (…) client support is caused by a lack of motivation, client demand or vendor negligence, i won’t go so far as to guess. Certainly if OGC put as much emphasis on broadening the consumption of its standard’s as it does jumping through hoops to get certified, I would have a lot less grief at work!

“I’m sorry “Frustrated-Consumer-of-OGC-standards”, what you have requested is entirely possible with the standards and server however your client does not support that manner of request. Can i advise hand-coding a *insert language here* script to post a request, parse the response, convert the format and then drop it into your GIS so you can do what you have asked??”

“Can you just send me the file? That will be easier …”

I hope the horrible analogy of build it and they will come will hold true. Otherwise, we’re in trouble …