Author Archive for Chris Tweedie

The tilecache goldrush

Does anyone else not see a problem with the trend over the past few years? “Tile-itis” is reaching critical mass and it is driving me bonkers. We’re taking away styling, reprojection, tile sizes and giving them … tiles. No wait, fast tiles? Really? Oh, so I can put them on Google Maps? Awesome. Can I have them in projection X? No, sorry, we don’t have another terabyte to reseed the cache. Can I have just the streets? No sorry, same problem.

Why do I seem like the only one asking “wtf” when I see something like this at OAM,

This means, as a rule of thumb, that the network must store ((4/3) + 1) * 3 = 7 MB of imagery plus tiles for every 1 MB of source imagery uploaded. If we load up all of the approximately 4 TB of LandSat-7 data at a 30m resolution, and generate a complete tile set, we will need 16-28 TB of storage in the network to hold it all. If stored on EC2, this would cost up to US$3,000 per month — and that’s just for one layer at a low resolution.

Or when a user asks a simple question

We want to serve the US NAIP Aerials in 1m resolution (which are a total of about 4.7 TB of MrSid/Jp2 data) on a interactive  web map as an optional map background. [sic] .. we determined early on is that MapServer is too slow to serve compressed imagery such as the native MrSid Jp2 imagery on the fly for our needs. [On using Mapserver to serve uncompressed tifs] … would also “blow up” the total data volume to something about 60 TB … Thus, we are in the process of researching options on how to serve the compressed data as fast as possible “on the fly” and without the need for caching them on disk

All replies, except one from (somewhat ironically :) ) Christopher Schmidt, ignores the initial constraint and instantly tells the user a cache is required.

The root of the problem is the assumption that for every organisation, every deployment, you absolutely, unequivocally must create a tile-geo-arcgis-spatial-osm-mapproxy-squid-cache. We’ve gotta do what Google does! I truly fear many organisations are being misled and are unnecessarily transitioned to tiling solutions when quite frankly they don’t need to. More importantly though, GIS software representatives are using the community affinity addiction(?) for tiling everything to mask quite frankly, badly poorly performing software to begin with.

So let us all take a deeeep breath next time you’re scoping out an imagery solution. Why do you need a tile cache? That’s great that your cache can max out a 100mbit connection (its not hard), but you’ve not only increased your storage requirements by a factor of 4, 8 or 20 times, you’ve also taken away other functionality for your customers and limited yourself to one convention.

If you do need a cache and by crikey they are needed in many situations, implement LRU or a hybrid cache solution but most importantly, give your customers the original WMS service. For all its warts, at least it gives them some options.

So to answer both quotes above,

  1. Storing 4TB of uncompressed Landsat 7, 30m data for the whole world as a single compressed ECW at 1:20 will be approx. 200 gb, visually lossless and $30 per month to store on Amazon S3. As some examples, i have the following 3 band mosaics
    1. Landsat742.ecw, 1,414,317 px x  534,778 px which totals 2,515,088 KB (yes, thats ~2.5gb). Did i mention this was created way back in 2003?
    2. Melbourne.ecw, 413,333 px x 346,667 px which totals 30,626,916 KB or ~30 GB from our friends at SKM Ausimage
    3. Metro_Central_2007_Mosaic.ecw,  224,100 px x 304,400 px which totals ~11.5 GB from Landgate
  2. ERDAS Apollo can serve all these mosaics, as 256px tiles on demand and still max out the 100mbit network; no problems. To prove, I ran our tiling test tool over a gigabit connection back to Apollo to see the throughput over a short 180 second test plan
    1. Landsat.ecw
      1. Random: 31837 tiles, avg 181.79 tiles per second, RT 0.03 seconds, throughput 15.2 MB / sec
      2. Sequential: 60673 tiles, avg 314.41 tiles per second, RT 0.02 seconds, throughput 26.65 MB / sec
    2. Melbourne.ecw
      1. Random: 10286 tiles, avg 109.92 tiles per second, RT 0.05 seconds, throughput 13.43 MB / sec
      2. Sequential: 39980 tiles, avg 230.25 tiles per second, RT 0.02 seconds, throughput 34.89 MB / sec
    3. Metro_Central_2007_Mosaic.ecw
      1. Random: 35585 tiles, avg 203.18 tiles per second, RT 0.02 seconds, throughput 33.15 MB / sec
      2. Sequential: 47191 tiles, avg 271.19 tiles per second, RT 0.02 seconds, throughput 51.12 MB / sec

So instead of looking at pure throughput of the cache tile server (which has been proven to be a fizzer), if we also take into account the storage requirements and plot the two variables, I know which one I’d choose. That ERDAS Apollo license is looking pretty damn attractive right now, isn’t it … isnt it *starts shaking*?

What I also find interesting is there seems to be a slight resurgence back to on-demand solutions after, invariably, users realise the scalability or flexibility issues with full tile caches. JPEG2000 seems to be making a comeback thats for sure for image serving, but dont forget Kakadu has the same licensing restriction as the ECWJP2 SDK, it aint free-as-in-beer either. OSM Mod_tile is also a good example of a hybrid solution with on demand rendering.

ps. Has anyone tested beyond 100mbit on any other tiling solution?

pps. ERDAS has its own tiling container format known as OTDF. Clearly this is for our most demanding customers where they need performance above and beyond the above

FUD, FUD, FUD some more

The Simon Hope vs Paul Ramsey posts has some classic asides.

I just had to re-quote the following comment from Atanas Entchev as it made me laugh. I am now personally tasked at seeking out and destroying this mysterious section of psychologists deep within ERDAS headquarters. I will also disassemble all subliminal messages embedded within our marketing and my blog *dons hat*. I have even heard the ESRI psychologist department is some 500 people strong!!

[sic]… the flawed assumption that decision-makers always make decisions based on reason.The “dealers”, on the other hand, know this to be false. So they employ (I speculate) psychologists to design sales tactics (such as FUD) that identify and target decision-makers’ *emotions*. They sell the sizzle, not the steak.

I like my sizzle as well as a good steak. If the steak tastes appalling I send it back. If I didn’t inquire to what I was ordering and expected pork? Well …

And then from Ian Turton,

Does your software use open standards that allow me to switch to another program next year or am I hooked to a conveyor belt of increasing license charges year after year?

Yes, my software does use open standards and yes if you’d like to switch to another program next year be my guest. How many organisations using opensource switch from mapserver to geoserver to mapnik to deegree to mapguide and back again every year? Mapwindow to QGIS to GRASS to UDIG to JUMP? SQLite to Postgres to Mysql? FDO to OGR to Geotools …? Although the FUD from opensource radicals (for lack of a better word) that proprietary solutions have a perpetual ball and chain, this just isn’t true. Sure some workflows are but certainly not to the degree some make out and I’d be damned to think of many without alternatives.

Come on lads, the underlying expectation here is that the vendors are somehow responsible for corporate (or not) entities selecting the wrong tool for the job or paying through the nose when there are viable and cost effective alternatives. Due diligence is king. After all, you are the ones with the $$, the phone to the ear, the door you can close, the conference you didn’t have to attend, the support and maintenance you didn’t have to renew and the software you didn’t have to use. Its my job to prove to you the value of ERDAS offerings, just as its Simon’s job to prove ESRI, Brett’s to prove Mapinfo or FME and Cameron’s to prove  Mapserver or Geoserver. Whats the diff, really,between Cameron doing the pushing and the first three?

http://blog.cleverelephant.ca/2010/05/whos-your-dealer.html

ERDAS Apollo results updated

With Apollo 10.1 about to get out the door I have updated the WMS image serving benchmark results. I’m still yet to update the product-by-format graphs as I will be rolling out a more dynamic and easier to maintain page shortly. One main addition was extending the ECW Test plan to from 150 to 300 users. I received a lot of requests from people wondering what the peak throughput was, which turns out to be not much higher at around 120 maps per second (but still, crazy quick at ~2 sec avg response).

ERDAS has also just registered for the Benchmarking event in Barcelona which brings the tally to 11 products which is great to see *queue herding cats picture*. So everyone, please stop asking me :-)

ERDAS Apollo vs ESRI ArcGIS Server

Lets face it, whatever benchmark results a vendor (*gasp*) publishes always draws a certain amount of suspicion. Luckily however, T-Mapy (Czech Republic) have just made available a detailed independant 20 page report on ERDAS Apollo vs ESRI ArcGIS Server.

T-Mapy have a long history with ESRI and now also ERDAS technology so they offer great perspective and expertise on both products. Michal Šeliga has done a wonderful job analysing performance and other metrics for serving a very large (290gb) 10cm aerial photo via WMS. Word of the day goes to “eyemetricaly worse” on page 13 :)

Image serving updates

I had hoped to post a lot more WMS image serving challenge results by now, but to date only Robert Parker at Lizardtech has taken me up on the offer with Express Server 6.1. Apologies to Rob for taking so long to publicize  the results as he was very eager to send them through and I’ve been sitting on them for well over a month now. Gold star to Lizardtech.

ESRI? Autodesk? Deegree? Mapserver? Geoserver? Oracle? Manifold? Mapinfo? … Show me your muscles (in my best Arnie voice). Don’t forget that results from real world users, not just developers are just as valuable.

On the benchmark side of things, I have also updated the Mapserver results with the 5.6.1 build. Sheesh, talk about being spammed very vocal :) ECW support was dropped and unfortunately I was unable to get the Kakadu JP2 driver working. I’ll update the individual graphs when I get some time but here is the formats-by-product result. The solid, bold line represents 5.6.1, the dotted stroke the original 5.4 results. Yes, something crazy happened on the TIFF External test but I reproduced the result over the typical 3 test run … will revisit that one later.