I grew up in New York City and spent enough time in Jersey back East to have witnessed a couple interesting brawls in my life (I even had neighbors who dug holes for a living or worked in waste management) but it’s been a while since I’ve seen anything like the recent scuffle among industry analysts and vendors regarding the recently published ParAccel TPC-H benchmark. Maron!
It all started innocently enough two days ago when Merv Adrian, BI industry analyst emeritus, published the news in his blog titled “ParAccel Rocks the TPC-H – Will See Added Momentum”.
Now, it’s not every day that a vendor publishes audited TPC-H benchmarks (“audited” being the key word, as the process runs around $100K from what I understand). Very few companies besides the Big Three have the deep pockets and technology prowess to accomplish that. Furthermore, ParAccel did its benchmark based on 30TB which isn’t exactly a small chunk of data. And so Merv made the point that, at the very least, the news should certainly help put ParAccel on the map. To quote him: “This is a coup for ParAccel, whose timing turns out to be impeccable”.
Immediately, this was picked up by no other than Curt Monash, BI analyst to the stars (and I say that quite seriously), who happens to despise the very concept of TPC benchmarks for reasons he clearly outlines in a recent post entitled “The TPC-H benchmark is a blight upon the industry”. To pull a couple of money-quotes from the site:
“...the TPC-H is irrelevant to judging an analytic DBMS’ real world performance.
“In my opinion, this independent yardstick [the TPC-H] is too warped to be worth the trouble of measuring with.”
“I was suggesting that buyers don’t pay the TPC-H much heed. (CAM)”
“TPC-Hs waste hours of my time every year. I generally am scathing whenever they come up”
Now, notwithstanding the TPC-H issues, I think Curt will concede that he doesn’t particularly appreciate or trust ParAccel as a company either as the following statements will show:
“I would not advise anybody to consider ParAccel’s product, for any use, except after a proof-of-concept in which ParAccel was not given the time and opportunity to perform extensive off-site tuning. I tend to feel that way about all analytic DBMS, but it’s a particular concern in the case of ParAccel.”
“I’d categorically advise against including ParAccel on a short list unless the company confirms it is willing to do a POC at the prospect’s location.”
“The system built and run in that benchmark — as in almost all TPC-Hs — is ludicrous. Hence it should be of interest only to ludicrously spendthrift organizations.”
“Based on past experience, I’d be very skeptical of ParAccel’s competitive claims, even more than I would be of most other vendors’.”
The combination of published TPC benchmarks and the originator of the benchmark seem to have created what Curt himself refers to as “the perfect storm”. To say he doesn’t like either would be a gross understatement :)
Both blogs immediately started getting “opinionated” comments from the public at large, including ParAccel’s VP of Marketing Kim Stanick and a gentleman named Richard Gostanian who may or may not be connected to Sun Microsystems (depending on which Twits you read). Sun supplied the hardware for the ParAccel benchmark. To cite a couple quotes from the comments, Richard Gostanian responds:
“Perusing your website, I detect a certain hostility towards ParAccel.” – (No kidding!)
“Indeed you do more to harm your own credibility than raise doubts about ParAccel.”
“…TPC-H is the only industry standard, objective, benchmark that attempts to measure the performance, and price-performance, of combined hardware and software solutions for data warehousing.”
“So Curt, pray tell, if ParAccel’s 30 TB result wasn’t “much of an accomplishment”, how is it that no other vendor has published anything even remotely close?”
Then Kim Stanick says: “It [TPC-H] is the most credible general benchmark to-date.”
And an anonymous reader chimes in:
“After reading Curt’s post about ParAccel and Kim this is obviously personal…I wonder why the little fella didn’t have a fit over Oracles 1TB TPC-H? Check his bio. He consults for Oracle.”
To which Curt replies (among other things):
“As for your question as to why other vendors don’t do TPC-Hs — perhaps they’re too busy doing POCs for real customers and prospects to bother.”
Ouch! This nasty sudden melee took me by surprise at a time when I was considering blogging about the whole TPC-H system for analytical engines anyway. I’ve wondered for quite a while whether or not publishing such metrics actually helped “new breed” startups like ourselves from a marketing and sales standpoint. Given the high cost and resource drain, what’s the return on this investment? What’s more, I have yet to meet a prospect or user who either cares of knows about TPC-H benchmarks. So far, the only people I’ve ever seen show any interest in the matter are venture capitalists and investors which tells me right there that something is amiss (or, maybe that’s why the small players take the plunge, I don’t know).
As some of you may know, XSPRADA is a recent member of TPC.org alongside other industry startups like Kickfire, Vertica, ParAccel and Greenplum. Numerous other startups in the same category are not members. They don’t seem to fare any worse. Furthermore, as best I can tell, even some existing members (namely Greenplum or Vertica) don’t publish audited benchmarks. Yet clearly these two vendors don’t seem negatively affected by the lack thereof.
Although we at XSPRADA have conducted TPC-H benchmarks (and continue to do so) internally, we have never attempted to get them audited and published. If a prospect asked me about it, I would recommend we help him run those benchmarks in-house on his own hardware anyway! Even if we had $100K to blow on getting audited benchmarks, I’m not sure it would make sense to pursue.
I’m usually a pretty opinionated black & white guy, but with respect to this TPC-H business, I tend to centerline. Strangely enough, I identify with both sides of the argument. One the one hand, I don’t believe the benchmarks to be totally useless. Having been involved in generating our internal results, I can vouch for the fact that it takes a lot of tedious work and kick-ass engineering to even complete the list. By any stretch of the imagination, this is not a small inconsequential feat. Doing so on anything above 10TB is, in my opinion, nothing to sneeze at. If nothing else, being able to handle the SQL for all 22 queries is a decent achievement. And then of course, there’s the notion that even “trying” to do it is noble in itself. In that sense I tip my hat to the small guys who pulled it off.
On the other hand, I don’t feel the benchmarks are holistically useful for evaluation purposes. As a prospect looking at several vendors, they might figure in my check-list but not more significantly than others I consider more important. Namely: how easy is the product to work with, what resources does it consume (human and metal), how does it play in the BI ecosystem as a whole (connectivity), and last but not least, what kind of support and viability will the vendor provide? I’m a little weird that way. I tend to evaluate companies based on their people over most everything else. But that’s just me.
At the end of the day (and everyone does seem to agree on that), what matters are onsite POCs. Nothing can beat running your own data on your own metal. I want a vendor to hand me the keys and go “ok, have a good ride, call me if you need anything” and mean it. BMW sells cars this way. Enough said. This is what I drive to when helping people evaluate our offering.
It remains to be seen how much of this brouhaha will benefit ParAccel in the long run. They say there’s no such thing as bad publicity. If they end up getting recognition and sales from it, then they have chosen wisely, and no one can take that away from them. Personally, I wish them the best. I believe the more numerous we are in this upstart game, the better it is for us, and more importantly, for our customers. So I say leave the guns, and take the cannoli.