Big Honking Databases: Leave the Gun, Take the Cannoli

I grew up in New York City and spent enough time in Jersey back East to have witnessed a couple interesting brawls in my life (I even had neighbors who dug holes for a living or worked in waste management) but it’s been a while since I’ve seen anything like the recent scuffle among industry analysts and vendors regarding the recently published ParAccel TPC-H benchmark. Maron!

It all started innocently enough two days ago when Merv Adrian, BI industry analyst emeritus, published the news in his blog titled “ParAccel Rocks the TPC-H – Will See Added Momentum”.

Now, it’s not every day that a vendor publishes audited TPC-H benchmarks (“audited” being the key word, as the process runs around $100K from what I understand). Very few companies besides the Big Three have the deep pockets and technology prowess to accomplish that. Furthermore, ParAccel did its benchmark based on 30TB which isn’t exactly a small chunk of data. And so Merv made the point that, at the very least, the news should certainly help put ParAccel on the map. To quote him: “This is a coup for ParAccel, whose timing turns out to be impeccable”.

Immediately, this was picked up by no other than Curt Monash, BI analyst to the stars (and I say that quite seriously), who happens to despise the very concept of TPC benchmarks for reasons he clearly outlines in a recent post entitled “The TPC-H benchmark is a blight upon the industry”. To pull a couple of money-quotes from the site:

“...the TPC-H is irrelevant to judging an analytic DBMS’ real world performance.

“In my opinion, this independent yardstick [the TPC-H] is too warped to be worth the trouble of measuring with.”

“I was suggesting that buyers don’t pay the TPC-H much heed. (CAM)”

“TPC-Hs waste hours of my time every year. I generally am scathing whenever they come up”

Now, notwithstanding the TPC-H issues, I think Curt will concede that he doesn’t particularly appreciate or trust ParAccel as a company either as the following statements will show:

“I would not advise anybody to consider ParAccel’s product, for any use, except after a proof-of-concept in which ParAccel was not given the time and opportunity to perform extensive off-site tuning. I tend to feel that way about all analytic DBMS, but it’s a particular concern in the case of ParAccel.”

“I’d categorically advise against including ParAccel on a short list unless the company confirms it is willing to do a POC at the prospect’s location.”

“The system built and run in that benchmark — as in almost all TPC-Hs — is ludicrous. Hence it should be of interest only to ludicrously spendthrift organizations.”

“Based on past experience, I’d be very skeptical of ParAccel’s competitive claims, even more than I would be of most other vendors’.”

The combination of published TPC benchmarks and the originator of the benchmark seem to have created what Curt himself refers to as “the perfect storm”. To say he doesn’t like either would be a gross understatement :)

Both blogs immediately started getting “opinionated” comments from the public at large, including ParAccel’s VP of Marketing Kim Stanick and a gentleman named Richard Gostanian who may or may not be connected to Sun Microsystems (depending on which Twits you read). Sun supplied the hardware for the ParAccel benchmark. To cite a couple quotes from the comments, Richard Gostanian responds:

“Perusing your website, I detect a certain hostility towards ParAccel.” – (No kidding!)

“Indeed you do more to harm your own credibility than raise doubts about ParAccel.”

“…TPC-H is the only industry standard, objective, benchmark that attempts to measure the performance, and price-performance, of combined hardware and software solutions for data warehousing.”

“So Curt, pray tell, if ParAccel’s 30 TB result wasn’t “much of an accomplishment”, how is it that no other vendor has published anything even remotely close?”

Then Kim Stanick says: “It [TPC-H] is the most credible general benchmark to-date.”

And an anonymous reader chimes in:

“After reading Curt’s post about ParAccel and Kim this is obviously personal…I wonder why the little fella didn’t have a fit over Oracles 1TB TPC-H? Check his bio. He consults for Oracle.”

To which Curt replies (among other things):

“As for your question as to why other vendors don’t do TPC-Hs — perhaps they’re too busy doing POCs for real customers and prospects to bother.”

Ouch! This nasty sudden melee took me by surprise at a time when I was considering blogging about the whole TPC-H system for analytical engines anyway. I’ve wondered for quite a while whether or not publishing such metrics actually helped “new breed” startups like ourselves from a marketing and sales standpoint. Given the high cost and resource drain, what’s the return on this investment? What’s more, I have yet to meet a prospect or user who either cares of knows about TPC-H benchmarks. So far, the only people I’ve ever seen show any interest in the matter are venture capitalists and investors which tells me right there that something is amiss (or, maybe that’s why the small players take the plunge, I don’t know).

As some of you may know, XSPRADA is a recent member of TPC.org alongside other industry startups like Kickfire, Vertica, ParAccel and Greenplum. Numerous other startups in the same category are not members. They don’t seem to fare any worse. Furthermore, as best I can tell, even some existing members (namely Greenplum or Vertica) don’t publish audited benchmarks. Yet clearly these two vendors don’t seem negatively affected by the lack thereof.

Although we at XSPRADA have conducted TPC-H benchmarks (and continue to do so) internally, we have never attempted to get them audited and published. If a prospect asked me about it, I would recommend we help him run those benchmarks in-house on his own hardware anyway! Even if we had $100K to blow on getting audited benchmarks, I’m not sure it would make sense to pursue.

I’m usually a pretty opinionated black & white guy, but with respect to this TPC-H business, I tend to centerline. Strangely enough, I identify with both sides of the argument. One the one hand, I don’t believe the benchmarks to be totally useless. Having been involved in generating our internal results, I can vouch for the fact that it takes a lot of tedious work and kick-ass engineering to even complete the list. By any stretch of the imagination, this is not a small inconsequential feat. Doing so on anything above 10TB is, in my opinion, nothing to sneeze at. If nothing else, being able to handle the SQL for all 22 queries is a decent achievement. And then of course, there’s the notion that even “trying” to do it is noble in itself. In that sense I tip my hat to the small guys who pulled it off.

On the other hand, I don’t feel the benchmarks are holistically useful for evaluation purposes. As a prospect looking at several vendors, they might figure in my check-list but not more significantly than others I consider more important. Namely: how easy is the product to work with, what resources does it consume (human and metal), how does it play in the BI ecosystem as a whole (connectivity), and last but not least, what kind of support and viability will the vendor provide? I’m a little weird that way. I tend to evaluate companies based on their people over most everything else. But that’s just me.

At the end of the day (and everyone does seem to agree on that), what matters are onsite POCs. Nothing can beat running your own data on your own metal. I want a vendor to hand me the keys and go “ok, have a good ride, call me if you need anything” and mean it. BMW sells cars this way. Enough said. This is what I drive to when helping people evaluate our offering.

It remains to be seen how much of this brouhaha will benefit ParAccel in the long run. They say there’s no such thing as bad publicity. If they end up getting recognition and sales from it, then they have chosen wisely, and no one can take that away from them. Personally, I wish them the best. I believe the more numerous we are in this upstart game, the better it is for us, and more importantly, for our customers. So I say leave the guns, and take the cannoli.

9 comments:

Justin SwanhartJune 24, 2009 at 3:01 PM
I know this is off-topic, but every time I see the name 'Gostanian', I can't help but think of 'Gozar' the 'Gozarian' from Ghostbusters. Seriously, it just keeps cracking me up every time I see it.

--Justin
UnknownJune 24, 2009 at 3:04 PM
That's funny, had not occured to me - I just saw an Armenian heritage in the name :) I could be wrong. I did not take the time to look up the gentleman, assuming he would likely come clean on CAM's blog.
merv adrianJune 24, 2009 at 10:18 PM
Nice review. As the guy who published the piece that started it all, I've been enjoying it - as a blogger, you always want people to engage. And I do believe that benchmarks have a history of providing value by pushing vendors to optimize certain things - not always the "best" or most useful things at any point in time, but it all helped.

I bet we'll all be talking about this for some time. And it's a worthy conversation. Thanks for pushing it along!
UnknownJune 24, 2009 at 10:21 PM
Thanks Merv. I do believe _some_ sort of standard/benchmark is a worthy goal. Whether it ends up being TPC-driven or not is another story. It is good that this conversation should emerge now -- if nothing else we have ParAccel to thank for it -- and you obviously! :)
Colin WhiteJune 25, 2009 at 5:26 PM
A good summary and balanced perspective. Thanks. Colin.
Neil RadenJune 25, 2009 at 7:31 PM
I can't find one thing in here to disagree with. And I love the title, from "The Godfather," as I recall.

We make most decisions in life without benchmarks. My wife for example, and that was a biggie, but I guess you could say we had a POC. hahahaha

-Neil Raden
UnknownJune 25, 2009 at 8:18 PM
@Neil -- you make a good point. About the wife thing I mean :) I think most people buy with the heart more so than the head although in enterprise software, I tend to hope it's not the case.
AnonymousJune 25, 2009 at 9:08 PM
Nice review of the discussion/debate. I'm in the middle like you. TPCs are like any other tool, good if used right, bad if used wrong.

I think what's really happening is that Curt is beginning to get some backlash for his misinformed writing over the past year or so as he labeled himself the go-to appliance guy. He's angered a lot of vendors and a lot of analysts, and he's reaping what he sowed.

But as they say, the only bad press is no press...
UnknownJune 25, 2009 at 9:26 PM
@Anon, thanks for the kind words. I am not presumptuous enough to pass judgement on anyone's writings in the industry, having been in it myself for a short time. But I can certainly say, from personal experience, that CAM is a straight-shooter, has low tolerance for BS, has always been generous in sharing information with me, and is certainly not a primadonna. All qualities I respect. I don't know if he's angered people, but as we say in French, to make an omelette, sometimes you have to break some eggs :)

Big Honking Databases

Wednesday, June 24, 2009

Leave the Gun, Take the Cannoli

9 comments:

About Me

Small sample of blogs I follow

Search This Blog

Tracer

Followers

Blog Archive