Wednesday, August 26, 2009

Mole Whackers Need not Apply

Two completely different events caught my attention lately. One of them is a post by Curt Monash called Bottleneck Whack-A-Mole, and the other is the much-publicized alliance for “BI in the Cloud” comprising RightScale, Talend, Jaspersoft and Vertica.

In the post, Curt describes software development (or developing a good software product) as “a process of incremental improvement”. Fair enough. The analogy he draws is between constantly fixing and improving performance bottlenecks and the annoying (if entertaining) arcade game of Whack-A-Mole where you have to be fast enough to clobber enough of the critters as they randomly pop up from below. He then makes the point that “Improving performance in, for example, a database management system has a lot in common with Whack-A-Mole.” Having spent most of my life designing, developing, improving and testing commercial and enterprise software applications, I have to say I don’t totally agree with his analogy for several reasons.

First, call me old-fashioned, but I’m an ardent believer in the fact that software building is deterministic. Whack-A-Mole engineering is not. The age-old controversy about software being more of an art than a science may never be resolved, but at the end of the day, I feel software is (should be) a scientific, engineering-driven, deterministic endeavor like any other engineering discipline. With Whack-A-Mole engineering, buildings and airplanes fall to the ground. That’s not good. In my experience, those who seek to “romanticize” software engineering are typically adverse to proper planning, design and testing as being too “dry” or unworthy an endeavor. That’s nonsense.

Second, there is a distinct difference in the way you develop “regular” software from “system software” and I’ve learned this from sitting in the front row the past several years at XSPRADA watching database software being built from the ground up. It’s a little bit like the difference between building a tree house and a major commercial skyscraper. And I believe that playing Whack-A-Mole games while trying to bring up a building is a scary proposition at best (especially for future tenants). And yet, the example Curt provides involves Oracle’s Exadata, of all products! He states: “When I spoke to Oracle’s development managers last fall, they didn’t really know how many development iterations would be needed to get the product truly unclogged” – This statement is mind-boggling to me.

Because for one thing, it suggests that Exadata is “clogged” (ouch) but worse, that their engineering people have no clue as to how they might eventually (if ever) snake the blockages out of it! So, basically it’s a trial and error approach to building a database. Notwithstanding their “professed optimism” that it wouldn’t take “many iterations at all” to finally figure things out, it certainly doesn’t give me (or any reasonable person) a warm feeling about a multi-million dollar product claiming to be the world's ultimate analytical machine.

I think there’s a lot to be said for sound engineering practices, proper planning and testing, setting expectations and deterministic engineering management practices in the world of system software. That Oracle (or Netezza for that matter, also referenced in the post) might just be going along whacking moles instead is a scary proposition indeed. Even if this little game is limited to “performance engineering” as Curt suggest (as if there was a more important endeavor in an ADBMS), that’s a serious allegation in my book. I say leave the arcade games to the kids, and let the real engineers design and implement database and system software please. There’s no room for amateurs in this game.

On to my next point of interest: the new Gang of Four in the Cloud (with apologies to design pattern aficionados) comprising RightScale, Talend, Vertica and Jaspersoft have recently promoted and demonstrated a “bundled” on-demand package for the cloud. I attended their webcast yesterday and was impressed, but with reservations.

Each of these vendors is impressive on its own, no doubt about it. But it seems to me the bundled proposition might be confusing at best to the unwary customer. This new offering is billed by the marketing folks as “Instant BI, just add water” which drives me nuts. Look, it might be simple in theory, and it might take a few minutes to setup the stack on your own (as Yves de Montcheuil from Talend claims) but it’s still a long way to actually accomplishing anything serious in a few simple clicks. Sorry, not going to happen anytime soon.

You still have to work your way through provisioning and instance management (RightScale), data integration and loading (Talend), feeding and configuring the database (Vertica), and setting up the reports/analytics you might need (Jaspersoft). All of which can be accomplished just as easily (or not) internally by the way. It’s true you’d still have to purchase or license Vertica internally, which may or may not match the SaaS pricing I’m not sure (and either way, Vertica has a SaaS offering as well) but the other components are open source so, I’m not sure I see the big advantage there.

An interesting thing I noticed as well is that some people didn’t seem to understand what RightScale’s role was in the whole offering. This tells me they don’t really grasp the intricacies of “the cloud” – because instance and infrastructure management for enterprise in the cloud is not trivial and you do need something like RightScale to grease the wheels (it’s an abstraction layer really), but I think many people assume moving to the cloud is “magic” and makes all these issues disappear. If that were the case, you wouldn’t need RightScale in the mix. Beware undermanaging expectations I'd say.

Additionally, the pricing model (which is supposed to be so much simpler in the cloud) is confusing at best as each vendor has its own menu. The best answer to that I can remember was “starting at $1,700 per month” – I’m not sure what to make of that. So I think from an engineering/technical standpoint, this endeavor is noble, but from a “let’s make things simpler and transparent for the user” perspective, there’s still a lot of work to be done. In other words, it's a nice play for the vendors holding hands, but I'm not sure how beneficial it might be to the average enterprise user.

As usual, caveat emptor – Beware promises of a holy grail in BI as there is no such thing. It’s all about work. Hard, detailed and careful work with proper planning and budgeting. In that respect, setting up successful BI solutions is a lot like running and implementing software projects. There are no shortcuts, and it’s not a job for mole whackers.

24 comments:

  1. The rising prices on the aviation industry and the unheard of charges that are being levied on airfares have disturbed people all over the world. This change has shocked people all the more as they had become habitual of comparatively cheap airfares. The IT industry is also following the same trend. The concepts like outsource software development and offshore software development took birth and found existence due to the combination of optimum quality product at competitive rates. Sadly, the IT sector is also experiencing a major downward slope in the present times. http://www.infysolutions.com/resources/resources.html

    ReplyDelete
  2. While I can not comment for Oracle Corp directly, I can say with first hand knowledge that in no way shape or form is Oracle's Exadata "clogged". The context of the conversation Curt Monash had with whomever at Oracle was likely just a foresightful, conservative and responsible comment and I see no reason that it should be looked upon with any negativity. Exadata is able to scan and feed rows to the Oracle database grid significantly faster than any non-Exadata storage technology. By removing the I/O bottleneck from a system, naturally some other bottleneck(s) will float to the top and will be addressed as they are discovered. For the most part these new issues are not difficult to resolve, it's more a matter of sometimes you don't know until you get there. In my experience, performance engineering is more like a bubble sort than whack-a-mole, but none the less a natural part of the process. I think any performance engineer worth their title/salary will attest to that.

    ReplyDelete
  3. @Prof Bud: Thank you for your comment, although I'm not sure how it applies to this specific post but, thanks for the information anyway :)

    ReplyDelete
  4. @Greg: Thank you for reading the blog and commenting. I can't agree that whoever talked to CAM from Oracle was in anyway "responsible" by suggesting they fly by the seat of their pants. It should be looked upon with some level of negativity because at the very least, their engineering team should have a roadmap for performance - I'm not suggesting one gets it right the first time, obviously, but at the very least you establish reasonable goals and have some kind of plan - at least that's what I've done and seen in my experience. I'm not arguing for or against specific performance numbers from Exadata -- but am simply suggesting that they might have gotten there "by chance" if I understand CAM's post correctly. It is after all what he (and not I) is suggesting. Also I dont see why removing some bottlenecks would necessarily generate others. I don't make that connection as naturally as you do. From what you (and CAM via Oracle) are suggesting, when you get good performance it's always by chance, and lucky trial & error. I simply cannot accept this premise which offends me as a software engineer :)
    Good discussion -- thanks.

    ReplyDelete
  5. Jerome,

    You are WAY misinterpreting my post.

    What's more, to my knowledge, you're the only person reading it that way.

    I continue to be amazed that you would claim your engineers are so brilliant that they always know exactly how their producdts will perform before being built, and never ever have any surprises.

    Or, if that's not what you're claiming, then your company is in the same group as all the others I wrote about, and your whole post makes no sense whatsoever.

    CAM

    ReplyDelete
  6. Or to put it another way -- Greg is right and you are wrong. :)

    CAM

    ReplyDelete
  7. Curt, I'm not sure how it's possible to misinterpret your post. I am only quoting from it and its semantics are clear! Furthermore, I make no such claim of perfection or brilliance. I am only claiming that software engineering should be deterministic and not based on trial & error. No one can guarantee performance before existence clearly but having clear achievable goals measurable via benchmarks is not an unreasonable expectation from a company like Oracle is it? How about a roadmap to some sort of performance level as opposed to trying this out, then that, then maybe the other. I may have misread your post but clearly this is what you are implying about NZ and Oracle's MO. And best I know you have far better/tighter connections into those 2 companies than I do :)

    ReplyDelete
  8. I might point out you might be right about my being the only person to not only read it "that way" but maybe the ONLY one to read it at all :) I was surprised at the lack of comments -- your posts always have comments -- and felt compelled to address them in a post because I thought it was significant and worthwhile public information. So either way, I think you did everyone a huge service by relaying this information.

    ReplyDelete
  9. @Jerome

    I see nowhere that anyone that talked to Curt from Oracle in anyway suggested that they are "fly[ing] by the seat of their pants" when it comes to performance. There is always a roadmap for performance it would be ludicrous to think otherwise and I have not suggested anything to the contrary. Exadata's performance is certainly not "by chance" or "trial and error", it is by design, however, like any software/hardware system pushing the performance envelope, issues can appear when pushed to extreme or previously untested levels. This, I believe, is the point that Curt is speaking to and the point that it seems you are misreading.

    Let me throw this out there for you to ponder: Netezza's new TwinFin comes in sizes from 3 to 120 S-blades, the latter being a 10 rack system. How many developers at Netezza do you think have their own 10 rack system with 1+PB of data to run their tests on? Could it be possible that performance issues that are not visible on a 1, 2 or 4 rack system could show their nasty head on a 10 rack system?

    Now does my comment about sometimes you don't know until you get there make more sense?

    ReplyDelete
  10. @Greg: "When I spoke with Oracle’s development managers last fall, they didn’t really know how many development iterations would be needed to get the product truly unclogged. Of course, they professed optimism– which seemed quite sincere – that it wouldn’t be many iterations at all. But they confessed, as well they should have, to not truly knowing."

    If there's a roadmap in there then you're right, I missed it.

    As for the NZ question, I am not privy to their internal resources. Nevertheless, I trust they are honest and straighforward with their users and would let them know if they are being used as beta testers. As in look, we have not tested this internally, can you help us out here? I dont see anything wrong with that. I may have missed it but I have not see Oracle doing that out there -- seems to me they're pitching the product as fully baked. I'm not saying one can anticipate every problem out there or get everything right the first round (God knows!) but I do find it shocking that Oracle's dev managers would have no idea how many iterations it will get them to get "somewhere" (where ever that might be).

    Now mind you, it's better they admit to not knowing rather than throw an arbitrary guess out there. I'll grant you (and CAM) that about their commendable honesty.

    Nevertheless I'll re-iterate the point: if I'm a customer looking at dropping $6M for Exadata (for the sake of argument) and I read this, it wouldnt give me a warm feeling.

    Would you fly in a plane that still had tweaks and "un-clogging" to be done ad-hoc? If Boeing or Airbus engineers played whack-a-mole with airliners would that inspire confidence? What's different about enterprise databases?

    ReplyDelete
  11. The discussion is about intangibles.

    "When I spoke with Oracle’s development managers last fall..."

    My translation:
    'Well yeah, there could be some performance problems, and if there are, well then I'm sure we'll take care of them as quickly as possible. As with all such things, we can't identify every possible problem with every possible workload, but we'll fix the problems we come to them and we're pretty sure we'll get it all worked out pretty quickly."


    When humans aren't 100% sure about something they
    A) make up bullshit
    B) make up timelines which are mostly bullshit
    C) make up excuses ahead of time to compensate for the bullshit
    and
    D) will not commit to anything remotely concrete about said bullshit

    ReplyDelete
  12. @swanhart: Well between the clogging, the un-clogging and the bullshit you'd think this was a plumber's convention :) - But I think you're probably right about that. As I posted on DBMS2 recently, I think the key thing to remember is, as CAM puts it, the "highly engineered, highly complex Oracle DBMS" -- it's hard to control, anticipate (behavior) or manage such a huge beast, there's no ifs and buts about it.

    ReplyDelete
  13. @swanhart: I could not agree more with your translation. Spot on.

    @Jerome: Curt's comment on DBMS2 says it all:
    "I continue to find it regrettable that you are (were) punishing them for their honest in what looks like an attempt to score cheap marketing points at their expense. Hence my vigorous defense of them against your misrepresentation."

    I don't understand why you feel the need to pursue pushing out FUD when you have zero first hand information. You comments about Exadata and the Oracle database are purely based on speculation, not fact.

    ReplyDelete
  14. @Greg: the information I have I extracted from Curt's post. I didn't make the quotes up. You have a beef with my interpretation, so be it. It is silly to assign "cheap marketing points" to my motivation which was purely from a software engineer's point of view. As if a pion such a myself (who is not in marketing mind you) could possible score anything marketing-wise against a behemoth like Oracle. Puh lease. Additionally, I don't see how CAM defended Oracle in anyway. All he did was laud their honesty. In no way did he either contradict or feel compelled to defend what he judges to be a natural process.

    ReplyDelete
  15. As someone who has a wee bit of Exadata experience I think I'd like to throw a word or two in here. I found this since I've been following XSPRADA and reading Monash of course...

    If the "unclogging" paraphrase/quote is indeed accurate, I don't think that is a bad thing at all. Allow me to explain...my words, not the words of Oracle Corporation....

    Think about it this way. Exadata is as good as it is but not perfect since there is no such thing. Eek, I'm watching out for that bolt of lightening...

    Exadata is a software/hardware combination. Oracle are continually finding ways to "unclog" it to get even more out of it. How stupid would Oracle be to say it is perfect in the current embodiment? It can only get (even) better.

    Yes, a product like Exadata is all about plumbing. That's data flow. Hardware can bottleneck software and visa versa. The database grid is responsible for eliminating what data to go after (in large swaths via partition elimination). Exadata is responsible for getting that data...and specifically *that* data by weeding out unintersting rows and columns through predicate evaluation, projection and join filtering. That data "flows" from the storage grid to the database grid. That is plumbing. Producer/consumer data flow can always be tuned. Pipe work is an iterative process, but I won't go for the mole-whacking bit. Only a product lacking a PRD would get engineered that way (IMHO).


    DISCLAIMER: The opinions expressed in this comment are my words and mine alone and do not reflect the views of Oracle Corporation

    ReplyDelete
  16. @Kevin: First, I want to thank you for following the blog and taking the time to post. It's nothing short of an honor. I've obviously been following your blog for a while now.

    Second, I thank you for saying "but I won't go for the mole-whacking bit. Only a product lacking a PRD would get engineered that way (IMHO)" - Which was _exactly_ the point I was trying to make!

    ReplyDelete
  17. XSPRADA_MAN,

    I like good blogs...and I think you did make that point you refer to...

    Off Topic:
    Say, if you get a chance sometime, see if Mr. Piedmonte (Founder, XSPRADA) remembers our phone chat several years ago. A good friend of mine (at a VC investigating an investment in XSPRADA at the time) set up a due-diligence call between Christopher and myself. I recall an enjoyable, interesting conversation...ok, sorry for the off-topic bit...




    DISCLAIMER: The opinions expressed in this comment are my words and mine alone and do not reflect the views of Oracle Corporation.

    ReplyDelete
  18. @Kevin - Be happy to put you and Chris back in touch. Email me at jerome.pineau@xsprada.com and I'll send you his contact info.

    ReplyDelete
  19. Jerome,

    Ugh, I was just exchanging pleasantries...getting in direct contact with the CEO of a company in the DW/BI/Analytics database space is a great way for a guy like me to end up floating face down in a shallow pool somewhere...just say, "Hi." That will suffice.

    ReplyDelete
  20. Kevin,

    LOL, yeah gotcha. Hate when that happens. Wouldn't wanna see ya sleeping with the fishes :) - I'll pass the greeting along.

    ReplyDelete
  21. @Jerome wrote:
    "Second, I thank you for saying "but I won't go for the mole-whacking bit. Only a product lacking a PRD would get engineered that way (IMHO)" - Which was _exactly_ the point I was trying to make!"

    I don't believe this point is made or supported anywhere within your post, hence all the back-n-forth discussions. Comments like "that their [Oralce's] engineering people have no clue as to how they might eventually (if ever) snake the blockages out of it. it! So, basically it’s a trial and error approach to building a database." do not appear to suggest or support that point at all and seem to be simply reckless and irresponsible; you are simply throwing Oracle and its engineers under the bus. Would you be appreciative of such comments/suggestions about the XSPRADA product or its engineers based on the very limited third party comments you based your post on? I would think not.

    I hope that helps clarify my objections to your post.

    ReplyDelete
  22. @Greg - Not to beat a dead horse but again: the only point I was making is that software engineering is deterministic and whack-a-mole processes are NOT. And that if I were a customer hearing about this it would not give me a warm feeling. I did not decide to "throw" anyone under the bus (as if I could!) but simply picked up on Oracle and NZ as they were mentioned in CAM's blog.
    Some folks seem to think i _did_ make the point, and others not. I suppose I'll have to be clearer in subsequent opinion pieces you're right.

    ReplyDelete
  23. @Greg - Sorry I neglected to address your question about "do unto others" - Namely how I would react to comments/suggestions about XSPRADA in a similar scenario if the tables were turned. First, I'd be jazzed that someone was talking about XSPRADA to begin with :) Second, I'd be wanting to have a conversation with whomever at XSPRADA gave the impression that we work in a whack-a-mole kind of way natively. Third, if the allegation turned out to be correct, I'd certainly bring it up to management because I think it's dangerous. There is a large difference between controlled chaos (which is often the nature of software engineering) and a throw-your-hands up attitude to meeting the unknown (which is what whack-a-moling is to me). In other words, obviously you cannot plan for every problem/eventuality but you have to try and frame the effort somewhat. Either with requirements/roadmaps, some kind of plan -- knowing full well it wont likely be followed to the letter. This is not unlike battlefield management. You can't go in there and say well, yeah I'm not sure where the enemy will be so, when they start shooting, we'll deal with it and kill them as best we can. You gotta have a plan -- any plan is better than no plan. I don't know of any other engineering discipline where this whack-a-mole business would be tolerated. I know it's a fact of life in software but I don't embrace it.

    ReplyDelete
  24. @Jerome

    Like you, the purest in me strives for a deterministic like approach, however, the realist in me also recognizes that pragmatism often times wins out. I think it is important to recognize that in the real world a balanced mix seems to yield the most positive outcome. I also think it should be noted that even the best software engineers and architects don't (or even could) think of everything pre-production, especially when working with systems of large scale and complexity. When you've worked on such a project you will likely relate to my comments.

    I'd also like to note that Curt writes "Improving performance...has a lot in common with Whack-A-Mole". He never directly calls it Whack-A-Mole engineering nor makes any comments that slight any Oracle (or Netezza) product or engineering for their iterative process, but you have done both. That is where I believe you crossed the line. Curt's post speaks to an iterative process achieving better and better performance, which if you have done any performance work, you would understand this is exactly how it is done. Performance bottlenecks are found in the lab running benchmarks and tests, not in a text editor looking at code. Thus when you take a market leading product like Oracle Exadata and Oracle Database, and you iterate over these performance issues, the result is even better and faster products.

    So when you suggest "[Oralce's] engineering people have no clue", you threw them under the bus.

    'Nuff said. Peace out.

    ReplyDelete