Friday, July 17, 2009

ADR and how I got kicked out by Kickfire

I had the opportunity to brief Merv Adrian (major BI industry analyst) recently about XSPRADA and in discussing competitive technological differentiation, I highlighted the three major points that make XSPRADA uniquely stand out in a sea of “new-breed” ADBMS vendors, namely our algebraic engine (aka ALGEBRAIX), adaptive data restructuring (ADR), and temporal invariance. Those of you who have honored me with their readership in the past will no doubt be familiar with those terms and concepts. (side note: Merv is the one who coined this ADBMS term that I shamelessly borrow constantly now -- thank you Merv!)

I want to focus a little bit on ADR in this post because in the midst of our discussion, Merv asked me a really good question about it. He said “if you’re busy doing ADR, and more and more queries come in and more and more users get on board, what will the impact be on performance?”. Excellent point! Quite honestly, no one had ever asked me this before so after our call, I did a little more research and came up with a few more questions on my own. All of which I’d like to discuss here.

First of all, to recap, ADR is the process of adaptively restructuring data both logically (say the structures in RAM for example) and on storage (disk) based on the nature of queries coming in. What does this mean? Well for example if a query is clearly pulling only certain columns from a table (as is typically the case in OLAP), ADR will pull these columns out and optimally lay them out on disk for more efficient access. ADR could also include indexing (as in bitmap, in low-cardinality cases) and any other means at its disposal to optimize questions pertaining to these columns. Sound familiar? It should as this is basically the principle behind columnar systems. However ADR doesn’t stop there.

It may, for example, decide that sharding row blocks is a more efficient strategy given a particular query pattern and set out to do just that as well. It may decide that duplicating certain pieces of information on given disks is more I/O efficient. In memory, it may decide to implement different indexing schemes depending on the nature of the queries. In short, ADR has absolute “carte blanche” to take every means at its disposal to optimize the system in real time.

Unlike most other ADBMS out there, the XSPRADA engine is a living breathing entity constantly striving for optimal performance. And it has more than a few tools at its disposal (in other words, not a one-trick pony, compliments of the underlying mathematics) -- But so the question is indeed legitimate: how does this impact performance if at all?

The answer, unsurprisingly, is it depends. First, it’s important to realize the design principle behind ADR. It is called “crowdsourcing”. The philosophy is that the more people hit the database from all angles , the better chance there is of being able to optimize the database. From a technical perspective, overload is always a possibility, as with any other system. If too much is submitted at a given time, the system could theoretically run out of resources and impact performance. But when properly configured (balanced) with proper amounts of storage, the system should not degrade with additional users and queries coming in.

Note that concurrent queries are also beneficial to the system. This is because data streams can be shared between query processes in parallel, thereby reducing disk I/O. Additionally, ADR is not user or query-specific. So there is no risk of one ADR action resulting from Query1 from “clobbering” another ADR action from Query2. When sharing is possible, it is implemented and maintained in the mathematical model (algebraic integrity). But all queries from all users enter the “algebraic space” together in one big pool. This means everyone’s actions benefit everyone else. The XSPRADA engine is a very populist one J

On a completely different topic, I wanted to relate a funny incident that happened to me on the way to the Forum recently. Well okay, it wasn’t exactly a forum per say but rather the newly minted Kickfire on-demand trial process. They call it Cloud-based Trial.

I enthusiastically signed up for the trial and got an assigned time-slice for today. Within minutes, I get the following email from Kickfire’s Karl Van den Bergh, who is Kickfire’s Vice President of Marketing and Business Development (phew! Talk about long names to match long titles).

“Thanks for your interest in Kickfire. Our trialing system is reserved for prospects and partners. As our companies are somewhat competitive we are not able to give access to XSPRADA at this time.”

Wow. Cold man.

But next morning, I get this email from the company’s trial support team:

“Congratulations! This email confirms that your Kickfire On-Demand Trial will begin at 10:00am (PST) on 07/17/2009.”

Then thirty minutes later I get this:

“jerome, Your reservation has been deleted. Reservation #sc14a5eb4e82d0dc.”

By that time I’m thinking okay, somebody at Kickfire Trial Support finally got their behinds kicked (pun intended) for daring to confirm my trial session. Apologizing for this confusion is Karl again:

"Apologies for the registration mix up this morning - we have a number of system administrators who weren't in synch."

Yeah, I’d say. But wait, there’s more if you order now! Early this morning, after my session was supposed to end, I get a phone call from Kickfire asking me how my trial went! I couldn’t help but blurt out “you guys are really confused”. Then I explained that Karl had branded me persona-non-grata at which point I was promptly dropped like a bad case of H1N1.

Now, don’t get me wrong. I love this Kickfire appliance concept. And I have a positive informative relationship with technical folks there who are nothing if not extremely competent (and nice people to boot). And although I cannot play with the Kickfire appliance either locally or remotely, from what I’ve heard and read, these guys have a top notch team, great technology, a seasoned CEO and attractive market positioning.

In thinking about this, my initial reaction was that I was stupidly naïve. Honestly, it never occurred to me that Kickfire would ban any “competitor” (or anyone else for that matter) from remotely playing with their software. The reason is because we don’t do this at XSPRADA. As a matter of fact, anyone can pull our bits for free from our website, including Kickfire (which already has). So to me this has a strange, secretive, “we have something to hide” feel which, given what I know about Kickfire, really took me by surprise.

But it's true I have this weird concept about openness that many other vendors share, and I’m happy to put our stuff in competitive hands (after all, the customers certainly will!) anytime and get any feedback from the experience, negative or not. In my experience, few competitors will go out and trash another vendor’s offering, much less try to reverse-engineer it (at least not in this industry) -- Maybe I’m foolish.

However, it’s true I don’t have a “marketing” bone in my body. As an engineer and evangelist, I’ve always been adamant about transparency, peer review and feedback. So maybe it’s a good think I don’t handle XSPRADA Marketing, or Kickfire’s for that matter! J

2 comments:

  1. Hi Jerome,

    Let me repeat to you again our apology for the confusion caused on our first day we launched our on-demand trial. This was due to a couple of system administrators acting independently of one another. This was an unfortunate one-off event that has not since been repeated.
    Regarding the follow-up call you received, it was not, as you believed, related to the trial but rather to a webinar of ours you had recently attended. As you can see what are not completely secretive. We are happy for competitors to attend our events ;-)

    Best regards, Karl

    ReplyDelete
  2. Karl, it's been 2+ months now; I've had plenty of time to get over it emotionally :)
    No worries.
    Best,
    J.

    ReplyDelete