The more I read, test and learn about on-demand BI every week, the more surprised I am to realize how many players there are on the market. And this isn’t even just a small & medium size (SMB) market anymore as I originally assumed. There’s a whole slew of on-demand BI companies out there targeting serious size enterprises with giga and terabyte size warehouses.
It seems one of the main arguments against the performance (latency) issues for BaaS is that the penalty is imposed only once at load time. In other words, yes it’s time-consuming and fairly slow to upload warehouse data (sometimes up to weeks) due to current network and pipeline bandwidth, but it’s something that is done only once and subsequent pushes are essentially incremental and consequently much quicker. I guess I can buy that argument provided the “done once” endeavor is resilient enough to resist catastrophic errors. For example, I better be able to just pick up and continue where I left off if I spent 9 of 10 days waiting for my upload to finish before my network connection dropped or my server exploded.
But even the “big guys” are dipping toes in the SaaS pool lately. As this article points out, SAS is investing in the cloud big time.
Vertica seems to have tacked from an “appliance” model to a hosted or cloud-based one (Vertica for the Cloud) as evidenced in their latest webinar as well, The Cloud and the Future of DBMSs in which they pretty much repeat their usual marketing litany.
Kognitio, who claims to have put the “D” in DaaS, just announced a cloud deal with Kelkoo.com, "Europe’s largest e-commerce website after Amazon and eBay”. They also have a cloud-based DaaS implementation with British Telecom (BT), who is about to lay off 10,000 people incidentally. Kognitio has been one of the most cloud-aggressive companies out there.
And of course behemoth Microsoft is breathing down everyone’s neck (discretely at the moment) via things like www.sqlserverdatamining.com/cloud and the entire Madison and Azure platforms.
Pretty much all the big players have some sort of stake in the “cloud” one way or another. No one wants to be left out, just in case. But even beyond these well-known players you also have folks like these here, some of which address specific analytical niches:
www.deciphertech.com – sales analytics with Salesforce.com
www.hostanalytics.com – financial analytics (budgeting/revenue planning) niche. I initially thought these guys might be connected to www.i-lluminate.com by the nature of the audio on their website.
www.adaptiveplanning.com - Budgeting, forecasting and reporting analytics.
www.quantivo.com – Customer behavior analytics.
www.1010data.com - I believe they use tenbase on the backend, are columnar in architecture and have an ODBC as well as an Excel plug-in connector.
www.shsna.com/ Nutricia North America (owned by Danon) makes baby food and now also runs Pentaho over MySQL in the EC2 cloud for its internal BI needs as described here.
www.kpiweb.com – Some French startup focusing on (I’m willing to bet) KPI metrics.
www.limtree.com – Another French startup. In fact, just a QlikView integrator.
Speaking of French sites, if you can read French then check out this BI blog at www.legrandbi.com – If you can’t read French, head over to translate.google.com and read it anyway because it’s a precious resource full of interesting “in-your-face” BI insight and informational tidbits I have not found elsewhere.
www.pivotlink.com which I mentioned in a previous post, is geared exclusively at large enterprise warehouses in the cloud (small players need not apply) and backed by Trident Capital from what I gather.
Even data integration seems to have made some inroads in the cloud realm. I often read that ETL and integration consume 70-80% of a typical BI project. I don’t know if the actual proportion is this huge, but I _do_ know from experience that integration tends to get grossly underestimated. That being said, it’s clearly a huge BI pain point and I was initially surprised to see anyone trying to do this on a hosted/cloud basis but the guys at www.boomi.com are pitching just that.
And speaking of boomi, I have to hand it to them for trying this approach which I believe has merit but they have about the worse webinar I have _ever_ attended. I think they’ve managed to do absolutely everything that a company should avoid doing in a webinar namely:
- Advertize a webinar lasting more than 60 minutes. Right off the bat, that doesn’t give me a warm feeling. How could they possibly need that much time when everyone else is managing to stay at or under an hour?
- Make it an expensive pain in the ass for people to connect. There’s a toll-full number to call in the US. Fine. Then there’s also a 9 digit access code. Then there’s an audio pin, then another 9-digit webinar code. What the heck?
- Display what seems like an interactive question-answer widget during the session but in fact have no one managing it on the other end. Hate when that happens.
- Hire a consultant presenter who is obviously quite astute technically but sporting a monotonically depressive voice.
- Take 35 minutes to explain how to take an FTP input, transform some rows, and output it to a folder, all on a local machine. Fascinating, but I think most people could have groked this rocket science in say, five minutes. No wonder they need 90 minutes to get through all this!
The point being, I couldn’t stand more than 35 minutes of this treatment and decided to bail out and try it out later on my own. Basically these guys have an agent-based architecture that allows you to connect to their “AtomSphere” (get it? Like atmosphere. Yeah) and have agents manipulate your data via connectors and transformations. You can setup and “send” agents onto other boxes and platforms. It did sound interesting technically, and I tried to pull their demo from the site to no avail. I got a message saying they welcomed me. Great. Where’s the bits at? I emailed back to support. I was assigned a “request for assistance” case number. Wow. Finally I got another message saying my account was active and I could login from the main page (which contradicts the initial instructions claiming you’ll get some email with a link in it) – Oh, and as for the suggestion that since I was new to Boomi, I should register for one of their training webinars, thanks but no thanks. I will definitely try it out though. It’s too compelling to pass up based on a few minor glitches from weak marketing or customer support departments.
Moving right along, I did want to mention www.lyzasoft.com, even though their offering is not “on-demand” per say, it kind of is in a “local” way. Basically you download their java thick client application. You then plop a bunch of connectors onto a workbook, bring in some data, and start graphing or analyzing it within minutes. All point-and-click, drag and drop. Yes I know this sounds like a “so-what” scenario but you don’t understand: I actually had _fun_ using this thing, yet it’s far from a toy! I wasn’t planning on spending more than 30 minutes with the product initially but ended up messing around with it for a couple hours. You can pull in anything from ODBC to flat-file connections, then you graphically describe relationships among tables (ie: joins) then you can merge that output with other data sources into a graph or statistical “component” where you then drag measures and attributes into corresponding axis “boxes” (much like the Excel pivot table designer interface). Amazingly enough (to me) this stuff just worked. It’s really cool how you can try stuff out and then back out or delete, then start from scratch or add/delete relationships and data at will. It’s very intuitive. There’s some performance and UI quirks (not surprising running Java UI code on Windows) and I doubt you can bring in significant (read terabytes) amounts of data at this stage. Yet Lyzasoft claims 175-200 maximum number of data input sources with largest customer databases at a “few million rows” of around 250-300 columns (I think that’s probably around 30-50GB of data roughly?). But overall it’s an impressive beginning and, quite honestly, probably easy enough to adapt to a hosted model. Add to that excellent and efficient real-time customer support, and you have a winner worth looking at here that could, in my opinion, pose a serious challenge to someone like QlikTek, with the right analytical engine behind them.