Thursday, March 12, 2009

There are only two types of people I cannot stand: people who are intolerant of other peoples' cultures, and the dutch.

In my quest to mine the business intelligence market to the last drop, when I’m not busy doing other things like talking to people, setting up POCs, coding, looking at product features or evaluating what I call “ecosystem” products like BI and ETL platforms, I spend a heck of a lot of time online reading blogs and scanning websites.

This market is very dynamic and it’s important to keep up and spot trends and pain points early on. This isn’t unlike an NSA type of endeavor really. There are gazillions of “signals” out there and it’s up to me to pick up the pieces and make sense out of them as best I can.

When I do this successfully, I can not only enhance our competitive edge, but also discuss intelligently how we fit into the business intelligence “big picture”. In this business, someone who is not constantly educating himself is in a world of hurt.

To do this, I scan several dozen blogs and websites an almost daily basis (see my partial list following this posting). This activity, combined with talking to folks in the BI world, product specialists, consultants, being involved in coding and testing a full-fledged database product, and attending webinars for the past year has given me a certain outlook on the market that carries “no baggage”. Let me explain.

I don’t come at this with the perspective of someone who has been immersed exclusively in the BI world for decades, or someone who has the hands-on background designing and implementing data warehouses his entire career. I’m just a software engineer who happens to have worked for a multitude of different businesses and in numerous domains, not just BI. So I feel this gives me a more “detached” view of the DW/BI market than an industry expert with BI-specific experience might have. That being said, the following are nothing more than observation-based personal opinions about the warehousing and BI market developed over the past 12 months. I’d love to hear your thoughts as well – at the risk of being branded a mere neophyte.

And then there was light…
Acceptance of “non-standard” (read: not from Microsoft, IBM or Oracle) approaches to analytical databases (including non-relational approaches) is going mainstream in the warehousing realm. Four to six years ago, if you talked to anyone in the enterprise about trying a “non-relational” database, they’d flip the bozo bit on you instantly. Nowadays, although relational “bigots” still (and always will) exist, BI people are more open-minded, mostly thanks to dozens of “new-breed” players in the market with well-documented successful implementations (not the least of which are Netezza, Teradata, Vertica and SybaseIQ). To use a phrase I detest, people are finally “thinking outside the Big Three box”.

Gimme a little OLTP with that OLAP would you?
The distinction between operational (transactional) and warehousing (analytical) business activity is blurring. Operational and analytical business efforts are often integrated. Warehousing and analytics is no longer “the crazy uncle in the attic” project. Another sign of this is the recent desire to enable ever more frequent insert/updates to warehouse stores than in the past. Not content with infrequent batch updates, people are now looking at efficiently pushing update/inserts real time into their warehouse. The updates often come from both an ODS and external data sources. Unfortunately most analytical databases are designed around the assumption that warehousing is mostly read-only (InfoBright ICE doesn’t even support DML) and optimize as such. I think they’re up against hard times unless they can “transactionalize” quickly.

Open source: cheaper than free?
Open source is having a large impact in analytics for both economic and practical reasons. Nowadays, businesses can set up data marts using engines like InfoBright (MySQL), for example, and use BI/Integration tools like Talend’s Open Studio or Pentaho’s Mondrian. Clearly a lot of this is driven by current economics, but open source deployment and licensing models are competing head-on with proprietary solutions on many levels.

Hey buddy, want some good BI?
Everyone’s talking about BI. Microsoft is running TV ad spots about it during prime-time and pushing the concept in a very public manner. After buying up Datallegro last year, they announced the Madison project, put out best-practice configurations for Dell and HP “appliances” hosting SQL Server, (;jsessionid=GSPUKPDEIPLJCQSNDLPSKHSCJUNN2JVN?articleID=214502509) and have been discretely adding warehouse-oriented features to SQL since 2007 ( What’s more, they own the BI desktop with Excel and Office 14 promises a significant OLAP functional push. Vaporware? Maybe, but I think they’re gunning to eat a lot of BI folks’ lunches out there in the coming months. If anyone can commoditize DW and BI technology, it’s Microsoft.

Performance, shperformance
“Performance” is a big mystery. No one really knows how to define it in the DW/BI space. Is it load time? Is it query response time? Is it data presentation to result time? Does it include backup time? How about data recovery? Opinions differ. Service level expectations are often misguided or unrealistic. People don’t seem to care much about TPC-H or TPC-DS performance metrics.

The proof is in the concept
POC POC POC! People want to see numbers on their data on their hardware up close and personal. That’s the way it should be. There’s too much hype in the industry, and people tune out the bullshit -- Who can blame them. These folks have been abused and lied to for decades now. As I’ve been on the other side of the fence many times, I can relate. That’s why I always keep it real simple when talking about our product. Everyone’s heard the “better, faster, simpler” shpiel a million times, so why insult their intelligence. Give them a set of keys and let them test-drive the darn product! POCs should be run as described at and and There isn’t a single solution in this market that applies perfectly to every customer across the board and any vendor claiming otherwise is either naïve or disingenuous.

Cloud cool-aid
“Cloud computing” madness has taken hold in the DW/BI industry as well. Vertica and Kognitio are big pushers of warehouse cloud hosting solutions. The buzzword now is DaaS (data as a service). I find it hard to get excited about this recent trend. I don’t mean to Andy Rooney the whole concept, but in enterprise warehousing and BI, given the amounts of data involved and the security, governance, availability, and SLA issues, I just don’t “grok” it. Maybe for small segments of “cold” data? I don’t know. I see the craze, I feel the buzz, I get the marketing upside, and I notice the traction but…I am not of the body. The power of clouds doesn’t compel me.

Here’s my list of DW/BI industry blogs:


  1. I recently came accross your blog and have been reading along. I thought I would leave my

    first comment. I dont know what to say except that I have enjoyed reading. Nice blog. I will

    keep visiting this blog very often.


  2. Thanks for chiming in Kelvin! :)