Something has been bugging me. It’s big data. Not the industry but the term itself. Every time I am asked about big data I need to use the term in order to be understood, but the term itself steers the uninitiated in the wrong direction. It leaves a bad taste in my mouth. It’s wrong.
It’s time to stop thinking about big data as big data, and start looking at these platforms as the next logical step in data management. What we call “big data” is really a building block approach to databases. Rather than the pre-packaged relational systems we have grown accustomed to over the last two decades, we now assemble different pieces (data management, data storage, orchestration, etc.) together in order to fit specific requirements. These platforms, in dozens of different flavors, have more than proven their worth and no longer need to escape the shadow of relational platforms. It’s time to simply think of big data as modular databases.
Big data has had something a chip on its shoulder, with proponents calling the movement ‘NoSQL’ to differentiate these platforms from relational databases. The term “big data” was used to describe this segment, but as it captures only one – and not even the most important – characteristic, the term now under-serves the entire movement. These databases may focus on speed, size, analytic capabilities, failsafe operation, or some other goal, and they allow computation on a massive scale for a very small amount of money. But just as importantly, they are fully customizable to meet different needs. And they work! This is not a fad. It is are not going away. It is not always easy to describe what these modular databases look like, as they are as variable as the applications that use them, but they have a set of common characteristics.
Hopefully this post will not trigger any “relational databases are dead” comments. Mainframe databases are still alive and thriving, and relational databases have a dominant market position that is not about to evaporate either. But when you start a new project, you are probably not looking at a relational database management system. Programmers simply need more flexibility in how they manage and use data, and relational platforms simply do not provide the flexibility to accommodate all the diverse needs out there. Big data is a database, and I bet within the next couple years when we say ‘database’ we won’t think relational – we will mean big data modular databases.
Comments