The DBA’s guide to the [new] galaxy
| Prologue
I’ve been wanting to write this article for quite a while. Consider
this as some sort of a high-level summary of today’s data world, given from my
own personal perspective.
Specifically, an overview for all of my fellow SQL DBA’s
(admins, developers, architects, warehouse/BI specialists), who get exposed to
new data platforms starting to get built around them
Then quietly wonder how to adjust to these changes, and what
should be their right approach (I’ll give you a quick tl;dr hint - denial is
not the right one!)
Are you ready? Let’s go!
| Part 1: A quick “12-steps” phase to acknowledge
the data world is changing
Until not too long ago, a typical organization would hold
its entire back-end data stack in one or more relational databases.
To handle the data, that typical organization would hire
one or more DBA’s responsible for tasks such as storage planning, administration,
development, tuning, DR and more.
Also, a typical DBA would usually specialize in one
(usually relational) database.
While this is still relevant, reality forced changes to the
traditional data world; here’s a brief of what and why:
Over the years, data collection volume is growing exponentially!
There’s an ever-growing need to collect and store more
data, while persisting historical data.
The new data is often getting richer in content and
structure, or sometimes intentionally lacks any formal pre-defined structure.
Storage pricing are constantly dropping, especially
commodity storage,
which is becoming more popular to use instead of having a
very high-end server connected to a high-end storage.
The need to facilitate these ever-growing requirements
while having minimal cost friction had led to new platforms, services and
frameworks being built, whether cloud-based or on premise.
So, the data world has changed, dramatically!
Let’s look at these changes from a different angle:
- New products within new startups, as well as existing organizations will not necessarily (gently put) choose a highly priced relational data store (not naming names, but you know which ones), unless the business model specifically requires one.
- There are a *lot* of new technologies, mostly open-source, that have already reached enough maturity level in such way that many organizations trust these technologies/platforms as their production source of record.
- Often, regardless to pricing, a required solution does not even fit inside a relational model and as a result, unlike a decade ago, you will see less and less relational databases trying to imitate processes that are not initially intended to be done inside a relational database. Need some examples? Key-Value stores, Document-based databases, unstructured data, true scale-out (share nothing) architectures, queues, graph data and more
- In addition to cutting down licensing costs, scaling out the data (both storage and processing) reduces the hardware cost. High-end server can easily cost like a new house!
Given the above, with the assurance a certain *free technology can be better (let alone - ‘good
enough’) to handle its data services - it is very likely such technology will
be chosen by almost any organization over its costly rivals.
*This “Free” has its costs,
but I’ll get to that later in the article.
Do you see this happening in your organization? If not, it’s
just a question of when, not if.
If any of the above is news to you, or if you knew
‘something was going on’ but was gently ignoring it, you may feel a bit of a
discomfort. (But if you are, that is totally fine)
First, it is important to be aware of what’s out there
Second, it is also important to keep in mind that
Relational databases are still very strong and dominant, and will stay there
for many good reasons.
In fact, for any structured data, with constraints, relationship to other data objects that need to be consistent, isolated, and transaction-safe, there are no better solutions than the relational model.
In fact, for any structured data, with constraints, relationship to other data objects that need to be consistent, isolated, and transaction-safe, there are no better solutions than the relational model.
Let’s have
a look at some graphs, shall we?
(Note: relevant to
this article date, so - 2015-ish)
So, breath in, breath out! Here are some “perspective”
graphs --
The first one, coming from “DB-engines” website’s ‘popularity
and trends’ shows that the commercial relational databases are flying up above everyone
else.
However, they do keep their stable value, if not
decreasing slowly, while other newer “players” gradually increase
The second graph is coming from Google Trends:
The trends graph shows the same picture - while “SQL Server”
search term which was used here is still above typical newer databases &
services - the trend is pretty clear.
We can clearly assume that popularity = estimated usage
“OK, so the data world has changed. Now what?”
No comments:
Post a Comment