| Part 3: Use the force, Luke!
Choosing the right architecture
--
Experience with a wide variety of data stores doesn’t
only help with choosing the right architecture, but also being confident about
vetoing bad designs.
This goes both ways; so for the following (intentionally
over-exaggerated) scenarios -
Scenario 1:
Requirement:
A new service which keeps track
of your company’s orders.
Data specs: there are supposed
to be about 100 orders per day, the information must be instantly consistent. Orders
relate to products and customers.
Occasionally, some ad-hoc
reports are needed, mostly to aggregate orders per customers, products or date
range.
Selected solution:
the development team decided to
save the orders to Redis, then use a queue to transform the data to an HDFS
store,
out of which some map-reduce jobs will be used to return the relevant data.
out of which some map-reduce jobs will be used to return the relevant data.
So yes, the dev team got their hands all over a new
trending technologies, and they can add “NoSQL” and “BIG”-data, but in fact,
this is a very inappropriate solution for the problem.
Here’s another example -
Scenario 2:
Requirement:
Track the company app’s site
visitor interaction in real-time on a heat map. In addition, show geo-locations
trend over time.
Data specs: the app is extremely
popular, gaining ~2500 interaction per second
Purposed solution:
Use a relational database on a
very strong, high-end machine to able to support the 2500/sec OLTP rates
Create additional jobs running
every few minutes, in order to aggregate the time frames & performing IP
lookups.
Again, although this might work, this is again not a
preferred solution.
In this case a real scale-out architecture will perform
better at a lower cost.
A typical ELK stack for example
could handle this scenario more naturally.
Can you see where this is going? Knowledge is important
to divert the architecture in the right way.
Different specs require different platform implementation
and in many cases - a mix.
Inappropriate platform selection can come from many sources, including:
Inappropriate platform selection can come from many sources, including:
- Developers who want to experience with different technologies
- PMs/POs lacking the detailed knowledge
- CTOs wishing to brag about using some new technology (preferably while playing golf with other CTOs)
Choosing the right platform for a given solution is
extremely important, even when the recommended technology is not the trending
one.
| Epilogue
The world of data is constantly changing. The variety of
solutions for storing & retrieving data in various shapes and sizes is
rapidly growing.
Relational databases are strong, and will stay strong - for storing and retrieving relational data, usually when no scale-out is required. For all the other types - the world has already changed.
Relational databases are strong, and will stay strong - for storing and retrieving relational data, usually when no scale-out is required. For all the other types - the world has already changed.
If there’s one thing you should take away with you, is that
looking forward - you should embrace these new technologies, get to know them
and understand the architectural position of the dominant ones.
No comments:
Post a Comment