- Understanding the Future: the Evolution of databases
Understanding the Future: the Evolution of databases
|
It
is undoubtedly one of the most provocative and that is calling me the
attention of the analysis of the trend that is assuming the phenomenon
big data in the business sectors: the enormous difficulty to understand
it without lowering until the systematics that sustains it. An
undoubtedly relevant topic: As long as you try to explain big data by
"prescribing" as a magic formula the reports of analysts like Forrester,
McKinsey, Gartner, etc. or resorting to application cases, the average
manager will not be able to understand what really lies behind this
world, let alone its possibilities.
What
are we really talking about? For me, the greatest difficulty inherent
in understanding the difference posed by big data is to make the idea of
what it means to move from the database schema that we all know at
different levels, to the idea of non-relational or NoSQL databases. A
world that is often defined as negative, for "What is not," which adds
even more conceptual difficulty.
Sounds
intimidating, but wait, don't unplug yet:-) Let's try to approach the
concept: SQL-based databases (structured Query language) is what the
vast majority of users know. You can know it at very different levels:
from the one who operates with them, manages the language as such,
understands the rules of normalization of a conventional database or is
able to analyze its limitations; Even those who simply imagine them as a
large electronic file system as a drawer and folders of a cabinet. A
relational database based on SQL, typically managed with systems such as
Oracle, MySQL, DB2, Informix, Microsoft SQL Server, Sybase, PostgreSQL,
etc, is an operation that results us, so to speak, "natural": Follow
the rules acid (atomicity, consistency, isolation and durability, or
atomicity, consistency, insulation and durability), which allows the
instructions can be considered a transaction, and respond to a simple
vision , in which a data is stored in an unequivocal manner and with
defined relationships. The view of tables with rows and columns in which
a query always returns the same fields.
What
happens if we extend the concept to accommodate other types of
realities that are becoming more and more frequent in our usual
operations today? Does any data have clear these structures? Or are we
just leaving out of our analysis everything that our database operation
is not able to pick up? Databases NoSQL (not only SQL, does not imply
that SQL is dead or should not be used, but there are better solutions)
to relax many of the limitations inherent in conventional databases and
how to work with them. Collections of documents with fields defined in a
lax manner, rather than tables with rows and columns, which allow much
faster and more efficient analyses and, above all, not limited to the
conventional structure. The idea is to store data in a massive way,
which responds very well to the enormous wealth of data generated by the
world today, and analyze them without necessarily following standards
that do not necessarily adapt to them. Where relational databases are
costly and time-consuming, the NoSQL alternative is much more efficient
and inexpensive to manipulate data without necessarily having to adapt
them to a rigid structure. In purity, a system of this type is not even a
database understood as such, but a distributed storage system to manage
data endowed with a certain structure, structure that can also be
enormously flexible.
The
problem? For most people, the difficulty of "thinking" in such a
system. Our mental schemes adapt to a rigid system, with clear standards
and marked structures. Parallelisms with stores divided into shelves,
cabinets and folders are something that works for us mentally. However,
how to manage with such a system, for example, huge database searches
that contain completely heterogeneous references to each other and with
relationships of all kinds, not necessarily unique? In many cases, we
talk about systems that have been precisely developed by companies like
Google, Yahoo!, Facebook and the like to manage their own operatives,
using almost always open source, in order to obtain a structure that,
with a reasonable cost and performance, allows them to treat enormous
amounts of data with many very complex relationships with each other.
In
a sense, to understand the subject it is necessary to "unlearn". But
the need to do so is evident, given the adaptation of these kinds of
structures to the problems of operating in the world in which we live
today. But it is not easy: for some time, many companies will continue
to torture their database systems re
0 comentarios: