Is your (or your end client’s) business growing rapidly? We certainly hope so! But to gain an edge on your IT-savvy competition, you need to start making sense of the mountains of business data your enterprise generates. At this point, you might consider a NoSQL database.
But prior to choosing a solution, you should comprehend the NoSQL basics so as to not overpay your Big Data Consultant and not make platform mistakes — which can be extremely costly to correct later on. Our quick 101 can save you a bunch of money and help you to talk to Big Data techies with confidence!
What is NoSQL?
NoSQL systems emerged as an alternative to tabular relations used in relational databases. As the name implies, NoSQL provides a non-relational mechanism for storing and rapidly retrieving data, so that you could run high-performance, scalable, and flexible databases in any environment — cloud, hybrid, or on-premises.
When to Use NoSQL?
Due to their dynamic nature, NoSQL databases are better suited for cases when:
- The software development lifecycle requires quick iterations and frequent code pushes.
- Structured, semi-structured, and unstructured data has to be stored in one database.
- The same database needs to serve both transactional and analytical workloads.
- You require a scale-out architecture to support increased traffic loads.
- SQL databases can no longer affordably serve the growing volumes of big data.
SQL vs NoSQL Databases: Main Differences
- Scalability: SQL databases scale vertically. NoSQL databases scale horizontally.
- Data type: SQL databases are better suited for multi-row transactions. NoSQL is conductive with both structured, semi-structured, and unstructured data.
- Storage type: SQL databases rely on tables and rows. NoSQL databases have different storage types such as key-value databases, graphs, column storage, and document storage type.
- Language: SQL databases have predefined schema and use structured query language. NoSQL databases leverage dynamic schemas.
How to Choose The Optimal NoSQL Database
Let’s start with the obvious: all your business needs to be collected and stored properly. Otherwise, it will be too difficult, or even impossible to analyze your data. To make things even worse, without a system it will be very complicated to move your big data analytics elsewhere post-deployment.
Have you heard about the 3Vs of Big Data? Fret not, it’s simple:
- Volume: presumably your business generates LOTS of data — otherwise you wouldn’t be reading this.
- Velocity: is your data flowing more like a waterfall than a river? Do you need to collect it at all times and at high rates?
- Variety: probably the most discerning one. Different data types can vary a lot: from those that are produced by machines (and therefore easy to be machine-processed) to pieces of art that only human beings can create and really comprehend.
While you can conquer big volumes of data with the brute force of buying ever more data storage, it’s your unique combination of the latter two Vs that will determine the best platform for you.
The common types of NoSQL databases include:
- Key-value store databases
- Column-based databases
- Graph databases
- Document-oriented databases
Below we’ll briefly review these four families of Big Data systems. See which one applies best to your business circumstances:
Key-Value Store Databases
Popular platforms: Azure Table Storage, DynamoDB, Redis
Advantages:
- Extremely fast at saving streams of data.
- Saves data as a key-value pair where the value can be of any type: BLOBs, CLOBs, encoded strings, etc. — and you can combine all of them.
Caveat: Does not support data relationships well (though this is not really a deciding factor as it is a common problem across most NoSQL engines).
Caveat #2: For later processing, you will need to transform the value into a form that supports SQL-like querying; for raw data, you need a key lookup.
Best for: video capture, encoded data, real-time logging.
Column Family Databases
Popular platforms: Hadoop/HBase, Cassandra.
Advantages
- An extension of raw key-value tuple model where the value is a set of columns defined as a name-value-timestamp tuple (triplet).
- A look-alike of the good old table structures in relational databases, though not quite so apparent.
- The quantity of table columns can vary to produce flexible data representation.
- You can perform Querying on column family structures very quickly because you can define the specific set of columns that you query most frequently and, thus, not all the information has to be read as it is in Relational DBM’s.
Best for: variable data representation, data analysis (both science and business).
Document Store
Popular platforms: MongoDB, Azure DocumentDB, CouchDB.
Advantages
- Represents data as a key-value pair, where the value is a “document” – a variable set of fields with a name and value, with nesting documents in documents for greater flexibility.
- Supports rich querying mechanisms and data relationships. representation techniques, which enables quick adoption from RDBMs
- Data denormalization – all necessary information is in one place.
Best for: metadata storage, web-applications that read/write massive amounts of information, sales/products definition and online marketing.
Graph Database
Popular platforms: FlockDB, Onyx, InfiniteGraph.
Advantages
- If you really need relations in your NoSQL, these engines are for you.
- You can maintain simple and transparent hierarchies.
- Objects can be queried not only by attributes but also by relations, using built-in join capabilities.
Best for: building infrastructure models, social networks maintenance, business process modeling.
Multi-Model Database
Popular platforms: ArangoDB, AlchemyDB, CortexDB.
Advantages
Combines two or more approaches to Big Data to implement the polyglot persistence paradigm.
Best for: large-scale enterprises where multiple types of Big Data have to be maintained and served through one integrated solution.
To Conclude
Now you have it, a brief comparison of the main features throughout the described NoSQL database types.
Edvantis has to date worked extensively with all five groups of Big Data systems, and our Engineers have hands-on experience in most of the technologies on the list ( but we have no association with any particular vendor or platform). Want more free advice? We are always up for a chat. Contact Edvantis to schedule a call.