How do NoSQL databases work? Simply Explained!

May 30, 2021

NoSQL

databases

have become very popular. Large enterprises rely on them to store hundreds of petabytes of data and execute millions of queries per second. But what is a NoSQL database? How does it

work

and why does it scale so much better than traditional relational

databases

? Let's start by quickly explaining the problem with relational databases like MySQL, MariaDB, SQL Server and the like. They are designed to store relational data as efficiently as possible. You can have a table for customers, orders and products, linking them logically: customers place orders and orders contain products. This strict organization is great for managing your data, but it comes at a cost: relational databases struggle to scale.

They have to maintain these relationships, and that is an intensive process that requires a lot of memory and computing power. So, for a while, you can keep upgrading your database server, but at some point, it won't be able to handle the load. In technical terms, we say that relational databases can scale vertically, but not horizontally, while NoSQL databases can scale both vertically and horizontally. You can compare this to a building: scaling up means adding more floors to an existing building, while scaling out means adding more buildings. You intuitively understand that vertical scaling is only possible to a certain extent, while horizontal scaling is much more powerful.

More Interesting Facts About,

how do nosql databases work simply explained...

Why do NoSQL databases scale so well? Well, first of all, they end these expensive relationships. In NoSQL, each database element is independent. This simple modification means that they are essentially key-value stores. Each database item only has two fields: a unique key and a value. For example: When you want to store product information, you can use the product barcode as the key and the product name as the value. This seems restrictive, but the value can be something like a JSON document containing more data, such as price and description. This simpler design is why NoSQL databases scale better.

If a single database server is not enough to store all your data or handle all queries, you can split the

work

load across two or more servers. Each server will then be responsible for only part of its database. To give an example: Apple runs a NoSQL database consisting of 75,000 servers. In NoSQL terms, these parts of your database are called partitions and a question arises. If your database is potentially divided into thousands of partitions, how do you know where an item is stored? That's where the primary key comes into play. Remember, NoSQL databases are key-value stores and the key determines which partition an item will be stored in.

Behind the scenes, NoSQL databases use a hash function to convert the primary key of each item to a number that falls within a fixed range. Let's say between 0 and 100. This hash value and range are used to determine where to store an item. If your database is small enough or you don't receive many requests, you can put everything on a single server. This will then be responsible for the entire range. If that server becomes overloaded, you can add a secondary server, which means the range will be split in half. Server 1 will be responsible for all items with a hash between 0 and 50, while Server 2 will store everything between 50 and 100.

In theory, you have now doubled the capacity of your database: both in terms of storage and the number of queries you can make. can execute. This range is also called the key space. It's a simple system that solves two problems: where to store new items and where to find existing ones. All you have to do is calculate the hash of an item's key and keep track of which server is responsible for which part of the keyspace. Now, in this example, the range from 0 to 100 is a little small. It would only allow you to split your database into 100 parts at most.

Therefore, real NoSQL databases have much larger key spaces, allowing them to scale with almost no restrictions. In addition to great scalability, NoSQL is schema-free, meaning database elements do not need to have the same structure. Each one can be completely different. In a relational database, you must define your table structure and then each element must conform to it. Changing this structure is not easy and could even lead to data loss. Not having a schema can be a huge advantage if your application and data structure are constantly evolving. At this point, it is clear that NoSQL databases have certain advantages over relational databases.

But that does not mean that relational databases are obsolete, far from it. NoSQL is more limited in how you can retrieve your data and only allows you to retrieve items by their primary key. Finding orders by ID is not a problem, but finding all orders above a certain amount would be very inefficient. Relational databases, on the other hand, have no problems with this. There are solutions to this problem, but only if you know how you are going to access your data. And that may not always be the case. Another disadvantage is that NoSQL databases are eventually consistent.

When you write a new item to the database and try to read it immediately, it may not be returned. As I

explained

, NoSQL divides your database into partitions. But each partition is mirrored on multiple servers. That way, a server can go down without much impact. When you write a new item to the database, one of these mirrors will store the new item and then copy it to the others in the background. This process may take a little time. So when it reads that item, the NoSQL database might try to read it from a mirror that doesn't have it yet.

This is not a big problem in practice because the data is replicated in just a few milliseconds. And if you want consistency, most NoSQL databases have that option. So, in summary: both NoSQL and relational databases will be around for the foreseeable future. Each with their own strengths and weaknesses. Now that you know how NoSQL works, let's look at some examples. Cloud providers heavily promote NoSQL because they can scale it more easily. AWS has DynamoDB, Google Cloud has BigTable, and Azure has CosmosDB. To give you another example of its scalability: during Amazon Prime Day in 2019, Amazon's NoSQL database peaked at 45 million requests per second.

That's amazing! But you can also run NoSQL databases yourself with software like Cassandra (developed by Facebook), Scylla, CouchDB, MongoDB, and more. Before we end this video, let's quickly talk about the name "NoSQL." It's a bit confusing since it can be interpreted in two ways. First: "NoSQL" can mean "not just SQL," pointing to the fact that some NoSQL databases partially understand the SQL query language in addition to their own query capabilities. And second, it is often called "NoSQL" in the sense of "non-relational" because it cannot easily store relational data. That was all for this video. Subscribe if you learned something from it and I hope to see you in the next video.

Watch Video & Subscribe

If you have any copyright issue, please Contact