Cloud Firestore Data Modeling (Google I/O'19)

Jun 04, 2021

hello everyone hello oh look at that, I had a reaction that's great. I realized that you are the best audience I will have at this event also the only audience, but still the best, so my name is Todd Kerpelman, thanks for coming. This introductory talk on Firestore

data

modeling

in the

cloud

. I'm from the Firebase team, as you may have guessed. Hey, Firebase, in case you haven't heard, is a set of tools and services to help you build more successful web and mobile apps, and we do it. with everything from analytics to a/b testing, performance monitoring, and yes, hosting your app

data

in the

cloud

with Cloud Firestore, our massively scalable cloud seamlessly hosted on a real-time database, so, In the interest of time, I'm skipping the product introduction because I think if you're here in this session, you'll generally know that Cloud Firestore is amazing, it has reliability and a truly serverless application development environment and magical synchronization of your data with everyone. your devices and great offline support and client libraries for iOS, Android and the web and much more, you know this, because I know that, without a doubt, when I first approach any new technology, something that excites me , like Cloud Firestore, I'm a bit like our developer. here, which is apparently working on a food delivery app.

I have this mix of excitement and fear, like I'm really excited to use all these shiny new features and I can't wait to see what I can do with them, but I'm also a little afraid of messing things up, like how can I make sure that six months from now I will have made the right decision so that my database can perform the right type of queries that I want? I haven't done anything wrong or screwed up performance as soon as my app starts scaling or I've done something terrible and, you know, increased my database costs exponentially, how can I make sure I'm doing the right thing? decisions now so that it is NOT a warning that you already know in the middle later, that is at least generally the feeling I have when approaching a new technology.

More Interesting Facts About,

cloud firestore data modeling google i o 19...

I guess maybe many of you do too. I certainly see this sentiment. like a mix of excitement and fear, as in stack overflow and say on our own, you know the discussion lists and to be fair, you know there's a lot to think about when it comes to how you're structuring your database and yes There's a little bit scary stuff here so let's see if we can shed light on some of these topics and I think I'll start by just reviewing in case you know you're new, I think it's a no-brainer database. Because for many web and mobile developers, this is the first time they're trying to build a real production-scale application using a database without sequels and then we'll get into some details about what Cloud Firestore is like. different, so I'll assume that most of you know what you know.

The most typical traditional sequel databases have tables and each of these tables represents some kind of strictly defined object, something like an author, a book or a review, and has schemas that have very strict rules about what type of data can appear in each of these columns, as if you knew that the first column in the author's table probably needs to be a string representing his last name and the second needs to be an auto-incrementing integer, the third needs to be a timestamp and so on, and then later on you might want to merge chunks of these different tables to get some data that you're interested in and you would do this by writing something in a language called sequel and you know that writing sequel statements is good because you can have the database do all the work for you to find all this different data together and more or less merge it and give it to you, but it has some drawbacks because you know that the performance, for example, can be very variable, so this could be very fast or it could be a little slow, it depends a lot on if you know how much data you have. you have to analyze what kind of queries you are making to your database, do you know exactly how your data is structured, etc., etc., that's the sequel in a nutshell, very simplified.

I only have 40 minutes in a world without consequences. To start, some differences, things tend to be a little more vague in terms of how your data is defined or I guess to use a more formal term we like to say schema-less, which means that by convention a lot of Your data Many of your data records will probably look similar, but there are no hard and fast rules about that, as you know, yes, sure every animal here has a name, but that's just by convention, the database It doesn't really apply and you know it. it's nice and then it gives you some flexibility, for example if you want to add a field like you, I can add a birthday field here for my dog and not worry about adding a birthday field for my key fish because you know I don't.

Careful, it's a fish, no one cares about a fish's birthday and you know developers generally like this because it gives them the freedom to start adding data as needed. You know, I don't have to worry about how I'm going to fill this out. birthday field on all my other animals because again I don't care about my fish's birthday, this also lets you know how to store similar but not exactly the same data, for example I can store a plumage field for my bird and a type field hair for my dog and a fin count field for my fish and not having to worry about putting these fields in other records where you know they might not make much sense, now the flip side of all this is that you have to code defensively You can never really guarantee what kind of data you'll get from the database and you should know that it will fail gracefully if things don't meet your expectations, but honestly, if you're building mobile apps where you know your user you can have a version of your app. which is about a year old because it refuses to update.

This should be a general practice you are following anyway. Never make assumptions about the data you are getting. I think the biggest difference with a no sequel database, if you could say, is kind of a hidden message right there in the name, there's no sequel, so the queries tend to be a lot simpler, so again, going back to our sequel example, imagine I met you some book tables. and it has some tables of authors and if I wanted to get a list of books and the names of the people who wrote them, I would do it with some kind of joint statement, but in the world without sequels you don't have access to these joints, so strictly depend of a type of foreign key like this, which you already know as foreign.

This isn't something you can let yourself know in a single database call, so I think a more likely scenario would be to duplicate some of that data. you know, take the name of the author and from these little author records and also put them in the records of the book so that it's basically a group of ways that you know you'll want to retrieve them together. Now, this practice, this duplicate of you know, duplicate. of data is known as denormalized data, which I know is a bad and scary thing for a lot of sequel developers because you know we've been told since we taught how to design our first database that, like denormalized data, it's something really bad. true, you're only supposed to have your data in one location so it's easy to change it later, you just have to change it in that place, but in the world without sequelae, denormalized data is not only allowed, it's expected.

Yeah, you know, if Charles Dickens changed his name to Chuck Edie to stay relevant to kids, I'd need to go back and change it everywhere in my database, not just the author record, but the author records as well. book where that. that duplicate data remains and yes, okay, that's a bit annoying, but there are good reasons to do this. One reason is that you know the readings there are really easy now if I want to get all the books along with the names of the authors who wrote them. I can do it right, it's just there and it's very easy for me to sweep up all that data and think realistically about how often the data is read or written like you know, depending on how popular my book app is.

You may know that it's read thousands or millions of times how often Charles Dickens changes his name like it was never right, so the philosophy of having no sequels is kind of like, hey, you know what, let's really optimize for the case that is happening thousands or millions of times. in the real world rather than the case that is happening once, another big reason why you don't set up sequel databases this way and yes I'm oversimplifying but it makes it easier for us to scale horizontally which means you know that Your database needs to grow as you go. you add more and more data to your database, you can basically just throw other machines at it and your data will automatically grow to span these multiple machines and it all works particularly in a managed server environment like let's say a Google cloud platform, it makes it easier for us a lot of turning servers on and off as your database grows and shrinks, and you know we can accommodate your data without you knowing we're doing any of this. work behind the scenes to make sure you know we're adding room to expand now, by contrast, sequel databases you know often have these tightly interrelated joins that tend to scale vertically, meaning you know that a as your database grows and you need to accommodate it.

Generally, you have to move it to bigger, beefier machines and you know that at some point you're going to run out of massive supercomputers to install that thing on, but also, generally speaking, any time you migrate your database to another machine, you know there will be downtime and we don't like downtime, so there are no sequel databases that come in many different types, you have like big key value stores, you have big old JSON objects like the old realtime database , but Cloud Firestore is about documents and collections, so I'm going to spend a little time looking at them.

Let's look at a document first. A document is something you can think of as a dictionary or a hash. It has a set of key-value pairs that we like to share. a like fields and the values of these fields can be a number of different things, from like strings to numbers, you know, very small binary objects to these JSON II looking things that we officially call maps, now documents are stored in collections that are so You might suspect that document collections now can't directly contain other documents, but can often point to subcollections that contain other documents which then point to other subcollections, etc., and so on.

Now, one important thing to keep in mind in Cloud Firestore. The bottom line is that the queries are shallow, meaning you can grab this document at the top and not worry about grabbing all the data below that you know is in all those subcollections and this is generally good, developers generally like that. This is because it means that you can structure your data hierarchically in a way that may make sense to you intuitively without having to worry about collecting a ton of unnecessary data. If I just want you to know that document at the top, the next thing we should probably cover is queries. and how queries work and Firestore in the cloud is interesting because as a rule they are pretty fast and as a rule they scale proportionally to the size of the result set, not the size of the underlying data set, and what do I want say with that?

I mean if I were to run a query that way that requests, say, the top 10 pizzerias in San Francisco, that query will take the same amount of time whether I have a thousand records to review in my database. or a hundred million, no matter how big the underlying data set is, that query will take me the same amount of time, so how does Claude Firestore do this by indexing every field in every document in every collection, thinking about our fake restaurant delivery ? app a little more. I imagine you know that we started having some restaurants represented as documents here and put them in some kind of restaurant collection.

Well, you'll notice that each of my restaurants has a field for name, cuisine, and city and rates well if I do. That Cloud Fire store will go ahead and create an index for every name, every cuisine, every city, and every rating in that collection. Now Claude Fire Store creates all these indexes automatically for me every time I add changes or delete a document and now I can search these documents in this collection as long as I can follow this 2 step rule, which isstep 1: find a place in the index where some condition is true and then basically take a bunch of adjacent documents until that condition is not true.

It is no longer true, so let's go to a concrete example. Let's imagine I want to find all the restaurants in Dallas. Well, that would be easy. I would find the index for my city and, using this two-step procedure, find where the city equals Dallas and then grab all the adjacent documents. until the city is no longer the same as Dallas, similarly, find all the restaurants with a rating of 4.5 or higher, you know, I can do that, find that place in the index, take all the adjacent documents until I guess , in this case, I am left without documents, I probably should too. note that by the way, Matt introduces the JSON II aspect, you can query those fields the same way as any other field, if my address is set to this cloud

firestore

essentially sees this as having a field for the address Street and address point city and zip address and it will go ahead and index those map fields the same way it would any other field and other features that I'm not going to get into just for the sake of time.

You can query in multiple fields, so you could say, "Find me all the Mexican restaurants in San Francisco with a rating of four or higher." You can also check documents that have arrays containing certain values to see if I have a Flags You Know field that has an array containing a bunch of you know elements about that restaurant. You could do a search that says, "Find me all the restaurants that serve alcohol or take reservations," but again I think the most important thing is to remember that every field is indexed and every query must follow these two steps. procedure I think it partly explains why things work so quickly, but it might also explain why some things that seem possible aren't actually any worse, right?

I can't say you can find me in every restaurant in Chicago or San. Francisco is right, that does not follow this two-step procedure in a similar way. I couldn't say, you know, find me all the restaurants where you know the city is not the same as Dallas, you can't get the same performance guarantees using those types of queries as you can with this two-step process and again you know, because we're Getting your results in real time, we want all of these queries to be quick, so with all that in mind, let's start thinking a little more about our food delivery. app and how we might want to organize some of the data since we've already been talking about restaurants and I can imagine that we will have customers who will want to place orders and then I also want to talk about the items that each of these restaurants serves on their menu, so I'm going to take you guys to talk about restaurants and actually the last one too because they go together so I think a good start is to imagine. our restaurant as we've been, you know, thinking about them so far, they'll be documents that will be in a collection like this and you know, our data is kind of what we've been thinking about, this actually seems like a good start.

Obviously, you know this is a little more simplified than what we would see in a real production application, but you know, I think we generally get the idea, but the only thing we haven't talked about yet is what we should do with the actual elements in the menu, well it seems like it would be pretty easy to convert a menu to something like JSON II and turn it into a map field that we place inside a restaurant document which seems reasonable, but you could also turn it into a subcollection if you think about it. , each of these individual menu items could easily be their own documents and you could make it a sub-collection of this restaurant.

That seems reasonable too, so I have to find solutions that seem reasonable, which one is correct here, what it should be. In fact, you know which direction I should go. I'm going to spend quite a bit of time talking about this because it raises a few more rules I want to get into, so hooray for more rules just when you were thinking of a talk! about database structures couldn't be more exciting. I'm going to add more rules, so the first one I want to talk about is that documents have limits, there are limits and a security cloud store that prevents you from having documents that are too large specifically for you.

Know that these are three you should worry about. You know one megabyte of total size of your data in a single document. 40,000 index fields and a QPS of sustained dock rights, which means you know you can have small bursts of rights on a document. but on average you should only have one write per second to the same document in the Cloud Fire Store, so these are some limits that we put to make sure that your documents are not too large, but in practice you know if you take our menu and We do, you know, part of our restaurant pushes us into too big an area, well, let's think about that so that when a megabyte doesn't seem like a lot of space, like we're all taking pictures of these slides. on each of them it will be like 4 megs or something like that, but remember that here we are dealing mainly with text and you know, text or numbers or JSON II that look like things and those don't really take up much space like everything in Pride. and Prejudice could fit in 1 megabyte, so you know, unless we have like George RR Martin writing the descriptions of our menu items, I think we'll probably be fine and then I'd delete all our favorite dishes, forty thousand index fields, well this could be a problem, remember that each of these fields in my map will be indexed, so Claude Firestore is creating an index for the name of the ribs of the menu items and the price of the ribs of the menu items and the rib item description of the menu items and, oh wow, that sounds like I could add but you know, at the same time, forty thousand is a pretty big number, even if I were to include items on my menu , you would have to have like 200 fields for each of those elements, you know, really worry about this limit, so again we're probably fine and as for a cube, it has dock rights.

I don't think that's really a problem. I can't imagine us, you know, updating the price of our menu items more than what it is. secondly if we know we would be having problems, maybe if you had a real time inventory of how many you know of these dishes our kitchen had available to serve and updated it in real time then I would be concerned. this limit, but again I think in this example we're probably fine, so you have to be careful about having documents that are too large because you'll run into these limits, although in practice right now I'm not sure.

That's a problem, so let's look at some other rules and see if they can help us make our decision. Rule number two is that you can only retrieve documents so you know how he told you that the queries are not superficial when you take the first one. We're not going to take any of this and name the subcollections below, well you know that's part of it, but the flip side is that you can't get a partial document back, you either get the full document or you get nothing, so you know if we start to put our entire menu in you know, a big document, it's going to start to become a big thing and if our client is, you know, one of our users is saying, hey, you know, give me the 30 best sushi restaurants in Boston, well , you know, our database returns the information that they're probably interested in at that moment, which is like you know the name and the delivery rate and your rating and your address, things like that along with everything that each of these restaurants has on their entire menu and that's probably more information that our user really wants at that moment and yes, I know, a few slides ago I told you that text is not a big deal compared to things like photos, but you will still have users which will be quite sensitive. to the amount of data or your app that is used properly, particularly in certain parts of the world where data is expensive, you know, also, the more data your app will use, the longer the battery life you will have or the app that will use. and the slower your app will seem because we have to download all that data before we can display your results and by the way, if you had a realtime listener set up on these results, when one of those values changes, actually send it the entire document again so you can see how this could start to be a bad experience for the user, so yes, you really don't want to send more data than your user is actually interested in at that moment. on the other hand, if we break this down into sub-collections, then you know that when I say, hey, you know, show me the 30 best Japanese restaurants in Boston.

I will receive only the restaurant information. I'm interested in that moment. You know the name. the address, the rating, etc., but none of the menu items, later on, when I say, oh, you know what you know, the izakayas look great, let's see what they have on their menu and only then can we look for those others documents. the database and that's much better from a data usage point of view, we only send our users the information that they are interested in at that moment, so that makes our subcollections solution the clear winner. Wait, because we have more rules.

To go over more rules, rule number three is that billing is primarily based on the number of documents you touch, so Cloud Fire Store pricing involves a few different factors, but I guess it mostly depends on the number of documents you touch. that you interact. You specifically know that you will be charged between three and six cents for every hundred thousand reads that you perform and you know the same for writes and deletions, so you want to think about the number of different documents that you are going to have. interacting with, so if I ask about the 30 best sushi restaurants in Boston and it's all in a giant document like this, well, you know, I'll be billed for 30 document reads even if I then say okay, let's see what's up. the menu of one of these, you know we have that data loaded locally and you know, assuming we've kept it and haven't discarded it properly, we can now show you the menu of one of these without incurring any additional reading.

So, on the other hand, it seems nice if you loaded up 30 documents or learned about the three best restaurants in Boston and then said, "Okay, let's see what's on the menu at one of these." Well, now I get billed for that guy. from the initial batch of 30 readings and then the second batch of 25 readings or whatever to get what's on the menu, so is this bad? Well, the answer is that it depends specifically if you think about most of the food delivery apps that you're usually looking at. first a kind of list of restaurants and then maybe after you find one you're interested in, click through for the full menu and then probably order from there, which means each of these readings is a set of readings. is a manual action, which means that realistically your user will do this, maybe as you know, a few times per session, but probably not hundreds or thousands of times, on the other hand, if we really said you know if you really we want to upload the file. full menu every time we did a search for restaurants or we thought, hey, you know what we'll be smart and start preloading all of our menu items every time our user performs a search well, now with a single user action we're not by capturing you just know all the restaurant documents, but all the menu items in all the subcollections of all these restaurant documents and that's going to be bad, since this is the situation you probably want to avoid, so you should stop and ask .

Ask yourself what your app is actually doing and make the right decision from there and sometimes when I give this advice people get angry and ask me: why don't you tell me what the right answer is? But I think you know the point. Isn't there always a correct answer in every situation? You need to understand what the trade-offs are and make the right decision based on how your application actually performs. In our case, though we are going to do more. readings of documents with this subcollection. I still agree with this because, like I said, each of those sets of reads is a manually driven action, you know, be careful of over-optimizing price.

I've seen that there really are strange solutions where people are too focused on pricing and end up creating a bad user experience or creating a lot more work for themselves if you really want a right answer, a rule of thumb. I would say you already know in general. have one collection per view controller forward slash table, you know, activity forward slash page, so like in our application, ifwe have as our restaurant list, a restaurant search page which is a view controller that will be driven by the restaurant collection later when it says okay. see the details of what's in your menu that will then be driven by another subcollection, so if you really want a correct answer, a collection by you know, see the controller bar activity, but wait because we're not done yet, yet There is one more rule to talk about. oh, and that's searching queries for index fields in a collection, so we talked about queries earlier, so if I said, "find me the top 30 restaurants in Dallas," I could do it with any of these settings, whether they're our restaurants or larger documents or smaller documents with these subcollections, either way, this query works, what do I do if I say, hey, I'm in the mood for chicken tikka masala?

Can I do it with any of these configurations? Hmm, well, let's start by taking a look. in larger documents, as you remember each field in a document, even the ones in this map are indexed, so you know when you look at our menu here stored as a giant JSON II looking object, you can see that I'm going to have an index for menu dot korma name of the lamb and then you know, it seems like you know to look at something like that looking at what I have for menu dot name CTM. You could search for restaurants that serve chicken tikka masala correctly and you would do that by saying okay, you know, let's search for restaurants where this dot CTM menu name field exists.

Honestly, I don't really care about the value, as long as you know that field exists, they probably serve chicken tikka masala, but the problem here is that essentially relying on each restaurant's menu to have the same code name for that dish, the The fact that I don't actually care what the name of the dishes is is kind of a red flag, if Rajas Restaurant used a different key for that JSON object that it is using to represent chicken tikka masala well now that the search already knows that That dish is not going to appear in my original search.

Basically, I'm relying on this strange hidden secret information that each dish has to have the same key name in my JSON object and that will be a bit strange and buggy. -prone and that's why I'm not a big fan of this solution, this is where it seems like subcollections would be a simpler and more natural solution, like if I look at each of these documents that represent an item in my menu, well I can see that each of these has its own set of values, and you would have traditional indexes on each of them, and you know, searching by name now seems much more natural.

I might say okay, you know, find me. articles in this collection where the name is equivalent to chicken tikka masala. The problem is that you know this only searches one collection, right? I can do it for Kieran's restaurant. I can't do it on all collections so far. Oh big gasp, yeah, okay, so this is where the collection group is located. Consultations can help. This is a feature I know we've been talking about for a while and I'm happy to say that everyone can play with it like this week. Basically the way it works is you go to the Firebase Console and you would tell Firestore about queries that you might want to run across multiple collections, so in this case here you can see.

I'm basically saying that it's okay between all men of all collections called menu items. I want it to find this field called name and I want you to index it in the scope of this collection group and what that basically means is that I want you to index the name field in all the menu item collections anywhere they exist in my database, index it as if it were just a giant collection. and that's what CallFire stur is going to do right: it's going to look at every collection with the same name and index that name field like it's one giant collection, which means I have a name index that parses all of these documents into all of these different subcollections, which means I can query that collection group to find all the restaurants that serve chicken tikka masala even though they are now divided into different subcollections, so let's go back to our original dilemma of whether we want to have larger documents or subcollections, I think given the advantages we get by putting them into subcollections, you know specifically, you know we don't have to worry about hitting that theoretical larger document limit, we're much more respectful of our users' data and you know I can now search for menu items by name by creating a collection group query, this will be my winner.

Yes, you know that larger documents will still give me fewer reads, but like I said before, be careful about over-optimizing for price like you do. You might want to make sure that you're not going to do anything that's catastrophically bad, but if you're trying to squeeze every penny of your knowledge out of using your database, that might actually be better spent on other things. Alright, let's switch gears a little bit and think about how we might want to store our users, you know, our customers who place orders, so this seems like a pretty obvious candidate for putting everything in a top-level collection.

That seems pretty simple, right, and we know that we can store like your name, your delivery address, and your profile picture, like maybe some of your favorite foods, and you know, this all seems very reasonable. I like this and then one day our product. The manager comes up with a fantastic idea. Hey, do you know what we should do? We should make this social, for example have our users find friends in their local area who like the same type of food and then be cool, they can meet up and they can order. take out food together and now we have become a food delivery bar dating app, right?

It sounds good to me and you know, this is a query that we could create pretty easily, you know, in our settings, like we could say, hey, let's find everyone in San Francisco, where they are, you know, the variety of favorites It contains Korean food and so, we find people in our city who like Korean food, so what's the problem here? Clearly, I'm leading us into some kind of trouble, otherwise it would be a very interesting presentation. this rule here, remember that you can only retrieve documents, not partial documents, so when you know that some of our random users are getting a list of people in your city who like Korean food, we are looking to outsource at the bottom right , like their name and also where they live, oh shit, well that's bad, I can't imagine anything being worse than, oh shit, now we know where all these people live and how to get into their houses and this is bad and sure You know you're probably smart enough not to display this data on the client, but that doesn't matter, the fact that this data is being sent means that a sufficiently motivated hacker could obtain this data and suddenly, you know. was leaked to all of your users, you know the addresses and how to get into your front door, so there are definitely other options that we should consider here, you know, one option is to just put your addresses in a subcollection right where we keep that information, that's also good because now we can add multiple addresses per user and then we know we can make sure that only our delivery people have access to that when they need it.

You know if we had payment information or anything else we wanted to keep private. We could store that as a sub-collection of private information, on the other hand, if we really think, hey, you know, we could be storing a lot of information about our users in these documents, we could also turn this around and have a public profile sub-collection. collection for each of our users. I like this because basically, you know, we can say that we can say that everything and our user document is private by default until we explicitly take a copy of that data and put it in the public profile and you know. that prevents a lot more accidental data leaks and I know I haven't spent much time talking about security rules yet, but when it comes to preventing unauthorized access, both this approach and the one on the previous slide are good because In general, it's easier to have different levels of access in different collections, so being able to say, "Hey, you know that," in this user collection, you know that users can only read the user document that belongs to them, but, well , these sub-collections of public profiles are open to you.

Know any user logged in to read that type of configuration, it's usually easy to do using security rules, having different security configurations for different collections, that's how security rules work, so let's move on to the last object of data. we're going to address and those are the orders, this is kind of interesting because it combines data from a lot of different places, we're going to have some elements unique to the order itself, like the time the order is placed on the shipping rate and probably a few more things and then we will have information about our user, your name, you know where to send the food, we will have information about the restaurant itself, like you know where to place the order, where our courier should go to pick up the dishes and then yes, we will we'll let you know the menu items that our user ordered, like what it was, how much it cost, they ordered an extra spicy or they held the Mayo and again if we were looking at this for more depth as a sequel, we'd probably think about it, you know, something like this, we could have a little bit of order-specific information and then kind of foreign keys to represent all these other bits of information and then we would do some kind of Join before we send this information to a restaurant to process the order, but you know , we don't live in a sequel world, we are a sequel-free database that has super fast reads and hardware scalability and all that cool stuff. but there are no fancy joins, so this is probably not the default way we want to think about storing this order.

Instead, we will simply create a document with the data we need at that moment, so that when our user places an order, we will create a document that will store the copy of the order-specific data about the relevant user information that we will need. know from our user document, we will copy the relevant restaurant information that we need and then we will also add the food that they're ordering and this is actually a case where I would recommend adding all the items directly to the order in a large array like this or a map instead of putting them in a subcollection because if you think about it, anyone who orders from reviewing their orders, whether as a user looking at previous orders or like if you know a restaurant looking at open orders that they need to process, probably you're going to want to see this menu information that you know along with your orders, so again, stop and think about what my app is actually doing and you know, make the call from there and yes, I know it's still a little strange to see this data duplicated in our records, but you know if this still baffles you a little.

The engineers had a really nice analogy which is like, instead of just thinking about this, normalized data is like you normalize the data, like this data in your database is actually like your real-time API for your application, like if you were to make an API for your application, which is like if you know the order, you know, get the order, you would basically be generating an electronic looking JSON object that would look an awful lot like this and so the whole idea perhaps without many databases without consequences. with Cod

firestore

it's like, well maybe just look at that data, since you know this API, you know the realtime API to retrieve orders and that's really the data that you're going to store in your database, so if that really gives you It helps to understand the databases without consequences, a little better use that I have done, on the other hand, that confused you more than you forgot.

I said something, so where the collection of this order actually goes in our database depends and honestly, no No, I don't care too much, you can make this a subcollection of a restaurant or, you know, make it a subcollection of our users or even convert it into another separate high-level collection, obviously any of these license plates are Okay by me, now you know that the collection group queries are working. Basically, you could make a query for any of these orders if you know the restaurant, the courier, or the user and you know they would all work well, so I'd say take your pick.

Whichever of these architectures first popped into your mind intuitively because it's probably the right one. I actually don't want to spend a lot of time on this decision because I want to go back to the duplicate data because itwe make. We need to ask ourselves what we do if one of these values changes later, how can we make sure it is updated in our order? Well, in some cases, maybe the answer is that we don't want to do anything right, like imagine that in a few days. After Diana places her order, you know that Troy decides to increase the price of his bibimbap.

Well, you know that in this case it is probably accurate and correct not to change that value in his order, since we want his order to reflect the price of the item. at the time you ordered it, so this is actually a situation where I think you know having this denormalized data works in our favor, but there are plague cases where it might make sense, like imagine we've done a UX research. study and we realized that when restaurants change their name, we want to make sure that the name change is reflected in the user's order because you know it makes it easier for the user to remember it or something, so, when choosing , Tofu Hut changes its name tofu booth of choice, well maybe we want to change that in our order too, so how do we do it?

One option is to simply have the client make that change, they already know everywhere, so they know we probably have some kind of application for the client. set up for our restaurant owners and we can say okay, when you know when the restaurant owner decides to change the name and change that in the restaurant document, we'll also go ahead and you know, do a search on all these orders that belong to this restaurant and making the change there can also work, but it's a little strange that our customers make this big transaction that changes all of these orders.

To begin with, what we are asking of our client is a lot of work. to do and you know, depending on the situation, it could also open up some kind of weird safety rules setup, like you know, now we have to make sure that a restaurant can go ahead and chain. You know, the restaurant field or the restaurant name. field on the order document, but we probably don't want them to modify or let you know the price of orders or add more food to other orders or do anything nefarious like that, so you know this can get a little weird, so I think in practice this is something we can do better with a cloud function, so if you don't know what cloud functions are, they are a way of executing server-side code by having functions that They are executed in response to actions that happen. in your application, like for example, someone changes the value in a restaurant document and because they run on the server side, they are generally not subject to the same security rules that would add that you would have to have for your clients just when they are executed. in an environment that you trust, as opposed to something where you know someone's phone somewhere, and that means you can generally lock down a lot more of your security rules because cloud features can bypass those safety rules, that's why I like simple and locked down.

When it comes to security, that just kind of means you know less things to worry about, so we can create a cloud function that triggers when a document in one of our restaurant chain's collections has changed and now our Restaurant customer only has one job and that is to go ahead and change their name. in the restaurant and then we can rely on the cloud function to notice that change and make the corresponding change across all of those orders and you know this could really be used anywhere where we have denormalized duplicate data that we want to keep. in sync, like remembering how we liked our users on our public profile, well, you know if we, Rebecca, if Becca changes her name to Rebecca and we decide, you know what's right, it always automatically, you know, update that value. on your public profile too, that's something we could rely on a cloud function to do for us, so hey, wow, this is all we were into at the beginning.

I guess it's not so scary after all, which is good, but I know. There was a lot of information to give you, but if you want to learn even more because it turns out there is a lot more to cover. I have a series on the Firebase YouTube channel called Meet Claude's Fire Shop, where yes, I took pictures of that. that's the most important thing, take a picture where I cover all of this in even more excruciating detail, but I also have cute cartoon characters because when you think about lectures that involve databases, you think that cartoon characters just go together, no forget it. to rate the session if you liked it if you didn't like it my name was Ray Domeier and he was talking about Android Studio and with that I'll say if you have any more questions, I'll be on Firebase. dome for another hour.

I hope everyone learned something. Thank you very much and I'll go out and have a great rest of the conference.

Watch Video & Subscribe

If you have any copyright issue, please Contact