YTread Logo
YTread Logo

Excel Tabellen modellieren, aber richtig!

Mar 19, 2024
Welcome to a new video from today's expert group on the topic of modeling Excel tables, but in business practice but also in research and teaching, Microsoft Excel remains a very popular tool for modeling and analyzing tables with quantities very large data or very In complex issues, it can quickly happen that the models used and the structure chosen become confusing and very slow to use. This was the motivation for making a video on this topic and I would like to start again with a short quote in today's video. As a food for thought, the goal when structuring and modeling data is always to gain insights that you may never have had before and how this can be implemented in the next few minutes using a practical example that I have selected. a sample data set for this video i downloaded from Kagel website, this sample data set is based on Adventure works database which is a database provided by Microsoft for testing purposes and there are also corresponding CSV files here which you can download and which can then be stored in Excel; for example, I will also link the source in the video below.
excel tabellen modellieren aber richtig
I have deliberately chosen a rather simpler and more striking example, but the method or approach presented can of course also be used for any number. From different situations, the Adventureworks dataset or the data I have used here is made up of several tables: there is a table for calendar days with month and year, then there is a table with product categories, then there are subcategories. of products and there is also a basic table where you can find all the products with prices and Costs and the names of the products and last but not least, a table where you can see the sales or in this case the quantities that were produced on a given calendar day.
excel tabellen modellieren aber richtig

More Interesting Facts About,

excel tabellen modellieren aber richtig...

I haven't deliberately set any format here because in practice I personally see things like this over and over again. When you encounter data graveyards, an extract of data is simply copied from various systems and then this data is often provided with artifacts. I also have an additional video on the channel on the topic of data formatting where I explain two examples, but usually you have a data graveyard consisting of several tables and then there is some target value that you want to report or even just calculate, so that's the situation you are facing. I have a very surprising approach to my own questions that I encounter every day in my personal work.
excel tabellen modellieren aber richtig
I have these types of situations, how can I proceed in a structured way to achieve the desired solution? I have shown the tables briefly. The first step I always take here is to determine the data structure. Now I've shown it very simply in Microsoft Excel using a form element and using an arrow and in fact I almost always use this type when. I get a set of data that I have never seen before and maybe I don't know the structure of the data and the principle is quite simple. I look at the tables I have and then try to put them into a schematic. and I will show it clearly.
excel tabellen modellieren aber richtig
We have a table for calendar months or days, which in this case I would just call calendar dates and then we have a table for product categories and for product subcategories, i.e. product categories. and subcategories of products and then there is a table corresponding to the products as we have seen, which I now simply call products and the last table was the topic of quantities or quantities sold per product, which I now simply call returns, how to structure it, that is just a Possible approach to do that. Personally, I always find it very, very easy to work with form elements because I can insert them very, very easily, I can adapt them however I want and what's the pressure really? which then I define the structure correctly.
I have to figure out how these tables relate to each other and in computer language there is always the concept of primary key and secondary key. I will also do my own video on this and cover it briefly. The database topic is explained, but these two terms can actually be explained quite easily and it's about how the tables relate to each other and based on the calendar date, you can see it very easily. Here we have a table for date values. and now also, for example in the return date table, there is also a column in which the respective calendar date is located.
The difference between both tables now is the calendar date, which only appears once. in this table, in which case there is no other column, but that is it. In principle, there is always a calendar day entry. There are no duplicate entries in the table with the returns, of course, the calendar days can appear in different numbers because they can. Of course, several products are sold in a day, which means, on the one hand, that it is important to understand how often a certain value occurs in one table and whether the same value exists in another table that can then be used as an element of Connection.
For example, if I now want to model the data, this is also done with the classic reference in Excel. You always have a certain target size that you can use to search and you have a target column where you then write the data. and then there is a lookup array where the data is extracted. Simply put, now we have these five tables and I'm basically going to start with the return table because that's the table where we actually have the useful information that it now contains. As far as the situation goes, in this case it's really about quantities, because with quantities, for example, you want to extrapolate what costs I have now or what sales I have and these.
The returns table then has the corresponding connection points to the other tables, that is, we have the products, for example, then we have the categories and the product chain of God's Prora and now let's see how they intertwine with each other, that is, we have the product key as the key for the connection and Then we have the products table here where the product is also, which means there is a connection between the returns and the products. Normally I like to do this and then connect the arrows accordingly and then everything connects to each other. other and we see here that in this case it is a one to N relationship to a one to one relationship, which means that we have several products in the returns table that can appear multiple times and in the products table each product is only keeps once. which means we can clearly assign to them which master data is in the respective product and which are then assigned to the returns table, for example, then here is another table with Polar categories and another abcategory of products which now again depend on the products, which means we have the abkette Glory products here which is then mapped back to the table then we can easily show it here with the arrows and the same applies of course to the SAP catical product or now my mistake I would actually have to give it the back because the Glory product chain is attached to the corresponding product table here That means it belongs exactly backwards, that means we have the products here Girls and we have the size of the products here and last but not least we have the dates of the calendar as we know them there we have the column with the returns in the returns table Dates and that would be linked to the calendar dates here again a 1 to 1 connection the same applies to this story here again, so we have actually defined the data structure and that is now basically a first step for you to first understand how this data model is set up as a whole, this of course can be expanded as desired and in technical terms such representations are often called schemas in star or teams in star in English and of course they can be very branched with many different tables, but the basic principle is always the same and in the second step, then in the first step.
Once you define the data structure, in the next step you can think about concrete modeling based on my personal experience, model drivers but also data scientists are often different. Personally, I am a strong proponent of the table object in Microsoft Excel. You know it from VLOOKUP. for example here you set a VLOOKUP with the respective value and then you just go to the raw data here, in this case the return dates and then you link something here, in which case you even have to do the opposite because Get the balls out with the primary cases, they are actually the perpetrators of the calendar.
This is very often done using an absolute area itself, that is, an area is defined once and then the data is put together in some way. it's also possible, no doubt, the problem is that something always happens when a line is added or removed, how manageable the whole thing is and the table objects are just much, much better because now I can for example just Take the Dates table here, I just copy it there in the modeling and then I just put a table on it, you can see here the table has a name, I usually always call it tbl and then for example calendar dates, TWL means table In IT, a briefing is often chosen for the respective object.
If your table is a tablet for an area, for example Rango, abbreviated RNG, in this case I have also set up a custom format for your table, which can also be adapted accordingly here, of course, you can always do that. You can also use standard Microsoft templates. They're not that bad and we'll just do everything for each board, which means we have several different pieces. of data here, of course, that comes on top of that, which means, of course, this chain of gories, that is, here we also have a table, which is what I call it from tbl product categories, then here again they set the color logic, the same applies to the product subcategories and then we have the products here, we put here the same game as above, another table that I then call tbl products and last but not least, the story with the quantities, we have the option to insert the quantities here and we call them tbl returns, for example, now I have done this a little faster, of course you can do it then or you can if you think about it.
If you watch the video you can understand it again, but in principle that would be The data is no longer a fixed cell area until it is displayed, but the respective data are individual table objects that can also be accessed, i.e. Now I can, for example, if I have a formula, if I define which one. Then the values ​​should be calculated, I can link them to these tables. This has the great advantage that, first of all, I have dynamic objects that are variable in size and I have a very clear structure throughout my workbook because I can quickly grab one.
Look at them, for example, formulas with name manager work, what tables do I have now in my Excel workbook? If I now have data somewhere in the book that I might not even see at first glance but is relevant, then I have to search first. and if that's what it doesn't look like there are 20 worksheets then it might need a little more time if I don't know the file now it has some charm after that it's part of these table objects if now link another workbook work there and link it to this object then that is the reference name and as far as I know it only works when both workbooks are open i.e.
I can't update the file for example the other workbook, without me. now I have the base file open, which again has disadvantages because unfortunately Microsoft Excel is not that smart. So it's probably a good idea or in many cases it may be necessary to select the left leg which sets my absolute reference, i.e. then I simply reference the area absolutely across different cells for example, and if we then set that absolute then Of course I also have a link. Of course it has the left side again which is a star if I structure the tables in such a way that I know it always works with the range of cells then of course it works too but that's part of the table object which It's difficult in all books, but within a book, if I want to do an analysis here, this is a really great way to do it. now structure data, for example.
Finally, now I will briefly show an analysis option or how we can arrive at a finished ratio or a fact by formatting the data. I said of course you can discuss it for a long time, I will simply change the date format so that we have the date and the date value and this data set from Adventure Works is generally very interesting because there are also some data sets. which are very interesting in terms of data type. For example, here we have produced certain sizes apparently there are also these sizes, for example now as a letter value or as a text value and as a numeric value, which is often a problem.
If of course you can only choose text and not a number type in terms of data type, maybe I will briefly explain in another video what can often happen with data types and what errors can occur,especially if you now join tables with each other and set up relationships here, for example, if the data type does not match, it can be done. I'll briefly go into the analysis and then briefly show you two examples of how you can deal with such a constellation of data and how you can do it. then perform analysis quickly. What I sometimes like to use is the data check element, maybe I know some of the filtering stuff or if I just want to select certain data in a field that can be found here under data check.
On one hand you can say you want to allow all values ​​but you can also choose a list and that is very popular if for example now I want to select certain values ​​any category for example fiscal year can be anything and most of people always do that to, for example, now go to the respective source. I'll just do an example, we'll use it now. Now the calendar goes here or the proda categories are the example. drag it there and then of course the first thing you have is that the problem is a fixed cell range when you add a category that is not there now of course there is the possibility again.
If you choose a very large range of cells, then you have the disadvantage that there are many empty lines because Excel does not recognize that they may be empty fields. I think there are some efforts now being made to negate empty fields, but it can also sometimes be a problem if you consciously keep an empty for. and you want it to be selectable but that is always happening and if you have a table object then you can choose a very elegant approach and now you can not only define names in the name manager but you can also define areas and Now I can just say here for example now I would like to have my product categories or the ones that girls use for example so I'll call it collected products or it just is and then just go to my Modeling and now select for example there if it's abcategories, which is now nice, as you can see if I mark it there, it is now bound to the table object and it is no longer a range of cells, but is actually a variable reference, i.e. in relation now on the rows, that means you actually have this column there, the lovely thing is that now I can theoretically move the table somewhere else.
Now you could put the entire table on a different table sheet, for example. For example, the reference is still valid because it references this table which has a unique name with tbl producer and you wouldn't actually have a problem if you made a change there, so to speak, or you would have the opposite problem if you now enter an absolute reference because of course it is related again to the respective spreadsheet and the respective area. You can easily see that in terms of maintainability and efficiency, the way I get data when I get new data is much, much faster and much, much better.
The disadvantage here is again if I now change this range to another one. If you want to link a workbook, the same problems for table objects only work again if they are both open, you can write me in the comments if there are other options. to do this, but I know there's really nothing stopping me from doing it without having to do it, so to speak. Open the workbook so that works, but inside the workbook this is a great way to link data here and if you now go into data verification and now you want a dropdown menu, for example, then there is a trick and that is that you click there and press the F3 key and then you can select this ranger and now lo and behold there is now a drop down menu. list below with all the product categories or in this case with the subcategories and now I can select all of them and if now in my data structure, for example now I would create a new one, for example, now I have no idea 40 and I will call it XYZ so that let's know it.
If Excel does that too, I have the new category inside and everything is completely dynamic, meaning it doesn't matter at all. Now I insert 100 lines or just one, it always works the same and this is now simply an example for analysis when I have already defined a clean data structure. A second example at the end, maybe as a little hint, now I can, for example, pivot. but that is always welcome for analysis and now I refer to my cost table for example, here again I have the whole structure or cost table inside, a big advantage is that it is now dynamically linked to the object again.
This means that when the parts change, this means that when the table grows, the data is there immediately, I don't need to adjust anything. If I now do a strict counting range, if you have one more row, the resident balls may be gone. show me the correct result and that's the beauty of a table like this again, there are many, many more aspects, especially when you move into programming. There will also be some videos in the future on the channel for the IT fetishists among you. , but that is a simple application that can save a lot of time if you need the data regularly and want to analyze it but for example the length of the data lines changes then it is really a very good solution and even then you don't have to define so many didactic parts to be able to really represent each constellation.
The table has one disadvantage: Unfortunately, Microsoft is not that smart or unfortunately I can't do it at the moment. write a formula in the header. That's not absolutely necessary in many use cases, but there are cases where this could be a problem if, for example, you now create a dashboard and you create a variable header here as a base table because, for example, you now need two or three languages ​​for the dashboard, then unfortunately this is not possible in the table object, it always has to be spoken, be a difficult value, that is the big disadvantage here, sometimes you can help yourself a little with reports, but unfortunately that is a problem for this situation because the pivot table has a problem because if a name changes in the pivot table it becomes active.
Microsoft removes this table or this column from the evaluation because unfortunately the name here is not very deep. development, but I suspect the name is hardcoded in the background and then I have to rebuild the table and structure it again. The office table is also not optimal, but for quick analysis and in terms of variability it is of course great. Like I said, the pivot table is just one element that I can use and then there is the lovely, embarrassing or functionality that also exists here at Microsoft. It's just that you can define the conditions here accordingly.
Now I have to look exactly. where it is and not just conditions but relationships, something like that is wrong. You can define relationships here and that's very interesting now if I want to map a data model if I no longer have a table as a source, but, as is the case. in our example here, multiple tables, then I can define the corresponding tables here, for example now I will take an example like product tbl, for example with corresponding product key and then we have our table products tbl and there is also a product key or I may have done something stupid, it probably has to be exactly like that, it has to be like that, I already chose the wrong one, sorry, of course we want to have the returns because that's our table that we've moved away from, so to speak, and we have the product key as a connection element and TWL products have corresponding standards.
Microsoft is again similar to what I said before with primary and foreign for primary and secondary key. or or foreign key is also called key in IT um for forums and that actually shows the relationship and now let's see what it does there, now it stores everything accordingly and now I can if I say for example I want a pivot table now Insert then now I can use the workbook data model in addition to the table and we can see that now we have all the tables in this list, now highlight these two tables that already have a mapping and now let's test what happens If for example now I select the product of the returns table in the parts and the quantified return in the columns and because now I need, for example, the name of the master data of the TWL product table, I simply drag the product to the side and see Since the assignment is done automatically, that means we get results here immediately and for example I no longer need a power reference because it does the mapping automatically.
That's a super lovely way if I now know what my data structure looks like. Of course, I also need clean data. Of course, the model doesn't work very well if, for example, products appear multiple times in the tbl products table, then this mapping wouldn't work. Then you might need another criteria because Microsoft. is default in this theme because I only have one column that I can select for the connection which means I would have to create a key myself to be able to use the table or if there are certain cases where I have a condition then maybe this is not works with this solution, but for use cases like this, this, for example, is a great way to combine data with each other and I can actually get the master data with the push of a button. to the extreme.
Many of you will already know that there are now great features in Excel, such as the ability to load data and there is, for example, the ability to work with Power Quiri. Maybe I'll make a video about it too. where you can, for example, integrate external tables and external CSV files, for example, and that's it. There are also table objects that are loaded there and then I can even do operations on them, so I don't have to go down the path I chose, i.e. the red track, but I could, for example, go to an existing folder or an existing csv , but this charm of going through the relationships and modeling it is of course a little bit complex and of course I have to know something about my data, but then I get great options for how I can analyze the data and I know.
Since then I can really build any build I want, whether I add the ab categories for example, or the product categories i.e. the tag or the products themselves, or if I look at certain months for example, I'm incredibly variable and that's how charming they are. That's basically it, I hope the video was fun and maybe there will be some. ideas that will help you in your daily life, whether in teaching or at work, and let me know if you have any questions. If you would like to comment on certain areas, I would love for you to subscribe, or the think tank is of course somewhat dependent on there being corresponding demand because of course it would be great if more videos could be created in the future.
If you're interested, I'll briefly summarize what was part of today's video. Based on Adventureworks and Microsoft database with some selected tables, I briefly showed the structure of these tables. The illustration was simple and clear on these preliminary objects. Of course, there are other options. But I find it very easy and active. that and then of course I can analyze the modeling using table objects and then many themes. There is the possibility of creating fields with data verification that I can use Vodafone. I can also expand it to a full data model with relationships. Then there is the issue of parking turns and also some other aspects like the word powerpio, the ability to generate reports.
Maybe I'll make another video about it, but thanks for watching and until next time, ciao.

If you have any copyright issue, please Contact