YTread Logo
YTread Logo

CSV Files in Python || Python Tutorial || Learn Python Programming

Jun 03, 2021
C..S..V.. This stands for “comma separated values” and is a popular format for storing data. Most of the time people use databases for large amounts of data and spreadsheets for small amounts. But CSVs still have their place. They are simple and convenient. No special drivers or APIs are needed to use them. And Python makes them even simpler with the CSV module. Get ready to see values ​​that have been separated by commas... - A CSV is a text file that contains data. - Often the first row of the file is a header that lets you know what the values ​​represent. - The remaining lines contain the data.
csv files in python python tutorial learn python programming
Think of each row as a record in a database. - In each row, the data is separated by commas. - Note: Because this is a text file, there are no data types. - While you can mentally interpret data as strings, dates, and numbers, everything is represented as a string. - When reading a CSV, it will be your responsibility to convert the data to the appropriate data type. - Also, whenever you see two commas in a row, it means that something is simply missing. - Today we will

learn

how to read and write CSV

files

. - We will show you how to use the Python CSV module by reading a file containing the first 10 years of Google stock data. - Next, we will analyze the data. - We will then write the results of our analysis to a new CSV. - By the way, in the description below there is a link to a spreadsheet containing this data.
csv files in python python tutorial learn python programming

More Interesting Facts About,

csv files in python python tutorial learn python programming...

You can export this spreadsheet as CSV so you can practice the techniques you'll

learn

today. - Let's get started... - Here is the path to the Google stock data CSV file on my system. - You can use the "open" function to display the content of this file. - In fact, there is a lot of data there. - However, for it to be useful, we need to store the data. - We will show you two ways to do this: with and without the CSV module. - Let's first analyze the data without using the CSV module. - We can quickly read the content of the file using a list comprehension. - Let's look at the first two lines to see what we have. - Note that each line is treated as a single string. - Additionally, a new line character is included at the end. - We can use the “strip” method to eliminate any leading or trailing white space. - And we can use the “split” function to divide the rope into smaller pieces.
csv files in python python tutorial learn python programming
Simply pass the string or character to partition. - This is a much better way to read each line. - So, let's read the data once again, but this time we will take care of removing unnecessary whitespace and split the string into separate values. - If you now look at the first two lines, you can see that each row is now a list of data. - Unfortunately, all data is strings. - There is also another problem. - What if your CSV contained data about books or movies? Many titles contain commas, so splitting them over commas would split those titles.
csv files in python python tutorial learn python programming
Handling issues like this is one of the many reasons why you should use the CSV module. - First import the CSV module. - When using a module for the first time, it is always a good idea to print the directory to see what functions and classes the module contains. - Run… - You can see that the module contains functions for Excel spreadsheets. - But today we will focus on the functions of “reader” and “writer”. - Let's now read the Google stock data once again, but this time we will use the CSV module. - First, open the file.
Note that we specify a "newline" keyword argument and pass it an empty string. This is because, depending on your system, strings may end with a newline, a carriage return, or both. This technique will ensure that the CSV module works correctly on all platforms. - Now we will create a "reader" function to parse the CSV data from the file. - Since the first line is a header and does not contain any data, use the "next" function to extract the first line. - And now we can read the data. - To confirm this works, let's print both the header and the first row of data. - While the code may not be shorter than our previous approach, it is much more robust. - Python will parse the CSV and handle most exceptions that may arise. - But there is still a problem.
The data is still treated as strings. - We need to improve our code by converting the data to the appropriate types. - This CSV contains 7 data per row: date, open, high, low, close, volume and adjusted close. - The Date column is a date and time, open/high/low/close and "tight close" are all floating. - And "Volume" is an integer. - Let's reload the data once again, but this time convert each string to the appropriate data type. - To parse the date, we need to import the datetime class from the datetime module. - Personally I would have chosen a different name for the class, but I'm just a bachelor. - Let's create a list called "data" that will contain all the successfully parsed rows. - Next, loop over the rows. - To improve readability, let's include a comment that describes what data is expected in each row. - To parse the date and time, we will call a method called "S-T-R-P-Time".
This is short for "string parsing time". - The first argument is the string and the second argument is the expected format. - Working with dates and times in Python is a tricky business, so we'll cover this in more detail in a separate video. - Next, convert the open... high... low... and close... into floats. - We use "open price" as the variable name since "open" is the name of a Python built-in function. - These numbers represent the stock's price when the market opened, its highest and lowest price during the day, and its price when the market closed. - Volume is the number of shares traded that day. - And the “Tight Close” is an alternative closing price that takes into account that prices can suffer quantum jumps due to stock splits and dividend payments. - Finally, add a list containing these values ​​to our "data" list. - Now if you print the first row, you can see that all the data is of the appropriate type. - They are no longer strings, but dates, floats and integers. - Now let's calculate the daily stock returns and write them to a separate CSV. - In finance, the return on a stock is simply the percentage change in price. - A daily return is the percentage change from one day to the next. - People also look at weekly, monthly, quarterly and annual returns. - Today we will only consider daily returns. - We will store the data we calculate in a file called “google_returns.csv”. - First, open this file in writing mode. - Let's create a "CSV writer" object to store our calculated results. - To write a row, call the “writerow” method with a list of values.
Let's write the header first. - Fortunately, our data is already arranged chronologically, so we can just scroll through it. - To calculate the daily price change as a percentage, we need the adjusted price for two consecutive days. - So instead of looping through the list of data, which would give us only one row at a time, let's create a "for loop" with an index. - Note that we will stop in the penultimate row. This is because, for the first day, there is no price from the previous day to compare with. - Then use the index "i" to get today's data. - The date is the first item in the “today's data” list. - The adjusted price is the last item in the list, which you can get with the index -1. - Because the dates are in descending order, we can get yesterday's data using the index “i+1”. - Similarly, we can get yesterday's price from data array using index -1. - Now we can calculate the daily performance. - The only data that we will write in our “Google Returns” CSV are the date and the daily return. - Run... - If you look at the "Google Returns" file, you will see that the data is there. - However, the date does not have a user-friendly format. - Let's fix this. - We will use the “S-T-R-F time” method on today's date and pass the desired format. - The name of this method is short for "String Format Time". - We will use the same format used in the original CSV file. - Now change the call to "write row" to use the formatted date and run... - If you look at the "Google returns" CSV file, you will now see a friendlier set of data.
With Python, data and its commas are soon separated. The CSV module gives you everything you need to read and write data responsibly. And don't forget: with big data comes great responsibility. So analyze wisely, my friends. Now go and subscribe to Socratica, because you never know when there's something you should know, but don't know, but you could know if you'd only subscribed to Socratica...

If you have any copyright issue, please Contact