Difference between revisions of "Python-3.4.3/C2/Parsing-data/English"
Line 122: | Line 122: | ||
− | We can have any number of '''whitespaces '''between '''to '''and '''Python tutorials. | + | We can have any number of '''whitespaces '''between '''to '''and '''Python tutorials. |
+ | |||
+ | But all the '''spaces''' are treated as one space. | ||
|- | |- | ||
Line 135: | Line 137: | ||
− | ''As | + | ''As we can see, we get a '''list''' of '''strings.''''' |
|- | |- | ||
Line 186: | Line 188: | ||
|- | |- | ||
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type,''' c <nowiki>= </nowiki>x.split(' ')''' | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type,''' c <nowiki>= </nowiki>x.split(' ')''' | ||
− | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type,''' c '''''is equal to '''''x '''''dot '''''split ''''' | + | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type,''' c '''''is equal to '''''x '''''dot '''''split '''''inside parentheses and inside single quotes '''space'''.'' |
|- | |- | ||
Line 220: | Line 222: | ||
|- | |- | ||
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''c=str1.split(' ')''' | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''c=str1.split(' ')''' | ||
− | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type,''' c '''''is equal to '''''str1 '''''dot '''''split ''''' | + | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type,''' c '''''is equal to '''''str1 '''''dot '''''split '''''inside parentheses and inside single quotes '''space'''.'' |
|- | |- | ||
Line 532: | Line 534: | ||
|- | |- | ||
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Highlight output | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Highlight output | ||
− | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Hence we | + | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Hence we got our final '''output'''. |
Line 564: | Line 566: | ||
Summary slide | Summary slide | ||
− | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| # Remove '''whitespaces''' using the '''strip() '''function. | + | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| |
+ | # Remove '''whitespaces''' using the '''strip() '''function. | ||
# Convert '''datatypes''' of numbers from one type to another | # Convert '''datatypes''' of numbers from one type to another | ||
# '''Parse''' input '''data''' and perform computations on it. | # '''Parse''' input '''data''' and perform computations on it. | ||
Line 574: | Line 577: | ||
Evaluation | Evaluation | ||
− | |||
Line 581: | Line 583: | ||
# How do you split the string “Guido;Rossum;Python" to get the words. | # How do you split the string “Guido;Rossum;Python" to get the words. | ||
− | |||
Line 588: | Line 589: | ||
Evaluation | Evaluation | ||
− | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| 2. What does int | + | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| 2. What does int inside paranthesis inside double quotes 20.0 produce |
|- | |- | ||
Line 600: | Line 601: | ||
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And the answers,''' ''' | | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And the answers,''' ''' | ||
− | # line.split | + | # line.split inside paranthesis inside single quotes comma |
− | # int | + | # int inside paranthesis inside double quotes 20.0 will give an error, because converting a string directly into integer is not possible. |
Revision as of 13:07, 4 May 2018
|
|
Show Slide
|
Welcome to the spoken tutorial on Parsing-data. |
Show Slide
Objectives
|
In this tutorial, we will learn to-
|
Show slide
System Specifications |
To record this tutorial, I am using
|
Show Slide
Prerequisite slide |
To practice this tutorial, you should know how to use lists.
|
Show Slide
Parsing Data
|
First, let us understand, what is meant by parsing data.
|
Show Slide
split() function
|
Next we will learn about split() function.
|
Show Slide
split() function |
The split function parses a string and returns an array of tokens.
|
Press Ctrl+Alt+T keys | Let us first open the terminal by pressing Ctrl+Alt+T keys simultaneously. |
Type ipython3 | Type, ipython3 and press Enter. |
%pylab and press Enter. | Let us initialize the pylab package.
|
str1 = "Welcome to Python tutorials"
|
From here onwards, please remember to press the Enter key after typing every command on the terminal.
Let us define a variable str1 as string data type.
But all the spaces are treated as one space. |
str1.split()
|
Now, we are going to split this string on whitespace.
|
Type
x = "08-26-2009;08-27-2009;08-29-2009"
|
Let us take another example for split() function with argument.
|
Type x.split(';') | Type, x dot split inside parentheses inside single quotes semicolon. |
Point to the output | We get a list of strings separated by comma. |
Show Slide
|
Pause the video.
|
Switch to the terminal | Switch to the terminal for the solution. |
Type, b = x.split() | Type, b is equal to x dot split open and close parentheses.
|
Type, c = x.split(' ') | Type, c is equal to x dot split inside parentheses and inside single quotes space. |
Type, b | Type, b |
Type, c | Type, c |
Highlight the output | We can see that splitting without argument is same as giving space as argument. |
Show slide: | Splitting the string without argument will split the string separated by any number of spaces.
|
Type str1 | Let us recall the variable str1. |
Type b= str1.split() | Now, we will split this string without argument.
|
Type c=str1.split(' ') | Type, c is equal to str1 dot split inside parentheses and inside single quotes space. |
Type b | Type, b |
Type c | Type, c |
Highlight the output | As you can see, here b is not equal to c since c has whitespaces as entries whereas b has only words.
|
show slide
strip() function |
Next we will learn about strip method.
|
Type unstripped = " Hello world " | Let us define a string by typing
unstripped is equal to inside double quotes space Hello world space |
Type unstripped.strip() | Now to remove the whitespace,
|
Highlight output | We can see that strip removes all the white spaces in the beginning and at the end of the string.
After splitting and stripping we get a list of strings with leading and trailing spaces stripped off. <<PAUSE>> |
Type mark_str = "1.25" | Now we shall look at converting strings into floats and integers.
Type, mark underscore str is equal to inside double quotes 1.25
|
Type mark = float(mark_str)
|
Type, mark is equal to float inside parentheses mark underscore str
|
Type type(mark_str) | Type, type inside parentheses mark underscore str
|
Type type(mark) | Type, type inside parentheses mark
This shows mark is a float datatype. |
Highlight the output | We can see that string is converted to float.
|
Show Slide
Exercise 2
|
Pause the video. Try this exercise and then resume the video.
|
Switch to terminal | Switch to the terminal for the solution. |
Type int("1.25")
|
Type, int inside parentheses inside double quotes 1.25
|
Type dcml_str = "1.25" | Let us see the correct solution for this.
|
Type flt = float(dcml_str) | Type, flt is equal to float inside parentheses dcml underscore str.
|
Type flt | Type, flt |
Type number = int(flt) | Type, number is equal to int inside parentheses flt
|
Type number
|
Type, number
we got the output as integer. This is how we should convert strings into floats and integers. <<PAUSE>> |
Open the file text editor.
|
Next, we will use a data file to parse the data.
|
Show text: student_record.txt is available in the Code files link.
|
A file student underscore record.txt is available in the Code files link of this tutorial.
|
Scroll down and show the records
|
We will first read the file line by line and parse each record in this file.
It contains records of students and their marks in the State Secondary Board Examination.
|
Highlight A;015163;JOSEPH RAJ S;083;042;47;00;72;244
|
Each line in the file is a set of fields separated by semicolons.
|
Open text editor | Open a new text editor. |
Copy paste the code from text editor | Type the code as shown. |
Highlight
for line in open("student_record.txt"): fields = line.split(";") |
Let me explain this program.
|
Highlight
math_mark = float(math_mark_str)
|
The math marks are then converted to float. |
Highlight the code for this narration.
|
Then it is appended and stored as a list in a variable math underscore marks underscore A for region code A. |
Save python file as marks.py | Save the file as marks.py in the home directory. |
Switch to terminal | Switch to the terminal. |
Type, %run marks.py | Execute the file with percentage sign run space marks.py. |
Switch to editor
|
Switch back to the editor.
|
Add in the marks.py file
math_marks_mean = sum(math_marks_A) / len(math_marks_A)
Highlight len(math_marks_A) |
Add the below lines to calculate the mean of math marks for region A.
|
Press ctrl + s | Let us save the file. |
Switch to terminal | Switch to the terminal. |
Type, %run marks.py | Execute the file again with percentage sign run space marks.py. |
Highlight output | Hence we got our final output.
|
Show Slide
Summary slide
|
This brings us to the end of this tutorial.
|
Show Slide
Summary slide |
|
Show Slide
Evaluation
|
Here are some self assessment questions for you to solve
|
Show Slide
Evaluation |
2. What does int inside paranthesis inside double quotes 20.0 produce |
Show Slide
|
And the answers,
|
Show Slide
Forum |
Please post your timed queries in this forum. |
Show Slide
Fossee Forum |
Please post your general queries on Python in this forum. |
Show Slide
Textbook Companion |
FOSSEE team coordinates the TBC project. |
Show Slide
Acknowledgment |
Spoken Tutorial Project is funded by NMEICT, MHRD, Govt. of India.
|
Show Slide
Thank You |
This is Priya from IIT Bombay signing off.
Thanks for watching. |