Difference between revisions of "Python-3.4.3/C2/Statistics/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
 
Line 16: Line 16:
  
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| At the end of this tutorial, you will be able to -
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| At the end of this tutorial, you will be able to -
 
  
 
* Do '''statistical''' operations in '''Python'''
 
* Do '''statistical''' operations in '''Python'''
 
* '''Sum''' a set of numbers
 
* '''Sum''' a set of numbers
* Find their '''mean''', '''median''' and '''standard''' '''deviation'''
+
* Find their '''mean, median''' and '''standard deviation'''
 
+
 
+
  
 
|-
 
|-
Line 33: Line 30:
 
* '''Python 3.4.3 '''and
 
* '''Python 3.4.3 '''and
 
* '''IPython 5.1.0'''
 
* '''IPython 5.1.0'''
 
 
  
 
|-
 
|-
Line 49: Line 44:
  
 
* load data from files
 
* load data from files
* use Lists and
+
* use '''Lists''' and
* access parts of Arrays
+
* access parts of '''Arrays'''
  
  
Line 58: Line 53:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| <nowiki>[File Browser]</nowiki>
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| <nowiki>[File Browser]</nowiki>
  
open and Show''' '''the''' '''file''' student_record.txt'''
+
open and Show the file''' student_record.txt'''
  
  
Line 65: Line 60:
  
  
You can also find this file''' '''in the '''code files''' link of this tutorial.  
+
You can also find this file in the '''Code Files''' link of this tutorial.  
  
  
Please download it in '''Home''' directory and use it.
+
Please download it in '''Home directory''' and use it.
  
 
|-
 
|-
Line 74: Line 69:
  
 
Show''' '''the''' '''file''' student_record.txt'''
 
Show''' '''the''' '''file''' student_record.txt'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| We will use mathematical and logical operations on this array structured file.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| We will use mathematical and logical operations on this '''array structured file'''.
  
  
Line 89: Line 84:
  
  
It is a library consisting of precompiled functions for mathematical and numerical routines.
+
It is a library consisting of '''pre-compiled functions''' for mathematical and numerical routines.
  
  
Line 108: Line 103:
  
  
Type, '''sudo apt-get install python3 '''''hyphen '''''pip''' and press Enter.
+
Type, '''sudo apt-get install python3 hyphen pip''' and press '''Enter'''.
  
  
You need to have '''root''' access for installation as it asks for '''admin''' '''password'''.
+
You need to have '''root''' access for installation as it asks for '''admin password'''.
  
 
|-
 
|-
Line 119: Line 114:
  
 
'''sudo pip3 install numpy==1.13.3'''
 
'''sudo pip3 install numpy==1.13.3'''
| style="background-color:#ffffff;border:1pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.014cm;padding-right:0.191cm;"| Next, we need to install '''numpy''' '''library''' as we will be using '''numpy''' '''library''' throughout the tutorial.
+
| style="background-color:#ffffff;border:1pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.014cm;padding-right:0.191cm;"| Next, we need to install '''numpy library''' as we will be using '''numpy library''' throughout the tutorial.
  
  
Line 129: Line 124:
  
  
We can see the terminal prompt without any error.  
+
We can see the '''terminal prompt''' without any error.  
  
 
|-
 
|-
Line 136: Line 131:
  
  
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Next we will learn about '''loadtxt'''() function.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Next we will learn about '''loadtxt() function.'''
  
  
To get the data as an array, we use the '''loadtxt()''' function.
+
To get the data as an '''array''', we use the '''loadtxt() function.'''
  
For '''loadtxt() '''function''', '''we need to '''import''' '''numpy''' library first.
+
For '''loadtxt() function''', we need to '''import numpy library''' first.
  
 
|-
 
|-
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| '''<nowiki>[Terminal] type ipython3</nowiki>'''
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| '''<nowiki>[Terminal] type ipython3</nowiki>'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Switch back to the terminal.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Switch back to the '''terminal'''.
  
 
Now, type '''ipython3''' and press '''Enter'''.  
 
Now, type '''ipython3''' and press '''Enter'''.  
Line 155: Line 150:
  
 
'''import numpy as np'''
 
'''import numpy as np'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type, '''import numpy as np''' and press '''Enter'''.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''import numpy as np''' and press '''Enter'''.
  
Where '''np''' is alias to numpy and it can be any name.
+
Where '''np''' is alias to '''numpy''' and it can be any name.
  
 
|-
 
|-
Line 166: Line 161:
  
 
Type''' L '''and press''' enter'''
 
Type''' L '''and press''' enter'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Let us load the data from the file '''student_record.txt '''as an array.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Let us load the data from the file '''student_record.txt '''as an '''array'''.
  
  
Type, '''L''' ''is equal to'' '''np '''''dot '''''loadtxt''' ''inside parentheses inside quotes'' '''student_record.txt''' ''comma'' '''usecols''' is equal to inside parentheses 3 comma 4 comma 5 comma 6 comma 7 comma '''delimiter''' ''is equal to inside quotes'' semicolon. Press '''Enter'''.
+
Type, '''L''' is equal to '''np dot loadtxt''' inside '''parentheses''' inside quotes '''student_record.txt''' comma '''usecols''' is equal to inside '''parentheses''' 3 comma 4 comma 5 comma 6 comma 7 comma '''delimiter''' is equal to inside quotes '''semicolon'''. Press '''Enter'''.
  
  
Type''' L '''and press''' enter'''
+
Type''' L '''and press''' Enter'''.
  
 
|-
 
|-
Line 183: Line 178:
  
  
'''Delimiter''' specifies the kind of character, that the fields of data is separated by.  
+
'''Delimiter''' specifies the kind of character that the '''fields''' of data is separated by.  
  
  
'''usecols''' specifies the columns to be used.  
+
'''usecols''' specifies the '''columns''' to be used.  
  
  
'''loadtxt, delimiter''' and '''usecols''' are keywords.
+
'''loadtxt, delimiter''' and '''usecols''' are '''keywords'''.
  
 
|-
 
|-
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Highlight command one by one
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Highlight command one by one
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| So columns 3,4,5,6,7 from '''student_record.txt '''are loaded here.  
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| So '''columns''' 3,4,5,6,7 from '''student_record.txt '''are loaded here.  
  
  
The 'comma' between column numbers is added because '''usecols''' is a '''sequence'''.
+
The 'comma' between '''column numbers''' is added because '''usecols''' is a '''sequence'''.
  
 
|-
 
|-
Line 209: Line 204:
 
|-
 
|-
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''L.shape'''
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''L.shape'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type, '''L '''''dot''''' shape '''and press '''Enter'''.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type, '''L dot shape '''and press '''Enter'''.
  
 
|-
 
|-
Line 225: Line 220:
 
|-
 
|-
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"|  
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"|  
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Let us switch back to the '''student_record.tx'''t file.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Let us switch back to the '''student_record.txt''' file.
  
 
|-
 
|-
Line 232: Line 227:
  
  
How do you find the '''sum''' of marks of all subjects for the first student?
+
How do you find the sum of marks of all subjects for the first student?
  
 
|-
 
|-
Line 240: Line 235:
  
 
'''<nowiki>L[0]</nowiki>'''
 
'''<nowiki>L[0]</nowiki>'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Switch back to the terminal.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Switch back to the '''terminal'''.
  
  
To access the first row in an array, we will type '''L '''''inside square brackets '''''0 '''and press '''Enter'''.
+
To access the first row in an '''array''', we will type '''L '''inside square brackets '''0 '''and press '''Enter'''.
  
 
|-
 
|-
Line 251: Line 246:
  
 
'''<nowiki>totalmarks=sum(L[0])</nowiki>'''
 
'''<nowiki>totalmarks=sum(L[0])</nowiki>'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Now to '''sum''' this, type,
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Now to sum this, type,
  
'''totalmarks '''''is equal to '''''sum '''''inside parentheses '''''L '''''inside square brackets 0 ''
+
'''totalmarks '''is equal to '''sum '''inside parentheses '''L '''inside square brackets '''0 '''
  
  
Line 262: Line 257:
  
 
Highlight 177.0
 
Highlight 177.0
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type, '''totalmarks '''and press '''Enter.'''
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''totalmarks '''and press '''Enter.'''
  
  
We got '''sum''' of marks of all subjects of the first student.
+
We got sum of marks of all subjects of the first student.
  
 
|-
 
|-
Line 278: Line 273:
  
  
Type, '''totalmarks '''divided by '''len''' ''inside parentheses '''''L'' '''inside square brackets '''''0 '''and press '''Enter.'''
+
Type, '''totalmarks '''divided by '''len''' inside parentheses '''L''' inside square brackets '''''0 '''and press '''Enter.'''
  
 
|-
 
|-
Line 288: Line 283:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Or simply use the '''function mean'''.
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Or simply use the '''function mean'''.
  
Type '''np '''''dot '''''mean '''''inside parentheses '''''L'' '''inside square brackets '''''0 '''and press''' Enter.'''
+
Type '''np dot mean''' inside parentheses '''L '''inside square brackets '''0 '''and press''' Enter.'''
  
 
|-
 
|-
Line 305: Line 300:
  
  
For this we will look into the '''documentation''' of '''mean.'''
+
For this, we will look into the '''documentation''' of '''mean.'''
  
  
Type, '''np '''''dot '''''mean '''''questionmark ''and press Enter''.''
+
Type, '''np dot mean questionmark '''and press Enter''.''
  
 
Read the text for more information.
 
Read the text for more information.
Line 320: Line 315:
  
 
Two-Dimensional array
 
Two-Dimensional array
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| In the above example, '''L''' is a '''two dimensional array '''like matrix.  
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| In the above example, '''L''' is a '''two dimensional array '''like '''matrix'''.  
  
  
Line 329: Line 324:
  
  
To calculate '''mean''' across all columns, we have to pass extra parameter, 1 for the '''axis'''.
+
To calculate '''mean''' across all '''columns''', we have to pass extra parameter 1 for the '''axis'''.
  
 
|-
 
|-
Line 337: Line 332:
  
 
'''np.mean(L,0)'''
 
'''np.mean(L,0)'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Switch back to the terminal.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Switch back to the '''terminal'''.
  
  
Line 343: Line 338:
  
  
Type '''np''' ''dot''' ''mean '''''inside parentheses '''''L '''''comma '''0''' ''and press Enter''.''
+
Type '''np dot mean '''inside parentheses '''L comma 0''' ''and press '''Enter'''.
  
 
|-
 
|-
Line 356: Line 351:
  
  
Type '''L '''''inside square brackets '''''colon''''' comma '''0 '''''and press Enter'''''.'''''
+
Type '''L '''inside square brackets '''colon comma 0 '''and press '''Enter'''.
 
+
 
+
Note colon comma zero displays first '''column''' in the '''array''' i.e (that is) English Mark.
+
 
+
  
  
 +
Note '''colon comma zero''' displays first '''column''' in the '''array''' that is, English Mark.
  
 
|-
 
|-
Line 372: Line 364:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| To get the '''median''' we will simply use the '''function median'''.
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| To get the '''median''' we will simply use the '''function median'''.
  
Type '''np '''''dot '''''median '''''inside parentheses '''''L '''''inside square brackets '''''colon''''' comma '''0 '''''
+
Type '''np dot median '''inside parentheses '''L '''inside square brackets '''colon''' comma '''0 '''
  
  
Press Enter'''''.'''''
+
Press '''Enter'''.
  
 
|-
 
|-
Line 386: Line 378:
  
  
Type '''np '''''dot '''''median '''''inside parentheses '''''L '''''comma '''0'''''
+
Type '''np dot median '''inside parentheses '''L comma 0'''
  
  
Press Enter'''''.'''''
+
Press '''Enter'''.
  
 
|-
 
|-
Line 397: Line 389:
  
 
'''np.<nowiki>std(L[:,0])</nowiki>'''
 
'''np.<nowiki>std(L[:,0])</nowiki>'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Similarly to calculate '''standard''' '''deviation''' we will use the '''function''' '''std'''
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Similarly to calculate '''standard''' '''deviation''' we will use the '''function std'''
  
  
Standard deviation for English subject can be found by typing '''np '''''dot '''s''td '''''inside parentheses '''''L '''''inside square brackets '''''colon''''' comma '''0'''''
+
Standard deviation for English subject can be found by typing '''np dot std '''inside parentheses '''L '''inside square brackets '''colon comma 0'''
  
  
Press Enter.
+
Press '''Enter'''.
  
 
|-
 
|-
Line 409: Line 401:
  
 
'''np.std(L,0)'''
 
'''np.std(L,0)'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And for all rows, we do, '''np '''''dot '''s''td '''''inside parentheses '''''L '''''comma '''0 '''''and press Enter.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And for all '''rows''', we do, '''np dot std '''inside parentheses '''L comma 0 '''and press '''Enter.'''
  
 
|-
 
|-
Line 419: Line 411:
  
 
Exercise 1
 
Exercise 1
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Refer to the file''' football.txt''', that is available in the code files link of this tutorial.  
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Refer to the file''' football.txt''', that is available in the '''Code Files''' link of this tutorial.  
  
  
Download and save the file in the present working directory.
+
Download and save the file in the '''present working directory'''.
  
  
Currently the present working directory is the '''Home''' directory.
+
Currently the '''present working directory''' is the '''Home directory.'''
  
 
|-
 
|-
Line 434: Line 426:
 
* second is '''goals''' '''at home''' and
 
* second is '''goals''' '''at home''' and
 
* third column is '''goals away'''.
 
* third column is '''goals away'''.
 
 
  
 
|-
 
|-
Line 443: Line 433:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| # Find the total goals for each player  
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| # Find the total goals for each player  
 
# '''Mean''' of home and goals away
 
# '''Mean''' of home and goals away
# '''Standard''' '''deviation''' of home and goals away
+
# '''Standard deviation''' of home and goals away
 
+
 
+
  
 
|-
 
|-
Line 464: Line 452:
 
The solution is, first, type,
 
The solution is, first, type,
  
'''L''' ''is equal to'' '''np '''''dot '''''loadtxt''' ''inside parentheses inside quotes'' '''football.txt''' ''comma'' '''usecols''' ''is equal to inside parentheses'' 1 comma 2 comma '''delimiter''' is equal to inside quotes comma.
+
'''L''' is equal to '''np dot loadtxt''' inside parentheses inside quotes '''football.txt comma usecols''' is equal to inside parentheses '''1 comma 2 comma delimiter''' is equal to inside quotes '''comma'''.
  
  
Press enter.
+
Press '''Enter'''.
  
  
'''np '''''dot''''' sum '''''inside parentheses '''''L '''''comma '''''1 '''and press enter.
+
'''np dot sum '''inside parentheses '''L comma 1 '''and press '''Enter'''.
  
 
|-
 
|-
Line 476: Line 464:
  
 
Type''' np.mean(L,0)'''
 
Type''' np.mean(L,0)'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Answer for the second, '''np '''''dot '''''mean '''''inside parentheses '''''L '''''comma '''''0 '''and press enter.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Answer for the second, '''np dot mean '''inside parentheses '''L comma 0 '''and press '''Enter'''.
  
 
|-
 
|-
Line 482: Line 470:
  
 
Type''' np.std(L,0)'''
 
Type''' np.std(L,0)'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Third, '''np '''''dot '''''std '''''inside parentheses '''''L '''''comma '''''0 '''and press enter.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Third, '''np dot std '''inside parentheses '''L comma 0 '''and press '''Enter'''.
  
 
|-
 
|-
Line 494: Line 482:
  
  
In this tutorial, we have learnt to, do the standard '''statistical''' '''operations''' like:
+
In this tutorial, we have learnt to do the standard '''statistical operations''' like:
  
 
'''sum'''
 
'''sum'''
Line 511: Line 499:
  
  
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Here are some self assessment questions for you to solve
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Here are some self assessment questions for you to solve.
 
+
# Given a '''two''' '''dimensional''' '''list '''as shown. How do you calculate the mean of each row
+
# Calculate the '''median''' of the given '''list'''?
+
 
+
  
 +
# Given a '''two dimensional list '''as shown, how do you calculate the '''mean''' of each row?
 +
# Calculate the '''median''' of the given '''list'''.
  
 
|-
 
|-
Line 524: Line 510:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"|
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"|
  
# There is a '''file''' with 6 columns. But we want to load text only from columns 2,3,4,5.  
+
# There is a '''file''' with 6 '''columns'''. But we want to load text only from '''columns''' 2,3,4,5.  
  
 
How do we specify that?
 
How do we specify that?
Line 535: Line 521:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And the answers,
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And the answers,
  
1. To get the mean of each row, we just pass 1 as the second parameter to the function''' mean'''. '''np.mean '''''inside parentheses''''' two_dimensional_list '''''comma''''' 1'''
+
1. To get the '''mean''' of each '''row''', we just pass 1 as the second '''parameter''' to the '''function mean'''
 +
 
 +
'''np.mean '''inside parentheses''' two_dimensional_list comma 1'''
  
 
2. We use the '''function median''' to calculate the '''median''' of the '''list'''
 
2. We use the '''function median''' to calculate the '''median''' of the '''list'''
  
'''np.median '''''inside parentheses '''''student_marks'''
+
'''np.median '''inside parentheses '''student_marks'''
  
3. To specify the particular columns of a file, we use the parameter '''usecols '''''is equal to inside parentheses '''''2, 3, 4, 5'''
+
3. To specify the particular '''columns''' of a file, we use the parameter '''usecols '''is equal to inside parentheses '''2, 3, 4, 5'''
  
 
|-
 
|-

Latest revision as of 21:32, 6 May 2018

Visual Cue
Narration
Show Slide Hello Friends. Welcome to the tutorial on "Statistics” using Python
Show Slide

Objectives


At the end of this tutorial, you will be able to -
  • Do statistical operations in Python
  • Sum a set of numbers
  • Find their mean, median and standard deviation
Show Slide

System Specifications

To record this tutorial, I am using
  • Ubuntu Linux 16.04 operating system
  • Python 3.4.3 and
  • IPython 5.1.0
Show Slide:

Pre-requisites

  • Load data from files
  • Use Lists
  • Access parts of Arrays


To practise this tutorial, you should know how to -
  • load data from files
  • use Lists and
  • access parts of Arrays


If not, see the pre-requisite Python tutorials on this website.

[File Browser]

open and Show the file student_record.txt


1:08 - text box

For this tutorial, we will use the data file student_record.txt which we used in the earlier tutorial.


You can also find this file in the Code Files link of this tutorial.


Please download it in Home directory and use it.

[File Browser]

Show the file student_record.txt

We will use mathematical and logical operations on this array structured file.


For this, we need to install Numpy.

Numpy(Numerical Python)

slide:


NumPy, stands for Numerical Python.


It is a library consisting of pre-compiled functions for mathematical and numerical routines.


NumPy has to be installed separately.

Open terminal by pressing Ctrl+Alt+T keys simultaneously Let us first open the Terminal by pressing Ctrl+Alt+T keys simultaneously.
[Terminal] Install latest Python

type sudo apt-get install python3-pip

Let us install latest pip.


pip command is used to install python libraries.


Type, sudo apt-get install python3 hyphen pip and press Enter.


You need to have root access for installation as it asks for admin password.

Install numpy

type

sudo pip3 install numpy==1.13.3

Next, we need to install numpy library as we will be using numpy library throughout the tutorial.


Type, sudo pip3 install numpy is equal to is equal to 1.13.3 and press Enter.

Highlight prompt after installation The installation is completed successfully.


We can see the terminal prompt without any error.

Slide:loadtxt()


Next we will learn about loadtxt() function.


To get the data as an array, we use the loadtxt() function.

For loadtxt() function, we need to import numpy library first.

[Terminal] type ipython3 Switch back to the terminal.

Now, type ipython3 and press Enter.

[IPython Terminal]

Type

import numpy as np

Type import numpy as np and press Enter.

Where np is alias to numpy and it can be any name.

Type

L=np.loadtxt('student_record.txt', usecols=(3,4,5,6,7), delimiter=';')


Type L and press enter

Let us load the data from the file student_record.txt as an array.


Type, L is equal to np dot loadtxt inside parentheses inside quotes student_record.txt comma usecols is equal to inside parentheses 3 comma 4 comma 5 comma 6 comma 7 comma delimiter is equal to inside quotes semicolon. Press Enter.


Type L and press Enter.

Highlight the output We get the output in the form of an array.
Highlight command one by one loadtxt loads data from an external file.


Delimiter specifies the kind of character that the fields of data is separated by.


usecols specifies the columns to be used.


loadtxt, delimiter and usecols are keywords.

Highlight command one by one So columns 3,4,5,6,7 from student_record.txt are loaded here.


The 'comma' between column numbers is added because usecols is a sequence.

[IPython Terminal]

Type L.shape

As we can see L is an array.


We can get the shape of this array using shape.

Type L.shape Type, L dot shape and press Enter.
[IPython Terminal]

4:45


Highlight (185667, 5)

We get a tuple giving the numbers of rows and columns respectively.


In this example, the array L has one lakh eighty five thousand six hundred and sixty seven rows and 5 columns.

Let us switch back to the student_record.txt file.
Highlight record Let us start applying statistical operations on these.


How do you find the sum of marks of all subjects for the first student?

[IPython Terminal]

Type

L[0]

Switch back to the terminal.


To access the first row in an array, we will type L inside square brackets 0 and press Enter.

[IPython Terminal]

Type

totalmarks=sum(L[0])

Now to sum this, type,

totalmarks is equal to sum inside parentheses L inside square brackets 0


Press Enter.

Type totalmarks

Highlight 177.0

Type totalmarks and press Enter.


We got sum of marks of all subjects of the first student.

[IPython Terminal]

Type

totalmarks/len(L[0])

Highlight 35.399999999999999

Now to get the mean we can divide the totalmarks by the length of the array.


Type, totalmarks divided by len inside parentheses L inside square brackets 0 and press Enter.

[IPython Terminal]

Type

np.mean(L[0])

Or simply use the function mean.

Type np dot mean inside parentheses L inside square brackets 0 and press Enter.

[IPython Terminal]

Type

np.mean?

But we have such a large data set.


And calculating the mean for each student one by one is time consuming.


Is there a way to reduce the work?


For this, we will look into the documentation of mean.


Type, np dot mean questionmark and press Enter.

Read the text for more information.

Type q and press enter Type q to exit the documentation.
show slide

Two-Dimensional array

In the above example, L is a two dimensional array like matrix.


We can calculate the mean across each of the axis of the array.


The axis of rows is referred by 0 and columns by 1.


To calculate mean across all columns, we have to pass extra parameter 1 for the axis.

[IPython Terminal]

Type

np.mean(L,0)

Switch back to the terminal.


Let us calculate, mean of the marks scored by all the students for each subject.


Type np dot mean inside parentheses L comma 0 and press Enter.

[IPython Terminal]

Type

L[:,0]

Highlight output array([ 53., 58., 72., ..., 49., 33., 17.])

Next, we will calculate the median of English marks for all the students.


Type L inside square brackets colon comma 0 and press Enter.


Note colon comma zero displays first column in the array that is, English Mark.

[IPython Terminal]

Type

np.median(L[:,0])

To get the median we will simply use the function median.

Type np dot median inside parentheses L inside square brackets colon comma 0


Press Enter.

[IPython Terminal]

Type

np.median(L,0)

For all the subjects, we can calculate median across all rows using median function as shown here.


Type np dot median inside parentheses L comma 0


Press Enter.

[IPython Terminal]

Type

np.std(L[:,0])

Similarly to calculate standard deviation we will use the function std


Standard deviation for English subject can be found by typing np dot std inside parentheses L inside square brackets colon comma 0


Press Enter.

[IPython Terminal]Type

np.std(L,0)

And for all rows, we do, np dot std inside parentheses L comma 0 and press Enter.
Pause the video here, try out the following exercise and resume the video.
Show Slide

Exercise 1

Refer to the file football.txt, that is available in the Code Files link of this tutorial.


Download and save the file in the present working directory.


Currently the present working directory is the Home directory.

highlight In football.txt,
  • the first column is player name,
  • second is goals at home and
  • third column is goals away.
Show Slide

Exercise 1

# Find the total goals for each player
  1. Mean of home and goals away
  2. Standard deviation of home and goals away
Ipython Terminal

Type

L=np.loadtxt('football.txt',usecols=(1,2), delimiter=',')


sum(L,1)


Switch to the terminal.


The solution is, first, type,

L is equal to np dot loadtxt inside parentheses inside quotes football.txt comma usecols is equal to inside parentheses 1 comma 2 comma delimiter is equal to inside quotes comma.


Press Enter.


np dot sum inside parentheses L comma 1 and press Enter.

Ipython Terminal

Type np.mean(L,0)

Answer for the second, np dot mean inside parentheses L comma 0 and press Enter.
[Ipython Termina]

Type np.std(L,0)

Third, np dot std inside parentheses L comma 0 and press Enter.
Show Slide

Summary


This brings us to the end of the tutorial.


In this tutorial, we have learnt to do the standard statistical operations like:

sum

mean

median and

standard deviation in Python.

Show Slide

Assignment


Here are some self assessment questions for you to solve.
  1. Given a two dimensional list as shown, how do you calculate the mean of each row?
  2. Calculate the median of the given list.
Show Slide

Assignment

  1. There is a file with 6 columns. But we want to load text only from columns 2,3,4,5.

How do we specify that?

Show Slide


Solution

And the answers,

1. To get the mean of each row, we just pass 1 as the second parameter to the function mean

np.mean inside parentheses two_dimensional_list comma 1

2. We use the function median to calculate the median of the list

np.median inside parentheses student_marks

3. To specify the particular columns of a file, we use the parameter usecols is equal to inside parentheses 2, 3, 4, 5

Show SlideForum Please post your timed queries in this forum.
Show Slide

Fossee Forum

Please post your general queries on Python in this forum.
Show Slide Textbook Companion FOSSEE team coordinates the TBC project.
Show Slide

Acknowledgment http://spoken-tutorial.org

Spoken Tutorial Project is funded by NMEICT, MHRD, Govt. of India.

For more details, visit this website.

Previous slide Thats it for the tutorial.


This is Trupti Kini from IIT Bombay signing off. Thank you.

Contributors and Content Editors

Nancyvarkey, Nirmala Venkat, Priyacst