Difference between revisions of "Python-3.4.3/C2/Statistics/English"

From Script | Spoken-Tutorial
Jump to: navigation, search
(Created page with " {| style="border-spacing:0;" | style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| <ce...")
 
Line 31: Line 31:
  
 
* '''Ubuntu Linux 16.04''' operating system
 
* '''Ubuntu Linux 16.04''' operating system
* '''Python 3.4.3'''
+
* '''Python 3.4.3 '''and
 
* '''IPython 5.1.0'''
 
* '''IPython 5.1.0'''
  
Line 40: Line 40:
  
 
Pre-requisites
 
Pre-requisites
 
  
 
* Load data from files
 
* Load data from files
Line 52: Line 51:
 
* use Lists and
 
* use Lists and
 
* access parts of Arrays
 
* access parts of Arrays
 +
*
  
 
If not, see the pre-requisite '''Python''' tutorials on this website.
 
If not, see the pre-requisite '''Python''' tutorials on this website.
Line 77: Line 77:
  
  
For this, we need to install Numpy.
+
For this, we need to install '''Numpy'''.
  
 
|-
 
|-
Line 86: Line 86:
  
  
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| NumPy, stands for Numerical Python
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| '''NumPy''', stands for '''Numerical Python.'''
  
  
It is a library consisting of precompiled functions for mathematical and numerical routines
+
It is a library consisting of precompiled functions for mathematical and numerical routines.
  
  
NumPy has to be installed separately.
+
'''NumPy''' has to be installed separately.
  
 
|-
 
|-
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| <nowiki>Open terminal by pressing Ctrl+Alt+T keys simultaneously [Terminal]</nowiki>
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Open terminal by pressing Ctrl+Alt+T keys simultaneously
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Let us first open the '''Terminal '''by pressing '''Ctrl+Alt+T '''keys simultaneously.
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Let us first open the '''Terminal '''by pressing '''Ctrl+Alt+T '''keys simultaneously.
  
Line 111: Line 111:
  
  
You need to have root access for installation as it asks for admin password.
+
You need to have '''root''' access for installation as it asks for '''admin''' '''password'''.
  
 
|-
 
|-
Line 119: Line 119:
  
 
'''sudo pip3 install numpy==1.13.3'''
 
'''sudo pip3 install numpy==1.13.3'''
| style="background-color:#ffffff;border:1pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.014cm;padding-right:0.191cm;"| Next, we need to install numpy library as we will be using numpy library throughout the tutorial.
+
| style="background-color:#ffffff;border:1pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.014cm;padding-right:0.191cm;"| Next, we need to install '''numpy''' '''library''' as we will be using '''numpy''' '''library''' throughout the tutorial.
  
  
Type, '''sudo pip3 install numpy equal to equalto 1.13.3'''
+
Type, '''sudo pip3 install numpy '''is equal to is equal to''' 1.13.3 '''and press''' Enter.'''
  
 
|-
 
|-
Line 141: Line 141:
 
To get the data as an array, we use the '''loadtxt()''' function.
 
To get the data as an array, we use the '''loadtxt()''' function.
  
For '''loadtxt() '''function''', '''we need to import '''numpy''' library first.
+
For '''loadtxt() '''function''', '''we need to '''import''' '''numpy''' library first.
  
 
|-
 
|-
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| '''<nowiki>[Terminal] type ipython3</nowiki>'''
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| '''<nowiki>[Terminal] type ipython3</nowiki>'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Now, type '''ipython3''' and press '''Enter'''.  
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Switch back to the terminal.
 +
 
 +
Now, type '''ipython3''' and press '''Enter'''.  
  
 
|-
 
|-
Line 153: Line 155:
  
 
'''import numpy as np'''
 
'''import numpy as np'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type, '''import numpy as np''' and press enter.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type, '''import numpy as np''' and press '''Enter'''.
  
 
Where '''np''' is alias to numpy and it can be any name.
 
Where '''np''' is alias to numpy and it can be any name.
Line 167: Line 169:
  
  
Type,
+
Type, '''L''' ''is equal to'' '''np '''''dot '''''loadtxt''' ''inside parentheses inside quotes'' '''student_record.txt''' ''comma'' '''usecols is equal to inside parentheses''' 3 comma 4 comma 5 comma 6 comma 7 comma '''delimiter''' ''is equal to inside quotes'' semicolon. Press '''Enter'''.
  
'''L''' ''is equal to'' '''np '''''dot '''''loadtxt''' ''inside parentheses inside quotes'' '''student_record.txt''' ''comma'' usecols is equal to inside parentheses 3 comma 4 comma 5 comma 6 comma 7 comma delimiter is equal to inside quotes semicolon.
 
 
Press Enter.
 
  
 
Type''' L '''and press''' enter'''
 
Type''' L '''and press''' enter'''
Line 190: Line 189:
  
  
loadtxt, delimiter and usecols are keywords.
+
'''loadtxt, delimiter''' and '''usecols''' are keywords.
  
 
|-
 
|-
Line 205: Line 204:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| As we can see '''L''' is an '''array'''.  
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| As we can see '''L''' is an '''array'''.  
  
We can get the shape of this '''array''' using '''shape'''
+
 
 +
We can get the shape of this '''array''' using '''shape.'''
  
 
|-
 
|-
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''L.shape'''
 
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type, '''L '''''dot''''' shape '''and press '''Enter'''.
'''L.shape'''
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type,
+
 
+
'''L '''''dot''''' shape '''and press enter.
+
  
 
|-
 
|-
Line 247: Line 243:
  
  
To access the first row in an array, we will type '''L '''''inside square brackets '''''0 '''and press enter.
+
To access the first row in an array, we will type '''L '''''inside square brackets '''''0 '''and press '''Enter'''.
  
 
|-
 
|-
Line 259: Line 255:
 
'''totalmarks '''''is equal to '''''sum '''''inside parentheses '''''L '''''inside square brackets 0 ''
 
'''totalmarks '''''is equal to '''''sum '''''inside parentheses '''''L '''''inside square brackets 0 ''
  
Press Enter
+
 
 +
Press '''Enter.'''
  
 
|-
 
|-
Line 265: Line 262:
  
 
Highlight 177.0
 
Highlight 177.0
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type, '''totalmarks'''
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type, '''totalmarks '''and press '''Enter.'''
 
+
Press Enter
+
  
  
Line 282: Line 277:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Now to get the '''mean''' we can divide the '''totalmarks''' by the length of the '''array.'''
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Now to get the '''mean''' we can divide the '''totalmarks''' by the length of the '''array.'''
  
Type,
 
  
'''totalmarks '''divided by '''len''' ''inside parentheses '''''L'' '''inside square brackets '''''0.'''
+
Type, '''totalmarks '''divided by '''len''' ''inside parentheses '''''L'' '''inside square brackets '''''0 '''and press '''Enter.'''
  
 
|-
 
|-
Line 292: Line 286:
  
 
'''np.<nowiki>mean(L[0])</nowiki>'''
 
'''np.<nowiki>mean(L[0])</nowiki>'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Or simply use the '''function mean'''.Type '''np '''''dot '''''mean '''''inside parentheses '''''L'' '''inside square brackets '''''0.'''
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Or simply use the '''function mean'''.
 +
 
 +
Type '''np '''''dot '''''mean '''''inside parentheses '''''L'' '''inside square brackets '''''0 '''and press''' Enter.'''
  
 
|-
 
|-
Line 308: Line 304:
 
Is there a way to reduce the work?
 
Is there a way to reduce the work?
  
<nowiki><pause></nowiki>
 
  
 
For this we will look into the '''documentation''' of '''mean.'''
 
For this we will look into the '''documentation''' of '''mean.'''
  
Type, '''np '''''dot '''''mean '''''questionmark''
+
 
 +
Type, '''np '''''dot '''''mean '''''questionmark ''and press Enter''.''
  
 
Read the text for more information.
 
Read the text for more information.
Line 318: Line 314:
 
|-
 
|-
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''q '''and press '''enter'''
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''q '''and press '''enter'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''q '''and press '''enter '''to exit the documentation.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Type '''q '''to exit the documentation.
  
 
|-
 
|-
Line 346: Line 342:
 
Let us calculate, '''mean''' of the marks scored by all the students for each subject.  
 
Let us calculate, '''mean''' of the marks scored by all the students for each subject.  
  
Type '''np''' ''dot''' ''mean '''''inside parentheses '''''L '''''comma '''0''' ''
+
 
 +
Type '''np''' ''dot''' ''mean '''''inside parentheses '''''L '''''comma '''0''' ''and press Enter''.''
  
 
|-
 
|-
Line 359: Line 356:
  
  
Type '''L '''''inside square brackets '''''colon''''' comma '''0'''''
+
Type '''L '''''inside square brackets '''''colon''''' comma '''0 '''''and press Enter'''''.'''''
  
  
Note :, displays first '''column''' in the '''array''' i.e (that is) English Mark.
+
Note colon comma displays first '''column''' in the '''array''' i.e (that is) English Mark.
  
  
Line 375: Line 372:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| To get the '''median''' we will simply use the '''function median'''.
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| To get the '''median''' we will simply use the '''function median'''.
  
Type '''np '''''dot '''''median '''''inside parentheses '''''L '''''inside square brackets '''''colon''''' comma '''0'''''
+
Type '''np '''''dot '''''median '''''inside parentheses '''''L '''''inside square brackets '''''colon''''' comma '''0 '''''
 +
 
 +
 
 +
Press Enter'''''.'''''
  
 
|-
 
|-
Line 387: Line 387:
  
 
Type '''np '''''dot '''''median '''''inside parentheses '''''L '''''comma '''0'''''
 
Type '''np '''''dot '''''median '''''inside parentheses '''''L '''''comma '''0'''''
 +
 +
 +
Press Enter'''''.'''''
  
 
|-
 
|-
Line 397: Line 400:
  
  
Standard deviation for english subject can be found by typing
+
Standard deviation for English subject can be found by typing '''np '''''dot '''s''td '''''inside parentheses '''''L '''''inside square brackets '''''colon''''' comma '''0'''''
  
'''np '''''dot '''s''td '''''inside parentheses '''''L '''''inside square brackets '''''colon''''' comma '''0'''''
+
 
 +
Press Enter.
  
 
|-
 
|-
Line 405: Line 409:
  
 
'''np.std(L,0)'''
 
'''np.std(L,0)'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And for all rows, we do,'''np '''''dot '''s''td '''''inside parentheses '''''L '''''comma '''0.'''''
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And for all rows, we do, '''np '''''dot '''s''td '''''inside parentheses '''''L '''''comma '''0 '''''and press Enter.
  
 
|-
 
|-
Line 419: Line 423:
  
 
Download and save the file in the present working directory.
 
Download and save the file in the present working directory.
 +
 +
 +
Currently the present working directory is the '''Home''' directory.
  
 
|-
 
|-
Line 452: Line 459:
  
  
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Switch to the terminal,
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Switch to the terminal.
  
The solution is,
 
  
First,  
+
The solution is, first, type,
  
Type,
+
'''L''' ''is equal to'' '''np '''''dot '''''loadtxt''' ''inside parentheses inside quotes'' '''football.txt''' ''comma'' '''usecols''' ''is equal to inside parentheses'' 1 comma 2 comma '''delimiter''' is equal to inside quotes comma.
  
'''L''' ''is equal to'' '''np '''''dot '''''loadtxt''' ''inside parentheses inside quotes'' '''football.txt''' ''comma'' usecols is equal to inside parentheses 1 comma 2 comma delimiter is equal to inside quotes comma.
 
  
 
Press enter.
 
Press enter.
  
'''np '''''dot''''' sum '''''inside parentheses '''''L '''''comma '''''1.'''
 
  
Press enter.
+
'''np '''''dot''''' sum '''''inside parentheses '''''L '''''comma '''''1 '''and press enter.
  
 
|-
 
|-
Line 472: Line 476:
  
 
Type''' np.mean(L,0)'''
 
Type''' np.mean(L,0)'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Second,
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Answer for the second, '''np '''''dot '''''mean '''''inside parentheses '''''L '''''comma '''''0 '''and press enter.
 
+
'''np '''''dot '''''mean '''''inside parentheses '''''L '''''comma '''''0.'''
+
 
+
Press enter.
+
  
 
|-
 
|-
Line 482: Line 482:
  
 
Type''' np.std(L,0)'''
 
Type''' np.std(L,0)'''
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Third,
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Third, '''np '''''dot '''''std '''''inside parentheses '''''L '''''comma '''''0 '''and press enter.
 
+
'''np '''''dot '''''std '''''inside parentheses '''''L '''''comma '''''0.'''
+
  
 
|-
 
|-
Line 496: Line 494:
  
  
In this tutorial, we have learnt to,
+
In this tutorial, we have learnt to, do the standard '''statistical''' '''operations''' like:
 
+
do the standard '''statistical''' '''operations''' like:
+
  
 
'''sum'''
 
'''sum'''
Line 517: Line 513:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Here are some self assessment questions for you to solve
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Here are some self assessment questions for you to solve
  
# Given a '''two''' '''dimensional''' '''list '''as, '''<nowiki>two_dimensional_list=[[3,5,8,2,1],[4,3,6,2,1]]</nowiki>''' how do you calculate the mean of each row?
+
# Given a '''two''' '''dimensional''' '''list '''as shown.how do you calculate the mean of each row
# Calculate the '''median''' of the given '''list'''? '''<nowiki>student_marks=[74,78,56,87,91,82]</nowiki>'''
+
# Calculate the '''median''' of the given '''list'''?  
  
  
Line 537: Line 533:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And the answers,
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| And the answers,
  
1. To get the mean of each row, we just pass 1 as the second parameter to the function''' mean'''. '''np.mean(two_dimensional_list, 1)'''
+
1. To get the mean of each row, we just pass 1 as the second parameter to the function''' mean'''. '''np.mean '''''inside parentheses''''' two_dimensional_list '''''comma''''' 1'''
  
 
2. We use the '''function median''' to calculate the '''median''' of the '''list'''
 
2. We use the '''function median''' to calculate the '''median''' of the '''list'''
  
'''np.median(student_marks)'''
+
'''np.median '''''inside parentheses '''''student_marks'''
  
3. To specify the particular columns of a file, we use the parameter '''usecols=(2, 3, 4, 5)'''
+
3. To specify the particular columns of a file, we use the parameter '''usecols '''''is equal to inside parentheses '''''2, 3, 4, 5'''
  
 
|-
 
|-
Line 553: Line 549:
  
 
Fossee Forum
 
Fossee Forum
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Please post your general queries on Python in this forum.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Please post your general queries on '''Python''' in this forum.
  
 
|-
 
|-
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Show Slide
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Show Slide Textbook Companion
 
+
Textbook Companion
+
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| FOSSEE team coordinates the TBC project.
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| FOSSEE team coordinates the TBC project.
  
Line 567: Line 561:
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Spoken Tutorial Project is funded by NMEICT, MHRD, Govt. of India.
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Spoken Tutorial Project is funded by NMEICT, MHRD, Govt. of India.
  
For more details, visit this website.
+
For more details, visit this website.  
  
 
|-
 
|-
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Previous slide
 
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Previous slide
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| This is Trupti Kin from IIT Bombay signing off. Thank you.
+
| style="background-color:#ffffff;border:0.5pt solid #000001;padding-top:0cm;padding-bottom:0cm;padding-left:0.088cm;padding-right:0.191cm;"| Thats it for the tutorial.
 +
 
 +
 
 +
This is Trupti Kini from IIT Bombay signing off. Thank you.
  
 
|}
 
|}

Revision as of 16:02, 4 May 2018

Visual Cue
Narration
Show Slide Hello Friends. Welcome to the tutorial on "Statistics” using Python
Show Slide

Objectives


At the end of this tutorial, you will be able to -


  • Do statistical operations in Python
  • Sum a set of numbers
  • Find their mean, median and standard deviation


Show Slide

System Specifications

To record this tutorial, I am using
  • Ubuntu Linux 16.04 operating system
  • Python 3.4.3 and
  • IPython 5.1.0


Show Slide:

Pre-requisites

  • Load data from files
  • Use Lists
  • Access parts of Arrays


To practise this tutorial, you should know how to -
  • load data from files
  • use Lists and
  • access parts of Arrays

If not, see the pre-requisite Python tutorials on this website.

[File Browser]

open and Show the file student_record.txt


1:08 - text box

For this tutorial, we will use the data file student_record.txt which we used in the earlier tutorial.


You can also find this file in the code files link of this tutorial.


Please download it in Home directory and use it.

[File Browser]

Show the file student_record.txt

We will use mathematical and logical operations on this array structured file.


For this, we need to install Numpy.

Numpy(Numerical Python)

slide:


NumPy, stands for Numerical Python.


It is a library consisting of precompiled functions for mathematical and numerical routines.


NumPy has to be installed separately.

Open terminal by pressing Ctrl+Alt+T keys simultaneously Let us first open the Terminal by pressing Ctrl+Alt+T keys simultaneously.
[Terminal] Install latest Python

type sudo apt-get install python3-pip

Let us install latest pip.


pip command is used to install python libraries.


Type, sudo apt-get install python3 hyphen pip


You need to have root access for installation as it asks for admin password.

Install numpy

type

sudo pip3 install numpy==1.13.3

Next, we need to install numpy library as we will be using numpy library throughout the tutorial.


Type, sudo pip3 install numpy is equal to is equal to 1.13.3 and press Enter.

Highlight prompt after installation The installation is completed successfully.


We can see the terminal prompt without any error.

Slide:loadtxt()


Next we will learn about loadtxt() function.


To get the data as an array, we use the loadtxt() function.

For loadtxt() function, we need to import numpy library first.

[Terminal] type ipython3 Switch back to the terminal.

Now, type ipython3 and press Enter.

[IPython Terminal]

Type

import numpy as np

Type, import numpy as np and press Enter.

Where np is alias to numpy and it can be any name.

Type

L=np.loadtxt('student_record.txt', usecols=(3,4,5,6,7), delimiter=';')


Type L and press enter

Let us load the data from the file student_record.txt as an array.


Type, L is equal to np dot loadtxt inside parentheses inside quotes student_record.txt comma usecols is equal to inside parentheses 3 comma 4 comma 5 comma 6 comma 7 comma delimiter is equal to inside quotes semicolon. Press Enter.


Type L and press enter

Highlight the output We get the output in the form of an array.
Highlight command one by one loadtxt loads data from an external file.


Delimiter specifies the kind of character, that the fields of data is separated by.


usecols specifies the columns to be used.


loadtxt, delimiter and usecols are keywords.

Highlight command one by one So columns 3,4,5,6,7 from student_record.txt are loaded here.


The 'comma' between column numbers is added because usecols is a sequence.

[IPython Terminal]

Type L.shape

As we can see L is an array.


We can get the shape of this array using shape.

Type L.shape Type, L dot shape and press Enter.
[IPython Terminal]

4:45


Highlight (185667, 5)

We get a tuple giving the numbers of rows and columns respectively.


In this example, the array L has 185667 rows and 5 columns.

Let us switch back to the student_record.txt file.
Highlight record Let us start applying statistical operations on these.


How do you find the sum of marks of all subjects for the first student?

[IPython Terminal]

Type

L[0]

Switch back to the terminal.


To access the first row in an array, we will type L inside square brackets 0 and press Enter.

[IPython Terminal]

Type

totalmarks=sum(L[0])

Now to sum this, type,

totalmarks is equal to sum inside parentheses L inside square brackets 0


Press Enter.

Type totalmarks

Highlight 177.0

Type, totalmarks and press Enter.


We got sum of marks of all subjects of the first student.

[IPython Terminal]

Type

totalmarks/len(L[0])

Highlight 35.399999999999999

Now to get the mean we can divide the totalmarks by the length of the array.


Type, totalmarks divided by len inside parentheses L inside square brackets 0 and press Enter.

[IPython Terminal]

Type

np.mean(L[0])

Or simply use the function mean.

Type np dot mean inside parentheses L inside square brackets 0 and press Enter.

[IPython Terminal]

Type

np.mean?

But we have such a large data set.


And calculating the mean for each student one by one is time consuming.


Is there a way to reduce the work?


For this we will look into the documentation of mean.


Type, np dot mean questionmark and press Enter.

Read the text for more information.

Type q and press enter Type q to exit the documentation.
show slide

Two-Dimensional array

In the above example, L is a two dimensional array like matrix.


We can calculate the mean across each of the axis of the array.


The axis of rows is referred by 0 and columns by 1.


To calculate mean across all columns, we have to pass extra parameter, 1 for the axis.

[IPython Terminal]

Type

np.mean(L,0)

Switch back to the terminal.


Let us calculate, mean of the marks scored by all the students for each subject.


Type np dot mean inside parentheses L comma 0 and press Enter.

[IPython Terminal]

Type

L[:,0]

Highlight output array([ 53., 58., 72., ..., 49., 33., 17.])

Next, we will calculate the median of English marks for all the students.


Type L inside square brackets colon comma 0 and press Enter.


Note colon comma displays first column in the array i.e (that is) English Mark.



[IPython Terminal]

Type

np.median(L[:,0])

To get the median we will simply use the function median.

Type np dot median inside parentheses L inside square brackets colon comma 0


Press Enter.

[IPython Terminal]

Type

np.median(L,0)

For all the subjects, we can calculate median across all rows using median function as shown here.


Type np dot median inside parentheses L comma 0


Press Enter.

[IPython Terminal]

Type

np.std(L[:,0])

Similarly to calculate standard deviation we will use the function std


Standard deviation for English subject can be found by typing np dot std inside parentheses L inside square brackets colon comma 0


Press Enter.

[IPython Terminal]Type

np.std(L,0)

And for all rows, we do, np dot std inside parentheses L comma 0 and press Enter.
Pause the video here, try out the following exercise and resume the video.
Show Slide

Exercise 1

Refer to the file football.txt, that is available in the code files link of this tutorial.


Download and save the file in the present working directory.


Currently the present working directory is the Home directory.

highlight In football.txt,
  • the first column is player name,
  • second is goals at home and
  • third column is goals away.


Show Slide

Exercise 1

# Find the total goals for each player
  1. Mean of home and goals away
  2. Standard deviation of home and goals away


Ipython Terminal

Type

L=np.loadtxt('football.txt',usecols=(1,2), delimiter=',')


sum(L,1)


Switch to the terminal.


The solution is, first, type,

L is equal to np dot loadtxt inside parentheses inside quotes football.txt comma usecols is equal to inside parentheses 1 comma 2 comma delimiter is equal to inside quotes comma.


Press enter.


np dot sum inside parentheses L comma 1 and press enter.

Ipython Terminal

Type np.mean(L,0)

Answer for the second, np dot mean inside parentheses L comma 0 and press enter.
[Ipython Termina]

Type np.std(L,0)

Third, np dot std inside parentheses L comma 0 and press enter.
Show Slide

Summary


This brings us to the end of the tutorial.


In this tutorial, we have learnt to, do the standard statistical operations like:

sum

mean

median and

standard deviation in Python.

Show Slide

Assignment


Here are some self assessment questions for you to solve
  1. Given a two dimensional list as shown.how do you calculate the mean of each row
  2. Calculate the median of the given list?


Show Slide

Assignment

# There is a file with 6 columns. But we want to load text only from columns 2,3,4,5.

How do we specify that?

Show Slide


Solution

And the answers,

1. To get the mean of each row, we just pass 1 as the second parameter to the function mean. np.mean inside parentheses two_dimensional_list comma 1

2. We use the function median to calculate the median of the list

np.median inside parentheses student_marks

3. To specify the particular columns of a file, we use the parameter usecols is equal to inside parentheses 2, 3, 4, 5

Show SlideForum Please post your timed queries in this forum.
Show Slide

Fossee Forum

Please post your general queries on Python in this forum.
Show Slide Textbook Companion FOSSEE team coordinates the TBC project.
Show Slide

Acknowledgment http://spoken-tutorial.org

Spoken Tutorial Project is funded by NMEICT, MHRD, Govt. of India.

For more details, visit this website.

Previous slide Thats it for the tutorial.


This is Trupti Kini from IIT Bombay signing off. Thank you.

Contributors and Content Editors

Nancyvarkey, Nirmala Venkat, Priyacst