Difference between revisions of "Gnuplot/C2/Statistics-and-box-plot/English"
(Created page with "{| border=1 || '''Visual Cue''' || '''Narration''' |- || '''Slide Number 1''' '''Title Slide ''' || Welcome to the tutorial on''' Statistics and Box plot'''. |- || '''Slide...") |
Snehalathak (Talk | contribs) |
||
(10 intermediate revisions by 2 users not shown) | |||
Line 6: | Line 6: | ||
|| '''Slide Number 1''' | || '''Slide Number 1''' | ||
'''Title Slide ''' | '''Title Slide ''' | ||
− | || Welcome to the tutorial on''' Statistics and Box | + | || Welcome to the tutorial on '''Statistics and Box Plot'''. |
|- | |- | ||
Line 13: | Line 13: | ||
|| In this tutorial, we will | || In this tutorial, we will | ||
− | * Plot string data on x-axis | + | * Plot string data on '''x'''-axis |
* Calculate Statistical summary for the input file | * Calculate Statistical summary for the input file | ||
− | * Draw candlestick plot | + | * Draw '''candlestick''' plot |
− | * Draw boxplot with | + | * Draw '''boxplot''' with and without '''outliers''' |
|- | |- | ||
|| '''Slide Number 3''' | || '''Slide Number 3''' | ||
'''Learning Objectives''' | '''Learning Objectives''' | ||
− | || | + | || and |
− | * Specify the position of boxplot on x-axis | + | * Specify the position of '''boxplot''' on '''x'''-axis |
|- | |- | ||
|| '''Slide Number 4''' | || '''Slide Number 4''' | ||
− | '''System | + | '''System Requirements''' |
|| To record this tutorial, I am using | || To record this tutorial, I am using | ||
− | * '''Ubuntu Linux''' | + | * '''Ubuntu Linux''' version 16.04 OS |
− | * '''gnuplot''' | + | * '''gnuplot''' version 5.2.6 |
− | * '''Gedit''' | + | * '''Gedit''' version 3.18 |
|- | |- | ||
Line 43: | Line 43: | ||
|- | |- | ||
|| '''Slide Number 6''' | || '''Slide Number 6''' | ||
− | '''Code | + | '''Code Files''' |
|| The files used in this tutorial are provided in the '''Code files''' link. | || The files used in this tutorial are provided in the '''Code files''' link. | ||
+ | |||
Please download and extract them. | Please download and extract them. | ||
Line 58: | Line 59: | ||
|- | |- | ||
|| Hover mouse over first column. | || Hover mouse over first column. | ||
+ | |||
House mouse over second column. | House mouse over second column. | ||
+ | |||
Hover mouse over third column. | Hover mouse over third column. | ||
|| The first column is row number. | || The first column is row number. | ||
− | The second column is '''y''' | + | |
+ | The second column is '''y''' data. | ||
+ | |||
The third column is the x data, which has the corresponding string or names. | The third column is the x data, which has the corresponding string or names. | ||
Line 79: | Line 84: | ||
|| Press '''Ctrl+L''' . | || Press '''Ctrl+L''' . | ||
|| I will clear the screen. | || I will clear the screen. | ||
− | Let's draw an '''xy''' plot labeling the x axis with strings. | + | |
+ | Let's draw an '''xy''' plot labeling the '''x''' axis with strings. | ||
|- | |- | ||
|| Enter the command, '''set xtics rotate '''. | || Enter the command, '''set xtics rotate '''. | ||
|| Enter the commands as seen on the screen. | || Enter the commands as seen on the screen. | ||
− | I will rotate the '''x tics | + | |
+ | I will rotate the '''x tics''' labels by 90 degrees. | ||
|- | |- | ||
Line 92: | Line 99: | ||
|- | |- | ||
|| Enter the command, '''plot "statistics.txt" using 2:xticlabels(3)''' . | || Enter the command, '''plot "statistics.txt" using 2:xticlabels(3)''' . | ||
− | || The next command plots the string x data against y to make a 2D plot. | + | || The next command plots the string '''x''' data against '''y''' to make a 2D plot. |
|- | |- | ||
− | || Hover mouse on x tics strings and graph. | + | || Hover mouse on '''x''' tics strings and graph. |
|| Notice the graphical plot of '''x string''' data against the numeric '''y axis''' data. | || Notice the graphical plot of '''x string''' data against the numeric '''y axis''' data. | ||
+ | |||
An example of string data is teacher plotting marks against student names. | An example of string data is teacher plotting marks against student names. | ||
Line 112: | Line 120: | ||
|- | |- | ||
− | || Type '''stats "staticstics.txt" using 2 '''and press '''Enter'''. | + | || Type '''stats "staticstics.txt" using 2''' and press '''Enter'''. |
− | || The command '''stats''', filename | + | || The command '''stats''', filename using column number, calculates statistics. |
|- | |- | ||
Line 124: | Line 132: | ||
|- | |- | ||
− | || Hover mouse over | + | || Hover mouse over 61. |
− | || This shows, the file has | + | || This shows, the file has 61 data points. |
|- | |- | ||
Line 132: | Line 140: | ||
Hover mouse over max and min values. | Hover mouse over max and min values. | ||
|| The output shows the statistical summary for the input file. | || The output shows the statistical summary for the input file. | ||
− | '''Mean, standard deviation | + | '''Mean, standard deviation''' & '''sum of squares''' are seen. |
− | '''Minimum values & maximum values are also seen | + | |
+ | '''Minimum''' values & '''maximum''' values are also seen. | ||
+ | |||
'''Median''' and '''quartile range''' is generated on the screen. | '''Median''' and '''quartile range''' is generated on the screen. | ||
|- | |- | ||
|| Hover mouse next to '''quartile range''' again. | || Hover mouse next to '''quartile range''' again. | ||
− | || We can plot the statistical analysis using candlestick plot or box plot. | + | || We can plot the statistical analysis using '''candlestick plot''' or '''box plot'''. |
+ | |||
This is useful for descriptive or informative analysis. | This is useful for descriptive or informative analysis. | ||
+ | |||
The height of the box can correspond to either standard deviation or quartile range. | The height of the box can correspond to either standard deviation or quartile range. | ||
Line 145: | Line 157: | ||
|| Open '''gedit'''. | || Open '''gedit'''. | ||
|| We will create a datafile to make '''candlestick plot''' with this data. | || We will create a datafile to make '''candlestick plot''' with this data. | ||
+ | |||
Open a '''gedit''' window and enter the values as seen here. | Open a '''gedit''' window and enter the values as seen here. | ||
Line 150: | Line 163: | ||
|| Type '''#candlestick plot style''' and start a new line. | || Type '''#candlestick plot style''' and start a new line. | ||
|| I will make a comment on the first row. | || I will make a comment on the first row. | ||
− | This Indicates the data is for the candlestick plot style. | + | |
+ | This Indicates the data is for the '''candlestick plot''' style. | ||
|- | |- | ||
Line 157: | Line 171: | ||
Press '''Enter'''. | Press '''Enter'''. | ||
|| I will also include the data format for this plot, with tab separation. | || I will also include the data format for this plot, with tab separation. | ||
− | For further information on candlestick plot, use the '''gnuplot''' help section. | + | |
+ | For further information on '''candlestick plot''', use the '''gnuplot''' help section. | ||
|- | |- | ||
− | || Type '''1''', Press '''tab''', type | + | || Type '''1''', Press '''tab''', type 300 . |
|| In the next line, enter the values for plotting. | || In the next line, enter the values for plotting. | ||
− | + | ||
− | The candlestick data will be plotted on this x position. | + | The first column is an arbitrary x value. |
− | + | ||
+ | The '''candlestick''' data will be plotted on this x position. | ||
+ | |||
+ | 300 is the value of mean minus the standard deviation. | ||
+ | |||
This defines the lower limit of the box. | This defines the lower limit of the box. | ||
|- | |- | ||
− | || Press '''tab''', type | + | || Press '''tab''', type 119, press '''tab''', type 2965. |
− | || '''Y '''minimum in the data is | + | || '''Y '''minimum in the data is 119. |
− | The fourth input is the '''y''' max value and it is | + | The fourth input is the '''y''' max value and it is 2965. |
|- | |- | ||
− | || Press '''tab''', type | + | || Press '''tab''', type 1582. |
− | || Enter the value of mean plus standard deviation as | + | || Enter the value of mean plus standard deviation as 1582. |
|- | |- | ||
− | || | + | || Save file in '''Desktop''' directory. |
− | ''' | + | Give filename '''candlestick.dat'''. |
− | || Save the file on '''Desktop''' with the file name '''candlestick.dat''' . | + | || Save the file on '''Desktop''' with the file name '''candlestick.dat'''. |
|- | |- | ||
− | || Click on ''' | + | || Click on '''Save'''. |
− | || Click on the ''' | + | || Click on the '''Save button''' to save the '''script'''. |
|- | |- | ||
|| Close '''Gedit'''. | || Close '''Gedit'''. | ||
− | || I will close ''' | + | || I will close '''gedit'''. |
|- | |- | ||
|| Go to '''gnuplot'''. | || Go to '''gnuplot'''. | ||
− | || Go back to the '''terminal '''. | + | || Go back to the '''terminal'''. |
|- | |- | ||
− | || Press '''Ctrl+L''' . | + | || Press '''Ctrl+L'''. |
|| I will also clear the screen. | || I will also clear the screen. | ||
Line 201: | Line 220: | ||
'''set xrange [0.97:1.03]''' | '''set xrange [0.97:1.03]''' | ||
and press '''Enter'''. | and press '''Enter'''. | ||
− | || Let’s also set x axis limits with '''set xrange''' command as seen. | + | || Let’s also set '''x''' axis limits with '''set xrange''' command as seen. |
|- | |- | ||
|| Enter the command, '''plot 'candlestick.dat' using 1:2:3:4:5 with candlesticks''' . | || Enter the command, '''plot 'candlestick.dat' using 1:2:3:4:5 with candlesticks''' . | ||
− | || | + | || Make a plot with the command as seen on the screen. |
|- | |- | ||
|| Cursor on the graphics window. | || Cursor on the graphics window. | ||
|| The '''candlestick plot''' appears on the graphics window. | || The '''candlestick plot''' appears on the graphics window. | ||
+ | |||
Often, we want to plot, '''outliers''' and '''quartile''' range for box height in the graph. | Often, we want to plot, '''outliers''' and '''quartile''' range for box height in the graph. | ||
Line 215: | Line 235: | ||
|| Close the graphics window. | || Close the graphics window. | ||
|| Let us see how to do this. | || Let us see how to do this. | ||
+ | |||
Close the graphics window. | Close the graphics window. | ||
|- | |- | ||
|| Enter the command, '''set autoscale''' . | || Enter the command, '''set autoscale''' . | ||
− | || In '''gnuplot''' prompt, | + | || In '''gnuplot''' prompt, set '''autoscale''' for axis range. |
|- | |- | ||
|| Enter the command, '''set style data boxplot'''. | || Enter the command, '''set style data boxplot'''. | ||
|| Set the box plot style for graph as seen on the screen. | || Set the box plot style for graph as seen on the screen. | ||
− | This command, sets the plot style to boxplot. | + | |
+ | This command, sets the plot style to '''boxplot'''. | ||
|- | |- | ||
− | || Enter the command, '''set style boxplot outliers | + | || Enter the command, '''set style boxplot outliers'''. |
− | || The next command, plots the boxplot with outliers. | + | || The next command, plots the '''boxplot''' with '''outliers'''. |
|- | |- | ||
Line 237: | Line 259: | ||
|| Enter the command, '''plot 'statistics.txt' using (0):2 ls 1 notitle''' . | || Enter the command, '''plot 'statistics.txt' using (0):2 ls 1 notitle''' . | ||
|| Type the plot command as seen on the screen. | || Type the plot command as seen on the screen. | ||
+ | |||
The plot will be set in '''x''' axis position zero. | The plot will be set in '''x''' axis position zero. | ||
Line 242: | Line 265: | ||
|| Graphics screen shows. | || Graphics screen shows. | ||
|| In the graph, notice the outliers are also plotted. | || In the graph, notice the outliers are also plotted. | ||
− | Outliers are data points, beyond the '''quartile range''' of the data set. | + | '''Outliers''' are data points, beyond the '''quartile range''' of the data set. |
|- | |- | ||
Line 248: | Line 271: | ||
'''set style boxplot nooutliers''' | '''set style boxplot nooutliers''' | ||
and press '''Enter'''. | and press '''Enter'''. | ||
− | || If the outliers are not to be plotted, do the following. | + | || If the '''outliers''' are not to be plotted, do the following. |
− | Go to the gnuplot terminal and enter the command as seen on the screen. | + | |
+ | Go to the '''gnuplot''' terminal and enter the command as seen on the screen. | ||
|- | |- | ||
Line 258: | Line 282: | ||
|- | |- | ||
|| Close graph. | || Close graph. | ||
− | Enter '''q''' to quit '''gnuplot''' | + | Enter '''q''' to quit '''gnuplot'''. |
|| Close the graphics window and quit '''gnuplot'''. | || Close the graphics window and quit '''gnuplot'''. | ||
Line 265: | Line 289: | ||
'''Summary''' | '''Summary''' | ||
|| Now let's summarize. | || Now let's summarize. | ||
+ | |||
In this tutorial, we | In this tutorial, we | ||
* Plotted string data on x-axis | * Plotted string data on x-axis | ||
* Calculated statistical summary with '''stats''' command | * Calculated statistical summary with '''stats''' command | ||
− | * Generated candlestick plot | + | * Generated '''candlestick''' plot |
− | * Generated box plot with and without outliers and | + | * Generated box plot with and without '''outliers''' and |
|- | |- | ||
|| '''Slide Number 8''' | || '''Slide Number 8''' | ||
'''Summary''' | '''Summary''' | ||
− | || | + | || * Specified the x-axis position for the box plot |
− | + | ||
|- | |- | ||
Line 282: | Line 306: | ||
[http://gnuplot.sourceforge.net/demo_5.2/ http://gnuplot.sourceforge.net/demo_5.2/] | [http://gnuplot.sourceforge.net/demo_5.2/ http://gnuplot.sourceforge.net/demo_5.2/] | ||
|| For the assignment activity, please do the following. | || For the assignment activity, please do the following. | ||
− | Practice box plot with and without outliers for the file '''boxplot.txt'''. | + | Practice box plot with and without '''outliers''' for the file '''boxplot.txt'''. |
− | Practice and understand example boxplot styles from '''gnuplot''' website. | + | Practice and understand example '''boxplot''' styles from '''gnuplot''' website. |
|- | |- | ||
Line 291: | Line 315: | ||
Draw a time-activity bar chart | Draw a time-activity bar chart | ||
|| We will do one more assignment. | || We will do one more assignment. | ||
− | * Draw a '''Time-activity graph '''for your daily activities | + | * Draw a '''Time-activity graph''' for your daily activities |
* For this, time your activities in a day and make a time table. | * For this, time your activities in a day and make a time table. | ||
− | * Plot time in hours on y-axis and activities on x-axis. | + | * Plot time in hours on '''y'''-axis and activities on '''x'''-axis. |
|- | |- |
Latest revision as of 20:43, 7 February 2020
Visual Cue | Narration |
Slide Number 1
Title Slide |
Welcome to the tutorial on Statistics and Box Plot. |
Slide Number 2
Learning Objectives |
In this tutorial, we will
|
Slide Number 3
Learning Objectives |
and
|
Slide Number 4
System Requirements |
To record this tutorial, I am using
|
Slide Number 5
Pre-requisites |
To follow this tutorial,
|
Slide Number 6
Code Files |
The files used in this tutorial are provided in the Code files link.
Please download and extract them. |
Go to Desktop.
Show file icon statistics.txt on Desktop. |
I have saved the input file on Desktop. |
Screenshot of file opened in gedit. | The input file statistics.txt consists of 3 columns. |
Hover mouse over first column.
House mouse over second column. Hover mouse over third column. |
The first column is row number.
The second column is y data. The third column is the x data, which has the corresponding string or names. |
Press Ctrl+Alt+T. | Open a terminal. |
Enter the command, cd Desktop . | Change the directory to Desktop. |
Enter the command gnuplot . | Let’s open gnuplot. |
Press Ctrl+L . | I will clear the screen.
Let's draw an xy plot labeling the x axis with strings. |
Enter the command, set xtics rotate . | Enter the commands as seen on the screen.
I will rotate the x tics labels by 90 degrees. |
Enter the command, set autoscale . | Use the command set autoscale to set the axis range to autoscale. |
Enter the command, plot "statistics.txt" using 2:xticlabels(3) . | The next command plots the string x data against y to make a 2D plot. |
Hover mouse on x tics strings and graph. | Notice the graphical plot of x string data against the numeric y axis data.
An example of string data is teacher plotting marks against student names. |
Cursor on the graphics window. | Analysis of such data, often involves, calculation of statistical parameters. |
Close the graphics window. | Close the graphics window. |
Go to the terminal. | Go to the terminal and type a command as seen on the screen. |
Type stats "staticstics.txt" using 2 and press Enter. | The command stats, filename using column number, calculates statistics. |
Output is seen on the screen. | A statistical summary is generated on the screen. |
Scroll up the page. | Let's scroll up. |
Hover mouse over 61. | This shows, the file has 61 data points. |
Hover mouse over std dev.
Hover mouse over Sum. Hover mouse over max and min values. |
The output shows the statistical summary for the input file.
Mean, standard deviation & sum of squares are seen. Minimum values & maximum values are also seen. Median and quartile range is generated on the screen. |
Hover mouse next to quartile range again. | We can plot the statistical analysis using candlestick plot or box plot.
This is useful for descriptive or informative analysis. The height of the box can correspond to either standard deviation or quartile range. |
Open gedit. | We will create a datafile to make candlestick plot with this data.
Open a gedit window and enter the values as seen here. |
Type #candlestick plot style and start a new line. | I will make a comment on the first row.
This Indicates the data is for the candlestick plot style. |
Type,
#x-position tab (mean-stddev) tab y-min tab y-max tab (mean+stddev) tab with candlesticks . Press Enter. |
I will also include the data format for this plot, with tab separation.
For further information on candlestick plot, use the gnuplot help section. |
Type 1, Press tab, type 300 . | In the next line, enter the values for plotting.
The first column is an arbitrary x value. The candlestick data will be plotted on this x position. 300 is the value of mean minus the standard deviation. This defines the lower limit of the box. |
Press tab, type 119, press tab, type 2965. | Y minimum in the data is 119.
The fourth input is the y max value and it is 2965. |
Press tab, type 1582. | Enter the value of mean plus standard deviation as 1582. |
Save file in Desktop directory.
Give filename candlestick.dat. |
Save the file on Desktop with the file name candlestick.dat. |
Click on Save. | Click on the Save button to save the script. |
Close Gedit. | I will close gedit. |
Go to gnuplot. | Go back to the terminal. |
Press Ctrl+L. | I will also clear the screen. |
Type,
set xrange [0.97:1.03] and press Enter. |
Let’s also set x axis limits with set xrange command as seen. |
Enter the command, plot 'candlestick.dat' using 1:2:3:4:5 with candlesticks . | Make a plot with the command as seen on the screen. |
Cursor on the graphics window. | The candlestick plot appears on the graphics window.
Often, we want to plot, outliers and quartile range for box height in the graph. |
Close the graphics window. | Let us see how to do this.
Close the graphics window. |
Enter the command, set autoscale . | In gnuplot prompt, set autoscale for axis range. |
Enter the command, set style data boxplot. | Set the box plot style for graph as seen on the screen.
This command, sets the plot style to boxplot. |
Enter the command, set style boxplot outliers. | The next command, plots the boxplot with outliers. |
Enter the command, set style fill solid 0.5 border -1 . | Set a solid style color fill for the box. |
Enter the command, plot 'statistics.txt' using (0):2 ls 1 notitle . | Type the plot command as seen on the screen.
The plot will be set in x axis position zero. |
Graphics screen shows. | In the graph, notice the outliers are also plotted.
Outliers are data points, beyond the quartile range of the data set. |
Type,
set style boxplot nooutliers and press Enter. |
If the outliers are not to be plotted, do the following.
Go to the gnuplot terminal and enter the command as seen on the screen. |
Type
replot and press Enter. |
Replot to see the results. |
Close graph.
Enter q to quit gnuplot. |
Close the graphics window and quit gnuplot. |
Slide Number 7
Summary |
Now let's summarize.
In this tutorial, we
|
Slide Number 8
Summary |
* Specified the x-axis position for the box plot |
Slide Number 9
Assignment 1 http://gnuplot.sourceforge.net/demo_5.2/ |
For the assignment activity, please do the following.
Practice box plot with and without outliers for the file boxplot.txt. Practice and understand example boxplot styles from gnuplot website. |
Slide Number 10
Assignment 2 Draw a time-activity bar chart |
We will do one more assignment.
|
Glimpse of assignment. | Your assignment may look similar to this. |
Slide Number 11
Spoken Tutorial Project |
This video summarises the Spoken Tutorial Project.
Please download and watch it. |
Slide Number 12
Spoken Tutorial workshops |
The Spoken Tutorial Team
For more details, please write to us. |
Slide Number 13
Forum for specific questions: |
Please post your timed queries in the forum. |
Slide Number 14
Acknowledgement |
Spoken Tutorial Project is funded by MHRD, Government of India. |
This is Rani from IIT, Bombay. Thank you for joining. |