Difference between revisions of "Gnuplot/C2/Statistics-and-box-plot/English"
Line 21: | Line 21: | ||
|| '''Slide Number 3''' | || '''Slide Number 3''' | ||
'''Learning Objectives''' | '''Learning Objectives''' | ||
− | || | + | || and |
* Specify the position of boxplot on x-axis | * Specify the position of boxplot on x-axis | ||
Line 29: | Line 29: | ||
|| To record this tutorial, I am using | || To record this tutorial, I am using | ||
− | * '''Ubuntu Linux''' | + | * '''Ubuntu Linux''' version 16.04 OS |
− | * '''gnuplot''' | + | * '''gnuplot''' version 5.2.6 |
− | * '''Gedit''' | + | * '''Gedit''' version 3.18 |
|- | |- | ||
Line 45: | Line 45: | ||
'''Code FIles''' | '''Code FIles''' | ||
|| The files used in this tutorial are provided in the '''Code files''' link. | || The files used in this tutorial are provided in the '''Code files''' link. | ||
+ | |||
Please download and extract them. | Please download and extract them. | ||
Line 58: | Line 59: | ||
|- | |- | ||
|| Hover mouse over first column. | || Hover mouse over first column. | ||
+ | |||
House mouse over second column. | House mouse over second column. | ||
+ | |||
Hover mouse over third column. | Hover mouse over third column. | ||
|| The first column is row number. | || The first column is row number. | ||
− | The second column is '''y''' | + | |
+ | The second column is '''y''' data in frequencies. | ||
The third column is the x data, which has the corresponding string or names. | The third column is the x data, which has the corresponding string or names. | ||
Line 80: | Line 84: | ||
|| Press '''Ctrl+L''' . | || Press '''Ctrl+L''' . | ||
|| I will clear the screen. | || I will clear the screen. | ||
+ | |||
Let's draw an '''xy''' plot labeling the x axis with strings. | Let's draw an '''xy''' plot labeling the x axis with strings. | ||
Line 85: | Line 90: | ||
|| Enter the command, '''set xtics rotate '''. | || Enter the command, '''set xtics rotate '''. | ||
|| Enter the commands as seen on the screen. | || Enter the commands as seen on the screen. | ||
+ | |||
I will rotate the '''x tics labels''' by 90 degrees. | I will rotate the '''x tics labels''' by 90 degrees. | ||
Line 98: | Line 104: | ||
|| Hover mouse on x tics strings and graph. | || Hover mouse on x tics strings and graph. | ||
|| Notice the graphical plot of '''x string''' data against the numeric '''y axis''' data. | || Notice the graphical plot of '''x string''' data against the numeric '''y axis''' data. | ||
+ | |||
An example of string data is teacher plotting marks against student names. | An example of string data is teacher plotting marks against student names. | ||
Revision as of 19:03, 6 February 2020
Visual Cue | Narration |
Slide Number 1
Title Slide |
Welcome to the tutorial on Statistics and Box plot. |
Slide Number 2
Learning Objectives |
In this tutorial, we will
|
Slide Number 3
Learning Objectives |
and
|
Slide Number 4
System and Software Requirement |
To record this tutorial, I am using
|
Slide Number 5
Pre-requisites |
To follow this tutorial,
|
Slide Number 6
Code FIles |
The files used in this tutorial are provided in the Code files link.
Please download and extract them. |
Go to Desktop.
Show file icon statistics.txt on Desktop. |
I have saved the input file on Desktop. |
Screenshot of file opened in gedit. | The input file statistics.txt consists of 3 columns. |
Hover mouse over first column.
House mouse over second column. Hover mouse over third column. |
The first column is row number.
The second column is y data in frequencies. The third column is the x data, which has the corresponding string or names. |
Press Ctrl+Alt+T. | Open a terminal. |
Enter the command, cd Desktop . | Change the directory to Desktop. |
Enter the command gnuplot . | Let’s open gnuplot. |
Press Ctrl+L . | I will clear the screen.
Let's draw an xy plot labeling the x axis with strings. |
Enter the command, set xtics rotate . | Enter the commands as seen on the screen.
I will rotate the x tics labels by 90 degrees. |
Enter the command, set autoscale . | Use the command set autoscale to set the axis range to autoscale. |
Enter the command, plot "statistics.txt" using 2:xticlabels(3) . | The next command plots the string x data against y to make a 2D plot. |
Hover mouse on x tics strings and graph. | Notice the graphical plot of x string data against the numeric y axis data.
An example of string data is teacher plotting marks against student names. |
Cursor on the graphics window. | Analysis of such data, often involves, calculation of statistical parameters. |
Close the graphics window. | Close the graphics window. |
Go to the terminal. | Go to the terminal and type a command as seen on the screen. |
Type stats "staticstics.txt" using 2 and press Enter. | The command stats, filename using column number, calculates statistics. |
Output is seen on the screen. | A statistical summary is generated on the screen. |
Scroll up the page. | Let's scroll up. |
Hover mouse over 61. | This shows, the file has 61 data points. |
Hover mouse over std dev.
Hover mouse over Sum. Hover mouse over max and min values. |
The output shows the statistical summary for the input file.
Mean, standard deviation & sum of squares are seen. Minimum values & maximum values are also seen. Median and quartile range is generated on the screen. |
Hover mouse next to quartile range again. | We can plot the statistical analysis using candlestick plot or box plot.
This is useful for descriptive or informative analysis. The height of the box can correspond to either standard deviation or quartile range. |
Open gedit. | We will create a datafile to make candlestick plot with this data.
Open a gedit window and enter the values as seen here. |
Type #candlestick plot style and start a new line. | I will make a comment on the first row.
This Indicates the data is for the candlestick plot style. |
Type,
#x-position tab (mean-stddev) tab y-min tab y-max tab (mean+stddev) tab with candlesticks . Press Enter. |
I will also include the data format for this plot, with tab separation.
For further information on candlestick plot, use the gnuplot help section. |
Type 1, Press tab, type 300 . | In the next line, enter the values for plotting.
First column is an arbitrary x value. The candlestick data will be plotted on this x position. 300 is the value of mean minus the standard deviation. This defines the lower limit of the box. |
Press tab, type 119, press tab, type 2965. | Y minimum in the data is 119 .
The fourth input is the y max value and it is 2965 . |
Press tab, type 1582. | Enter the value of mean plus standard deviation as 1582 . |
Save file in Desktop directory.
Give filename candlestick.dat |
Save the file on Desktop with the file name candlestick.dat . |
Click on save. | Click on the save button to save the script. |
Close Gedit. | I will close Gedit. |
Go to gnuplot. | Go back to the terminal . |
Press Ctrl+L . | I will also clear the screen. |
Type,
set xrange [0.97:1.03] and press Enter. |
Let’s also set x axis limits with set xrange command as seen. |
Enter the command, plot 'candlestick.dat' using 1:2:3:4:5 with candlesticks . | Plot the file with the command as seen on the screen. |
Cursor on the graphics window. | The candlestick plot appears on the graphics window.
Often, we want to plot, outliers and quartile range for box height in the graph. |
Close the graphics window. | Let us see how to do this.
Close the graphics window. |
Enter the command, set autoscale . | In gnuplot prompt, Set autoscale for axis range. |
Enter the command, set style data boxplot. | Set the box plot style for graph as seen on the screen.
This command, sets the plot style to boxplot. |
Enter the command, set style boxplot outliers pointtype 7 . | The next command, plots the boxplot with outliers. |
Enter the command, set style fill solid 0.5 border -1 . | Set a solid style color fill for the box. |
Enter the command, plot 'statistics.txt' using (0):2 ls 1 notitle . | Type the plot command as seen on the screen.
The plot will be set in x axis position zero. |
Graphics screen shows. | In the graph, notice the outliers are also plotted.
Outliers are data points, beyond the quartile range of the data set. |
Type,
set style boxplot nooutliers and press Enter. |
If the outliers are not to be plotted, do the following.
Go to the gnuplot terminal and enter the command as seen on the screen. |
Type
replot and press Enter. |
Replot to see the results. |
Close graph.
Enter q to quit gnuplot |
Close the graphics window and quit gnuplot. |
Slide Number 7
Summary |
Now let's summarize.
In this tutorial, we
|
Slide Number 8
Summary |
* Specified the x-axis position for the box plot |
Slide Number 9
Assignment 1 http://gnuplot.sourceforge.net/demo_5.2/ |
For the assignment activity, please do the following.
Practice box plot with and without outliers for the file boxplot.txt. Practice and understand example boxplot styles from gnuplot website. |
Slide Number 10
Assignment 2 Draw a time-activity bar chart |
We will do one more assignment.
|
Glimpse of assignment. | Your assignment may look similar to this. |
Slide Number 11
Spoken Tutorial Project |
This video summarises the Spoken Tutorial Project.
Please download and watch it. |
Slide Number 12
Spoken Tutorial workshops |
The Spoken Tutorial Team
For more details, please write to us. |
Slide Number 13
Forum for specific questions: |
Please post your timed queries in the forum. |
Slide Number 14
Acknowledgement |
Spoken Tutorial Project is funded by MHRD, Government of India. |
This is Rani from IIT, Bombay. Thank you for joining. |