Applications-of-GeoGebra/C3/Statistics-using-GeoGebra/English
Visual Cue | Narration |
Slide Number 1
Title Slide |
Welcome to this tutorial on Statistics using GeoGebra. |
Slide Number 2
Learning Objectives |
In this tutorial, we will learn how to use GeoGebra to perform:
One Variable Analysis to calculate different statistical parameters. Two Variable Regression Analysis to estimate best fit line. Multiple Variable Analysis to calculate different statistical parameters. |
Slide Number 3
System Requirement |
Here I am using:
Ubuntu Linux Operating System version 16.04. GeoGebra 5.0.481.0-d. |
Slide Number 4
Pre-requisites |
To follow this tutorial, you should be familiar with:
GeoGebra interface. Statistics |
Slide Number 5
Statistics Data analysis and interpretation. Measures of central tendency Measures of Dispersion. Comparing variability of data series Additional material |
Statistics deals with
Data analysis and interpretation. Measures of central tendency. Measures of Dispersion. Comparing variability of data series. Please refer to additional material provided along with this tutorial. |
Slide Number 6
Fish Feed A fishery is testing four feed formulations on its fish: A, B, C and D Length (mm) Weight (lbs) Girth (mm) |
Fish Feed
Let us look at an example. A fishery is testing four types of feed formulations on its fish: A, B, C and D. Data to be collected after feeding the fish for 6 months are: Length in millimeters. Weight in pounds. Girth in millimeters. Let us look at some of these data. |
Slide Number 7
Fish Feed Data |
Fish Feed Data
We will use these data for our analyses. Please download the code file, Fishery-data, provided along with this tutorial. |
Show the GeoGebra window. | I have opened the GeoGebra interface. |
Click on View tool >> select Spreadsheet. | Click on View tool and select Spreadsheet. |
Click on X at top right corner of Graphics, Algebra views. | Click on X at top right corner of Graphics and Algebra views.
This will close these views. |
In the code file, drag mouse to highlight length and weight data from columns H and I.
Show data in columns H and I. Hold Ctrl key down and press C. |
In the code file, drag mouse to highlight length and weight data from columns H and I.
These are data for fish that have been fed formulation C. Hold Control key down and press C. |
Click on Spreadsheet view in GeoGebra. | Click in the top of the Spreadsheet in GeoGebra. |
Press Ctrl key down, press V. | Hold Control key down and press V.
This will copy and paste the highlighted data from the code file into GeoGebra. |
Place the cursor on the first column header in Spreadsheet view.
Drag and adjust column A's width. |
Place the cursor on the first column header in Spreadsheet view.
Drag and adjust column A's width. |
Right-click on column A heading of Length (mm).
Select Object Properties. Point to dialog box. |
Right-click on column A heading of Length millimetres.
Select Object Properties. A dialog box opens. |
Click on Text tab and change the name to Length (mm)-C.
Close the dialog box. Similarly, add –C to Weight (lbs). |
Click on Text tab and change the name to Length millimetres hyphen C.
Close the dialog box. Similarly, add hyphen C to Weight pounds. |
Adjust column B width. | Adjust column B width. |
Use mouse to drag and highlight first column A’s length data and label in GeoGebra. | Click on column A heading of Length millimetres C.
Drag to highlight length data in Spreadsheet view. |
Below the menubar, click on One Variable Analysis tool.
Point to Data Source popup window. Click on Analyze button. |
Below the menubar, click on One Variable Analysis tool.
A Data Source popup window appears. Click on Analyze button. |
Point to Data Analysis window and histogram. | A Data Analysis window appears.
By default, a histogram is plotted. |
Drag the boundary to see the graph properly. | Drag the boundary to see the graph properly. |
Point to length on the x-axis and frequency on the y-axis. | The length is plotted on the x-axis.
The number of fish that are of a particular length, the frequency, is plotted on the y-axis. |
Point to the display box above the graph containing the word Histogram. | Note the display box above the graph containing the word Histogram. |
In the display box, click on the dropdown menu button. | In the display box, click on the dropdown menu button to display the list of plots. |
Select Histogram.
Point to slider to the right of the display. |
We will stay with the histogram option.
To the right of the dropdown menu is a slider. |
Drag the slider from left to right to go to 20. | Drag the slider from left to right to go to 20. |
Point to rectangles between minimum and maximum values of data. | The slider changes the number of rectangles between the minimum and maximum values of data. |
Click on Options button to the right of the slider. | Click on Options button to the right of the slider. |
Under Classes, check Set Classes Manually. | Under Classes, check Set Classes Manually check box.
This displays Start and Width text-boxes to the left of the Options button. |
Type 800 in the Start text-box and press Enter.
Show the value of 5 in the Width text-box. |
As all the fish are over 800 mm long, type 800 in the Start text-box and press Enter.
We will stay with the default value of 5 for rectangle width. |
Uncheck Set Classes Manually. | Uncheck Set Classes Manually check box. |
Under Show, uncheck Histogram check box. | Under Show, uncheck Histogram check box to make it disappear. |
Scroll down and check Frequency Polygon to show it. | Scroll down and check Frequency Polygon to show it. |
Check Cumulative option as the Frequency Type. | Under Frequency Type, check Cumulative option. |
Point to default Count selection.
Point to the cumulative frequency count. |
The default Count selection shows the cumulative frequency count for the data. |
Drag slider, bring it back to 20. | Drag the slider and note the effects on smoothness of the cumulative frequency count curve.
We will drag the slider back to 20. |
Under Frequency Type, uncheck Cumulative and under Show, uncheck Frequency Polygon. | Under Frequency Type, uncheck Cumulative and under Show, uncheck Frequency Polygon. |
Under Show, check Histogram and uncheck Frequency Polygon > > click on Options button again to hide the window. | Under Show, check Histogram .
And click on Options button again to hide the window. |
Click on Show Data tool >> point to data highlighted in the Spreadsheet. | Above the Histogram text-box, click on the third Show Data tool button.
This displays all the data highlighted in the Spreadsheet. |
Drag the boundary to see the data properly. | Drag the boundary to see the data properly. |
Click on Show Data tool again to hide the list. | Click on the Show Data tool again to hide the list. |
Click on Show 2^{nd} Plot tool. | Above the Histogram text-box, click on the last Show 2^{nd} Plot tool button. |
Select histogram for top plot and box plot for bottom plot. | The same data are graphed in two vertically placed plots.
You can select plot types from the dropdown menu button above each plot. |
Click on Show Statistics tool.
Point to Statistics for both plots. |
Above the Histogram text-box, click on the second Show Statistics tool button.
Statistics for the plot appears as a panel in the middle. |
Drag the boundary to see it properly. | Drag the boundary to see it properly. |
Slide Number 8
Box Plot |
Box Plot
Box plot is a standardized way of showing data, based on the five number summary. |
Click and point to Median, Min, Max, Q_{1} and Q_{3} values in the box plot. | Let us compare histogram and box plot.
In the box plot, locate the Median, Min, Max, Q_{1} and Q_{3} values. |
Click on the button next to Options button above the plot. | Above each plot, in the upper right corner, click on the button next to Options.
A dropdown menu appears with which you can copy each plot to Clipboard or export it as an image. |
Click on Show Statistics tool button to hide the data. | Click on Show Statistics tool button to hide the data. |
Close the Data Analysis window. | Close the Data Analysis window. |
Slide Number 9
Least Squares Linear Regression (LSLR) Changing an independent variable x changes the dependent variable y. LSLR predicts y based on x value. LSRL (best fit line) y = b_{0} + b_{1}x Coefficient of determination R^{2} |
Least Squares Linear Regression (LSLR)
Changing an independent variable x changes the dependent variable y. LSLR predicts y based on x value. Least Squares Regression Line (LSRL) is also called the best fit line. It is given by y = b_{0} + b_{1}x. b_{1}, the slope, is the regression coefficient. Coefficient of determination R^{ }squared R^{ }squared ranges from 0 to 1. The closer R squared is to 1, the better is the prediction of variance in y from x. |
Show length and weight data in the Spreadsheet in the GeoGebra. | Let us go back to the length and weight data in the Spreadsheet view in GeoGebra. |
Drag mouse to highlight all labels and data in the two columns. | Drag and select all the data in both columns. |
Under One Variable Analysis, click on Two Variable Regression Analysis tool. | Under One Variable Analysis, click on Two Variable Regression Analysis tool. |
Click Analyze button in the Data Source window that pops up. | In the Data Source window that pops up, click Analyze button. |
Data Analysis window appears. | A Data Analysis window appears with two plots. |
Show both plots. | By default, the upper plot is a Scatterplot and the lower a Residual plot. |
Click on Show Statistics tool to see Statistics. | Click on Show Statistics tool to see the Statistics. |
Drag the boundary to see them properly. | Drag the boundary to see them properly. |
Below Statistics window, click on the Regression Model menu button >> select Linear. | Below the Statistics window, click on the Regression Model menu button and select Linear. |
Point to the red line that is drawn through some points. | Note the red line in the Scatterplot. |
Point to equation is given in red, y= 0.08x-48.39. | This is the best fit line that passes through as many points as possible.
Its equation is given in red at the bottom. |
Point to R^{2} value of 0.7722. | This R^{ }squared value indicates good fit between the model and the actual data. |
Select other regression models to see effects on R^{2}. | Select other regression models to see effects on the R^{ }squared value. |
Point to the lower Residual Plot. | The lower plot is the Residual Plot.
Residuals are the differences between observed and predicted values of all points. |
Click on Switch Axes button. | Above the Statistics window, click on the last Switch Axes button. |
Point to length now plotted on y-axis and weight on x-axis. | For the scatterplot, length is now plotted along y-axis and weight along x-axis. |
Point to the best fit line and statistics.
Point to equation y= 9.91x + 684.3. |
Observe that the best fit line and many statistics change.
Its equation is now y= 9.91x + 684.3. |
Point to r, R^{2} and rho (ρ). | The only statistics that remain the same are r, R^{ }squared and rho (ρ).
Note that r and rho are greater than 0.8, indicating positive correlation. Weight increases as length increases for fish given feed C. The relationship is strong and well predicted by the best fit lines. |
Click on Switch Axes button. | Again, click on Switch Axes button. |
Point to Symbolic Evaluation at the bottom. | At the bottom, in Symbolic Evaluation, you can enter a value for x to get a prediction for y. |
Point at the line in the Scatterplot. | To get logical predictions, we will enter x values above the x-intercept. |
In Symbolic Evaluation, type in a value for x >> press Enter. | In Symbolic Evaluation, in the text-box for x, type 800 and press Enter. |
Point to y value appearing next to the display box. | Note that a y value appears next to the display box.
The x value was substituted in the best fit line equation to get the y value. |
Click on Show Statistics tool button. | Again, click on Show Statistics tool button. |
Close the Data Analysis window. | Close the Data Analysis window. |
Point to length and weight data in the Spreadsheet. | Let’s go back to the length and weight data in the Spreadsheet. |
Drag mouse to highlight all labels and data in the two columns. | In the Spreadsheet, select all the data in both columns. |
Under One Variable Analysis, click on Multiple Variable Analysis tool. | Under One Variable Analysis, click on Multiple Variable Analysis tool. |
Click Analyze button in the Data Source window that pops up. | In the Data Source window that pops up, click Analyze button. |
Point to Box Plots in the window and to the cell numbers in each row. | Box Plots appear in the window.
They are for length and weight data. |
Click on Show Statistics tool.
Point to Statistics for both plots. |
Above the plot, click on the second Show Statistics tool.
Statistics for both plots appear below. |
Place the cursor on the boundary between the plot and statistics.
When the arrow appears, drag the boundary to resize the windows. |
Place the cursor on the boundary between the plot and statistics.
When the arrow appears, drag the boundary to resize the windows. |
Let us summarize. | |
Slide Number 10
Summary |
In this tutorial, we have learnt how to use GeoGebra to perform:
One Variable Analysis to calculate different statistical parameters. Two Variable Regression Analysis to estimate best fit line. Multiple Variable Analysis to calculate different statistical parameters. |
Slide Number 11
Assignment Perform statistical analyses for weight and girth data. Is any of the oils absorbed more than the others? |
Assignment
Perform statistical analyses for weight and girth data given in this tutorial. Four oils were used to deep fry chips. Amount of absorbed fat was measured for 6 chips fried in 4 oils. Is any of the oils absorbed more than the others? |
Slide Number 12
About Spoken Tutorial project |
The video at the following link summarizes the Spoken Tutorial project.
Please download and watch it. |
Slide Number 13
Spoken Tutorial workshops |
The Spoken Tutorial Project team conducts workshops and gives certificates.
For more details, please write to us. |
Slide Number 14
Forum for specific questions: Do you have questions in THIS Spoken Tutorial? Please visit this site Choose the minute and second where you have the question Explain your question briefly Someone from our team will answer them |
Please post your timed queries on this forum. |
Slide Number 15
Acknowledgement |
Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.
More information on this mission is available at this link. |
This is Vidhya Iyer from IIT Bombay, signing off.
Thank you for joining. |