Applications-of-GeoGebra/C3/Statistics-using-GeoGebra/English-timed
From Script | Spoken-Tutorial
| Time | Narration |
| 00:01 | Welcome to this tutorial on Statistics using GeoGebra. |
| 00:06 | In this tutorial, we will learn how to use GeoGebra to perform: |
| 00:12 | One Variable Analysis to calculate different statistical parameters. |
| 00:18 | Two Variable Regression Analysis to estimate best fit line. |
| 00:23 | Multiple Variable Analysis to calculate different statistical parameters. |
| 00:29 | Here I am using:
Ubuntu Linux Operating System version 16.04. |
| 00:36 | GeoGebra 5.0.481.0 hyphen d. |
| 00:43 | To follow this tutorial, you should be familiar with:
GeoGebra interface, Statistics |
| 00:51 | Statistics deals with
Data analysis and interpretation. |
| 00:56 | Measures of central tendency. |
| 00:59 | Measures of Dispersion. |
| 01:02 | Comparing variability of data series. |
| 01:06 | Please refer to additional material provided along with this tutorial. |
| 01:12 | Fish Feed |
| 01:14 | Let us look at an example. |
| 01:17 | A fishery is testing four types of feed formulations on its fish: A, B, C and D. |
| 01:26 | Data to be collected after feeding the fish for 6 months are:
Length in millimeters, Weight in pounds, Girth in millimeters. |
| 01:39 | Let us look at some of these data. |
| 01:42 | Fish Feed Data |
| 01:44 | We will use these data for our analyses. |
| 01:49 | Please download the code file, Fishery-data, provided along with this tutorial. |
| 01:57 | I have opened the GeoGebra interface. |
| 02:01 | Click on View tool and select Spreadsheet. |
| 02:07 | Click on X at top right corner of Graphics and Algebra views.
This will close these views. |
| 02:17 | In the code file, drag mouse to highlight length and weight data from columns H and I. |
| 02:26 | These are data for fish that have been fed formulation C. |
| 02:32 | Hold Control key down and press C. |
| 02:36 | Click in the top of the Spreadsheet in GeoGebra. |
| 02:41 | Hold Control key down and press V. |
| 02:45 | This will copy and paste the highlighted data from the code file into GeoGebra. |
| 02:52 | Place the cursor on the first column header in Spreadsheet view. |
| 02:58 | Drag and adjust column A's width. |
| 03:02 | Right-click on column A heading of Length millimetres. |
| 03:08 | Select Object Properties. |
| 03:11 | A dialog box opens. |
| 03:14 | Click on Text tab and change the name to Length millimetres hyphen C. |
| 03:23 | Close the dialog box. |
| 03:26 | Similarly, add hyphen C to Weight pounds. |
| 03:35 | Adjust column B width. |
| 03:38 | Click on column A heading of Length millimetres C. |
| 03:44 | Drag to highlight length data in Spreadsheet view. |
| 03:49 | Below the menubar, click on One Variable Analysis tool. |
| 03:55 | A Data Source popup window appears.
Click on Analyze button. |
| 04:02 | A Data Analysis window appears. |
| 04:06 | By default, a histogram is plotted. |
| 04:10 | Drag the boundary to see the graph properly. |
| 04:14 | The length is plotted on the x-axis. |
| 04:18 | The number of fish that are of a particular length, the frequency, is plotted on the y-axis. |
| 04:26 | Note the display box above the graph containing the word Histogram. |
| 04:32 | In the display box, click on the dropdown menu button to display the list of plots. |
| 04:39 | We will stay with the histogram option. |
| 04:43 | To the right of the dropdown menu is a slider. |
| 04:48 | Drag the slider from left to right to go to 20. |
| 04:53 | The slider changes the number of rectangles between the minimum and maximum values of data. |
| 05:01 | Click on Options button to the right of the slider. |
| 05:06 | Under Classes, check Set Classes Manually check box. |
| 05:12 | This displays Start and Width text-boxes to the left of the Options button. |
| 05:19 | As all the fish are over 800 milimeters long, type 800 in the Start text-box and press Enter. |
| 05:29 | We will stay with the default value of 5 for rectangle width. |
| 05:35 | Uncheck Set Classes Manually check box. |
| 05:39 | Under Show, uncheck Histogram check box to make it disappear. |
| 05:45 | Scroll down and check Frequency Polygon to show it. |
| 05:51 | Under Frequency Type, check Cumulative option. |
| 05:56 | The default Count selection shows the cumulative frequency count for the data. |
| 06:03 | Drag the slider and note the effects on smoothness of the cumulative frequency count curve. |
| 06:11 | We will drag the slider back to 20. |
| 06:15 | Under Frequency Type, uncheck Cumulative and under Show, uncheck Frequency Polygon. |
| 06:24 | Under Show, check Histogram .
And click on Options button again to hide the window. |
| 06:33 | Above the Histogram text-box, click on the third Show Data tool button. |
| 06:40 | This displays all the data highlighted in the Spreadsheet. |
| 06:45 | Drag the boundary to see the data properly. |
| 06:49 | Click on the Show Data tool again to hide the list. |
| 06:55 | Above the Histogram text-box, click on the last Show 2nd Plot tool button. |
| 07:02 | The same data are graphed in two vertically placed plots. |
| 07:07 | You can select plot types from the dropdown menu button above each plot. |
| 07:14 | Above the Histogram text box, click on the second Show Statistics tool button. |
| 07:22 | Statistics for the plot appears as a panel in the middle. |
| 07:27 | Drag the boundary to see it properly. |
| 07:31 | Box Plot
Box plot is a standardized way of showing data, based on the five number summary. |
| 07:41 | Let us compare histogram and box plot. |
| 07:45 | In the box plot, locate the Median, Min, Max, Q1 and Q3 values. |
| 07:57 | Above each plot, in the upper right corner, click on the button next to Options. |
| 08:06 | A dropdown menu appears with which you can copy each plot to Clipboard or export it as an image. |
| 08:15 | Click on Show Statistics tool button to hide the data. |
| 08:21 | Close the Data Analysis window. |
| 08:25 | Least Squares Linear Regression (LSLR) |
| 08:30 | Changing an independent variable x changes the dependent variable y. |
| 08:36 | LSLR predicts y based on x value. |
| 08:41 | Least Squares Regression Line (LSRL) is also called the best fit line. |
| 08:49 | It is given by y = b0 + b1x. |
| 08:55 | b1, the slope, is the regression coefficient. |
| 09:00 | Coefficient of determination R squared |
| 09:04 | R squared ranges from 0 to 1. |
| 09:08 | The closer R squared is to 1, the better is the prediction of variance in y from x. |
| 09:16 | Let us go back to the length and weight data in the Spreadsheet view in GeoGebra. |
| 09:23 | Drag and select all the data in both columns. |
| 09:29 | Under One Variable Analysis, click on Two Variable Regression Analysis tool. |
| 09:36 | In the Data Source window that pops up, click Analyze button. |
| 09:41 | A Data Analysis window appears with two plots. |
| 09:46 | By default, the upper plot is a Scatterplot and the lower a Residual plot. |
| 09:54 | Click on Show Statistics tool to see the Statistics. |
| 10:00 | Drag the boundary to see them properly. |
| 10:04 | Below the Statistics window, click on the Regression Model menu button and select Linear. |
| 10:13 | Note the red line in the Scatterplot. |
| 10:17 | This is the best fit line that passes through as many points as possible. |
| 10:23 | Its equation is given in red at the bottom. |
| 10:28 | This R squared value indicates good fit between the model and the actual data. |
| 10:36 | Select other regression models to see effects on the R squared value. |
| 10:43 | The lower plot is the Residual Plot. |
| 10:47 | Residuals are the differences between observed and predicted values of all points. |
| 10:54 | Above the Statistics window, click on the last Switch Axes button. |
| 11:00 | For the scatterplot, length is now plotted along y axis and weight along x axis. |
| 11:08 | Observe that the best fit line and many statistics change. |
| 11:13 | Its equation is now y= 9.91x + 684.3. |
| 11:22 | The only statistics that remain the same are r, R squared and rho (ρ). |
| 11:31 | Note that r and rho are greater than 0.8, indicating positive correlation. |
| 11:39 | Weight increases as length increases for fish given feed C. |
| 11:46 | The relationship is strong and well predicted by the best fit lines. |
| 11:52 | Again, click on Switch Axes button. |
| 11:56 | At the bottom, in Symbolic Evaluation, you can enter a value for x to get a prediction for y. |
| 12:04 | To get logical predictions, we will enter x values above the x intercept. |
| 12:11 | In Symbolic Evaluation, in the text-box for x, type 800 and press Enter. |
| 12:19 | Note that a y value appears next to the display box. |
| 12:25 | The x value was substituted in the best fit line equation to get the y value. |
| 12:32 | Again, click on Show Statistics tool button. |
| 12:37 | Close the Data Analysis window. |
| 12:40 | Let’s go back to the length and weight data in the Spreadsheet. |
| 12:45 | In the Spreadsheet, select all the data in both columns. |
| 12:51 | Under One Variable Analysis, click on Multiple Variable Analysis tool. |
| 12:59 | In the Data Source window that pops up, click Analyze button. |
| 13:04 | Box Plots appear in the window. |
| 13:07 | They are for length and weight data. |
| 13:11 | Above the plot, click on the second Show Statistics tool. |
| 13:17 | Statistics for both plots appear below. |
| 13:21 | Place the cursor on the boundary between the plot and statistics. |
| 13:27 | When the arrow appears, drag the boundary to resize the windows. |
| 13:34 | Let us summarize. |
| 13:36 | In this tutorial, we have learnt how to use GeoGebra to perform: |
| 13:41 | One Variable Analysis to calculate different statistical parameters. |
| 13:47 | Two Variable Regression Analysis to estimate best fit line. |
| 13:52 | Multiple Variable Analysis to calculate different statistical parameters. |
| 13:58 | Assignment
Perform statistical analyses for weight and girth data given in this tutorial. |
| 14:07 | Four oils were used to deep fry chips. |
| 14:11 | Amount of absorbed fat was measured for 6 chips fried in 4 oils. |
| 14:19 | Is any of the oils absorbed more than the others? |
| 14:24 | The video at the following link summarizes the Spoken Tutorial project.
Please download and watch it. |
| 14:32 | The Spoken Tutorial Project team conducts workshops and gives certificates.
For more details, please write to us. |
| 14:42 | Please post your timed queries on this forum. |
| 14:46 | Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.
More information on this mission is available at this link. |
| 14:59 | This is Vidhya Iyer from IIT Bombay, signing off.
Thank you for joining. |