Applications-of-GeoGebra/C3/Probability-and-Distributions/English

From Script | Spoken-Tutorial
Revision as of 12:15, 31 October 2018 by Vidhya (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Visual Cue Narration
Slide Number 1

Title Slide

Welcome to this tutorial on Probability and Distributions in GeoGebra.
Slide Number 2

Learning Objectives

In this tutorial, we will:

Learn how to use Probability Calculator in GeoGebra

Look at different distributions and parameters.

Slide Number 3

System Requirement

Here I am using:

Ubuntu Linux OS version 16.04

GeoGebra 5.0.481.0-d

Slide Number 4

Pre-requisites

To follow this tutorial, you should be familiar with

GeoGebra interface

Statistics

Slide Number 5

Fish Feed

A fishery is testing four types of feed formulation on its fish: A, B, C and D

Length (mm) Weight (lbs) Girth (mm)

Fish Feed

Let us look at an example.

A fishery is testing four types of feed formulation on its fish: A, B, C and D. Data to be collected after feeding the fish for 6 months are:

Length in millimeters

Weight in pounds

Girth in millimeters

Let’s look at some of these data.

Slide Number 6

Fish Feed Data

[[Image:|top]]

Fish Feed Data

We will use these data for our analyses.

Please download the code file, Fishery-data, provided along with this tutorial.

Slide Number 7

Probability

Probability of an event P(A), from 0 to 1

P(A) is ratio of frequency of event A to number of trials

Sampling distribution (normal, t etc)

Probabilities compare 2 independent sample proportions or means

Probability

Probability of an event P(A) is the likelihood that event A will occur. P(A) lies between 0 and 1.

P(A) is the ratio of frequency of event A to the number of trials.

Statistics are calculated for each sample. The probability distribution of these statistics is called a sampling distribution. Examples are normal, t etc.

Probabilities compare 2 independent sample proportions or means.

Slide Number 8

Hypothesis testing

Statistical hypothesis

Hypothesis testing: H0, Ha

z-, t-, F-tests etc

Hypothesis testing

A statistical hypothesis is an assumption about a population parameter

Hypothesis testing asks: should a statistical hypothesis be accepted or rejected?

H zero, the null hypothesis, says observations arise from pure chance.

Ha, the alternative hypothesis, says observations arise due to non-random causes.

z-, t- and F-tests test population parameters in different situations. 
Please refer to additional material provided along with this tutorial.
Show the GeoGebra window. I have opened the GeoGebra interface.
Click on View tool and select Spreadsheet. Click on View tool and select Spreadsheet.
Click on X at top right corner of Graphics and Algebra views. Click on X at top right corner of Graphics and Algebra views.

This will close these views.

In the code file, use the mouse to highlight length data in column B. In the code file, use the mouse to highlight length data in column B.
Hold Ctrl key down and press C to copy the data. Hold Control key down and press C to copy the data.
Click on Spreadsheet view in GeoGebra. Click in the top of the Spreadsheet in GeoGebra.
Hold Ctrl key down and press V. Hold Control key down and press V.
Point to the data pasted into GeoGebra. This will copy and paste the highlighted data from the code file into GeoGebra.
Drag and adjust the column width. Drag and adjust the column width.
Click on Text and change name to Length (mm)-A.

Close the dialog box.

As shown earlier in the series, right-click on the heading.

Change the name to Length (mm) hyphen A.

Close the dialog box.

Adjust the column width. Adjust the column width.
Highlight data in columns E, H and K.

Copy and paste data from code file into GeoGebra.

Change names of data from columns E, H and K to:

Length (mm)-B

Length (mm)-C

Length (mm)-D

Repeat this with data in columns E, H and K.
Drag mouse to highlight all labels and data in the four columns. Select all data in the four columns by dragging.
Under the menubar, under One Variable Analysis, click on Multiple Variable Analysis. Under the menubar, under One Variable Analysis, click on Multiple Variable Analysis.
Show Data Source popup window.

Click on Analyze button.

AData Source popup window appears.

Click on Analyze button.

Show the Data Analysis window. A Data Analysis window appears.
Drag the boundary to see it properly. Drag the boundary to see it properly.
Point to Stacked box plots appearing in all four columns. Stacked box plots appear for data for all four columns.
Click anywhere in the GeoGebra window and then click on Show Statistics tool.

Point to Statistics displayed below the box plots.

Click anywhere in the GeoGebra window and then click on Show Statistics tool.

Statistics are displayed below the box plots.

Above the statistics, click on menu button next to the display.

Select ANOVA.

Above the statistics, click on menu button next to the word Statistics.

Select ANOVA.

Drag boundaries to increase size of statistics tables. Drag the boundaries and resize the window to increase size of statistics tables.
Place the cursor on the boundary below the plots.

And drag to increase the size of the tables.

Place the cursor on the boundary below the plots.

And drag to increase the size of the tables.

Point to the results. The between groups mean square (MS) is much greater than within groups MS.

F value is the ratio of between groups MS to within groups MS. Hence, F value is quite large (36.5892).

P value is 0. This means it is probably less than 0.001. The difference in the means of all groups is statistically significant.

The feed does make a statistically significant difference to the length of the fish.

Hence, the null hypothesis can be rejected in this case. The null hypothesis here is that none of the length means are different. That is, none of the feeds make any difference to the length of the fish.

Click on the menu button next to the ANOVA display. Next to the ANOVA display, click on the menu button.
Point to two options appearing for T Test and T Estimate. Two options appear for T Test: Difference of Means and Paired Differences.
Point to T Test in the menu. The same two options appear for T Estimate.

Difference of Means is for unpaired T Test.

Paired Differences is for paired T Test.

The T Test compares two groups at a time.
Select T Test: Difference of Means. Select T Test: Difference of Means.
Point to Sample 1 and Sample 2. Column A data are denoted by default as Sample 1. Column B data are denoted by default as Sample 2
Click on menu buttons next to the displays to reverse the order. Click on the menu buttons next to the displays to reverse the order.

As mean of column B is greater than mean of column A, T values and limits will now be positive.

Point to t and P values in T tests. T Tests give t and P values.

Comparing A and B gives P less than 0.001 and T value greater than 4.

Thus, feeds A and B have a significant effect on lengths of fish.

Click on the menu button and choose T estimates, Difference of Means. Click on the menu button and choose T estimates, Difference of Means.
Show the statistics tables.

Point to Confidence Level 0.95.

T Estimates give lower and upper limits for the mean difference.

The confidence level is 95%.

We can be 95% sure that the mean difference is between the lower and upper limits.

You can change column pairs for comparison and look at the T Test results.

Close the Data Analysis window. Close the Data Analysis window.
Now let us look at the Probability Calculator.
Point to Spreadsheet view.

Use mouse to drag and highlight length data for feed A.

We are in the Spreadsheet view.

Use the mouse to drag and highlight length data for feed A.

Click on One Variable Analysis tool. Click on One Variable Analysis tool.

In the Data Source popup window that appears, click on Analyze button.

At the top of the Data Analysis window, click on the 2nd Show Statistics button.

Note down mean mu (µ) and standard deviation sigma (σ). (745.5, 29.0215)

At the top of the Data Analysis window, click on the 2nd Show Statistics button.

Note down mean mu (µ) and standard deviation sigma (σ).

Close the Data Analysis window and follow the same steps for feed B. (801.5, 21.2191) Close the Data Analysis window and follow the same steps for feed B.
Drag and highlight feed A length data. Again, drag and highlight feed A length data.
Click on View and then click on Probability Calculator. Click on View and then click on Probability Calculator.
Point to the Probability Calculator window that pops up. The Probability Calculator window pops up.
Drag the boundary to see it properly. Drag the boundary to see it properly.
Point to the plot and the Distribution tab above the plot.

Below the plot, point to the Normal display box.

We are looking at a normal distribution in the Distribution window.
Place your cursor on the horizontal boundary below the distribution curve.

Drag the arrow upwards to see the data entry window below the curve properly.

Place your cursor on the horizontal boundary below the distribution curve.

Drag the arrow upwards to see the data entry window below the curve properly.

Let us look at a normal distribution for fish given feed A.
Type 745.5 in the box next to mu >> press Enter. In the box next to mu (μ), type 745.5 and press Enter.
Type 29.0215 in the box next to sigma >> press Enter. In the box next to sigma (σ), type 29.0215 and press Enter.
Point to normal distribution plot. A normal distribution plot appears with mean 745.5 and sigma 29.0215.
Click on the 1st of three buttons below the mean and σ boxes. Click on the 1st of three buttons below the mean and σ boxes.
Point to the right side bracket indicates this is the upper limit. The right side bracket indicates this is the upper limit.
Type 770 in the box next to P (X ≤ and press Enter. In the box next to P of X less than or equal to, type 770 and press Enter.
Point to the probability P appearing in the box to the right, 0.8007. Note that the probability P appears in the box to the right, 0.8007.

Thus, 80.07% fish fed feed A are 770 mm long or shorter.

Type 0.09 in the P box to the right and press Enter. Let us do the reverse.

In the P box to the right of the equal to sign, type 0.09.

Press Enter.

We want to know how long 9% of the fish are, on the lower side of the group.
Point to X 706.5893 appearing in the box. When you press Enter, X less than or equal to 706.5893 appears in the box.
Point to the value. Thus, 9% of the fish are shorter than this length.
Click on the curve symbol next to the Normal display box. Next to the Normal display box, click on the curve symbol.
Point to the cumulative distribution function curve appears. The cumulative distribution function curve appears.
Point to Probability on y-axis and length of feed on x-axis. Probability is plotted on the y-axis, length of feed A group is plotted on x-axis.
Click on curve symbol to return to the normal distribution bell curve. Click on curve symbol to return to the normal distribution bell curve.
Click on 2nd of three buttons below mu and sigma displays. Below mu and sigma displays, click on 2nd of the three buttons.
Point to the two brackets indicate that lower and upper limits can be specified. The two brackets indicate that lower and upper limits can be specified.
Type 705 in the first box and 758 in the second box and press Enter. In the first box, type 705 and in the second box, 758, and press Enter.
Point to P (705 ≤ X ≤ 758) = 0.5852. P equal to 0.5852 appears.

This means 58.52% of fish fed feed A are 705 to 758 mm long.

Click on the 3rd of the three buttons showing a left bracket. Finally, click on the 3rd button showing a left bracket.
Type 760 in the box and press Enter. In the box, type 760 and press Enter.
Point to 0.3087. 30.87% of fish fed feed A are longer than 760 mm.
Click on Statistics tab next to Distribution tab. Next to Distribution tab, click on Statistics tab.
Close the Probability Calculator window. Close the Probability Calculator window.
Point to Spreadsheet view. Let us look at the Spreadsheet in GeoGebra.
Use mouse to drag and highlight length data in columns A and B. Use mouse to drag and highlight length data in columns A and B.
Under One Variable Analysis, select Probability Calculator. Under One Variable Analysis, select Probability Calculator.
Point to Statistics window. We are looking, as before, at the Statistics window.
From the dropdown menu at the top, select T Test, Difference of Means. From the dropdown menu at the top, select T Test, Difference of Means.
Type means, standard deviation σ and N=10 in respective boxes.

Press Enter after entering all values.

You can type mean, standard deviation σ and total number of samples N in the boxes.

We will type 10 for N as each feed group has 10 fish.

Press Enter after entering all values.

Feed A mean is lower than feed B mean.
Choose feed B group as Sample 1 and feed A as Sample 2. So we will choose feed B group as Sample 1 and feed A as Sample 2.

This will result in positive values for different statistical parameters.

Note t, standard error SE, degrees of freedom df and P values.

Compare them to results from Multiple Variable Analysis.

Select different tests for different pairs of columns in the Spreadsheet.

Interpret the results and compare with your calculations.

Let us summarize.
Slide Number 9

Summary

In this tutorial, we have learnt how to use Probability Calculator in GeoGebra.

We looked at different distributions and parameters.

Slide Number 10

Assignment

Assignment

Perform statistical analyses for weight and girth data given in this tutorial.

Four oils were used to deep fry chips. Six chips were chosen from each batch fried in a given oil. Amount of absorbed fat was measured for these chips. Is any of the oils absorbed more than the others?

[[Image:|top]]

Slide Number 11

About Spoken Tutorial project

The video at the following link summarizes the Spoken Tutorial project.

Please download and watch it.

Slide Number 12

Spoken Tutorial workshops

The Spoken Tutorial Project team conducts workshops and gives certificates.

For more details, please write to us.

Slide Number 13

Forum for specific questions:

Do you have questions in THIS Spoken Tutorial?

Please visit this site

Choose the minute and second where you have the question

Explain your question briefly

Someone from our team will answer them

Please post your timed queries on this forum.
Slide Number 14

Acknowledgement

Spoken Tutorial Project is funded by NMEICT, MHRD, Government of India.

More information on this mission is available at this link.

This is Vidhya Iyer from IIT Bombay, signing off.

Thank you for joining.

Contributors and Content Editors

Madhurig, Snehalathak, Vidhya