Saturday, October 16, 2010

Box Plots and Shape

Awhile ago someone re-tweeted a request for applet ideas using GeoGebra.  I had just been scouring for an applet where students could play with a distribution and see how it is related to a box plot.  I could not find anything to fit exactly what I was looking for.  So, I jumped all over the opportunity.

I visited the tweeted link and put in my request.  Hardly a day had passed and this applet was the result.  @mrhodotnet had nailed it!  Exactly what I was looking for.

With permission I embedded the applet into our class's moodle.  A couple of minor cosmetic changes is all I did.

  • I hid the spreadsheet and toolbar menu.  We (and I) have not used GeoGebra before.  I didn't want the spreadsheet and toolbar to distract from our goal.

  • I moved the list of values to the top of the page.

  • I moved a couple of other labels around to give more space.


Other than that the applet is as is.  Students use the sliders to add/remove values to the dotplot.  At the same time a box plot is also created.    The second dot plot I will get to later.

I had two goals for the lesson:

  • For students to discover box plots can (not always) give clues to the shape of a distribution.

  • To get rid of a common misconception in box plots.  Students believe if a whisker is longer than the other whisker it is because the longer whisker contains more values.  They also believe the same if one side of the box is longer than the other.


The activity on moodle looked like this:
-----------------------------------------------------------------
In this lab you will investigate features of a box plot and how to get a rough idea of shape from a box plot.

To complete the lab:

  • Open the applet

  • Open the answer sheet to record your answers to the questions below.

  • When finished, hand in the answer sheet to turnitin.com


Tips for using the applet:

  • There are 10 "sliders" across the top. Left click on a slider and drag it up and down. Example: If you click on the second slider, as you move up or down you will add or remove 2's from the set of data.

  • At the very top as you add values they will be listed out.

  • As you add values the dot plot and blue box plot will be created.

  • Ignore the check boxes on the right side.

  • The green target box plot is automatically created every time you press F9. You can try to add values to the set of data to make the blue box plot match the green box plot. If they match look for a special message.


A. Add values to the dot plot to create a distribution that is symmetric (remember it doesn't have to be perfect, that's why we sometimes call it approximately normal).
1. Now that you made your dot plot symmetric, describe what your box plot looks like.
2. Look at another student's box plot. What features do your box plots have in common? (If you do not have another student's box plot to look at, rearrange your values to create a another dot plot that is symmetric.)
3. In general, what do you notice about a box plot that may indicate it is a symmetric distribution?

B. Add and/or remove values to create a distribution that is skewed right.
1. Describe what your box plot looks like.
2. Look at another student's box plot. What features do your box plots have in common?
3. In general, what do you notice about a box plot that may indicate it is a skewed right distribution?

C. Add and/or remove values to create a distribution that is skewed left.
1. Describe what your box plot looks like.
2. Look at another student's box plot. What features do your box plots have in common?
3. In general, what do you notice about a box plot that may indicate it is a skewed left distribution?

D. Create a box plot using 20 values.
1. How many values are above the median? below?
2. How many values are above quartile 3? below?
E. Change your values so that the right whisker is much longer than the left whisker.
1. How did you change your values to make the right whisker long?

2. Which whisker contains more values?


F. Change your values so that the box from the median to Q3 is much shorter than the box from Q1 to the median.
1. How did you change your values to make the box from the median to Q3 short?
2. Which contains more values, the box from Q1 to median or the box from median to Q3?
G. What is a box plot showing you about the data when a whisker or box is long vs short?

H. Change your values so that you have NO whisker on the left.
1. How did you change your values to make it happen?

Additional practice:

  • Press F9 to make the green target box plot change

  • Add and/or remove values until you get your blue box plot to match the green one

  • Repeat as often as you want.


-------------------------------------------------

One of my faults I have found when doing these guided discoveries when using an applet is I tend to stray from my intended goals and go a little overboard with the questions.  That whole being succinct thing is not my strong suit.

Overall the students were on track with the first objective.  @mrhodotnet gave some ideas about questions where a box plot fails at providing us with a shape.  In our debrief session I made a couple of distributions and had only the box plot visible for the students.  I set the trap asking for the shape and then showed the matching dot plot.  They were quick to see box plots do not always show shape.  This is when we took the opportunity to reinforce the idea the main objective behind using a box plot is to quickly compare data sets and get a visual of their spreads.

The second objective did not go as well.  More students then I cared for still had the misconception that a whisker or part of a box is long because it contains more values.  Questions D-H above is where I tried to have students play with the idea.

As always, your thoughts and comments are appreciated.  Thanks again @mrhodotnet for creating the applet!

 

1 comment:

  1. Some notes on the applet:
    -The boxplot in GeoGebra doesn't seem to have feature/functionality to use upper and lower fence to determine outliers (maybe I just don't know how) and have them show up in the graph. It's one reason I limited the data set to numbers from 1 to 10. In the spreadsheet you can also see what numbers are possible for median and quartiles. I think despite its limitations it does do several things well.

    -I think I may have built too many things to this applet. I have a tendency to do that while I'm learning to work the features and explore what's possible. For example, removing "easy mode" allows fractional median, Q1, and Q3s (well .5s). Not sure if anyone uses that. The spreadsheet was another feature where I tried to do too much. I also had it showing to see how the numbers in the data set were connected to the distribution and dotplot. Originally I built in the feature to allow data manipulation using either the spreadsheet column or sliders. I think different applets for different concepts would have been more useful.

    As I've said on twitter, I really like your questions and activity.

    I've actually used the applet myself in class. I don't think it would have been possible were it not for your suggestion. I think it's exactly like you said in an earlier post. Some teachers are taking upon themselves to do their own PD by collaborating online and sharing their resources hoping for the kind of constructive feedback that we can use to become better teachers. I'm thankful that you and other teachers showed up to give suggestions and feedback. Thanks for also sharing your lessons involving the applet.

    ReplyDelete