Presentation on theme: "Graphs and graphical presentation Peter Shaw pH 6 7 1 2 3 4 5 6 Pond #"— Presentation transcript:
Graphs and graphical presentation Peter Shaw pH 6 7 1 2 3 4 5 6 Pond #
Your most important concept for the day : A graph is the best way to communicate numerical information to people. Bar nothing. Always graph data if you want to understand them or explain to others.
Shows maxima and minima pH 6 7 The distribution of pH values in ponds on Wimbledon Common Rules for any graph: 2: A title 1: Clearly labelled axes, units where appropriate 3: Explanations of symbols 1 2 3 4 5 6 Pond #
The most common fault: You use the PC stats package to plot the graph for you. Eh? Come on, are you seriously expecting me to draw them by hand when the PC does it for me?! Well, actually, the number of times that an SPSS- or EXCEL-generated graph is acceptable as thesis-quality first time around is close to zero. Common errors are stupid axis ranges (weight or height starting at negative values), default variable names (VAR001 tells me nothing!), and glorious technicolor (that becomes illegible in the photocopied version). Re-edit them to give big bold black symbols and sensible ranges. In several cases I don’t bother, but edit and past the graph into powerpoint and re-draw it by hand in powerpoint. All the diagrams in my book were redrawn in Powerpoint this way after I despaired of ever getting a useful graph out of SPSS!
Litter depth, mm 0 20 40 60 80 100 0 2 4 8 16 32 Distance, m And the graph I really wanted…
Key 1997-2001 1992-1996 1988-1990 1986-1987 a species cc c c Cc Co C3 C5 Am Ar Lp Pi Bs Gr Sv Sl Lh -300 –200 -100 0 100 200 300 400 -100 0 100 200 300 400 500 2nd DCA axis Eigenvalue = 0.111 1 st DCA axis Eigenvalue = 0.375 Another hand-drawn in Powerpoint.. This is an ordination diagram – more later on in the course
Types of graph: There are many types, and no laws stopping you from inventing a new format. My aim for today is to show you the theory and practice of the commoner types of graph. Then I will get you used to plotting them in your head to model the behaviour of different patterns within your data (rest assured that this s very quick and easy). Then we head for the PCs to do them ourselves.
100 50 0 Number of individuals caught 1 2 pond These are useful for showing how properties differ between sites/classes, but work best when you have only one number (a total, average or other) per class. Bar charts
Boxplots These are under- rated, but extremely helpful tools for examining the distribution of data. They have the big advantage over barcharts that they show the range of values in data. 0 50 100 median 25 th centile 75 th centile Highest value Lowest value
Here we have an example of boxplots in action, describing soil insects in 3 areas of a wood in Devon (ancient oak, modern conifer, and newly cleared).
Scatterplots pH depth These are very commonly used and powerful tools. The Y axis (going up) is always assumed to depend on the X variable. Think hard before putting any marks on here! Generally you should fit a singe best fit line if the correlation is p<0.05, otherwise leave alone.
year NEVER dot – dot!! Unless your are absolutely sure that interpolation is valid Lichen cover on tombstone This is WRONG year Height of 1 child This is OK
A hybrid scatter- graph with error bars. You may want to consider the validity of joining the points up, but it can be justified.
Scatterplots, contd Beware the false axis! Why is this graph meaningless? 1 5 10 Bag number Weight of leaf
Pie charts These are good for showing the proportional composition of communities, but not so good for comparing samples of different sizes.
P-P graphs These are used to decide about normality of data. If the plotted points lie on the green line (the line of Y=X) the data distribution appears to be that of the Normal or Gaussian curve. Here we see the same data before and after a logarithmic transformation.
Kite diagrams These are mainly used to show how communities of 3-10 entities vary along an axis (time, or a spatial gradient such as downstream from a pollution source). They are good for ecological studies, less so for physical data. Age, years 0 5 10 Species A Species B Species C Total counts for each species
I want to give you the secret to good results: The secret to a successful exercise in data collection is to plan (ie visualise) the final presentation BEFORE you start to collect the data! This does NOT mean you plot the graph then make up the data!! It means that you consider what patterns might arise in your data, how best to portray these on a graph, and thereby allows you to plan what data you will need to collect, and drives the whole project along.
A student wants to measure the pH values of ponds on Wimbledon common, already planning the talk that they will give a week later. They want to show a graph like this: Shows maxima and minima pH 6 7 1 2 3 4 5 6 Pond # The distribution of pH values in ponds on Wimbledon Common They know that there are 6 accessible ponds to visit and want to be able to talk about all of them. They work out how long they have for each pond, and collect 4 measurements from each.
A quick boxplot exercise Imagine that you are to undertake research on the Common, measuring properties of two ponds. Produce a boxplot chart comparing them between sites under TWO scenarios: 1: There is significant variation between the sites - at least one is different. 2: There is a little variation between sites, but only due to random noise.
Now produce a scatter graph showing how two variables are related. Let’s plot yield of vegetation against dose of fertiliser added. Again plot 2 scenarios: 1: How you imagine the data would work out if the two variables are significantly related (correlated in the jargon) 2: What you might find if the fertiliser turned out to a waste of money.