Presentation on theme: "Data Domains and Introduction to Statistics Chemistry 243."— Presentation transcript:
Data Domains and Introduction to Statistics Chemistry 243
Instrumental methods and what they measure Electromagnetic methods Electrical methods Photons are modulated by sample
Instruments are translators Convert physical or chemical properties that we cannot directly observe into information that we can interpret.
Sometimes multiple translations are needed Thermometer Bimetallic coil converts temperature to physical displacement Scale converts angle of the pointer to an observable value of meaning adapted from C.G. Enke, The Art and Science of Chemical Analysis, 2001. http://upload.wikimedia.org/wikipedia/commons/d/d2/Bimetaal.jpg http://upload.wikimedia.org/wikipedia/commons/2/26/Bimetal_coil_reacts_to_lighter.gif http://static.howstuffworks.com/gif/home-thermostat-thermometer.jpg Thermostat: Displacement used to activate switch
Data domains Information is encoded and transferred between domains Non-electrical domains Beginning and end of a measurement Electrical domains Intermediate data collection and processing
Initial conversion device Intermediate conversion device Readout conversion device Quantity to be measured Intermediate quantity 2 Number Intermediate quantity 1 PMTResistor Digital voltmeter Emission Voltage (V = iR) Intensity Current Data domains Often viewed on a GUI (graphical user interface)
Electrical domains Analog signals Magnitude of voltage, current, charge, or power Continuous in both amplitude and time Time-domain signals Time relationship of signal fluctuations (not amplitudes) Frequency, pulse width, phase Digital information Data encoded in only two discrete levels A simplification for transmission and storage of information which can be re-combined with great accuracy and precision The heart of modern electronics
Digital and analog signals Analog signals Magnitude of voltage, current, charge, or power Continuous in both amplitude and time Digital information Data encoded in only discrete levels
Analog to digital to conversion Limited by bit resolution of ADC 4-bit card has 2 4 = 16 discrete binary levels 8-bit card has 2 8 = 256 discrete binary levels 32-bit card has 2 32 = 4,294,967,296 discrete binary levels Common today Maximum resolution comes from full use of ADC voltage range. Trade-offs More bits is usually slower More expensive K.A. Rubinson, J.F. Rubinson, Contemporary Instrumental Analysis, 2000.
Byte prefixes About 1000 About a million About a billion
Serial and parallel binary encoding (serial) Slow – not digital; outdated Fast – between instruments “serial-coded binary” data Binary Parallel: Very Fast – within an instrument “parallel digital” data
Introductory statistics Statistical handling of data is incredibly important because it gives it significance. The ability or inability to definitively state that two values are statistically different has profound ramifications in data interpretation. Measurements are not absolute and robust methods for establishing run-to-run reproducibility and instrument-to-instrument variability are essential.
Introductory statistics: Mean, median, and mode Population mean ( ): average value of replicate data Median ( ½ ): ½ of the observations are greater; ½ are less Mode ( md ): most probable value For a symmetrical distribution: Real distributions are rarely perfectly symmetrical
Statistical distribution Often follows a Gaussian functional form
Introductory statistics: Standard deviation and variance Standard deviation ( ): Variance ( 2 ):
Gaussian distribution Common distribution with well-defined stats 68.3% of data is within 1 of mean 95.5% at 2 99.7% at 3
Statistical distribution 50 Abs measurements of an identical sample Let’s go to Excel Table a1-1, Skoog
Standard deviation and variance, continued is a measure of precision (magnitude of indeterminate error) Other useful definitions: Standard error of mean
Confidence intervals In most situations cannot be determined Would require infinite number of measurements Statistically we can establish confidence interval around in which is expected to lie with a certain level of probability.
Calculating confidence intervals We cannot absolutely determine , so when s is not a good estimate (small # of samples) use: Note that t approaches z as N increases. 2-sided t values
Example of confidence interval determination for smaller number of samples Given the following values for serum carcinoembryonic acid (CEA) measurements, determine the 95% confidence interval. 16.9 ng/mL, 12.7 ng/mL, 15.3 ng/mL, 17.2 ng/mL or Sample mean = 15.525 ng/mL s = 2.059733 ng/mL Answer: 15.525 ± 2.863, but when you consider sig figs you get: 16 ± 3
Propagation of errors How do errors at each set contribute to the final result?