Download presentation
Presentation is loading. Please wait.
Published byThomasina Fletcher Modified over 10 years ago
1
Application of rule induction techniques for detecting the possible impact of endocrine disruptors on the North Sea ecosystems Tim Verslycke 1, Peter Goethals 1,2, Gert Vandenbergh 1, Karen Callebaut 3 & Colin Janssen 1 1 Laboratory of Environmental Toxicology and Aquatic Ecology, Ghent University 2 Institute for Forestry and Game Management 3 Ecolas n.v.
2
Outline Introduction on endocrine disruptors ED North project Database set-up Data mining and rule induction Practical application on ED North database Conclusions
3
Endocrine disruptors, pseudo-hormones, endocrine modulators, xeno-hormones, … Compounds that interfere with the endocrine system, resulting in (negative) effects on health and/or reproduction of organisms Since 90s: one of the strongest growing research domains in environmental toxicology Dozens of lists, 100s compounds Worldwide implication: industry - government - academics Endocrine disruptors ??
4
Endocrine disruption in marine environments ?? Sea: final sink for many chemicals North Sea and its estuaries are under a heavy pollution load Indications of potential endocrine disruption in these ecosystems Need to have better overview of potential endocrine disruption in North Sea and Scheldt estuary ED-NORTH project
5
ED-North project ~ Goals Critical evaluation of the literature on endocrine disruptors Build a reference list and database of chemicals with (potential) endocrine disruptive activity Evaluation of the described and suspected effects of endocrine disruptors on marine organisms Prioritize the selected chemicals If enough information: preliminary risk assessment Formulation of the research needs and policy actions (overview of the Belgian expertise)
6
ED-North project ~ Methods Literature study - electronic databases: Poltox, Medline, Current Contents, CAB abstracts, Agris, Agricola, Web of Science,… - world wide web: USEPA, OECD, WWF, CEFIC, IEH,… - grey literature Database MS Access (relational database)
7
ED-North project ~ Results General overview of endocrine disruption in humans and other mammals, birds, reptiles, fish and invertebrates Situation in Belgium and The Netherlands Expertise in Belgium Emission of synthetic and natural hormones in Belgium Sources, effects and occurrence of endocrine disruptors in the North Sea + prioritization Database of (potential) endocrine disruptors for the North Sea ecosystem
8
CHEMICALS (765) Chemical ID Chemical Name Nl Chemical Name E CAS UN Chemical Formula Molecular Weight Boiling Point Melting Point Density Pressure Solubility Log Kow Phase Notes ENDOCRINE Endocrine ID Chemical ID Reference ID Group Name Organism Tissue Age In vivo Lab Flow Duration Route Temperature Concentration Notes EFFECT (3516) Effect ID Hormone Name Endocrine ID Effect Code Effect description REFERENCES (423) Reference ID Authors Year Title Source GROUP Group Name HORMONE Hormone Name EFFECT CODE Effect Code Relational database: anthropogenic (potential) endocrine disruptors
9
Endocrin ID Chem ID Ref ID GroupOrganismTissueAge In Vivo Lab Dura tion Concentra tion Notes 2598 24026 mammalianHumanMCF-7 cellsIn vitroLaboratory6 days10 µM Technical grade; E- screen Chem ID ChemNameNlCAS Chem Form Mol weight BPBPMPPressureSolubility Log Kow Phase 240DDT 50-29-3C14H9Cl5354,49260°C108°C1,9E-7 mm Hg at 20°C3,1-3,4 µg/l6,19Solid Tabel: Endocrine Tabel: Chemicals RefIDAuthorsYearSource 26 Soto, A.M., Chung, K.L., Sonnenschein, C.1994Environ. Health Perspect., 102:380-383 Tabel: References Relational database
10
Rule induction techniques Data mining (analysis) techniques: 1) Clustering methods (which data are related or ‘similar’) e.g. cluster analysis 2) Classification methods (how are variables related, merely using classes (numerical or not) = rules amongst variables) e.g. decision trees 3) Regression methods (quantitative description of the relation between two variables) e.g. multivariate regression A A B B A B
11
Rule induction techniques Classification and decision trees: induction of rules from datasets which variables are related e.g. which variables are mainly related to endocrine disruptive effects in animals how are variables related (quantitative rules making use of treshold values or classes) e.g. when hormone concentration higher than value A, then estrogenic effects of type X will occur
12
Rule induction techniques WEKA data mining software: DOS command window but also Visual JAVA interface
13
Induced rule set Rule set performance indicators
14
Applications on ED-North database Example on crustacean data 1) Prediction of endocrine disruptive effects based on physical/chemical properties of chemicals 2) Prediction of estrogenic effect of chemicals to the crustaceans in the database 3) Which factors (flow, concentration, duration,...) affect this estrogenicity
15
1) Which molecular characteristics are related to estrogenic effects Estrogenic effects in crustaceans (89 cases) Tested variables: effects, molecular weight, boiling point, temperature, Log Kow, solubility Induced rule set: LogKow 3.74: Estrogenic effect LogKow > 3.74 | Solubility 0.00033: No Estrogenic effect | Solubility > 0.00033: Estrogenic effect Reliability (CCI): 63 %
16
2) Which estrogenic effects are related with particular compounds in the environment Estrogenic effects in crustaceans Tested variables: effects, compounds Induced rule set (23 rules, one for each compound): CHEMID = 4-nonylphenol (p-nonylphenol): Estrogenic effect CHEMID =...... CHEMID = 20-hydroxyecdysone: No Estrogenic effect Reliability (CCI): 60 %
17
2) Which estrogenic effects are related with particular compounds in the environment Estrogenic effects in crustaceans Tested variables: effects, organisms, compounds Induced rule set (13 rules, one for each organism): Organism = Balanus amphitrite: No estrogenic effect Organism = Daphnia magna: Estrogenic effect... Reliability (CCI): 74 %
18
3) Which factors affect the estrogenic effects Estrogenic effects in crustaceans Tested variables: effects, organisms, compounds, age, flow, in vitro/in vivo, duration Induced rule set (16 rules, one for each age class and for larval also one for each organism type): Age = Juvenile: No estrogenic effect Age = Larval | Organism = Balanus amphitrite : Estrogenic effect | Organism =... Age = Adult: Estrogenic effect Age = Egg: Estrogenic effect Reliability (CCI): 78 %
19
General discussion This exercice on the ED North data base illustrated that data mining can help to find relations between: Type of organisms Test and environmental conditions Estrogenic effects Compounds and their structure
20
General discussion Data mining helps to find errors and outliers in the data set, and creates insights to improve further data collection and the development of databases Interaction between data miners and domain experts (ecologist, ecotoxicologist) very important: 1) easily find ‘reliable nonsense’ rules by excluding important variables during the analysis (need for expertise of ecotoxicologist) 2) the parameter settings and the insight in tuning them have a very important impact on the richness of the outcome of the data mining exercice (need for data mining expertise)
21
General discussion The collected data set itself influences to an important extend the outcome of the analysis: 1)importance of collecting data that cover the whole range (variables and their values/classes) and stratification of the instances is necessary 2)Selection of variable-classes can affect the results to a high extend (e.g. larval-adult problem, amount of effect-classes,...)
22
Conclusions Data mining allows to find which gaps exist in the database and delivers information for sustainable data collection and management Data mining delivers insight in the dataset: generation of knowledge from data Highly impredictable parts in the dataset are useful to focus further research on General reliable rules are promising for decision support in environmental management Important to be aware of exploring correlations instead of causal relations! Control by experts or further research (validation) is always necessary Data mining adds more colour to our data
23
Federal Office for Scientific, Technical and Cultural Affairs (OSTC) Thesis students Ward Vanden Berghe (VLIZ) The Flemish Institute for the Promotion of Scientific and Technological Research in Industry (IWT) Acknowledgements
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.