Presentation on theme: "Linear and Conjunctional Methods for Recognition of Environmental Influence on the Human Organism T.K.Breus and V.A.Ozheredov Space Research Institute."— Presentation transcript:
Linear and Conjunctional Methods for Recognition of Environmental Influence on the Human Organism T.K.Breus and V.A.Ozheredov Space Research Institute RAS IKI-MSR Research Workshop 11-12 June, 2009
Cooperation We work with ten institutes from 2 Academy of Sciences Departments (of Physics and Biology) (IMBP, ITEB, IZMIRAN, IOPHAN RAS, Pediatric Institute RAMS, Sechenov’s Medical Academy, various clinics and city Hospitals. International Cooperation: USA (Halberg’s Center of Chronobiology), Japan (Medical Schools), China (three clinics), Germany, Bulgaria (STCL), Slovakia, Norway (about 150 scientists from the International BIOCOS (Biology and Cosmos) association, which had been founded three years ago by Franz Halberg (USA).
Official status We work in frames of 2 Programs of the Russian Academy of Sciences N21 “Fundamental Sciences - to Medicine” N16 “Solar-Terrestrial Connections” There is a multidisciplinary group in Space Weather Laboratory at IKI. There is a Section of the Academy of Science Council on “Heliobophysics” and monthly interdisciplinary seminar of this section.
State of Art Investigations of Terrestrial Weather (W) and GeoMagnetic Activity (GMA) influence on human organism in the world practice up to nowadays were mainly based on searching of cross- correlation coefficients between the W and GMA and various health parameters. In modern society W and GMA parameters are not relevant in comparison with social and technological factors. In result, researchers from many countries revealed positive but statistically poor correlation coefficients in most cases.
Objectives From the other hand, W and GMA influences can lead to such acute health effects as myocardial infarctions (MI), brain strokes (BS), hypertension crises (HC) and even to sudden death (SD)[ Gurfinkel et al, 2008, 2009; Breus et al.1995, 2009; Stoupel et al. 1991; Neil Cherry, 2002 ]. Thus the investigation of W and GMA influence on humans still remains actual but requires new approaches. The well-proved results within frames of traditional approaches can be established only in case if the correlation coefficients (r) are close to 1. If not, (r<0.5), results are not confident, because in this case only less than 0.5 2 =0.25 (25%) of common least-square amplitude can be explained. Our approach differs from other previous ones: it is based on the theory of pattern recognition.
Theoretical approach We have set of tags – Atmospheric temperature T, pressure P, Kp-index We have also replies of human organism – f.e. number of myocardial infarctions(MI), brain strokes (BS), level of blood pressure (BP) and so on, which we suppose to be connected with aforementioned tags action A priory we invent existence of two classes of situations: 1-st class – the number of diseases are larger than average or BP level is higher than normal 2-nd class – all remain cases It is necessary to find the solving rule of selection objects belong to each of these classes This solving rule of selection is the main subject of the theory of pattern recognition
Theory of pattern recognition 1) Dividing of tag space by linear separatrice if this method is adequate in particular case 2) Dividing of tag space by nonlinear closed separatrice using a conjunctional method in case of overlapping of convex envelops of precedents (objects that are involved in algorithm teaching process) In both cases: Separatrice permits to find a “critical area” in tag space that allows us to make decision to which from two classes given object belongs. Decision is based on the position of the tag vector of a new object: if this vector is inside the critical area – we make decision that this object belongs to the I-st class; if it is outside than the object belong to the 2-nd class. The final purpose of the theory is to learn how to classify objects according to classes using set of their tags. Tag space division in case of separation (a) and overlapping (b) of convex envelops of precedents
Linear Pattern Recognition We are searching linear separatrice with optimal characteristics (normal vector w is the main parameter of linear hypersurface (separatrice)) Its projections on tag axes show relative contributions of particular tags into biological reply Separatrice position is defined by optimization of error balance Error of the I-st type appears if the object of the I-st class allocates outside of critical area; Error of II-nd type appears if the object of the II-nd class allocates inside the critical area. These errors are functions of w. Variation of w direction in process of optimization changes the error balance. The balance we will search with help of Objective Function (OF), which is a convex combination of I-st and II-nd type errors. OF is depicted on the lower-left part of figure. We will minimize OF in order to find valid seperatrice. We employed Simplex method of minimum searching. We introduce so-called Barrier Function (right side of the lower figure) in order to restrict the decreasing of OF resolution with distance from ZERO (see left side of figure)
Result of linear division of 2-dimensional tag space (Kp-index and Atmospheric Pressure AP) for revealing of weather influence on arterial blood pressure (Hypertension - AH) Materials: 2503 ABP every day data (once in morning) of the people suffering from AH and monitored in the A.L.Miasnikov’s Cardiology Center (Moscow) in 2001-2002. Corresponding Space (Kp-index) and Terrestrial (AP) Weather parameters were taken from Internet sites. Results: Both Factors are acting simultaneously with relative input in their influence according to figure: 6Kp / 4.8AP
Necessity of using nonlinear approaches for the pattern recognition Advantages: 1) Transparency 2) Flexibility 3) Controllability Disadvantages: Confidence of local probability density sharply decreases with increasing of tag space dimension Luck of dividing efficacy in case of strong overlapping of convex envelops of precedents (objects that are involved in algorithm teaching process)
Nonlinear conjunctional method advantages 1) transparency: Behavior of each components of system have interpretation with purpose of making conclusion about correctness of system operation in general; 2) flexibility: Proper response of system to change of operating conditions in comparison with initially assumed; 3) controllability: Possibility manually to create scenarios of optimisation of parameters of separate system components
Conjunctional tag space dividing and its characteristics We use so-called Newman-Pirson’s model of precedent generating: 1) precedents are being generated independently from each other; 2) Probability density associated with given class is stationary. These two conditions define REPRESENTATIVE SAMPLE of precedents Newman-Pirson’s lemma revealed an optimal criteria for critical area – in the critical area the ratio of probability densities of two classes must be larger than certain threshold c (is defined from additional conditions) As far as we do not know probability densities and can’t use Newman-Pirson solving rule directly, we have to make interval estimations of probability densities We construct near each points some cells (see figure) and perform interval estimation of probability density in each points in cells which we named – conjunctions. The larger size of conjunction, the more accurate are estimations. In turn, the smaller size of conjunction, the lager resolution we have. Conjunction
Searching of optimal threshold c G is a critical area which is searching for as a set of points where local likelihood exceeds the threshold c. Local likelihood is a ratio of estimated probability density associated with the I-st class to estimated probability density associated with the II-nd class. We are performing of teaching process in two steps – 1) dividing entire database into two parts: A (teaching part) and B (examining part); 2)Then we generate the critical area G using only A and we estimate characteristics of G using only B (crossvalidation method). The threshold c we are searching using GOOD and BAD relations. BAD means when the object from II class appears in critical area (I class). GOOD means when the object from I class appears in critical area (I class) Green and red curves on the low figure are dependencies of GOOD and BAD parameters on threshold c. Straight green line show the low boundary for numbers of true classification (GOOD) and red line show the upper boundary for numbers of false classification (BAD). Three curves for GOOD and BAD correspond to various cycles of crossvalidation. Thus the permitted region for threshold c can be defined as a region where both conditions (for BAD and GOOD ) are fulfilled. Its width is a width of likelihood
Influence of the terrestrial weather on human hypertension Systolic blood pressure (SBP) database contains 680 precedents – atmospheric temperature T and pressure P, and SBP values. The most confident results were obtained for following division of precedents into classes: SBP 153 – hypertension Critical area are multispace and allocated in three segments: 1)temperature near zero and low atmospheric pressure. 2) temperature between -3 degrees and -10 degrees and high atmospheric pressure, 3) temperature from +15 to +30 degrees and normal pressure. The GOOD/BAD ratio is about 1.7 (i.e., 63% true classifications)
Conclusions The Weather and GMA effects were revealed and corresponding limits of hazardous for development of hypertension atmospheric pressure and Kp-index were obtained This approach can be used for automatic control of Space and Terrestrial weather influence on humans in generalized forecasting systems which are using satellites. It can be used for multifunctional medicine diagnostic and selection of optimal medical treatment efficacy. In multifunctional environmental effects this approach allows to make an objective selection of the most hazardous factors. Classification of objects taking into account their stochastic behavior. Forecasting in the wide area of processes. Development of game strategy. Construction of systems which are using automatic dividing of objects into classes and also are searching for optimal sizes of conjunctions require extremely large computer performance and are an objectives of our future work.