1
A new family of regular semivalues and applications Roberto Lucchetti Politecnico di Milano,Italy

2
R.Lucchetti Politecnico di Milano 2 Main goal: To rank genes from DNA data provided by Microarray Analysis. Tools: Cooperative Game Theory, in particular Power indices Power indices rank players according to their strength in the game. In the EU council the strongest states (GE,FR,IT,UK) have a some 10 times power w.r.t. the weakest state (MT) In UN the veto players have a some 100 (10) times power w.r.t. non permanent players, according to Shapley (Banzhaf).

3
R.Lucchetti Politecnico di Milano 3 A (TU) game is with N={1,…,n} is the set of players, v is the characteristic function of the game. A N is called coalition. v(A) is the utility (or cost) for the coalition A. G N represents the set of all games having N as set of players. Remark: G N R 2 n -1

4
R.Lucchetti Politecnico di Milano 4 A Base for G N : Unanimity games Subclass of games: Simple games. Among them the weighted majority games:

5
Introduction: how an array works A chip can contain millions of DNA probes

6
Introduction: how a microarray works Hybridization When a single DNA helix meets a single mRNA helix, if they are complementary they will stick to each other. Hybridization helps researchers to identify what RNA sequences are present in a sample and this tells them what genes are being expressed by the organism and how much they are being expressed.

7
Introduction: how a microarray works GeneChip microarrays use the natural chemical attraction between the RNA target (from the sample preparation) and the DNA on the array to determine the expression level of a given gene. Adenine (A) Guanine (G) Thyimine (T)/Uracil (U) Cytosine (C)DNA/RNA T C A G

8
Introduction: how a microarray works PCR) The RNA extract from a sample is copied in cRNA (through a process known as PCR). Copying the RNA allows it to be more easily detected on the array. At the same time the RNA is copied, a chemical flourescent molecule called biotin is attached to the strand. This molecule will show where the sample RNA has stuck to the DNA probe on the array.

9
Introduction: how a microarray works If the gene is highly expressed,many RNA molecules will stick to the probe and the probe location will shine brightly when the laser hit it. If the sample RNA doesnt match it will be rejected by the probe on the array and when the laser hits the probe, nothing glows.

10
Introduction: how a microarray works The whole point of microarray gene expression analysis is to compare expression levels among different samples. Lets simplify the situation with an example in which we have four genes and two samples. Gene1: 2RUDEGene2: 2LOUDGene3: GETOUTGene4: FATMET Gene4 is not glowing.

11
Array1Array2Array3 … array 1array 2array 3array 4… gene 10,670,451,321,34… gene 21,011,131,542,13… gene 31,381,211,230,12… gene 40,650,980,54…… gene 50,171,322,43…… ……………… Expression level of gene 4 in array 2

12
R.Lucchetti Politecnico di Milano 12 The Microarray Game : An mxn Boolean matrix M such that Given the column, supp

13
R.Lucchetti Politecnico di Milano 13 Sample 1Sample 2Sample 3 gene10.50.21 gene20.410.3 gene30.80.40.2 Sample1Sample2Sample3Sample 4 gene10.70.31.80.8 gene20.10.20.50.9 gene310.61.70.1 Sample1Sample2Sample3Sample4 gene10010 gene21100 gene31011

14
R.Lucchetti Politecnico di Milano 14 A power index for the game (N,v) is (x 1,…,x n ) such that: x i represents the power of player i in game v. weighted voting does not work… The most famous: Shapley ( ) and Banzhaf ( ).

15
R.Lucchetti Politecnico di Milano 15 the marginal contribution of i to S {i} Shapley ( ) and Banzhaf( )

16
R.Lucchetti Politecnico di Milano 16 is a probabilistic value if there is a probability on such that Shapley Banzhaf

17
R.Lucchetti Politecnico di Milano 17 If p i (S)=p(|S|)>0, the probabilistic value is called regular semivalue Examples: Banzhaf Shapley p-binomial Regular semivalues are points in the simplex:

18
R.Lucchetti Politecnico di Milano 18 Properties for power indices Let The solution has the dummy player (DP) property, if for each player such that for all coalitions A not containing i,

19
R.Lucchetti Politecnico di Milano 19 Let be a permutation. Given the game v, denote by the game and by The solution has the symmetry (S) property if, for each permutation as above

20
R.Lucchetti Politecnico di Milano 20 The new family of power indices Let Define on the unanimity game as and extend it by linearity on a generic

21
R.Lucchetti Politecnico di Milano 21

22
R.Lucchetti Politecnico di Milano 22

23
R.Lucchetti Politecnico di Milano 23

24
R.Lucchetti Politecnico di Milano 24 Theorem 1 There exists one and only one value fulfilling the symmetry, linearity and dummy player properties, and assigning a S to all non null players in the unanimity game u S, where a 1 =1 and a s >0 for s=2,…,n. fulfills the formula :

25
R.Lucchetti Politecnico di Milano 25 Theorem 2 a is a regular semivalue for all a>0. 2 fulfills the formula: Corollary The family of the weighting coefficients of the values a, a>0, is an open curve in the simplex of the regular semivalues, containing the Shapley value. The addition of the Banzhaf value to the curve provides a one-point compactification of the curve.

26
R.Lucchetti Politecnico di Milano 26 Theorem 3 study of the term: Key tool Let, let Then Moreover, for all natural l, and positive real a,x: Finally, for each natural m, the following formula holds:

27
R.Lucchetti Politecnico di Milano 27 Let count in how many ways the sum of the weights of j players different from i can give k. Then the following proposition holds. Let be the value defined in the theorem above. Let q>0 be a positive integer, and let w 1,…,w n be non negative integers. Let v=[q;w 1,…,w n ] be the associated weighted majority game. Then the following formula holds: Calculating the indices in weighted majority games An efficient algorithm based on generating functions and formal series allows for a fast calculation of the coefficients

28
R.Lucchetti Politecnico di Milano 28 Applications The EU

29
29

30
R.Lucchetti Politecnico di Milano 30 The power indices, when considering the 56 genes common to the indices, among the first 100 common to all indices. Data from 40 tumor samples vs 22 normal, 2000 genes

31
R.Lucchetti Politecnico di Milano 31 Data from a Colon Rectal Cancer 10 Healthy 12 Tumoral tissues An extended microarray game considers also how much the genes are abnormally expressed w.r.t a normality interval. Given the normality interval [m i,M i ] of the gene i, s i the standard deviation, N k i =[m i -ks i,m i +ks i ], assign k to the ij cell of the matrix if value of gene i in patient j falls in N i k \ N i k-1 A weighted Shapley value is used to rank genes. This allows better differentiating the genes. Taking the first 100 genes in the ranking, the game is formed as an average of weighted majority games. Then we calculate the Shapley, Banzhaf and 2 indices

32
R.Lucchetti Politecnico di Milano 32 Gene expression analysis was performed by using Human Genome U133A-Plus 2.0 GeneChip arrays (Affymetrix, Inc., Calif). The following 7 genes are quoted in medical literature as having great importance in the onset of the disease: CYR61, UCHL1, FOS,FOSB, EGR1, VIP, KRT24. One of them was ranked around the 100-th position by the weighted Shapley value. All other ones are among the first 50 and played the subsequent game. SB 2 FOSB211 CYR61122 FOS333 VIP556 EGR11099 KRT244535

33
R.Lucchetti Politecnico di Milano 33 References R.Lucchetti P.Radrizzani, E. Munarini, A new family of regular semivalues and applications, Int.J.of Game Theory DOI 10.1007/s00182-010-0263-5 R. Lucchetti-S. Moretti-F. Patrone-P. Radrizzani, The Shapley and Banzhaf indices in microarray games, Computers and Operations Research, 37, (2010) p. 1406-1412. R. Lucchetti-P.Radrizzani, Microarray Data Analysis Via Weighted Indices and Weighted Majority Games, Computational Intelligent Methods for Bioinformatics and Biostatistics II, Masulli, Peterson, Tagliaferri (Eds), Lecture Notes in Computer Science, Springer (2010) p.179-190. S.Moretti, F.Patrone, S.Bonassi, The class of microarray games and the relevance index for genes. TOP 15 (2007), p256-280. D. Albino, P. Scaruffi, S. Moretti, S.Coco, C.Di Cristofano, A.Cavazzana, M.Truini, S.Stigliani, S.Bonassi, G.Ptonini (2008): Stroma poor and stroma rich gene signatures show a low intratumoral gene expression heterogeneity in Neuroblastic tumors. Cancer 113, p. 1412-1422.

