DNA and splicing (circular) Dipartimento di Informatica Sistemistica e Comunicazioni, Univ. di Milano - Bicocca ITALY Dipartimento di Informatica e Applicazioni, Univ. di Salerno, ITALY Paola Bonizzoni, Clelia De Felice, Giancarlo Mauri, Rosalba Zizza Circular splicing, definitions State of the art Our contributions Works in progress

Before Adleman experiment (1994)... Tom Head 1987 (Bull. of Math. Biology) Formal Language Theory and DNA : an analysis of the generative capacity of specific recombinant behaviors SPLICING Unconventional models of computation

SPLICING LINEAR CIRCULAR

CIRCULAR SPLICING restriction enzyme 1 restriction enzyme 2 ligase enzymes

Circular languages: definitions and examples Conjugacy relation on A* w, w A*, w ~ w w=xy, w = yx Example abaa, baaa, aaab,aaba are conjugate A ~ = A* ~ = set of all circular words ~ w = [w] ~, w A* Circular language C A ~ set of equivalence classes A* A* ~ L Cir(L) = { ~ w | w L} (circularization of L) C L C {w A*| ~ w C}= Lin(C) (Full linearization of C) (A linearization of C, i.e. Cir(L)=C )

FA ~ ={ C A ~ | L A*, Cir(L) = C, L FA, FA Chomsky hierarchy} Definition: Theorem [Head, Paun, Pixton] C Reg Lin (C) Reg C Reg ~ Lin (C) Reg

Pauns definition Circular splicing systems (A= finite alphabet, I A ~ initial language) SC PA = (A, I, R) R A* | A* $ A* | A* rules ~hu1u2,~hu1u2, ~ku3u4~ku3u4 A ~ r = u 1 | u 2 $ u 3 | u 4 R u 2 hu 1 u4ku3u4ku3 ~ u 2 hu 1 u 4 ku 3 Definition I and closed under the application of the rules in R A circular splicing language C(SC PA ) (i.e. a circular language generated by a splicing system SC PA ) is the smallest circular language containing

Other definitions of splicing systems Heads definitionSC H = (A, I, T) T A* A* A* triples A ~ (p, x, q ), ( u,x,v) T vkux ~ hpx vkux q ~ hpxq, ~ kuxv q hpx (A= finite alphabet, I A ~ initial language) SC PI = (A, I, R) A ~ (, ; ), (, ; ) R ~ h h ~ h, ~ h h Pixtons definition R A* A* A* rules h

Problem: Theorem [ Paun96] Characterize FA ~ C(Fin, Fin) C(Reg, Fin) class of circular languages C= C(SC PA ) generated by SC PA with I and R both finite sets. F {Reg ~, CF ~, RE ~ } R +add. hyp. (symmetry, reflexivity, self-splicing) Theorem [Pixton95-96] R Fin+add. hyp. (symmetry, reflexivity) C(F, Fin) F F {Reg ~, CF ~, RE ~ } C(F, Reg) FC(Reg ~, Fin) Reg ~,

Circular finite splicing languages and Chomsky hierarchy CS ~ CF ~ Reg ~ ~ ((aa)*b) ~ (aa)* ~ (a n b n ) I= ~ aa ~ 1, R={aa | 1 $ 1 | aa} I= ~ ab ~ 1, R={a | b $ b | a}

Our contributions Reg ~ Fingerprint closed star languages X*, X regular group code Cir (X*) X finite cyclic languages weak cyclic, other examples ~ (a*ba*)* Reg ~ C(Fin, Fin)

Our contributions (continued) Comparing the three definitions of splicing systems C(SC H ) C(SC PA ) C(SC PI ) ~ (a*ba*)*, ~ ((aa)*b) =... ?

Star languages L A* is star language if L is regular, closed under conjugacy relation and L=X*, with X regular Proposition: SC PA =(A,I,R), I Cir(X*) C(SC PA ) Cir (X*) Consistence easily follows!!! Examples (b*(ab*a)*)* = X* (a*ba*)* = X* X=b ab*a X= a*ba* Definition

Fingerprint closed languages Definition For any cycle c, L contains the Fingerprints of c Fingerprint of a cycle c n c L power of the cycle, where the internal cycles are crossed a finite number of times c=(x(y(zz) j y) i x) n c i n y, j n x c q0q0 x x y y z z q0q0

Fingerprint closed star languages C(Fin,Fin) Theorem I=Cir({successful path containing fingerprint of cycles}) R={1 | 1 $ 1 | ƒ | ƒ fingerprint of cycle c, for any cycle c} Star languages not fingerprint closed (a*ba*)* but not generated!!! Star languages fingerprint closed X*, X regular group code X finite, Cir(X*) Sketch Take SC PA = (A, I, R) with (for example X=b ab*a) (for example X=A d )

Not Star Languages in C(Fin, Fin) new! Definition Cyclic(z) ={( ~ (z* p)) | p Pref (Lin( ~ z))} Example Cyclic(abc)= ~ (abc)*a ~ (abc)*ab ~ (abc)*b ~ (abc)*bc ~ (abc)*c ~ (abc)*ca z = abc A* Lin ( ~ z) =Lin ( ~ abc) ={abc, bca,cab} Pref(Lin ( ~ z)) =Pref(Lin ( ~ abc)) =Pref({abc, bca,cba}) = {a, ab, b, bc, c, ca} Cyclic Languages

Theorem Cyclic(z) C(Fin,Fin) The proof is quite technical... Example (continued) Cyclic (abc) is generated by SC PA = (A,I,R) where I,R are defined as follows I={ ~ ((abc) i p | 0 i 3, p Pref(Lin( ~ (abc))) } R={z ab | z $ z | ca z, z ab | z $ z b | c z, z ca | z $ z $ bc z, z a | z $ z | b z, z b | z $ z $ c z, z c | z $ z | a z } For any z, |z|>2, z unbordered word, then i.e. z uA* A*u

Other circular regular splicing languages ~ (abc)*a ~ (abc)*ab ~ (abc)*b ~ (abc)*bc ~ (abc)*c ~ (abc)*ca Cyclic(abc) ~ (abc)*ac weak cyclic languages Cyclic (abca).... bordered word...

Works in progress Characterize Reg ~ C(Fin, Fin) Characterize FA ~ C(Fin, Fin) C(SC PI ) = Star languages Additional hypothesis r= u 1 | u 2 $ u 3 | u 4 in R Reflexive: r = u 1 | u 2 $ u 1 | u 2 Symmetric: r = u 3 | u 4 $ u 1 | u 2 Self-splicing: From ~ xu 1 u 2 yu 3 u 4, with r,r as above, generates ~ u 4 xu 1, ~ u 2 yu 3.

DNA6 auditorium Thanks!

