Data Protection and String Search in SDDS-2005 http://ceria.dauphine.fr/Riad/PagePersoRiad.html Riad Mokadem
SDDS-2005 Evolution of SDDS-2004 Existent Functions Extension of algebraic signatures Introduction of Pre-computed algebraic signature
Cumulative Algebraic Signature Encoding data in servers Protection against incidental viewing in servers Fast manipulation of string Encoding form directly in servers The fastest technology in manipulation of characters. Prefix search Corruption protection ( future) Data Compression ( future)
Applications Servers SDDS Data Grid P2P Systems XML data Search engine MSN ?
Prototype SDDS 2005 Réseau Serveur Cases en RAM Client SDDS-CP Applications ... de Noms d'index RP* S
Problematic Encoding\ Decoding of data by client - Data on clear viewing per on authorized administrator - Storage dump… No protection data Encoding\ Decoding of data by client Data encoded in servers Different possibilities of search.
Data structure Encoding/decoding concerns non key data. Encoding/decoding is transparent for servers. Limit of 256B for data actually
Search of string Aalgorithms: Boyer Moore, Karp Rabin… SDDS-2005: No sent data to search. Sending signature Best confidentiality Encoding Pré- computed algebraic signatures
Cumulative signature algebraic signature (p1,p2,…,pn)= pi i Structure of Galois field GF(2f ) f>>1 symbols of size f=8,16… primitive element algebraic signature (p1,p2,…,pn)= pi i Pre-computed signature p’i=antilog (log pi+i) Encoding pi p’’i= p’’i-1 XOR p’i i=n-1 i=0
Encoding+ sent request Cumulative signature Encoding+ sent request Comparison of signatures+ verification of collision Result of search Client Décoding Pré compute of signatures Gain of search time
Performances d’Encodage / Décodage Size of record Encoding time Decoding time Iinsertion time 100B 0.045ms 0.042ms 0.3ms Fast Time encoding / decoding. Signatures pre computed serach time reduced
SDDS-2005 String matching functions Prefix search String search Sent of signature + size of data Longest Prefix match Longest common String Sent of data to comparison
Cumulative signature Size of case Size of inserted data Size of last record Size of data to search Offset of data in last record Time of search Sign alg Time of search Karp Rabin Time of search sign cumulatives 100 250 25 10 5 205 151 147 200 368 275 268 500 1123 725 702 1000 2254 1580 1526 1) La solution d’envoi de ces messages par UDP, faite par .. 2) Preparer un tampon Unifier les messages de supp et d’insertion du meme enreg e …
Performance(cumulative signature) Record Position Size of inserted data Size of data to search r Offset of string Time of search (ms) 1 100 13 80 0.65 250 5 460 Record Position Size of inserted data Size of prefix Time of search (ms) 1 100 20 0.369 250 37 Préfix Search String match search Record Position Size of inserted data Size of last record Size of prefix to search Offset prefix in record Time of search (ms) 1 50 25 0.48 99/100 250 453 Longest Prefix Match Search time in 2 servers+ comparison = Search time in only one server 1) La solution d’envoi de ces messages par UDP, faite par .. 2) Preparer un tampon Unifier les messages de supp et d’insertion du meme enreg e … Ecord Position Size of inserted data Size of last record Size of string to search Offset of string Time of search (ms) 1 22 10 0.658 100 200 35 20 4 672 15 625 250 45 8 407 Longest Common String
Cumulative signature }New Search Functions Reduction of search time: -Existent Algorithms existents (30% algebraic signature, 5% Karp-Rabin, Size(Data) >32B… Non encoded data 1) La solution d’envoi de ces messages par UDP, faite par .. 2) Preparer un tampon Unifier les messages de supp et d’insertion du meme enreg e … }New Search Functions
SDDS-2005 : Cumulative signatures Opérationnel Functions (2005): Prefix Search String matching Longest Prefix match Longest Common String 1) La solution d’envoi de ces messages par UDP, faite par .. 2) Preparer un tampon Unifier les messages de supp et d’insertion du meme enreg e …
Cumulative Signatures Théory labour (2005): - Comparison with Karp Rabin algorithm. Remaining work: - Data > 256B - Performance mesures - Amélioration of signature calculation time (Horner scheme, Broder table…) -Data compression (prefix, suffix, full ) 1) La solution d’envoi de ces messages par UDP, faite par .. 2) Preparer un tampon Unifier les messages de supp et d’insertion du meme enreg e …
The end Merci Riad Mokadem