Presentation is loading. Please wait.

Presentation is loading. Please wait.

The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite AMGA Riccardo Bruno

Similar presentations


Presentation on theme: "The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite AMGA Riccardo Bruno"— Presentation transcript:

1 www.epikh.eu The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite AMGA Riccardo Bruno (riccardo.bruno@ct.infn.it)riccardo.bruno@ct.infn.it INFN Dept. Catania Joint EPIKH/EUMEDGRID Support event in Cairo Africa 4 – App Porting Tutorial. Cairo, 25.10.2010-03.11.2010

2 Metadata Metadata is data about other data. AMGA (ARDA Metadata Grid Application); the ‘official’ Grid metadata service in gLite (gLite v3.1) Since ‘data’ in gLite means files, AMGA was originally designed to manage metadata on Grid files; but not only! Example –Grid files of movie trailers stored on the Grid –Each movie file has associated different metadata:  Title  Duration  Genre (Action, Animation, Comic, Drama etc.)  Cast (List of actors) –User can ‘query’ on metadata in order to get back the movie file  Get trailer movie files having: Duration greater than 10 minutes  Get trailer movie files having: ‘Nicole Kidman’ in the Cast 2

3 Simplest metadata Scenario 3 Some SEs and a LFC on the Grid List of LFNs AMGA Server QUERY: All trailers having ‘Animation’ as Genre Selected Movie Files

4 LFC and AMGA By design there exists a close relationship between LFC and AMGA servers to associate Metadata to Files Then Metadata can be hierarchically organized, FS like 4 LFC AMGA …/trailers/ moviefile_1.avi moviefile_m.avi italian/ ita_movie_1.avi … spanish/ es_movie_1.avi … …/trailers/ moviefile_1.avi moviefile_m.avi italian/ ita_movie_1.avi … spanish/ es_movie_1.avi … List of attributes: Name : Madagascar Genre : Animation Duration: 90 Cast : …. LFCAMGA

5 WNs Other metadata Scenario 5 QUERY: Job Ids related to my ‘Done’ jobs User Jobs WMS CE AMGA List of JobIds

6 AMGA – Metadata Terminology Entries – List of entities having metadata associated Attribute – key name, key type pair Schema –Set of attributes Collection –A set of entries associated with a schema Metadata –List of attributes (including their values) associated with entries 6 EntriesAttribute 1Attribute 2…Attribute n Entry 01E01’ Attrib. 1 valueE01’ Attrib. 2 value…E01’ Attrib. n value Entry 02E02’ Attrib. 2 value …E02’ Attrib. n value …………… Integer Char Date … collection_1/ entry_1 entry_2 … collection_2/ entry_1 entry_2 … FS Analogy

7 Metadata Example 7 /gilda/demo/trailers/ AMGA collection: >> Title >> varchar >> Duration >> int >> Genre >> varchar >> Cast >> varchar collection attributes: /gilda/demo/trailers/ madagascar.avi moulinrouge.avi … Collection entries: Attibute values >> madagascar.avi >> madagascar >> 15 >> animation >> Ben Stiller;Chris Rock;David Schwimmer;Jada Pinkett … RDBMS View ! Schemas/Attributes may be changed ANYTIME ! It is possible to define: SEQUENCES, INDEXES and CONSTRAINTS ! Schemas/Attributes may be changed ANYTIME ! It is possible to define: SEQUENCES, INDEXES and CONSTRAINTS

8 Sub-Collections AMGA Collections may contain sub-collections (Dir FS Analogy) AMGA Sub-collections may or not inherit parent attributes 8 /gilda/demo/trailers/ madagasgar.avi moulinrouge.avi /gilda/demo/trailers/ italian madagascar_ita.avi moulinrouge_ita.avi … /gilda/demo/trailers/ user_remarks remark_0001 remark_0002 … AMGA trailers’ sub-collections: >> Title >> varchar >> Duration >> int >> Genre >> varchar >> Cast >> varchar >> D ubbedCast >> varchar >> Title >> varchar >> User >> varchar >> Remark >> varchar >> Title >> varchar >> User >> varchar >> Remark >> varchar

9 AMGA as DB solution Although AMGA has been desgned to serve as a Grid File metadata service; it can be used as a DB – Collection  DB Table – Schema  Table Schema – Attribute  Schema Column – Entry  Table row/record Tables may be organized in a single directory (RDBM) or hierarchically organized (OODBM). 9 Entry Name RowId Attr_1/Col_1Attr_2/Col_2… GUID_1 RecVal(1,1)RecVal(1,2)…RecVal(1,n) GUID_2 RecVal(2,1)RecVal(2,2)…RecVal(2,n) …………… GUID_mRecVal(m,1)RecVal(m,2)…RecVal(m,n) Collection/Table

10 Attribute Data Types AMGAPostgreSQLMySQLOracleSQLitePyton int integer int number(38) int float double precision float varchar(n) character varying(n) varchar2(n) varchar(n) string timestamp timestamp w/o TZ datetime timestamp(6) unsupported time(unsuppo rted) text long text string numeric(p,s) numeric(p.s) float 10 Using the above datatypes you are sure that your metadata can be easily moved to all supported AMGA back-ends (DB Migration) If you do not care about DB portability, you can use, in principle, any datatypes supported by the back-end, even the more specific ones: (PostgreSQL Network Address type or Geometric ones). Are Excluded Oracle’ MySQL and PostgreSQL binary types (BLOBs) Tested solution implies the use of uuencode / uudecode (shareutils) to convert binaries into Base64 text format.

11 Interacting with AMGA Users may interact with AMGA in two different frontends –Streaming front end (TCP) / amgad  CLI interactive session: mdclient mdjavaclient  CLI single command: mdcli  APIs (C++, Java, Python, Perl, PHP)C++JavaPythonPerlPHP –SOAP frontend (WSDL) / mdsoapserverWSDL 11

12 mdcli/mdclient A configuration template file available at – /opt/glite/etc/mdclient.config Template can be copied into – $PWD/mdclient.config – $HOME/.mdclient.config mdclient starts a interactive session – Query> mdcli executes a single AMGA command –It saves a session file storing the current session status in /tmp (i.e md_18968_amga.eela.ufrj.br_8822_0) 12 [brunor@genius ~]$ mdcli 'whoami' prod.vo.eu-eela.eu [brunor@genius ~]$ mdclient Connecting to amga.eela.ufrj.br:8822... ARDA Metadata Server 1.9.0 Query>

13 mdcli/mdclient help It is possible to get help on mdcli/mdclient commands typing – help or Possible topics – help metadata metadata-optional directory replication constraints entry group acl index schema sequence user view site replicas ticket capabilities admin commands 13 [brunor@glite-tutor ~]$ mdclient Connecting to amga.ct.infn.it:8822... ARDA Metadata Server Query> help >> help [topic] >> Displays help on a command or a topic. >> Valid topics are: help metadata metadata-optional directory replication constraints entry group acl index schema sequence user view site replicas ticket capabilities admin commands Query> help metadata >> setattr entry attribute value [attribute value]... >> Sets given attributes to specified values for all entries matching entry. >> addattr dir attribute type >> Adds a new attribute to a directory … Query>

14 Simple metadata commands Create a collection – createdir / [inherits] Associate a schema to the collection – addattr / [ ] … List Attributes – listattr / Remove Attributes – removeattr / Rename Attributes – renameattr / Add entries and attribute values – addentry / [ ] … Set an attribute value – setattr / [ ] … List entries – listentries / 14

15 Getting metadata Three commands: getattr find and selectattr – getattr pattern attribute1 attribute2 … – find pattern 'query' It is possible to make complex queries throug the use of boolean operators or join queries among different collections – Find 15 Query> getattr *.avi Title Duration Genre >> madagascar.avi >> madagascar >> 15 >> animation >> moulinrouge.avi >> moulin rouge! >> 12 >> Drama;Musical;Romance Query> find *.avi 'Duration > 10' >> madagascar.avi >> moulinrouge.avi Query> find *.avi 'Title=italian:Title' >> madagascar.avi

16 Getting metadata selectattr allows to get Attribute values from given queries selectattr … 'query' 16 Query> selectattr trailers:Title trailers/italian:DubbedCast 'trailers:Title=trailers/italian:Title' >> madagascar >> Alessandro Besentini;Francesco Villa;Fabio De Luigi: Melman la giraffa;Michelle Hunziker;Chiara Colizzi;Oreste Baldini;Roberto Draghetti;Massimiliano Alto;Luigi Ferraro;Massimo Bitossi;Elena Magoia;Franco Mannella;Gerolamo Alchieri;Pasquale Anselmo;Roberto Pedicini;Marco Mete;Stefano De Sando;Emanuela Rossi Query> selectattr trailers:Title trailers:Duration 'like(trailers:Cast,"%Kidman%")' >> moulin rouge! >> 12

17 SQL Support It is possible to issue SQL queries in AMGA Recognized SQL statements – SELECT, INSERT, UPDATE, DELETE (uppercase) INSERT statement automatically generates a unique ID as entry name 17 Query> SELECT Title FROM trailers WHERE trailers.Duration > 10 >> trailers.Title >> madagascar >> moulin rouge! >> madagascar Query> SELECT trailers:Title FROM trailers, trailers/italian WHERE trailers:Title=trailers/italian.Title; >> trailers.Title >> madagascar Query>

18 Users and Groups AMGA maps users to configured AMGA users and groups accordingly to –LOGIN name –X509/GridProxy DN – VOMS Groups and Roles Main user is: root Users and groups are shown and managed POSIX like d rwx rwx (user, group) user ownweship 18 Query> ls –l >> drwxr-x gilda /gilda/demo/trailers Query> ls –l trailers >> drwxr-x gilda /gilda/trailers/italian >> drwxr-x gilda /gilda/demo/trailers/remark >> -rwxr-x gilda madagascar.avi >> -rwxr-x gilda moulinrouge.avi >> -rwxr-x gilda madagascar.avi

19 ACLs AMGA allow users to define ACLs for –Collections –Entries (MySQL5 and PostgreSQL collection created with -acl ) Use acl_show or stats Since AMGA v2.0 sudo command allows root user to become any user 19 Query> acl_show trailers >> gilda rwx >> gilda:users rwx >> system:anyuser rx Query> stat madagascar.avi >> /gilda/demo/trailers/madagascar.avi >> entry >> rwx >> r-x >> gilda

20 AMGA Replication AMGA provides a replication/federation mechanisms Motivation –Scalability – Support hundreds/thousands of concurrent users –Geographical distribution – Hide network latency –Reliability – No single point of failure –DB Independent replication – Heterogeneous DB systems –Disconnected computing – Off-line access (laptops) Architecture –Asynchronous replication  Master-Slave Writes are only allowed on the master Application level replication –Replicate Metadata with AMGA’ commands ( dump ) Partial replication –Supports replication of only sub-trees of the metadata hierarchy 20

21 AMGA Replication types Full Replication 21 Partial Replication Federation Proxy Commands are redirected

22 AGMA DB Import Each AMGA server rely on a dedicated DB backend –Oracle, MySQL, PostgreSQL, mSQL, other (UnixODBC) Database Import: two possibilities –Import tables from the DB into an AMGA DB Backend –Import AMGA DB Backend into DB hosting tables Use the import command by root to “mount” your table into the AMGA collection hierarchy 22 Query> whoami >> root Query> createdir world Query> cd world Query> import world.City world/City Query> import world.Country world/Country Query> import world.CountryLanguage world/CountryLanguage Query> acl_add /world/ gilda:users rx Query> acl_show /world >> root rwx >> gilda:users rx >> system:anyuser rx

23 DB Access and Replication 23 www.eu-eela.eu MySQL DB Movie Metadata PostgreSQL DB User Comments Oracle DB Actors PostgreSQL DB Storage AMGA master AMGA master AMGA master AMGA master AMGA slave / /movie/storage/actors/comments /movie/info /movie/title /movie/aka_title /storage/LFN /storage/SEs /actors/name /actors/info/comments/i nfo /comments/users 23 Federation and DB Import With Federation and DB Import feature it is possible to create huge federated metadata structures

24 Jobs with AMGA 24 Since AMGA supports Grid Proxies, jobs may access to any AMGA server ( mdclient.config ) Normally the Job Pilot Script uses mdcli client applications to get /set metadata

25 Since AMGA supports Grid Proxies, jobs may access to any AMGA server ( mdclient.config ) Normally the Job Pilot Script uses mdcli client applications to get/set metadata EXAMPLE A grid job that selects movies accordingly to a given actor A pilot script will query the AMGA server taking the actor name as parameter and identifies the LFN The file pointed by the LFN will be uploaded to the WN In the JDL a mdclient.config file has to be specified in the InputSandbox Jobs with AMGA 25 # amgajobdemo.sh #!/bin/bash echo "Looking for Actor: '"$1"'" MOVIE=$(mdcli "selectattr /gilda/demo/trailers:Title 'like(/gilda/demo/trailers:Cast,\"%${1}%\")'") echo "Selected Movie Title: '"$MOVIE"'" MOVIEFILE=$(mdcli "find /gilda/demo/trailers/*.avi 'Title = \"${MOVIE}\"'") echo "Selected Trailer avi file: '"$MOVIEFILE"'" MOVIESCD=$(mdcli "pwd") echo "Uploading LFN file '"$MOVIESCD$MOVIEFILE"'" lcg-cp lfn:$MOVIESCD$MOVIEFILE file:$PWD/movie.avi... # amgajobdemo.sh #!/bin/bash echo "Looking for Actor: '"$1"'" MOVIE=$(mdcli "selectattr /gilda/demo/trailers:Title 'like(/gilda/demo/trailers:Cast,\"%${1}%\")'") echo "Selected Movie Title: '"$MOVIE"'" MOVIEFILE=$(mdcli "find /gilda/demo/trailers/*.avi 'Title = \"${MOVIE}\"'") echo "Selected Trailer avi file: '"$MOVIEFILE"'" MOVIESCD=$(mdcli "pwd") echo "Uploading LFN file '"$MOVIESCD$MOVIEFILE"'" lcg-cp lfn:$MOVIESCD$MOVIEFILE file:$PWD/movie.avi... Pilot script # mdclient.config Host = amga.ct.infn.it Port = 8822 Login=NULL PermissionMask = rwx GroupMask = r-x Home = /home/gilda UseSSL = require AuthenticateWithCertificate = 1 UseGridProxy = 1 VerifyServerCert = 0 TrustedCertDir = /etc/grid-security/certificates RequireDataEncryption = 1 # mdclient.config Host = amga.ct.infn.it Port = 8822 Login=NULL PermissionMask = rwx GroupMask = r-x Home = /home/gilda UseSSL = require AuthenticateWithCertificate = 1 UseGridProxy = 1 VerifyServerCert = 0 TrustedCertDir = /etc/grid-security/certificates RequireDataEncryption = 1 mdclient.config # amgajobdemo.jdl Type = "Job"; JobType = "Normal"; Executable = "amgajobdemo.sh"; StdOutput = "amgajobdemo.out"; StdError = "amgajobdemo.err"; InputSandbox = {"mdclient.config", "amgajobdemo.sh"}; OutputSandbox = {"amgajobdemo.out","amgajobdemo.err"}; Arguments = "Kidman"; # amgajobdemo.jdl Type = "Job"; JobType = "Normal"; Executable = "amgajobdemo.sh"; StdOutput = "amgajobdemo.out"; StdError = "amgajobdemo.err"; InputSandbox = {"mdclient.config", "amgajobdemo.sh"}; OutputSandbox = {"amgajobdemo.out","amgajobdemo.err"}; Arguments = "Kidman"; JDL file

26 Simple usage scenario 26 Grid Movie On Demand

27 gMOD: grid Movie On Demand Algiers, Joint EPIKH/EUMEDGRID-Support Site Admin Tutorial, 27.06.2010 27 gMOD provides a Video-On-Demand service User chooses among a list of video and the chosen one is streamed in real time to the video client of the user’s workstation For each movie a lot of details are stored and users can search a particular movie querying on one or more attributes ( Title, Runtime, Country, Release Date, Genre, Director, Case, Plot Outline ) Two kind of users can interact with gMOD: TrailersManagers that can administer the DB of movies and GILDA VO users (guests) that can browse, search and choose a movie to be streamed.

28 gMOD under the hood Algiers, Joint EPIKH/EUMEDGRID-Support Site Admin Tutorial, 27.06.2010 28 Built on top of gLite services: Storage Elements, sited in different place, physically contain the movie files LFC, the File Catalogue, keeps track in which Storage Element a particular movie is located AMGA is the repository of the detailed information for each movie, and makes possible queries on them The Virtual Organization Membership Service (VOMS) is used to assign the right role to the different users The Workload Management System (WMS) is responsible to retrieve the chosen movie from the right Storage Element and stream it over the network down to the user’s desktop or laptop GENIUS allow users to interact with above Grid Services

29 gMOD interactions Algiers, Joint EPIKH/EUMEDGRID-Support Site Admin Tutorial, 27.06.2010 29 www.eu-eela.eu AMGA LFC SEs GENIUS Portal get Role VOMS User WNs Job Request CE WMS

30 gMOD screenshot Algiers, Joint EPIKH/EUMEDGRID-Support Site Admin Tutorial, 27.06.2010 30

31 Usage scenarios summary 31 Grid File metadata (LFC) Gridified DB solution (Platform Independent DB) Job/Infrastructure Monitoring System (GANGA/MonAMI) Handle complex job workflows Producer/Consumer Job models Trivial parallelization management Partial/Full Output retrieval (Watchdog) I/O Sharing of data among different Users and Jobs Share data among Grid users securely (sensitive data) Easy backend to develop Digital Libraries (gLibrary)

32 Conclusion Algiers, Joint EPIKH/EUMEDGRID-Support Site Admin Tutorial, 27.06.2010 32 AMGA – Metadata Service of gLite –Part of gLite 3.1  Can be used with other middleware platforms  Useful to realize simple Relational Schemas or add metadata information to Grid Files –Fully Integrated with the Grid Environment (Security) Features: –Replication/Federation (root) –Importing existing databases (root) –SQL support –Security (SSH, X509, G.Proxyies,VOMS,users/groups,ACLs) –APIs / client Applications –SOAP Tests shown good performance/scalability

33 References Algiers, Joint EPIKH/EUMEDGRID-Support Site Admin Tutorial, 27.06.2010 33 AMGA Web Site http://cern.ch/amga AMGA Manual v2.0 http://amga.web.cern.ch/amga/downloads/2.0/amga-manual_2_0_0.pdf AMGA API Javadoc http://amga.web.cern.ch/amga/javadoc/index.html AMGA Basic Tutorial https://grid.ct.infn.it/twiki/bin/view/GILDA/AMGAHandsOn More information on existing DB access @: –http://amga.web.cern.ch/amga/importing.htmlhttp://amga.web.cern.ch/amga/importing.html –https://grid.ct.infn.it/twiki/bin/view/GILDA/AMGADBaccesshttps://grid.ct.infn.it/twiki/bin/view/GILDA/AMGADBaccess

34 34 Questions …


Download ppt "The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite AMGA Riccardo Bruno"

Similar presentations


Ads by Google