Wiki Display Wiki Display

AMGAHandsOn

The mdclient AMGA metadata client application and its usage (hands on)#

The AMGA metadata catalog can be accessed using both the mdclient client application or a set of API calls. In this WIKI you will learn about the use and configuration of the mdclient AMGA metadata client and how to create simple metadata structures inside an AMGA server. In particular the following items will be covered by this WIKI session:

1 Accessing and configure the AMGA client (mdclient)#

It is possible to run the AMGA client from any gLite user interface having the AMGA client installed. To control if your UI is able to run the AMGA client, just check the output of the following command:

1.1) Check the AMGA client or get client version

rpm -qa | grep -i amga-cli (You should get a package name like: glite-amga-cli-X.Y.Z-w)

Before starting the AMGA client application it is necessary to copy a configuration file into the home directory. There is a template file provided by the AMGA client installation that can be customised and used to access your own AMGA server.

1.2) Copy template configuration file to the HOME directory

cp $GLITE_LOCATION/etc/mdclient.config $HOME/.mdclient.config The file can be hidden or not by adding/avoiding the '.' character at the beginning of the file name; the hidden copy put in the $HOME directory has a global scope. This means that this file will be read by the mdclient application even calling it from another directory. It is possible to override the .mdclien.config creating a new mdclient.config file (without the '.'), this file has a scope only in the current direcrory.

Once you have copied your configuration file ( .mdclient.config) it is necessary to open the file using a text editor and change some values. The following fields should be changed accordingly:

1.3) Setup the .mdclient.config configuration file

Login=NULL (no username will be used by the AMGA client)
UseSSL = require (The client will use a secure connection)
Port=8822 (AMGA server listening port)
HOST=amga.ct.infn.it (AMGA server hostname)
AuthenticateWithCertificate = 1 (The Amga client will get your username from the certificate)
UseGridProxy = 1 (The client will authenticate looking for a local grid proxy)

Now the AMGA client application can be started:

1.4) Start the mdclient application

$ mdclient

Please do not forget to have a valid grid proxy certificate including valid VOMS extentions. Use the command voms-proxy-init --voms gilda to initialize your proxy certificate (see Authentication and Athorization). Once you are succesfully logged in, you will see the command prompt:

Connected to amga.ct.infn.it:8822 ARDA Metadata Server 1.2.0 Query> 

2 Getting help on mdclient usage#

It is possible to get help anytime on the client just using the 'help' command.

2.1) Try the help command

Query> help
>> >help [topic]<
>> >Displays help on a command or a topic.<
>> >Valid topics are: help metadata metadata-optional directory entry group acl index schema sequence user
view ticket commands<

Commands are grouped by topic. You can get the list of valid commands for each topic, typing: help [topic] The list of valid topics is:

  • help
  • metadata
  • metadata-optional
  • directory
  • replication
  • entry
  • group
  • acl
  • index
  • schema
  • sequence
  • user
  • view
  • ticket
  • commands

2.2) Try the use of help command with any topic

Query> help entry

3 mdclient General commands#

Follows a brief description of generic use commands:
>> createdir <path> [options]
Make a new folder. It can inherit the schema associated to the upper level folder
>> rm pattern
Remove items corresponding to the given pattern
>> link <file>
Make a link to another file or to a external URL
>> dir <directory>
List the content of a directory
>> listentries <directory>
List the items (not the collections) of a directory
>> stat <filepattern>
Show the statistic information about a directory
>> chown <file> <owner>
Changhe the ownership of a file or a directory
>> chmod <file> <rights>
Change the access rights to a file or a directory
>> rmdir <directory>
Remove a directory
>> dump <directory>
Make a recursive dump startung from a given directory, (the default is: '=/=')

4 mdclient Directory related commands#

Following a filesystem schema, AMGA uses directories where associate 'schemas' and inside the directories have 'entries'. This section describes a set of commands used to manipulate directories.

4.1) Browse the content of a directory

Query> dir <path>

Following the filesystem metaphore, in AMGA the user can browse metadata exacly like in a filesystem. The dir command lists all the entries belonging to the given directory specified in the path. The dir command will show all entries and/or sub-directories. This allows the users to define complex metadata hierarchies. Once the mdclient has been started the default directory is '=/=' (root). Nevertheless, in order to override this setting, the user can change the parameter DefaultDir in .mdclien.config file and define a different default directory.

Practice: try dir /

4.2) Print the current working directory

Query> pwd

The pwd command shows the current working directory.

Practice: try pwd

4.3) Change the current working directory

Query> cd <path>

The cd (change directory) command allows the user to change its working directory and then browse metadata hierarchy.

Practice: try cd /gilda/tutorials

4.4) Directory creation

Query> createdir <dirmame>

The createdir command creates a directory named '<dirname>' into the current working directory. The user may specify an absolute path name but all parent directories must exist.

Practice: under /gilda/tutorials directory try: create two directories named <date>_<your accountname>_X, where X in {1,2} and <date> in the format YYYYMMDD

4.5) Directory removal

Query> rmdir <dirmame>

The rmdir removes the specified directory. Like the createdir command, it allows absolute path specification.

Practice: under /gilda/tutorials directory try: remove the directory named <date>_<your accountname>_2

5 mdclient Handling attributes#

Once a directory has been created, it is possible to associate a schema defining several attributes in it. In analogy with databases it is possible to think about directories as table names and their attributes as column names with a given data type. Each attribute is defined by the couple: (attribute name, attribute datatype).

5.1) Schema population

Query> addattr [path]<dirmame> <attribute_name> <type_name>

Adds a new attribute populating the schema associated to the given directory. When adding attributes to a directory, we are going to define a collection for it. Type is the name of an SQL datatype which will be translated (if necessary) into a data type understood by the DB back-end. Valid datatypes are summarized by the following table where the corrisponding AMGA DB backend datatype is also shown.

AMGA PostgreSQL MySQL Oracle SQLite Pyton
int integer int number(38) int int
float double precision double precision float float float
varchar(n) character varying(n) character varying(n) varchar2(n) varchar(n) string
timestamp timestamp w/o TZ datetime timestamp(6) unsupported time(unsupported)
text text text long text string
numeric(p,s) numeric(p.s) numeric(p.s) numeric(p.s) numeric(p.s) float

Using the above datatypes the user can be sure that the metadata can be easily moved to all supported DB back-ends. If the user does not care about the database back-end portability, it is possible to specify even native types of a particular database backend (PostgreSQL, PostGIS, MySQL5, multipolygon, etc).


Examples
addattr /gilda/merida/tcaland MovieTitle varchar(100)
addattr /gilda/merida/tcaland Runtime int
addattr /gilda/merida/tcaland PlotOutline text

Practice: to the /gilda/tutorials/<date>_<your accountname>_1 directory try to insert the following attributes:

id integer
name varchar(30)
remark varchar(100)
toberemoved int
(add any other attribute/type you whish) ...

5.2) Attribute listing

Query> listattr <path>

The listattr command shows the entire list of attributes associated to the directory written inside the given <path>.

Practice: try: listattr /gilda/tutorials/<date>_<your accountname>_1 you will get all attributes just inserted before

5.3) Attribute Removal

Query> removeattr dir attribute

Removes an attribute from a directory if it is not used by any entry inside the directory.

Practice:
Use listattr to show your attributes
try: removeattr /gilda/tutorials/<date>_<your accountname>_1 toberemoved
Please notice the missing attributes executing listattr command once again

6 Managing Entries#

Once the schema has been defined, entries insertion is possible.

6.1) Entry creation

Query> addentry entry (attribute value)+ (+ means that more than can be specified more than one couple)

Add a new entry specifying one or more attribute values.
Example: addentry /gilda/rio/tcaland/madagascar.mov MovieTitle Madagascar

Practice: Add the following entries:

addentry /gilda/tutorials/<date>_<your accountname>_1/001_entry id 1 name 'entry 1' remark 'rem of entry 1' 
addentry /gilda/tutorials/<date>_<your accountname>_1/002_entry id 2 name 'entry 2' remark 'rem of entry 2'
addentry /gilda/tutorials/<date>_<your accountname>_1/003_entry id 3 name 'entry 3' remark 'rem of entry 3' 

6.2) Setting attribute values

Query> setattr entry (attribute value)+ (+ means that more than can be specified more than one couple)

Attribute values of a given entry can be changed anytime using the setattr command.

Example: setattr /gilda/&hellip;/madagascar.mov Runtime 86

Practice:
We can change the id of the first entry typing: /gilda/tutorials/<date>_<your accountname>_1/001_entry id 100

6.3) Getting attribute values

Query> getattr pattern (attribute)+ (+ means that more than one parameter can be specified)

Returns the entries and all the attributes for every entry matching the given pattern
Example: getattr /gilda/…/tcaland/*.mov Title

Practice:
View all inserted ids: getattr /gilda/tutorials/<date>_<your accountname>_1/* id
(try more examples changing the pattern, the fields etc)

6.4) Entry deletion

Query> rm pattern

Removes all entries matching pattern
Example: rm /gilda/&hellip;/m*.mov

Practice:
Remove the first entry: rm /gilda/tutorials/<date>_<your accountname>_1/001_entry
Please notice the missing entry executing: getattr /gilda/tutorials/<date&gt_<your accountname&gt_1/* id

7 Metadata Queries#

One of the most important issue on using metadata is the possibility to find entries just querying for a particular attribute value.

7.1) find Command

Query> find pattern 'query_condition'

Returns all entries matching the pattern where the query_condition is true
Example: find /gilda/&hellip;/tcaland/*.avi 'Runtime > 80'

Practice:
Get the third entry with:

find /gilda/tutorials/<date>_<your accountname>_1/* 'id > 2' 

(try with more complex queries ed: 'like(remark,"%2%")')

7.2) View attribute values

Query> selectattr attr... condition

Returns the values of given attributes for all files matching the posted condition:
Example: selectattr .:MovieTitle .:Runtime 'Runtime > 80'

Practice:

cd /gilda/tutorials/<date>_<your accountname>_1 
selectattr .:id .:remark 'like(remark,"%2%")'

8 Exercise#

The following exercise summarizes all the above steps. There is just one difference regarding the entry name. This time the entry names will be used as a reference to GUIDs registered into a file catalog (please refer to the DMS hands on).

  • Log into the Metadata Catalog
  • Create a directory with your surname into the /grid/gilda/<directory>
  • Add some attributes (Description-varchar(100), Value-int, Comment-text) to the your directory
  • Add some entries using as entry name the GUIDs you uploaded and registered into the File Catalog during the DMS hands-on session
  • Fill the attribute fields for the inserted entries
  • Look for the entries having 'Value' > 50

9 Jobs using AMGA#

Since it is possible to access to AMGA thanks to the use of Grid proxyies the users can write Grid jobs that can query or create metadata. Normally Grid jobs use the non interactive AMGA client application named mdcli Follows a small example:

The Grid job needs a mdclient.config file to be sent via the JDL InputSandbox field. An example of mdclien.config follows:

# mdclient.config
Host = amga.ct.infn.it
Port = 8822
Login=NULL
PermissionMask = rwx
GroupMask = r-x
Home = /home/gilda
UseSSL = require
AuthenticateWithCertificate = 1 
UseGridProxy = 1
VerifyServerCert = 0
TrustedCertDir = /etc/grid-security/certificates 
RequireDataEncryption = 1 

The Grid job main script:

# amgajobdemo.sh
#!/bin/bash
echo "Looking for Actor: '"$1"'" 
MOVIE=$(mdcli "selectattr /gilda/demo/trailers:Title 'like(/gilda/demo/trailers:Cast,\"%${1}%\")'") 
echo "Selected Movie Title: '"$MOVIE"'"
MOVIEFILE=$(mdcli "find /gilda/demo/trailers/*.avi 'Title = \"${MOVIE}\"'")
echo "Selected Trailer avi file: '"$MOVIEFILE"'"
MOVIESCD=$(mdcli "pwd")
echo "Uploading LFN file '"$MOVIESCD$MOVIEFILE"'"
echo "Now I could get the Grid file with: lcg-cp lfn:$MOVIESCD$MOVIEFILE file:$PWD/movie.avi

The JDL file:

# amgajobdemo.jdl
Type = "Job";
JobType = "Normal";
Executable = "amgajobdemo.sh";
StdOutput = "amgajobdemo.out";
StdError = "amgajobdemo.err";
InputSandbox = {"mdclient.config", "amgajobdemo.sh"};
OutputSandbox = {"amgajobdemo.out","amgajobdemo.err"};
Arguments = "Kidman";

-- Main.RiccardoBruno - 24 Jul 2006 Hands-on on AMGA Metadata catalog

0 Attachments
1474 Views
Average (0 Votes)
Comments