Data transfer from Cascina with Grid tools

Objectives

Our objective is to ensure a reliable data transfer from Cascina, to the various computer centers (Bologna, Lyon, Budapest? ). with the following aspects:

  1. Uniform interface to all the CC.
  2. As lightweight as possible, since it is going to be installed on the head of the disk pool.
  3. VOMS aware
  4. Bandwidth is not really an issue, we need something like 40 MByte/sec.
  5. In principle one directional data flow with limited number of users performing the transfer.

Grid storage elements

Currently there are a wide variety of storage elements used on the Grid. The log-file servers, the Classic SE , the Disk Pool Manager, dCache, STORM, Castor, etc, etc... They are suitable for different puposes. The maintenance of STORM, Castor and dCace are relatively demanding, they are for big tape and disk installations serving hundreds/thousands of users. DPM is easy to install, maintain, still robust and has no a-priori limitation in its size, but lacks of tape backend. Classic SE is not supported any more, unlike all the other it has no file database in memory but uses the unix filesystem directly. A great advantage is that through mount it can provied POSIX-like byte level access to its data, so data-mining activities could profit from this fact. None of them has internal bandwidth limitation, its just question of hardware and wire.

Taking into account our purposes we left with three possible choice: Classic SE, log-file server, DPM.

  • Since Classic SE is not supported any more I wouldn't choose it.
  • DPM would be a very good choice but
    • The files on the disks has to be imported into the DPM database which poses extra attention
    • It comes with a relatively large number of software package to be installed. (If this above two is not a problem then we can go for it.)
  • Log-file servers are very easy to insall/configure and only 4-5 rpm is necessary, so I'm going to describe how to install one, in the next section.

Possible simplest implementation

Log-file server is nothing else, than a VOMS aware gridftp server.

  1. Install the following packages:
        vdt_globus_data_server
        vdt_globus_essentials
        glite-initscript-globus-gridftp
        edg-mkgridmap
        se_gridftp
        glite-yaim-core
       
    They are available in the gLite repositories except se_gridftp that you can download from here. This is in principle a set of YAIM functions cooked up by Maarten Litmaath.
  2. Configure it by running YAIM. All what you need is a users.conf, a groups.conf and a site-info.def file with the correct settings. Here you can find examples of users.conf, groups.conf and site-info.def.
        cd /root
        mkdir yaimconfig
        cd yaimconfig
        wget http://grid.kfki.hu/afs/gdebrecz/web/virgo/users.conf
        wget http://grid.kfki.hu/afs/gdebrecz/web/virgo/groups.conf
        wget http://grid.kfki.hu/afs/gdebrecz/web/virgo/site-info.def
       
    edit the values according to your needs, then run
       /opt/glite/yaim/bin/yaim -c -s ./site-info.def -n SE_gridftpd -d 5
       
    The YAIM configuration will configure and start the gridftp server, create the unix groups and users defined in the config files.
  3. Make your storage directory read available for the corresponding unix group.
  4. Then from another machine you can test the installation and use for example lcg-cp or lcg-cr to move and/or register your data and perform 3rd pary copies.

I've tested this procedure, found to work fine. In case of any problem contact me.

Notes

  • The installed gridftp server makes it possible to calculate the cheksum of the files
  • Transfer can be controlled for any computer, ther is no need to install a live UI.
  • Pay attention to create sufficient number of unix pool account for the VO !

-- Gergely Debreczeni - Feb 23, 2009

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2009-02-24 - GergelyDebreczeni
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback