Subject: Integration of PVM into SGE 5.3
To: None <current-users@netbsd.org, tech-cluster@netbsd.org>
From: Co Thai Ngo <cngo@nmsu.edu>
List: current-users
Date: 06/16/2005 15:35:20
Hi,   
   
I'm trying to integrate PVM into SGE 5.3. But for some reasons, pvm is not   
started. Here is what I did:   
- replaced SGE_ROOT/pvm with the supplied from SGE howto website   
(http://gridengine.sunsource.net/howto/pvm-integration/pvm-integration.html)   
- Since NetBSD is not supported by PVM, I added the folloing line in to   
SGE_ROOT/pvm/src/aimk   
********   
case nbsd-i386:     
case glinux:     
case linux:     
set CC = gcc     
set CFLAGS = "-O -Wall -Werror -Wstrict-prototypes -DLINUX $DEBUG_FLAG   
$CFLAGS"     
set LFLAGS = "$DEBUG_FLAG $LFLAGS"   
...     
*******     
- Defined PE:  
********    
arbutus# qconf -mp pvm   
pe_name           pvm  
queue_list        all  
slots             32  
user_lists        NONE  
xuser_lists       NONE  
start_proc_args   /usr/pkg/sge/pvm/startpvm.sh $pe_hostfile  
$host /usr/pkg/pvm3  
stop_proc_args    /usr/pkg/sge/pvm/stoppvm.sh $pe_hostfile $host  
allocation_rule   1  
control_slaves    FALSE  
job_is_first_task TRUE  
  
then I got the errors when I run the hello program:  
*********  
acacia: {36} more tester_loose.sh.pe481  
[pvmd pid25581] 06/16 14:21:08 readhostfile() iflist failed  
startpvm: Couldn't get all of the 2 requested hosts  
rm: /tmp/481.1.yucca.q/hostfile: No such file or directory  
libpvm [pid27666] /tmp/pvmd.1024: No such file or directory  
libpvm [pid27666] /tmp/pvmd.1024: No such file or directory  
libpvm [pid27666]: pvm_halt(): Can't contact local daemon  
********  
  
******  
acacia: {37} more tester_loose.sh.po481  
/usr/pkg/sge/default/spool/yucca/active_jobs/481.1/pe_hostfile  
yucca.nmsu.edu /usr/pkg/pvm3  
/var/tmp/tmp.0.00025581aa  
startpvm.sh: startup failed - invoking cleanup script  
/usr/pkg/sge/default/spool/yucca/active_jobs/481.1/pe_hostfile yucca.nmsu.edu  
/usr/pkg/sge/default/spool/yucca/active_jobs/481.1/pe_hostfile yucca.nmsu.edu  
/usr/pkg/sge/default/spool/oenothera/active_jobs/481.1/pe_hostfile  
oenothera.nmsu.edu /usr/pkg/pvm3  
/var/tmp/tmp.0.00026663aa  
********/  
  
Does anyone know why pvm doesn't start. I've checked with SGE people and they  
think the generated line "/var/tmp/tmp.0.00025581aa" shouldn't be there and  
maybe PVM is compiled in a special way on NetBSD... But they seem not know how  
to fix it. I highly appreciated if anyone could help me to fix that problem.  
Thank you very much,   
    
--     
Co Thai Ngo     
Dept. of Biology      
New Mexico State University