[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Beowulf] MPICH problem
I've gotten similar problems on a linux Xeon cluster with ethernet, and mpich2-0.96p2. I ended up just using mpich-1.2.5.2. Both were compiled from source with gcc(for c mpi programs). With version 2-0.96p2 I could not get any sample program to run on more than a single node(which incidently worked), even those that just initialize MPI and don't do any real message passing.
Which version are you using?
Isaac Dooley
I'm having some problems running some mpi programs in a beowulf cluster.
The cluster is composed of 12 Linux machines and the compilation of the
mpich libraries run well. I've also configured the machines.LINUX file
so that it lists all machines available in the cluster. When I try to
run some program I get the following error:
$ mpirun -np 3 cpi
rm_924: p4_error: rm_start: net_conn_to_listener failed: 33064
p0_22381: p4_error: Child process exited while making connection to
remote process on a01: 0
/opt/mpich/bin/mpirun: line 1: 22381 Broken
pipe /nfshome/ex/cpi -p4pg /nfshome/ex/PI22264
-p4wd /nfshome/ex
The /nfshome is a nfs shared directory. The a01 is accessible by rsh.
Can someone help me with this error?
_______________________________________________
Beowulf mailing list, Beowulf@xxxxxxxxxxx
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
This mailing list archive is a service of Copilot Consulting.