[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Testbed-admins] Can not bring up the first node for a new Emulab site



So you have a couple of things going on.  One it appears there must be
an DHCP server running on whatever network de0 is attached to.  The whole
DHCP-to-find-the-control-net technique is fraught with peril and only
works reliably if there is a single attached network with a DHCP server on
it.  You need to look into that, but it is not the problem that is stopping
you here, because the determination of who is "boss" is different and
independent of the control net.

By default, boss is determined by who the node's DNS server is.  This is
apparently set correctly because the node was able to query boss and get
the "IPOD" info (I assume that 192.168.56.3 is your boss node).

The "unable to get address" is caused by boss not returning the correct
"loadinfo".  The first thing to try is just to force the reload-related DB
state of the node to get reset by doing:

  nfree emulab-ops reloading pcXXX

(where pcXXX is your node name) This will force the node back through the
reloading process.

If that doesn't change anything, look at /usr/testbed/log/tmcd.log on boss
and see what it says.  After an attempted boot of this node you should see
something like:

  Aug  6 08:49:02 boss tmcd[71339]: pc236: vers:30 TCP loadinfo

and then if it was working:

  Aug  6 08:49:02 boss tmcd[71339]: pc236: loadinfo wrote 105 bytes

If it claims to be returning the info, then you should be able to login
to the node on the console as root (you cannot ssh in) and run:

  /etc/testbed/tmcc loadinfo

and see what it returns.

On Thu, Aug 06, 2009 at 10:31:49AM -0400, chunhui Zhang(Evan) wrote:
> On Wed, Aug 5, 2009 at 8:03 PM, Mike Hibler <mike@flux.utah.edu> wrote:
> 
> > Any chance you are getting a DHCP reply over de0?
> >
> 
> This time I disconnected the de0, it showed,
> ---------------------------------------------
> Setting hostname: .
> Emulab looking for control net among: de0 xl0 ...
> xl0: link state changed to UP
> Terminated
> *Emulab control net is xl0*
> ...
> ...
> ...
> Starting local daemons: Playing Frisbee ...
> *de0: autosense failed: cable problem?*
> Authenticated IPOD enabled from 192.168.56.3/255.255.255.255
> Unable to get address for loading image
> Failed to load disk, dropping to login prompt at Wed Aug  6 08:11:43 MDT
> 2009
> --------------------------------------------------------
> 
> >
> > There should be some logfiles from the DHCP process in /var/tmp.
> > What is in /var/tmp/netif-emulab.log?
> 
> 
> I did 'cat /var/tmp/netif-emulab.log' and it showed,
> --------------------------------------------------------
> *Using dhclient port...*
> --------------------------------------------------------
> 
> Thanks for the help,
> Evan Zhang
> 
> >
> >
> > On Wed, Aug 05, 2009 at 07:52:21PM -0400, chunhui Zhang(Evan) wrote:
> > > I got to the point that,
> > > ----------------------------------------
> > > Attempting boot of: /tftpboot/frisbee
> > > Loading /boot/defaults/loader.conf
> > > /boot/kernel text= ............................................etc
> > > /boot/acpi.ko text= ..........................................etc
> > >
> > > Hit [Enter] to boot immediately, or any other key for command prompt.
> > > Booting [/boot/kernel]...
> > > ...
> > > ...
> > > ...
> > > Setting hostname: .
> > > Emulab looking for control net among: de0 xl0 ...
> > > xl0: link state changed to UP
> > > Emulab control net is de0
> > > ...
> > > ...
> > > ...
> > > Starting local daemons: Playing Frisbee ...
> > > Authenticated IPOD enabled from 192.168.56.3/255.255.255.255
> > > Unable to get address for loading image
> > > Failed to load disk, dropping to login prompt at Wed Aug  5 17:36:44 MDT
> > > 2009
> > > ...
> > > ...
> > > ...
> > > FreeBSD/i386 (pc1) (console)
> > >
> > > login:
> > >
> > -------------------------------------------------------------------------------------
> > >
> > > Above message tells me the control net is de0 which is not true. xl0 is
> > the
> > > one I used as control NIC and the DB also have the knowledge that. My
> > node
> > > was booted from xl0 and de0 do not have pxe boot ability at all.  Then I
> > > logged in through the dropped prompt and did 'cat
> > /etc/testbed/controlif',
> > > it showed 'xl0'. I traced through the source code and find out 'tmcc
> > > loadinfo' gave me nothing. I am not sure how to debug this issue further.
> > So
> > > I am wondering do you see the similar issue before? And any suggestions?
> > >
> > > Thanks a lot,
> > > Evan Zhang
> > >
> > >
> > > On Tue, Aug 4, 2009 at 10:42 PM, chunhui Zhang(Evan) <chyz198@gmail.com
> > >wrote:
> > >
> > > >
> > > >
> > > > On Tue, Aug 4, 2009 at 10:26 PM, Mike Hibler <mike@flux.utah.edu>
> > wrote:
> > > >
> > > >> On Tue, Aug 04, 2009 at 09:40:45PM -0400, chunhui Zhang(Evan) wrote:
> > > >> > On Mon, Aug 3, 2009 at 6:05 PM, Mike Hibler <mike@flux.utah.edu>
> > wrote:
> > > >> >
> > > >> > > I will jump in the middle here, please excuse me if you have
> > already
> > > >> > > answered
> > > >> > > any of my questions!
> > > >> > >
> > > >> > > In one of your earlier messages you had some text that "came out",
> > was
> > > >> >
> > > >> >
> > > >> > > that on the VGA or serial line?  It doesn't matter, as long as we
> > have
> > > >> a
> > > >> > > console that works.
> > > >> > >
> > > >> >
> > > >> > >
> > > >> > > Do I understand correctly that the 47 MFS boots on the PIII node
> > but
> > > >> the
> > > >> > > 62 version does not?  Strange...
> > > >> > >
> > > >> > > No matter, let's go back to the 62 MFS, the 47 is a loser long
> > term.
> > > >> > >  Verify
> > > >> > > that you have the right console selected in
> > > >> > > /tftpboot/freebsd.newnode/boot/loader.conf.orig
> > > >> >
> > > >> >
> > > >> > By changing the option 'console="comconsole"' to
> > 'console="vidconsole"'
> > > >> in
> > > >> > loader.conf.orig solved my problem!
> > > >> > Thank you so much!  BTW, where can I find the document talking about
> > > >> this
> > > >> > configuration?
> > > >> >
> > > >> >
> > > >> > Evan Zhang
> > > >> >
> > > >>
> > > >> How much of your problem did it solve?  Will the MFS boot on the newer
> > > >> node?
> > > >>
> > > >
> > > >  Yes, all three MFSs can boot now.
> > > >
> > > >>
> > > >> Re: documentation, I thought that console config was in the README for
> > the
> > > >> MFS
> > > >> but it isn't!  We need to fix that.
> > > >
> > > >
> > > > Evan
> > > >
> > > >
> > > >
> >
> > > _______________________________________________
> > > Testbed-admins mailing list
> > > Testbed-admins@flux.utah.edu
> > > http://www.flux.utah.edu/mailman/listinfo/testbed-admins
> >
> >