[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Testbed-admins] Emulab Rebuild Problems



The control interface for pc20 wound up in the interfaces table as an "expt" interface. I can see from the archive that this problem has come up before. It appears that I haven't managed to isolate the control switch and the experimental switch adequately?

I played around with various vlan and trunkport settings on the switches, but it seems that I either block ALL traffic from boss to the experimental switch or the switchmac script reports that it finds the control interface of pc20 on the trunk port (port 48) of the experimental switch. (That is, all the configurations in which I can ping from boss to the experimental switch seem to wind up confusing switchmac.)

Note that we have not divided up our control switch into vlans as discussed in the network design portion of the install instructions. The instructions make this sound like an option -- perhaps it's not? Or have I just not hit on the right vlan and trunkport settings on the experimental switch yet?

As for power controllers -- we aren't using any at present.

On 5/28/2010 6:17 PM, Mike Hibler wrote:
Did the control net information for that machine wind up in your DB?
You can do "mysql tbdb" and then:
     select * from interfaces where node_id='pc20';
and see if there is an entry for the control network.

What do you have for power controllers?

On Fri, May 28, 2010 at 05:50:56PM -0500, Barry Trent wrote:
I'm working on re-building our small emulab with new hardware here at
Architecture Technology in Minnesota and I'm having some trouble.

The first big problem I think I have solved: We are using Cisco 2948
switches for the experimental network. I figured the type field in the
node_types and nodes tables should be 'cisco2948'. Wrong. The 2948 is
actually in the 4000 class of Catalyst switches! The difference appears
to be that it uses "community string indexing" for some of its MIBs.
(Described here:
ftp://ftp-sj.cisco.com/pub/mibs/supportlists/wsc4000/wsc4000-communityIndexing.html).


So -- note for posterity: For the Cisco 2948 switch, set its type value
to 'cisco4000' in the type fields of the node_types and nodes tables.

Now the new problem I'm up against:

We PXE boot our testbed machines and they load the freebsd.newnode and
appear in the "New Testbed Nodes" page of the web interface. We "Search
switch ports for selected nodes". Our 5 experimental interfaces are
properly discovered but the control interface isn't. I figure this
shouldn't be a big problem(?) -- we enter this manually.

When we try to actually "Create" the node the operation appears to succeed:

-----
/usr/testbed/www
pc20 succesfully added!
Re-generating dhcpd.conf
Restarting dhcpd: /usr/local/bin/sudo -S /usr/local/etc/rc.d/2.dhcpd.sh stop
Restarting dhcpd: /usr/local/bin/sudo -S /usr/local/etc/rc.d/2.dhcpd.sh
start
   dhcpd wrapperSetting up nameserver
Running exports_setup
Rebooting nodes...
Rebooting 192.168.10.109


Finished - when you are satisifed that the nodes are working
correctly, use nfree on boss to free them from the emulab-ops/hwdown
experiment.
-----

BUT, the node never reboots. It just sits there at the login prompt of
the freebsd.newnode boot image. It never "reports in" to the hwdown
experiment (although we can see that there is now a machine in the
hwdown experiment, the idle time just keeps going up). No entry for the
machine actually gets placed into the re-generated dhcpd.conf file
either -- I presume that one should. If we reboot manually we go right
back to the newnode image.

An ideas/suggestions on how to troubleshoot this problem?
_______________________________________________
Testbed-admins mailing list
Testbed-admins@flux.utah.edu
http://www.flux.utah.edu/mailman/listinfo/testbed-admins