[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Testbed-admins] Emulab Rebuild Problems



You will have to refresh me on your HW configuration.  You do have separate
control and experimental switches, correct?  Just one of each?

Since you aren't sub-segmenting your control network, the config should be
that boss has two interfaces connected to your control net switch, one in
the "node control network" VLAN (VLAN 3 for us) and one in the "hardware
control network" VLAN (VLAN 10 for us).  There should be a wire between
the control and experimental switches, but it doesn't need to be a trunk
link, both ports just need to be in VLAN 10.

This should prevent the switchmac process from seeing the control interfaces
via the experiment net.  But even when it is a trunk, switchmac should ignore
it, if the ends are marked as a trunk in the DB "wires" table, ala:

mysql> select * from wires where type='Trunk' and (node_id1='cisco2' or node_id2='cisco2');
+-------+-----+-------+----------+-------+-------+----------+-------+-------+
| cable | len | type  | node_id1 | card1 | port1 | node_id2 | card2 | port2 |
+-------+-----+-------+----------+-------+-------+----------+-------+-------+
| 2646  | 90  | Trunk | cisco14  | 1     | 1     | cisco2   | 5     | 12    |
+-------+-----+-------+----------+-------+-------+----------+-------+-------+
1 row in set (0.00 sec)

But you probably don't want that wire to be a trunk, unless there is more
going on in your config than I remember.

On Sat, May 29, 2010 at 11:11:11AM -0500, Barry Trent wrote:
> The control interface for pc20 wound up in the interfaces table as an 
> "expt" interface. I can see from the archive that this problem has come 
> up before. It appears that I haven't managed to isolate the control 
> switch and the experimental switch adequately?
> 
> I played around with various vlan and trunkport settings on the 
> switches, but it seems that I either block ALL traffic from boss to the 
> experimental switch or the switchmac script reports that it finds the 
> control interface of pc20 on the trunk port (port 48) of the 
> experimental switch. (That is, all the configurations in which I can 
> ping from boss to the experimental switch seem to wind up confusing 
> switchmac.)
> 
> Note that we have not divided up our control switch into vlans as 
> discussed in the network design portion of the install instructions. The 
> instructions make this sound like an option -- perhaps it's not? Or have 
> I just not hit on the right vlan and trunkport settings on the 
> experimental switch yet?
> 
> As for power controllers -- we aren't using any at present.
> 
> On 5/28/2010 6:17 PM, Mike Hibler wrote:
> >Did the control net information for that machine wind up in your DB?
> >You can do "mysql tbdb" and then:
> >     select * from interfaces where node_id='pc20';
> >and see if there is an entry for the control network.
> >
> >What do you have for power controllers?
> >
> >On Fri, May 28, 2010 at 05:50:56PM -0500, Barry Trent wrote:
> >>I'm working on re-building our small emulab with new hardware here at
> >>Architecture Technology in Minnesota and I'm having some trouble.
> >>
> >>The first big problem I think I have solved: We are using Cisco 2948
> >>switches for the experimental network. I figured the type field in the
> >>node_types and nodes tables should be 'cisco2948'. Wrong. The 2948 is
> >>actually in the 4000 class of Catalyst switches! The difference appears
> >>to be that it uses "community string indexing" for some of its MIBs.
> >>(Described here:
> >>ftp://ftp-sj.cisco.com/pub/mibs/supportlists/wsc4000/wsc4000-communityIndexing.html).
> >>
> >>
> >>So -- note for posterity: For the Cisco 2948 switch, set its type value
> >>to 'cisco4000' in the type fields of the node_types and nodes tables.
> >>
> >>Now the new problem I'm up against:
> >>
> >>We PXE boot our testbed machines and they load the freebsd.newnode and
> >>appear in the "New Testbed Nodes" page of the web interface. We "Search
> >>switch ports for selected nodes". Our 5 experimental interfaces are
> >>properly discovered but the control interface isn't. I figure this
> >>shouldn't be a big problem(?) -- we enter this manually.
> >>
> >>When we try to actually "Create" the node the operation appears to 
> >>succeed:
> >>
> >>-----
> >>/usr/testbed/www
> >>pc20 succesfully added!
> >>Re-generating dhcpd.conf
> >>Restarting dhcpd: /usr/local/bin/sudo -S /usr/local/etc/rc.d/2.dhcpd.sh 
> >>stop
> >>Restarting dhcpd: /usr/local/bin/sudo -S /usr/local/etc/rc.d/2.dhcpd.sh
> >>start
> >>   dhcpd wrapperSetting up nameserver
> >>Running exports_setup
> >>Rebooting nodes...
> >>Rebooting 192.168.10.109
> >>
> >>
> >>Finished - when you are satisifed that the nodes are working
> >>correctly, use nfree on boss to free them from the emulab-ops/hwdown
> >>experiment.
> >>-----
> >>
> >>BUT, the node never reboots. It just sits there at the login prompt of
> >>the freebsd.newnode boot image. It never "reports in" to the hwdown
> >>experiment (although we can see that there is now a machine in the
> >>hwdown experiment, the idle time just keeps going up). No entry for the
> >>machine actually gets placed into the re-generated dhcpd.conf file
> >>either -- I presume that one should. If we reboot manually we go right
> >>back to the newnode image.
> >>
> >>An ideas/suggestions on how to troubleshoot this problem?
> >>_______________________________________________
> >>Testbed-admins mailing list
> >>Testbed-admins@flux.utah.edu
> >>http://www.flux.utah.edu/mailman/listinfo/testbed-admins
> >