[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Testbed-admins] problem with bge driver?



Mike Hibler wrote:
What version of FreeBSD is your FBSD-STD?  Is it 6.x?  I assume so,
since FreeBSD 4.10 would probably just laugh in your face if asked to
boot on anything that wasn't at least 5 years old :-)

6.2. I don't even want to think about what 4.7 would do on these machines. I'm pretty sure it would walk over from the machine room and kick me in the teeth.


When you say "bounce the link" what do you mean?  Obviously, you are taking
it down and then bringing it back up.  But when you bring it up, are you
explicitly setting the speed/duplex or letting it auto-negotiate?

on the Cisco, using the "shutdown" and "no shutdown" commands on the interface. In the "no carrier" case mentioned below, I forced the switch to 1000 using IOS commands.


I would certainly believe there are issues with auto-negotiation.  Come
to think of it, I would certainly believe that there would be bugs in
the broadcom driver period!  Our R710s won't even work with FBSD6 because
they don't recognize the broadcom chipset.

That's a handy data point to have, just in case someone has some budget on a research project and wants r710s.


But I digress.

helpful, even in digression


We do always auto-negotiate the control net interface.  So it is quite
possible that Gb isn't working on your broadcom interface and it is
falling back to 100Mb there.  You should be able to verify that just by
doing "ifconfig" on the control interface.

Yes. That's what I thought was so strange about this. The control interface (bge) was running 1000/full. The experimental interface (bge) was running 100/full.



If it is the case that your current BSD cannot do Gb on these interfaces,
then you will need to upgrade to a newer BSD (it sounds so simple when I
say it like that).

My inner child just started crying, at least I hope it was just my inner child.


I have FBSD 7.2 kernels you can drop down into your
MFSes:

	http://www.emulab.net/downloads/tftpboot-kernels-7.2.tar.gz

(this is the stopgap measure til we roll out the Linux-based MFSes).

Thanks.  I wgot the tarball.



We also have 32 and 64 bit FBSD 7.2 images as well, though not yet GENERIC
ones.  I can prepare one of those.

On Wed, Sep 02, 2009 at 10:19:23AM -0500, David A. Kotz wrote:
I've recently added some brand-new Dell PowerEdge r200 machines with NetFPGA cards (to be dealt with later) to our testbed. Users reported that their experiments failed to swap in when they set lan speed to 1000Mbs, so I did some testing:

Hardware:
control switch: Cisco 3750 stack
experimental: Cisco 6509 w/ 48 port Gb blades
pcr200: Dell PowerEdge r200, onboard Broadcom Gb NICs (bge driver), NetFPGA (no drivers yet)
pc3001: Dell PowerEdge 2850, Intel NICs (em drivers)


r200 <---> 1000Mbs lan <---> r200 fails
r200 <---> 100Mbs lan <---> r200 works
r200 <--- 1000Mbs ---> r200 (no lan defined, link set to 1000) fails

pc3001 <---> 1000Mbs lan <---> pc3001 works


Looking at the database definitions for the NICs in both pc3001 and pcr200, I see no reason for the 100Mbs limitation. Both have max speed set to 1000. These are the first testbed machines we're used with the bge driver.

I allocated an r200 with default settings and got FC8 with a 100/full experimental link. I forced the switchport to 1000, bounced the interface, and ifconfig in FC8 showed that I was now at 1000/full.

I allocated an r200 specifying FBSD-STD and got a 100/full experimental link. I forced the switchport to 1000, bounced the link, and ifconfig from BSD showed "no carrier". This would seem to indicate that the bge driver for BSD doesn't work for 1000 except that bge0, the control interface, is connected to the 3750 at 1000/full.

The hunch I was attempting to test was that maybe the control interface was running at 100, and there was either a hardware or software problem preventing one of the ports on the NIC (recall the r200 has a single dual-port onboard NIC) from running 1000 while the other was running 100, but that's actually the *working* configuration in BSD.

What seems to be preventing the use of gigabit links is that emulab is BSD-centric, and when emulab creates the node it is using BSD, which is autonegotiating that link to a maximum of 100Mbs.

I'd greatly appreciate any insight anyone can provide on this issue.

- dave

_______________________________________________
Testbed-admins mailing list
Testbed-admins@flux.utah.edu
http://www.flux.utah.edu/mailman/listinfo/testbed-admins