[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Testbed-admins] Can't bring up the first node



Title: Re: Can't bring up the first node
Yes, I did put the kernel and acpi.ko module in /tftpboot..... And reran prepare. And it pretty much blew up on loading.

I tried the 6.4 kernel with the mod to loader.conf.orig .  In verbose mode I  did not observe messages like:

  SMAP type=01 base=0000000000000000 len=000000000008d000

It basically went through the initial  loading and then a bunch of MADT found ....CPU id ....
Then cpu info then:

Real memory = 655360 (0 MB)
Physical memory chunks (s):
0x0000000000001000 – 0x000000000005bfff, 372736 bytes (91 pages)
Avail memory = 114699 (0 MB)
bios32 : Found BIOS32 Service Directory header at 0xc00ffe80
bios32: Entry = 0xffe90 (c00ffe90) rev = 0 Len = 1
Pcibios: PCI BIOS entry at 0xf0000+0xb02e
pnpbios: Found PnP BIOS data at 0xc00fe2d0
pnpbios: Entry = f0000:e2f4 Rev = 1.0
Other Bios signatures found:
APIC: CPU 0 has ACPI ID 1
Panic : hashinit: bad elements
KDB: enter : panic
[thread pid 0 tid 0 }
Stopped at    kdb_enter+0x2b: nop
Db>


On 8/13/09 3:48 PM, "Mike Hibler" <mike@flux.utah.edu> wrote:

I'll take that as a step in the wrong direction. :-)

So you put the kernel and the acpi.ko module out in the /tftpboot/blahblah
directory and reran "prepare"?  Did it blow up first thing?

We'll get together a Linux-based boot environment that you can install...

But in the meantime, try one (really two) last thing for me.  Go back to
the 6.4 kernel.  In the loader.conf.orig file put:

  hw.hasbrokenint12=1

at the end. (This sets some hack for newer machines that cannot use some
older BIOS call to get memory info.)  Rerun the prepare script and reboot
the node.  At the same time, we are going to turn on a verbose boot.
When the node reaches the point:

  Type a key for interactive mode (quick, quick!)
  Attempting boot of: 155.98.32.70:/tftpboot/freebsd7-sio-acpi
  Loading /boot/defaults/loader.conf

as soon as it starts that "Loading" phase, type a space.  It will finish
loading the modules and then should prompt you with:

  Type '?' for a list of commands, 'help' for more detailed help.
  OK

type "boot -v" and hit return.  The first thing you should see are a bunch
of messages like:

  SMAP type=01 base=0000000000000000 len=000000000008d000
  SMAP type=02 base=00000000000f0000 len=0000000000010000
  SMAP type=01 base=0000000000100000 len=000000007f4ffc00
  ...

and then later:

  real memory  = 2136993792 (2037 MB)
  Physical memory chunk(s):
  0x0000000000001000 - 0x000000000008bfff, 569344 bytes (139 pages)
  0x0000000000100000 - 0x00000000003fffff, 3145728 bytes (768 pages)
  0x0000000002c25000 - 0x000000007d1d6fff, 2052792320 bytes (501170 pages)
  avail memory = 2052218880 (1957 MB)
  ...

But I suspect it is not reading the BIOS memory map correctly, so you may
not see any of this.

On Thu, Aug 13, 2009 at 02:34:45PM -0400, Espinola, Derek wrote:
> Using the 7.2 kernel gives different result.
>
> Fatal trap 12: page fault while in kernel mode
> Cpuid = 0; apic id = 00
> Fault virtual address = 0xc354efd4
> Fault code                   = supervisor write, page not present
> Instruction pointer    = 0x20 :0xc0aee1a3
> Stack pointer              = 0x28 :0xc3020d48
> Frame pointer            = 0x28 :0xc3020d5c
> Code segment            = base 0x0, limit 0xfffff, type 0x1b
>                                      = DPL 0, pres 1, def32 1, gran 1
> Processor eflags        = interrupt enabled, resume, IOPL = 0
> Current process         = 0 ()
> Trap number              = 12
> Panic : page fault
> Cpuid = 0
>
>
> On 8/13/09 1:22 PM, "Mike Hibler" <mike@flux.utah.edu> wrote:
>
> I did put together a 7.2 kernel and made sure it worked with the MFSes.
> Give that a try:
>
>   http://www.emulab.net/downloads/tftpboot-kernels-7.2.tar.gz
>
> On Thu, Aug 13, 2009 at 12:10:33PM -0400, Espinola, Derek wrote:
> > Mike,
> >
> > I did try 8GB and even down to 4GB of memory, same end result. I also tried the 6.4 kernel with the correct acpi.ko using 16GB/8GB and it crashes at  this now:
> >
> > Real memory = 655360 ( 0MB )
> > Avail memory = 1154688 ( 0MB )
> > ACPI APIC Table: <DELL PE_SC3 >
> > Panic : hashinit : bad elements
> > Uptime: 1s
> >
> > -Derek
> >
> > On 8/12/09 7:20 PM, "Mike Hibler" <mike@flux.utah.edu> wrote:
> >
> > On Mon, Aug 10, 2009 at 06:01:31PM -0600, Mike Hibler wrote:
> > > So I was unable to reproduce this.  But our pe1950 only has 8GB of RAM in it.
> > > If you are feeling daring, you could try to remove half the memory from your
> > > machine, and we can see if the problem is related.  I doubt it though.
> > >
> > > Tomorrow, I will see if I can temporarily snag another 8GB out of the 2950
> > > we have.
> > >
> >
> > All the memory slots are full, so I could not add more memory.
> > If you cannot try with 8GB, then we will move on.  We have to fix the
> > problem one way or the other since I doubt you would want to remove 8GB
> > from all your machines just so that our old kernel would boot!
> >
> > When you tried the new 6.4 kernel, you were using the older ACPI module.
> > Try extracting both the kernel and acpi.ko from here:
> >
> > /usr/testbed/www/downloads/tftpboot-kernels-6.4.tar.gz
> >
> > into your /tftpboot/freebsd.newnode/boot directory.  Make sure the ACPI
> > module gets loaded when the kernel does.  In the meantime, I am going to
> > see if a 7.2 kernel will boot with the current MFS.
> >
> > Ultimately, we will be moving to a Linux-based MFS environment that we are
> > testing now, so you might get to be the first external test case!
> >
>