[Leaplist] Lose server after a few hours....

Whit Hansell skipper44 at comcast.net
Mon Nov 3 10:27:16 EST 2008


To both Ed Guldemond and Bryan,
Thanks for your replies.  Good news.  I'm back and up and running OK.   
Sorry that it has taken me so long to get back to you.   A variety of 
reasons some of which included additional computer problems, trying to 
find work and a printer that would not print anything including resumes' 
except at super slow rate.  I've got everything fixed now and it is all 
up and running fine.  I believe the server problem was a bug in the 
previous upgrade files because I did my dist-upgrade again and it's not 
stopped running yet. So I think they have fixed the problem even tho' I 
have no idea what it was.

But after my original message and your replies, I had to teach myself 
how to SSH into the server again in order to grab the log files.  That 
took some time cuz I had to find my notes on the command to get in.  ;-)

I was also trying to get work and the printer caused a real problem for 
some reason.  It would not print even test ascii files at any speed.   
It was like the buffer was full and it had to wait until one thing was 
done before it could do another.  Or I don't know.  It was just taking 
forever to do anything.

But it could have also been a problem I had w. the client already.  I 
did not mention it because I didn't think it had anything to do w. the 
server problem at all.   The client did not finish it's last 
dist-upgrade because the boot partition was full and it told me I had to 
do a "dpkg -- configure -a" to fix it but really I had to figure out how 
to in crease the size of the partition.  I had never used Gparted before 
so had to figure out how to do that too.   Got it done and did my 
"confure" command and it's all ok but again, it's taken time to do.

So, I had did my new dist-upgrade to the server and it's OK.  Did my 
Gparted to the client and configured it as explained above and it's 
working fine.

Then I got the latest dist-upgrade for the client done and it did just 
fine w. no problems.

Removed the old printer setup and re-installed the printer after having 
done the new dist-upgrade and it's working fine too.

I don't know what the problem was w. the server but the new upgrade 
seemed to fix it.

The printer problem could have been anything from a messup from the 
previous upgrade to a problem w. the last not having been finalized.  
Could have been a messed up configuration file or I don't know.  I just 
did everything I had to do one by one and it's all OK.

BTW I'm running etch on the server and lenny on the client.  But it's 
not unheard of for a bug to occur even on stable so figure it had to be 
that since it's now cleared up all by itself.  I dont think it had 
anything to do w. the client having it's problem since it would work 
just fine and then not work at all, the server I mean.

Bryan, I've done the dist-upgrade since soon after I started the LUGS 
because I had seen it recommended to do that instead of the straight 
upgrade.  I also use aptitude instead of apt as it is recommended to do 
so by Debian.org.  When I do my upgrades, it's do the "update", the 
"dist-upgrade", "autoclean" and then "updatedb" to finish it all off.  
On "testing" i will get a buggy upgrade every once in a while but that's 
what testing is for.  Normally the bug is fixed by the next week.  I 
think the normal schedule is to do them weekly on Tuesdays.  That's when 
I normally do it.

You thot' it odd that the kernel image would be upgraded w/o a major 
situation.  I can't explain the why to you but this is what shows up on 
the aptitude list:

'***
Aptitude 0.4.4: log report
Sun, Oct 26 2008 22:02:32 -0400

IMPORTANT: this log only lists intended actions; actions which fail due to
dpkg problems may not be completed.

Will install 6 packages, and remove 0 packages.
20.5kB of disk space will be freed
===============================================================================
[UPGRADE] dbus 1.0.2-1+etch1 -> 1.0.2-1+etch2
[UPGRADE] dbus-1-utils 1.0.2-1+etch1 -> 1.0.2-1+etch2
[UPGRADE] libdbus-1-3 1.0.2-1+etch1 -> 1.0.2-1+etch2
[UPGRADE] libmyspell3c2 1:3.1-18 -> 1:3.1-18etch1
[UPGRADE] linux-image-2.6.18-6-686 2.6.18.dfsg._1-22etch3_ -> 
2.6.18.dfsg._1-23_
[UPGRADE] tzdata 2007k-1etch1 -> 2008e-1etch3
===============================================================================

Log complete.
'****
You can see the linux image is changed just a little.  From a 
"1-22etch3"   to a "1-23".

Of course this is from the aptitude log on the server.

There was a change on both of the machines when I did that previous 
upgrades.  But the one that showed a bug was the one on stable/etch, or 
so I believe.  Again, my problem on the client w. testing was that my 
partition was too small and I needed to enlarge it to finish the upgrade.

Anyway, I wanted to thank you both for your replies and again, I 
apologize for not getting back to you sooner but it has been a 
conglomeration of problems, computer and otherwise/personal/trying to 
find work.

Thanks for being here.   I and many who never post really appreciate it 
and I hope I haven't messed up too bad w. usergroup etiquette.  God bless.
Whit

BTW, I was able to finally SSH into the machine and pull the log files 
over(aptitude/daemon.log.0/dmesg, but couldn't find the time to get into 
it all as I explained above.  
Thanks again and I really do appreciate your help.

Bryan J. Smith wrote:
> Whit Hansell <skipper44 at comcast.net> wrote:
>   
>> Hey guys,
>> Have run into an interesting problem in the last few days.
>> Am running Debian ETCH as a server on an old box, PII 400,
>> ssh/rsync/nfs as a file server only.   I did a dist-upgrade
>> the other day and it installed a supposed "new" kernel image
>> which after it did so stated that I was already running that
>> kernel so needed to reboot immediately after the upgrade was
>> complete to rebuild the module list. 
>>     
>
> That doesn't make sense at all.  It sounds like something else.
>
> I mean, you did a "dist-upgrade" so I assume you were upgrading
> from an old Debian release with a kernel 2.4 version, correct?
> What was far more likely is that the core components installed
> were designed for a kernel 2.6 version, and wouldn't work with
> the currently executing version, correct?
>
> Or was this box already at kernel 2.6 prior?
>
>   
>> That was no problem as it has happened before and I've done
>> it a few times on that box and on the client too. 
>>     
>
> But you're not giving us a "context" here.
>
> It's one thing to "dist-upgrade" from a Debian release running
> kernel 2.4.  It's another to "dist-upgrade" from a Debian
> release running kernel 2.6.  Now you could upgrade kernels on
> some older Debian releases from 2.4 to 2.6 without a full "core"
> upgrade (with exceptions).
>
> I'm just trying to figure out what you had "before."
>
>   
>> But I had not used the netword drive for a while and when
>> I tried to access the drive on the server, my client box 
>> locked up and I went to the server and the monitor would
>> not power up.  It had the yellow light on but would not go
>> full power until I turned it off and then back on and it
>> would show the green light for a moment and then turn to
>> the yellow again.  And I had no keyboard or mouse either or 
>> if I do I can't see it because I don't have a monitor.  But
>> on the keyboard I can't change numlock on or off.  It just
>> ignores my input so figure it's off too.
>>     
>
> What does your /var/log/* files have?  dmesg?
>  
>   
>> The kernel is 2.6.18.6-676 on the PII file server.
>>     
>
> More helpful would be what you came from.  ;)
>
> And have you tried to install any other kernel from Etch?
>
> Also, have you tried "apt-get upgrade" or even "apt-get
> dist-upgrade" again?  If so, what components were "still
> missing," if any?
>
> I've had cases where some dependencies weren't resolved
> until the "major core" changes happened and the system
> was rebooted.  It's rarely, but I've seen it with APT-DPKG
> (as well as APT-RPM, SMART for either, etc...).
>
>   
>> I can get it back by rebooting using the power button on
>> the box but no other way.   It comes up just fine and I
>> can use it again for a number of hours until it locks up
>> again.  Everything, monitor, keyboard, mouse 
>> and since I can't even access it thru the cable (not
>> wireless, cabled thru a router/switch) it seems the hard
>> drive too since it locks up the workstation/client box
>> when the server drive is accessed.
>>     
>
> Actually, that's just the stateful nature of NFS (long
> story).  If you mount NFS, and the mount is not "soft"
> or the programs don't "interrupt" operations (if "intr"
> is set), then it can "hang" processes on the workstation
> that are accessing those remote files over NFS.
>
> This "logic" has both its drawbacks and benefits.
>
>   
>> I'd say it was power management but I can't find
>> anywhere it shows that it is set up.  I looked at
>> the control panel in KDE and it shows that 
>> power management is NOT on.
>>     
>
> Again, dmesg and select /var/log/* would be extremely
> helpful.  If they aren't, then we can talk about a remote
> kdump (but that's more involved).
>
>   
>> It is so frustrating.  As I said, it can be rebooted but
>> that is not the way to have to work.  And one of these
>> times it will not be able to bring it up clean again.
>>     
>
> Again, have you tried installing any other kernel update
> available?
>
>   
>> Any help will be greatly appreciated.  I've looked on
>> the net but can't find anyone who has the same problem
>> as this with a desktop.  Can find plenty of power
>> management problems w. laptops but this is not a laptop 
>> and has no battery setup.
>>     
>
> That's because so few run file servers, or at least in
> production environments (don't get me started, I've had
> too many flamewars over "my experience" versus "oh, but
> I've read" ;).
>
>   
>> And it might NOT be a specific power management problem
>> but do suspect that it is.
>>     
>
> Again, dmesg and select /var/log/* are great places to
> start with regards to this.
>
>
>   

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the Leaplist mailing list