Brian Vagnoni on 15 Oct 2007 04:42:43 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] network/server troubleshoot


What I know of your situation:

First, Cavtel in my humble experience doesn't seem to care who does what on their network. I actually left them after they continued to allow other users to spam very intense broadcast traffic over their subnets, admitted a user was doing this and then did absolutely nothing to stop them. It was soo bad it actually pulled down my bandwidth and I live right next to the CO literally. I have a dsl loop of under a 1000 feet.

Though I don't think this is your problem. I have had nic's crap out slowly over time on me. Since the price of a lot of general purpose nic's are under $20.00 wouldn't be easier just to replace it and see. You could pickup an Intel 10/100 PCI Nic easily for that and they very well supported also. In fact if you wait until Monday I would gladly give you whatever I have lying around my apartment at the meeting; you choice of PCI or ISA Adaptec, Realtec, 3COM, SMC, Intel, or other.

I have a lot of crap lying around my apartment. I rat pack. It gets soo bad I have to do a what I call a purge every so often just because I can't stand it anymore. I filled 2 dumpsters last time, this unfortunately was before I started coming to Philly Linux else I would have told people to come over and had a purge party.

Brian Vagnoni




From: Eric [mailto:eric@lucii.org]
To: Philadelphia Linux User's Group Discussion List [mailto:plug@lists.phillylinux.org]
Sent: Sun, 14 Oct 2007 15:25:38 -0400
Subject: [PLUG] network/server troubleshoot

I've been having an intermittent problem with my firewall server and/or Internet
connection. Unfortunately, I don't have the time to spare to "tinker" with it
and I'm not a network expert either. I'm hoping someone here has some insight
because my current favorite solution involves blasting caps and some mixtures
better left unmentioned :-) [that's a joke to express my frustration BTW]

Background: Firewall is a SME server/CentOS based system with 2 nics. eth0 is
the Internet and eth1 is the LAN.

The system is running djbdns tools (dnscache and tinydns) but they appear
blameless AFAIK. I did set it up to use opendns.com rather than my ISP
(Cavalier DSL) but this changed nothing - the problem persisted.

Frequently the Internet connection just ceases to work properly. It may fix
itself after some indeterminate time. Here is what I observe:

( for all of the following I am logged in as root on the firewall )

1. When it does not work (no traffic appears to go in or out) and I type
ping www.google.com I get the message: ping: unknown host www.google.com

2. Fetchmail complains like this:

fetchmail: awakened at Sun Oct 14 09:32:39 2007
fetchmail: Query status=2 (SOCKET)
fetchmail: timeout after 300 seconds waiting to connect
to server pop.gmail.com.
fetchmail: socket error while fetching from pop.gmail.com

3. I can "fix" this situation by entering the following commands (which I have
combined into a script called "toggle":

#!/bin/bash
/sbin/ifdown eth0
sleep 3
/sbin/ifup eth0

4. To log the problem and temporarily "deal" with it I created a script
called doody and put it in the root cron to run every minute.
(You can guess the reason for the name)

#!/bin/bash
/bin/ping -W 10 -c 1 www.google.com >/dev/null
if [ "$?" == "0" ]
then
echo -n '.'
else
echo ''
echo -n 'trouble: '
date
/root/bin/toggle
fi

Okay, it's stupid but it works temporarily and the outages don't last
more than a minute this way :-P

DESPERATION, not necessity, is the mother of invention.

5. There are no relevant messages in /var/log/messages when it fails.

6. When I "toggle" the eth0 interface I sometimes see this in
/var/log/messages:

Oct 14 13:41:18 polaris kernel: eth0: Setting full-duplex
based on MII#1 link partner capability of 01e1.

less frequently the above link is preceded by:

Oct 14 15:03:12 polaris kernel:
0000:01:01.0: tulip_stop_rxtx() failed

Google search on "tulip_stop_rxtx" and failed yields a bunch of
useless comments from the kernel list. Bad news IMHO but I don't
know what to do about it other than swap out the tulip-based nics.

Here, for example, is the output of a few hours of doody.log - the
output from the doody naturally (every period represents a minute
without a problem.) You can see the frequency of the interruptions:

trouble: Sun Oct 14 09:59:11 EDT 2007
.......................................................
trouble: Sun Oct 14 10:55:11 EDT 2007
...................
trouble: Sun Oct 14 11:15:11 EDT 2007
..............
trouble: Sun Oct 14 11:30:11 EDT 2007
.........
trouble: Sun Oct 14 11:40:11 EDT 2007
............
trouble: Sun Oct 14 11:53:11 EDT 2007
.........................
trouble: Sun Oct 14 12:19:11 EDT 2007
.......................................................
trouble: Sun Oct 14 13:15:11 EDT 2007
.........................
trouble: Sun Oct 14 13:41:11 EDT 2007
...
trouble: Sun Oct 14 13:45:11 EDT 2007
....................
trouble: Sun Oct 14 14:06:11 EDT 2007


My biggest problem is that I don't know how or where to get more information for
troubleshooting this. It's almost worth the trouble to just replace all the
nics and reconfigure the system. If I knew that would fix it I would do that
ASAP.

Advice appreciated!

Eric
--
# Eric Lucas
#
# "Oh, I have slipped the surly bond of earth
# And danced the skies on laughter-silvered wings...
# -- John Gillespie Magee Jr
___________________________________________________________________________
Philadelphia Linux Users Group -- http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug