Exchange Server Failing to start

Redemption
10-17-2003, 08:44 PM
Really starting to hate windows.

Our Exchange server decided to freak out this afternoon and after a reboot, the event viewer shows this.

EVENT ID 7001. The Microsoft Exchange MTA Stacks service depends on the Microsoft Exchange System Attendant service, which failed to start because of the following error: System Attendant Hung on start.

Anyone run into this before?
I've found similar 7001's but none with this part of the error.

Data
10-17-2003, 08:51 PM
:huh:

Redemption
10-17-2003, 08:55 PM
:huh:
I was on my way out the DAMN DOOR when this thing decided to start acting up.
The other admins don't seem to have any idea what's wrong (not that I do either at this point).
Things like this are why I HATE windows.
So we'll probably get to spend three hours on the phone with MS for them to tell us to restore from our backups.

FUBAR|Ascain
10-17-2003, 10:42 PM
Can you manually start the system attendant service? If not, what error code is returned?

Have you run ESEUTIL and ISINTEG against the store to correct any errors and try restarting all the Exchange services?

Redemption
10-18-2003, 01:27 AM
Can you manually start the system attendant service? If not, what error code is returned?

Have you run ESEUTIL and ISINTEG against the store to correct any errors and try restarting all the Exchange services?

No, it fails to start, I don't remember the code off hand.
The other admins decided we needed to do a series of reboots, after the third, it didn't come back up, so they're at the colo. Apparently it's hung at the bios boot with a memory error. Looking through the exported event view, there were various memory errors, so I'm wondering if that may have been related. Guess I'll find out in the morning if someone says "tag you're it"

CandyMan
10-18-2003, 03:41 AM
Also check to see how big your information store is, standard Exchange only goes up to 16gb on the info store and then it refuses to start, you'll be looking at an upgrade to Ent Edition at that point

Redemption
10-18-2003, 06:47 PM
Enterprise edition.
After few hours with MS support, found out that it's due to the server being unable to access the global catalog.
None of our servers are apparently.
So we're now working on what's wrong with that server, but it's better then having done a rebuild of our mail server.

Redemption
10-18-2003, 08:04 PM
ROAR!

So now the issue with the DC at our Colo appears to be related to the fact that the server can't sync with our primary DC.
So we're working on servers and our connection is getting flakier and flakier to our colo from HQ.
Now to the point we can't access it at all via our private T1.
We check our 2600 at the office and discover that the configuration has been wiped. :eek:
Reload the config, and still having problems.
Check the router at the colo physically and discover it's eth 0/0 now has the same ip as our hq router along with the gateway being entirely wrong.
We've fixed it now and can access our servers, waiting to see if this clears up the synch issue with the DC and mail server (which probably was demanding to use the dc that was acting funky because it couldn't find any other ones).

Data
10-18-2003, 08:49 PM
So were you hit with the Cisco router exploit... or something?

:huh:

Redemption
10-18-2003, 09:07 PM
Not sure.
We've got our routers back up in line. we can telnet into the RPC port on our PDC from our colo DC and the exchange server.
We can access other RPC functions such as event viewer from the colo to HQ, but we can't synch the global catalog for the colo DC, consequently the Exchange server, which has some hard set configurations to use the colo DC, won't start.
We're bringing the exchange server and colo DC back to HQ, going to reip them and bring them up on the local net. If they work, then we've established that something is uber wrong with the private T1 or the routers themself.
But a cisco router exploit would be much cooler, though I'm doubting it because they're on a private T1 so it would have to come in from one of our T1's at the HQ through a firewall, which would put them on our local net, from there they'd have to go through the firewall to access the serverlan where the routers exist.

Data
10-18-2003, 10:10 PM
I see.

The exploit makes less sense, given your network topology.

Redemption
10-19-2003, 02:33 AM
So I'm going home now.
The other admins who'd been at it for 24 hours+ went home about 6 hours ago.

We brought the exchange server and the domain controler it was bound to back to our HQ.
about 80% of exchanges stuff was working (better then none) however both systems had problem making out going connections via RPC devices to other systems. Both also had errors saying duplicate name exists :huh:

So we toss the Primary DC and the Secondary DC (one from the colo) on a single cheap MS Hub (they can't blame our hardware now ;)).
PDC Can make RPC connections to the SDC, but not vice versa.
Finally dig and dig and dig, find out that the SDC is using the PDC as a wins, okay, that should be okay, EXCEPT that the PDC had some old static wins entries for the SDC with it's old IP. This was causing the duplicate name entry. Did some Marine Corp Style seek and destroy on the static entries. purged all netbios, told SDC not to use the PDC entries, gave them both a reboot(PDC first). SDC comes up with no errors about duplicate names. It can now do RPC style functions to the PDC. We set the mail server to not use the wins either, reboot, it now can start all of it's services and use the manager utility.
Our outlook clients can now all log into the system without crashing, though i can't send mail to myself. BUT, we're at least at a place where we can work with the mail server appropriately. We're back in at 8am in the morning to make sure it's all working appropriately and to take the SDC and mail server to the colo to sync it back up with OWA and the relay system.
Preciate all your responses on this and will let you know the end result.

Data
10-19-2003, 03:05 AM
I wish I could say I had actually helped. Exchange Server Administration is way over my head.

I'm glad you're finally nailing it down. Some weekend, eh? :ugh:

Panda Star
10-23-2003, 12:38 PM
So I'm going home now.
The other admins who'd been at it for 24 hours+ went home about 6 hours ago.

We brought the exchange server and the domain controler it was bound to back to our HQ.
about 80% of exchanges stuff was working (better then none) however both systems had problem making out going connections via RPC devices to other systems. Both also had errors saying duplicate name exists :huh:

So we toss the Primary DC and the Secondary DC (one from the colo) on a single cheap MS Hub (they can't blame our hardware now ;)).
PDC Can make RPC connections to the SDC, but not vice versa.
Finally dig and dig and dig, find out that the SDC is using the PDC as a wins, okay, that should be okay, EXCEPT that the PDC had some old static wins entries for the SDC with it's old IP. This was causing the duplicate name entry. Did some Marine Corp Style seek and destroy on the static entries. purged all netbios, told SDC not to use the PDC entries, gave them both a reboot(PDC first). SDC comes up with no errors about duplicate names. It can now do RPC style functions to the PDC. We set the mail server to not use the wins either, reboot, it now can start all of it's services and use the manager utility.
Our outlook clients can now all log into the system without crashing, though i can't send mail to myself. BUT, we're at least at a place where we can work with the mail server appropriately. We're back in at 8am in the morning to make sure it's all working appropriately and to take the SDC and mail server to the colo to sync it back up with OWA and the relay system.
Preciate all your responses on this and will let you know the end result.


the mailing to yourself part could be cause they can't resolve the host name appropiately

make sure dns is handing out names correctly

I've seen bad static entries in dns reek havoc on a mail server especially if wins is screwed

Redemption
10-27-2003, 11:15 AM
This ended up being a huge ugly mess because one of the admins tightened down our colo firewalls too much.
It *shouldn't* have been an issue because our private T1 is supposed to avoid them, but apparently we had some messed up routing tables in our colo, which made packets coming from the colo run through the firewall.
So this caused an issue with our Global Catalogs being synched. After we brought our Mailserver back to our main office and got rid of all kinds of wrong static wins and DNS (lobbying to drop wins support on our network entirely) the mail server started working again.
We finally traced back through our change logs and found out what set a lot of this off. Undid the firewall setting our our test DC we put up at the colo started working again.
THis coming weekend we get to move it all back and hope it doesn't freak out.

Data
10-27-2003, 03:34 PM
Glad to see you finally found the culprit. :D

Redemption
10-27-2003, 03:55 PM
Glad to see you finally found the culprit. :D
Numbskull sysadmin?
isn't that the default answer?

Data
10-27-2003, 05:01 PM
Numbskull sysadmin?
isn't that the default answer?
Nah, the default is "User Error".

AKA P.E.B.K.A.C.
Problem
Exists
Between
Keyboard
And
Chair

Darkness
10-28-2003, 08:02 PM
This one time... i like thought i knew stuff about networking, then i saw this thread.