Windows 8 and 8.1 Forums

BSOD 0x0000009c on servers

  1. #1


    Posts : 3
    Windows Server 2012 R2

    BSOD 0x0000009c on servers


    Hello

    For a project we have 50 servers all equiped with (generally) the same hardware. The issue we have here is very serious and happens on all machines. Despite a lot of effort and contacting manufacturs and the software developpers everyone points to eachother and even refuses to give me a clue about what is going one.

    First let me describe the setup. This is 'servergrade' hardware. For my first experience, servergrade is the largest dissappointment in my life.

    - SuperMicro X10SDV-8C+-LN2F
    - Intel Xeon D-1540 (embedded on the motherboard)
    - Custom designed 1U case or SuperMicro original case
    - 480 watt server PSU or 200 watt SuperMicro original PSU
    - Samsung Evo 850 500 GB SSD
    - 32 GB DDR4-2133 ECC or NON-ECC (but not mixed in the same server)
    - Asus GT730 4GB DDR3 GPU
    - GPU is mounted with a PCIe riser card (not ribbon), nameless from China or SuperMicro original

    Running on the system
    - Windows Server 2012 R2 Enterprise
    - VMWare Workstation 12
    - VM's run GPU intensive tasks
    - This system is stock, there's not over/underclocking at all

    Symptoms
    - Random BSOD 0x09c (aka Machine_Check_Exception)
    - Random as in sometimes the system runs for a week with no problems, sometimes in crashes after just 10 minutes, but mostly it runs for a few hours.


    Already tried/checked
    - BIOS updated to latest version (I would think now that this improved the time for the system to be stable, but that could have been random.
    - Windows updated to the latest version
    - VMWare updated to the latest version
    - Swapped all components and tried every different option, even tried a desktop ATX PSU and M.2 SSD.
    - Installed all systems from scratch with Ubuntu. I'm not familiar with Linux and have never seen a Linux BSOD and I still didn't since server systems are headless and I tried this in the DC. RESULT: system would hang and after reboot Linux reported XORG crash (GPU related).
    - Changed GPU setting in BIOS to 'Above 4G', the rest of the BIOS is factory default.


    Also informative:
    - Systems are located in a datacenter. Temperature, air, power and network are optimal.
    - Temperatures are well below the factory maximum
    - We have the exact same *software* setup running on desktop computers (with desktop hardware). These system can run fine with 1 our of 100 PC's crashing every month.
    - I have contacted VMWare, the say this is a hardware issue
    - I have contacted SuperMicro, they say nothing really except some things and already tried and also that this could still be a software issue.

    We are desperate here. The application we run luckily is sort of redundant. If a server and it's VM's on it drop, it's not such an issue, other servers will take over the load within 5 minutes, but at this rate I am required to be online all day to restart servers.

    I have a large hardware knowdledge but this goes past it, I've search on this all day for over a month trying all sorts of different things.
    The fact that these motherboards are used with hosting providers on a large scale makes me suspect that the board on itself is ok. This is definately not a specific hardware issue for RMA as all 50 boards have the same symptoms. The only thing different with us is the GPU. This in combo with the Linux experiment makes me suspect that this is definately something on the PCIe lane. The GPU itself is stable on desktop mobo's. Despite it's large memory capacity this is a small GPU that does not draw much power. I would suspect the Chinese riser cards, but then again we also use SuperMicro certified risers and they show no improvement at all.

    I am very desperate to find a solution here. This will start with determing the exact cause.
    We are willing to pay a nice bounty to an expert who can analyse some dumps and give us more details (or even better yet, a solution).

    Kind regards,

    Simon

      My System SpecsSystem Spec

  2. #2


    Posts : 660
    windows 8.1


    The latest dumpfile blames intelppm.sys.

    *******************************************************************************
    * *
    * Bugcheck Analysis *
    * *
    *******************************************************************************
    Use !analyze -v to get detailed debugging information.
    BugCheck 9C, {0, ffffd00020b610e0, 0, 0}
    Probably caused by : intelppm.sys ( intelppm!MWaitIdle+18 )

    This afaik related to the intel processor and i believe part of the chipset drivers.


    Also in the msinfo32.info i found under problem devices:
    PCI Simple Communications Controller PCI\VEN_8086&DEV_8C3B&SUBSYS_086D15D9&REV_04\3&11583659&0&B1 The drivers for this device are not installed
    .
    This driver are installed by Intel Management Engine

    When i searched on the supportpage of your supermicro motherboard Super Micro Computer, Inc. | Support | Resources

    They had 2 drivers avaliable

    Description: Intel C600 Series Chipset INF Utility
    Version: 10.1.2.8
    Description: Intel Management Engine
    Version: 3.00.05.402
    Do not know if this is related to your specific problem.
    But maybe you can look into it.
      My System SpecsSystem Spec

  3. #3


    Posts : 3
    Windows Server 2012 R2


    Quote Originally Posted by lifetec View Post
    The latest dumpfile blames intelppm.sys.

    *******************************************************************************
    * *
    * Bugcheck Analysis *
    * *
    *******************************************************************************
    Use !analyze -v to get detailed debugging information.
    BugCheck 9C, {0, ffffd00020b610e0, 0, 0}
    Probably caused by : intelppm.sys ( intelppm!MWaitIdle+18 )

    This afaik related to the intel processor and i believe part of the chipset drivers.


    Also in the msinfo32.info i found under problem devices:
    PCI Simple Communications Controller PCI\VEN_8086&DEV_8C3B&SUBSYS_086D15D9&REV_04\3&11583659&0&B1 The drivers for this device are not installed
    .
    This driver are installed by Intel Management Engine

    When i searched on the supportpage of your supermicro motherboard Super Micro Computer, Inc. | Support | Resources

    They had 2 drivers avaliable

    Description: Intel C600 Series Chipset INF Utility
    Version: 10.1.2.8
    Description: Intel Management Engine
    Version: 3.00.05.402
    Do not know if this is related to your specific problem.
    But maybe you can look into it.
    Hello

    Many thanks for your reply. I have posted this on many places and this is by far the most usefull reply I have received.
    I am quite dissappointed in SuperMicro, the drivers on the website don't work on my system or they are outdated by almost a year. In the end to find the correct driver I had to fall back to 'Driver Navigator'.
    Now there is another issue. I managed to do it correctly on one server, but all servers after that only update the driver with '83CA' and not '83CB'. I find the driver file but when installing it is says I have a wrong system.
    Any chance I can copy the driver files from the first server to the others and do it manually? These PCI devices do not show up in the device manager unfortunately.

    Any advice on this?

    Kind regards,

    Simon
      My System SpecsSystem Spec

  4. #4


    Posts : 3
    Windows Server 2012 R2


    UPDATE: I just had another crash on the first machine which was fully updated. I will let it run some longer so I can post some minidumps again.
      My System SpecsSystem Spec

BSOD 0x0000009c on servers
Related Threads
Hello. As the title may have suggested, I have been experiencing in these past weeks problems with the internet. Purely random, only I get disconnected from the internet, with reboot being the only 'temporary' solution. In the event logs I see errors that look like this one: Name resolution for the...
Can't connect to Steam servers in Network & Sharing
I'm connecting to the Internet using Wi-Fi, the router is TP-Link TL-WR841N. Browsing Web or downloading files with BitComet is smooth, the data is downloaded without a hitch. However, when I try to connect to a game server, including Steam, I cannot see most of the servers. Sometimes I can see one...
Hello everyone. I am posting in off topic because this topic is not directly related to Windows 8. Recently I was lucky enough to obtain two server desktops for free! I've never used or messed around with servers before, so this is completely new territory for me (you could call me a noob when...
I have already a Windows VPN server installed on a windows 7 laptop. Now, I'm trying to setup a DIFFERENT windows VPN server on another windows8.1 laptop. Both laptops are on the same wifi network. The problem is that port 1723 forwarding is already used for the windows7 laptop. It seems...
Hey all, I just got a copy of Windows 8 Pro and did a clean install, and so far I love the OS :)! But I'm having a problem with Steam that I didn't have on Windows 7. Every time I try to log in to my account, I get an error saying "Steam is having trouble connecting to the Steam servers." ...
Media Servers in Network & Sharing
Does anyone know how to remove the 'Media Servers' icon from My Computer yet?
Eight Forums Android App Eight Forums IOS App Follow us on Facebook