Dedicated servers
Dedicated Server Hardware Troubleshooting
Diagnose disk, memory, and network hardware on Nexelya dedicated servers using IPMI sensors, logs, and datacenter escalation.
Published March 5, 2025
Recognizing hardware vs software issues
Intermittent crashes, SMART errors, uncorrectable ECC memory events, and link flaps on NICs point to hardware. Application-only failures under load without kernel logs often remain software—but correlate with IPMI sensor thresholds.
Nexelya displays service status; use IPMI SEL (System Event Log) for authoritative hardware events.
Storage diagnostics
Run smartctl -a on each disk and watch Reallocated_Sector_Ct and Current_Pending_Sector. RAID degraded states require immediate replacement—hot spares may rebuild automatically but open a ticket for drive swap scheduling.
Clicking drives or hung I/O with kernel BLOCK errors warrant filesystem checks after backup.
- Capture MegaCLI / storcli show all for RAID health.
- Replace failing drives in same slot order when hot-swapping.
- Verify rebuild progress before leaving maintenance.
Memory and CPU
ECC corrected errors may be logged without immediate failure; uncorrected errors need DIMM replacement. Run memtest86+ via IPMI boot for suspected RAM. Thermal throttling shows in IPMI temperature sensors—verify fan curves and datacenter ambient issues.
Network interface issues
Check ethtool for link speed/duplex mismatches, cable faults, or bad SFP modules on 10G+ links. Flapping interfaces may be switch port or NIC hardware—try alternate ports or known-good optics.
Escalation and RMA
Open Nexelya support with SEL exports, SMART logs, timestamps, and permission for remote hands to test spares. Nexelya coordinates parts replacement and chassis work with the datacenter. Maintain spare capacity or failover servers for hardware lead times.
Frequently asked questions
Intermittent errors are hardest—enable IPMI SEL logging alerts and retain logs across reboots when possible.
Drive rebuilds in RAID reduce performance; schedule replacements during low-traffic periods.
Keep spare servers or VPS failover for hardware lead times that exceed your RTO.
Ready to deploy? Create a Nexelya account or compare plans.