YorkHost - Infrastructure Virtualisation Upgrade – Maintenance details

All systems operational

Infrastructure Virtualisation Upgrade

Completed
Scheduled for November 06, 2025 at 7:30 AM – 9:30 AM

Affects

Virtualization Infrastructure

Under maintenance from 7:30 AM to 9:30 AM

PVE 01

Under maintenance from 7:30 AM to 9:30 AM

Web

Under maintenance from 7:30 AM to 9:30 AM

Plesk Web01

Under maintenance from 7:30 AM to 9:30 AM

Updates
  • Completed
    November 06, 2025 at 9:30 AM
    Completed
    November 06, 2025 at 9:30 AM

    Status: ⚙️ Completed (Partial Success)
    Date: November 6, 2025
    Start Time: 08:30 AM
    Completion: ~10:00 PM

    As part of YorkHost’s infrastructure enhancement program, a multi-layer maintenance operation was conducted across several environments, including PVE-01, PVE-05, and the management hypervisor (MGMT). The objective was to reinforce virtualization performance, storage reliability, and scalability.


    🔧 Objectives

    1. Install a new SSD on PVE-01 to improve I/O performance and replace the old drive.

    2. Deploy PVE-05, a new hypervisor expanding the cluster’s compute capacity.

    3. Upgrade the management PVE node to strengthen orchestration and monitoring.


    🧩 Technical Summary

    • The old SSD on PVE-01 was successfully removed and replaced with a new Samsung QVO.

    • The new drive was detected but marked “foreign” by the RAID controller (Dell PERC H710).

    • This prevented its integration into Proxmox. A reset of the RAID configuration was attempted, but the issue persisted.

    • A follow-up maintenance (~30 minutes) will be planned to reinitialize the controller and finalize the integration.

    Parallel to this, the PVE-05 setup began. Base installation and network configuration are complete, with final cluster integration expected within 24 hours.
    The MGMT hypervisor was upgraded, improving internal tools and monitoring, but a few misconfigurations created temporary bottlenecks, delaying completion.


    🖥️ Concurrent Operations

    This maintenance coincided with other internal tasks, including interventions on:

    • Several dedicated servers belonging to client infrastructure.

    • Two game nodes undergoing updates and hardware adjustments.

    While these parallel operations were all part of ongoing optimization efforts, scheduling too many maintenance actions within such a short timeframe led to resource overlap and reduced operational efficiency.
    Future procedures will be better spaced and independently scheduled to maintain focus and execution quality.


    ⚠️ Operational Observations

    The planned 30-minute window extended due to:

    • RAID configuration anomalies.

    • Additional diagnostics and reinitialization steps.

    • Cumulative delays from concurrent tasks.

    The late timing also increased operational strain and fatigue, further impacting response times.


    📅 Improvements for Future Maintenance

    • Plan interventions individually and earlier in the day to reduce operational pressure.

    • Avoid simultaneous maintenance across multiple systems unless strictly necessary.

    • Implement structured scheduling and task dependencies to ensure workflow separation.

    • Prepare spare hardware and validated RAID configurations before intervention.


    ✅ Next Steps

    1. Schedule short maintenance to finalize SSD setup on PVE-01.

    2. Complete PVE-05 activation and synchronization.

    3. Validate inter-node monitoring and backup consistency.

    Despite partial complications, all systems are stable, and YorkHost engineering teams remain committed to improving reliability and service continuity across the infrastructure.
    End of Report.

  • Planned
    November 06, 2025 at 7:30 AM
    Planned
    November 06, 2025 at 7:30 AM

    As part of our ongoing virtualization infrastructure improvements, a maintenance operation is scheduled on PVE-01 to install an additional SSD aimed at enhancing storage performance and system reliability.

    ⏱️ Estimated duration: up to 30 minutes
    ⚠️ Expected impact: temporary interruption of hosted services, including Plesk

    This upgrade is part of our continuous effort to ensure optimal performance and service stability across all environments.
    Our engineering teams will closely monitor the operation to minimize downtime and ensure a swift restoration of all services.