Experiencing disruptions
Last updated just now

Multiple internal machines unreachable →

Internal networking

A switch that provides part of the network to our internal traffic is failing. We are investigating the cause of this failure and a possible solution to turn up the machines.


Altamira supercomputer   (?) Altamira supercomputer related systems.
Batch System   (?) Slurm batch system for Altamira Operational
Login nodes   (?) Altamira login nodes (login1, login2) Operational
Grid and HTC   (?) General purpose batch system and high throughput compute system.
Web and miscelaneous services   (?) Web services, wiki pages and other services.
AAI   (?) Authentication, Authorization and Identity systems.
Networking   (?) Internal and external networking. Disrupted
Storage systems   (?) Distributed storage systems.

Incident history


August 28, 2025 at 7:45 PM UTC

Multiple internal machines unreachable

▲ This issue is not resolved yet
July 31, 2025 at 10:00 AM UTC

[Cloud] Disrupted issues in compute and networking services

Resolved in under a minute
July 29, 2025 at 7:00 AM UTC

[HPC and Grid] Slurm upgrade

Resolved after 53h 0m of downtime
June 24, 2025 at 6:10 AM UTC

Redundancia red cloud / Cloud network redundancy

Resolved after 9h 50m of downtime
April 28, 2025 at 10:30 AM UTC

Electrical blackout

Resolved after 21h 31m of downtime
April 14, 2025 at 7:31 AM UTC

Cloud upgrade

Resolved in under a minute
April 4, 2025 at 10:05 AM UTC

System authentication failing

Resolved in under a minute
April 2, 2025 at 8:06 AM UTC

Ceph Upgrade  ℹ

The Ceph storage system, which provides external storage volumes for the cloud system, is about to be updated. Therefore, there may be some timeouts at some point."
April 1, 2025 at 9:07 AM UTC

Login2 - No login

Resolved after 54h 53m of downtime
February 23, 2025 at 11:28 AM UTC

Ampliacion de potencia del CPD / Datacenter power upgrade

Resolved after 120h 0m of downtime

←   Previous     1 / 7     Next   →