IFCA Computing services

Uh oh! It looks like you have disabled JavaScript. Please enable scripting to enhance your experience on this website.

Experiencing disruptions

Investigating - We are investigating a potential issue that might affect the uptime of one our of services. We are sorry for any inconvenience this may cause you. This incident post will be updated once we have more information.

HPC - Broken Omnipath switch →

Batch System

Tenemos dos switchs Omnipath y uno se ha roto en los nodos de hpc. Estos son los que realizan el intercambio de mensajes para jobs paralelizados, cola compute de Slurm. Estamos a la espera que nos manden recambio

-– — —

We have two Omnipath switches, and one has broken in the HPC nodes. These are the ones that exchange messages for parallelized jobs, Slurm compute queue. We are waiting for a replacement to be sent to us.

► ▲ Altamira supercomputer

Batch System Maintenance

► ▲ Cloud Infrastructure

OpenStack Cloud Public APIs Operational

OpenStack Compute (nova) 🔗 Operational

OpenStack Identity (Keystone) 🔗 Operational

OpenStack Object Storage (Swift) 🔗 Operational

OpenStack Block Storage (cinder) 🔗 Operational

OpenStack Image catalalog (Glance) 🔗 Operational

OpenStack Networking (Neutron) 🔗 Operational

OpenStack Dashboard (Horizon) 🔗 Operational

OpenStack Compute Nodes Operational

► ▲ Grid and HTC

User Interfaces Operational

Computing Elements Disrupted

► ▲ Web and miscelaneous services

Indico Agenda pages 🔗 Operational

Wordpress pages 🔗 Operational

Wiki (Confluence) 🔗 Operational

Videoconference system 🔗 Operational

Chat rooms 🔗 Operational

GitLab 🔗 Operational

IFCA repository mirror 🔗 Operational

Data Science Hub 🔗 Operational

Helpdesk 🔗 Operational

Monitoring 🔗 Operational

Nextcloud 🔗 Operational

► ▲ AAI

Authentication Single Sign-on 🔗 Operational

Authentication main system 🔗 Operational

Authentication replica 🔗 Operational

► ▲ Networking

External network connection Operational

Internal networking Operational

DNS Operational

► ▲ Storage systems

Backup system Operational

Ceph block storage Operational

IBM Spectrum Scale Operational

Tape system Operational

Incident history

Incidencia eléctrica urgente / Urgent electrical incident

Mantenimiento de transformador de edificio / Building transformer maintenance 2024

Cinder update to Wallaby ℹ

[CLOUD] Upgrade Compute and Network OpenStack services

Database down

Cloud OpenStack cluster disrupting failures

Upgrade Slurm urgently because of multiple CRITICAL risk vulnerabilities

Mantenimiento de transformador de edificio / Building transformer maintenance

Nextcloud Upgrade

Actualización de Slurm / Slurm upgrade