Discovered during hospital filter development, 2026-03-10. Each HCODE links to the hospital page.
The Google Sheet has two columns both named “CIC” in different sections:
❌ Lost during processing — DictReader overwrites this
✅ Survives — second column wins in DictReader
csv.DictReader keys rows by column header. When two columns share a name, the last one silently overwrites the first. The processed hospitals.csv has two identical CIC columns, both containing year data. The original YES/NO staff data is completely lost.
The cleaning script (clean_data.py) has no CIC-specific logic — the corruption is purely from the DictReader duplicate-key behavior. Both source files (hospitals.csv and hospitals-2026-03-03.csv) have identical CIC data, so this isn’t a recent change.
| CIC boolean (col 87) | CIC year (col 113) | Count | Assessment |
|---|---|---|---|
| YES | year | 17 | Consistent — both columns agree |
| YES | ND | 15 | Missing year data — needs backfill |
| NO / empty | year | 6 | Contradictory — needs reconciliation |
| NO | ND | 8 | Consistent — explicitly no CIC |
| Subroga | ND | 1 | SML only |
| empty | ND | 103 | Consistent — no involvement |
These hospitals have a collaboration year recorded but the boolean column says NO or is blank:
| HCODE | Hospital | CIC boolean (col 87) | CIC year (col 113) | Issue |
|---|---|---|---|---|
| COP | Centro Oncológico Pediátrico de Baja California | empty | 2024 | Empty boolean, has year |
| INP | Instituto Nacional de Pediatría | empty | 2023 | Empty boolean, has year |
| MTY | Hospital Universitario UANL | empty | 2023 | Empty boolean, has year |
| CVJ | Hospital del Niño Morelense | NO | 2023 | Explicitly NO but has year |
| MLM | Hospital Infantil de Morelia | NO | 2023 | Explicitly NO but has year |
| TAB | Hospital del Niño Dr. Rodolfo Nieto Padrón | NO | 2023 | Explicitly NO but has year |
These hospitals say YES in the boolean but have no year recorded. The year column needs backfilling:
| HCODE | Hospital | CIC boolean | CIC year |
|---|---|---|---|
| ABC | ABC Medical Center | YES | ND |
| CPE | Campeche | YES | ND |
| GDL | Civil nuevo | YES | ND |
| HMO | Sonora | YES | ND |
| ICM | ICM | YES | ND |
| IQR | IQR | YES | ND |
| IVA | IVA | YES | ND |
| MOC | Moctezuma | YES | ND |
| NOV | 20 de Noviembre | YES | ND |
| OAX | Oaxaca | YES | ND |
| OCC | CMNO | YES | ND |
| PUE | Puebla | YES | ND |
| SGF | SGF | YES | ND |
| XAL | Xalapa | YES | ND |
| ZCL | Zacatecas | YES | ND |
These hospitals have YES in the boolean AND a valid year — no issues:
| HCODE | Hospital | CIC year |
|---|---|---|
| CUL | Culiacán | 2018 |
| ITO | HITO | 2018 |
| LAP | La Paz | 2019 |
| MID | Mérida | 2019 |
| TIJ | Tijuana | 2019 |
| CVM | Tamaulipas | 2021 |
| NIM | IMIEM | 2021 |
| HRA | HRA | 2022 |
| TPQ | Nayarit | 2022 |
| ACP | Acapulco | 2024 |
| CUU | Chihuahua | 2024 |
| CYW | Celaya | 2024 |
| TGZ | Tuxtla | 2024 |
| BJX | Bajío | 2025 |
| LEO | León | 2025 |
| PCA | Pachuca | 2025 |
| MIS | ISSEMYM | 2026 |
clean_data.py
Current dashboard workaround: the filter system treats any non-falsy CIC value (including years) as “Sí” for boolean filtering.