Prague's municipal digitisation programme has a problem hiding in plain sight. Thousands of duplicate images — scanned photographs, architectural drawings, and urban planning records — are sitting in the city's public repositories, inflating storage costs, confusing researchers, and slowing down access to genuine archival material. Now, officials and technical experts are publicly acknowledging the scale of the issue and debating what to do about it.
The timing matters. Prague City Hall is currently mid-way through a five-year digital infrastructure upgrade, running until 2028, that was budgeted at roughly 1.4 billion Czech crowns. Duplicate image data is not a trivial footnote in that project — it directly affects how efficiently the city can migrate legacy records into the new unified system. The longer the problem is ignored, the more expensive the eventual cleanup becomes.
What the Experts Are Saying
Staff at the Prague City Archive on Archivní Street in Střešovice have been grappling with the problem for at least two years. The archive holds records going back to the 13th century, and its digitisation push accelerated significantly after 2020. According to technical documentation published by the archive, batch scanning operations — particularly of cadastral maps and building permits from the 1950s through the 1980s — produced high duplication rates when multiple departments submitted overlapping source materials without prior deduplication checks.
The Institute of Planning and Development of the Capital City of Prague, known by its Czech acronym IPR Praha and headquartered on Vyšehradská Street in Nové Město, has flagged the issue in internal working papers shared with the City Council's committee on digitalisation. IPR Praha manages extensive photographic and cartographic databases tied to ongoing urban planning projects across districts including Žižkov, Holešovice, and Smíchov. Technical staff at the institute have described the duplication problem as a product of decentralised data entry, where individual project teams uploaded images independently without referencing a shared master catalogue.
Miroslav Baštař, a data governance lecturer at Czech Technical University's Faculty of Civil Engineering in Dejvice, has written about the issue in professional journals, arguing that Prague's situation is not unique — cities including Warsaw and Vienna faced similar challenges during their own digitisation transitions — but that Prague's relatively late adoption of automated deduplication tools has made the backlog larger than necessary. He has estimated, in published academic work, that between 15 and 22 percent of images in mid-sized European municipal archives are duplicates at the point of first audit.
The Path Forward
Prague's Department of Information Technology issued a public procurement notice in May 2026 seeking vendors capable of supplying perceptual hashing and machine-learning-assisted deduplication software for a pilot programme covering approximately 400,000 image files. The pilot is scheduled to run through the fourth quarter of 2026, with results to be reviewed by the City Council's digitalisation committee in January 2027.
Heritage preservation groups have their own concerns. Spolek Za Starou Prahu, the civic association dedicated to preserving Prague's historic built environment and founded in 1900, has publicly urged the city to ensure that deduplication processes include human review for images of listed buildings and protected zones, particularly in Malá Strana and the Jewish Quarter. Automated deletion, the group has argued in correspondence with the city, risks removing what appear to be duplicates but are actually distinct images showing different states of deterioration or renovation.
The practical stakes are real for anyone using Prague's public records. Architects filing permit applications in districts like Vinohrady frequently cross-reference historical imagery from city databases, and redundant files increase search times. Journalists and researchers accessing the National Technical Museum's digitised collections on Kostelní Street in Holešovice encounter similar friction.
The deduplication pilot's results, expected in early 2027, will determine whether the city rolls out the technology across all municipal repositories or opts for a more labour-intensive manual review process. Either way, officials appear agreed that the status quo — hundreds of thousands of redundant files accumulating storage costs estimated at tens of thousands of crowns per month — is no longer sustainable.