The Biblioteca de Catalunya adds 14.256 new versions of .CAT webs at PADICAT repository

26-03-2010

The number of harvested sites of the Catalan Internet has been doubled, up to reach 22.931 digital resources

The Biblioteca de Catalunya (BC) has published the information relative to its repository, PADICAT (Digital Heritage of Catalonia), created to guarantee the preservation of the Catalan digital resources published on the Internet. In march 2010, it preserves 22.931 harvestings from 17.063 web sites, occupying a space of storage of approximately 7,7 Terabytes from a total amount of 186 million files.

These 22.931 harvestings represents an increase of 107% in the last year, from the 11.056 harvested versions of 8.800 digital resources to the 22.931 harvested versions of 17.063 web sites. The reason of this significative growth lies in the extension of the computer nodes managed by the Centre de Supercomputació de Catalunya (CESCA), technological partner of the BC in the preservation of Catalan digital contents on the Internet.

Harvesting of web sites with .CAT domain has been prioritized in the year 2009. Until march 2009 these resources were representing the 18% of PADICAT’s collection, with a total amount of 4.256 harvested versions from 4.000 web sites. One year later, .CAT domain represents the 62% of PADICAT’s resources, with a total amount of 14.256 harvested versions from 14.004 web sites. The incorporation of the .CAT domain to PADICAT’s collection is the result of the cooperation agreement between the BC and the Fundació puntCAT, signed on November 10, 2006.

The Biblioteca de Catalunya systematicly harvests several versions of Catalan web pages on the Internet, thanks to the 437 cooperation agreements signed with entities and associations; 292 digital resources proposed by visitors and users of PADICAT’s web site; 5.472 harvested web sites from monographic collections (museums of Catalonia, electoral campaigns in the successive elections, folk-rock music), and the .CAT domain.

PADICAT is one of the 36 existing web archives in the world, grouped at the International Internet Preservation Consortium, formed by National Libraries and Archives from different countries, with the aim of contributing to the preservation of web pages published on the Internet.

More information available at Crawled web sites