Vast 21st Century Information Archives Are Remarkable Resources

History of Libraries and Archives

The ancient Egyptians invented everything, including libraries and archives. Collections of textual resources similar to modern archives were kept in ancient Egypt from the Old Kingdom onwards, These archives included documents regarding cults, sacred texts, magical texts, and administrative records. Around 200-300 BCE, The Ptolemaic Dynasty spent a great deal of time and money building the famous library at Alexandria. Physical libraries and archives became the norm until the dawn of the digital age. Before the computer age, serious research involved shoe leather and traveling to various libraries and document repositories. Digital computing and storage revolutionized library storage from shelves of books to shelves of data servers. Concurrently, personal computers and hand-held devices expanded CPU processing power and memory. Individuals now have ever-growing processing power and data storage available at their finger-tips. The worldwide web (Internet) allows personal devices to connect to any mega-repository in the world. Any consumer with interest and rudimentary digital device can access virtually the entire corpus of human knowledge.

What will consumers do with this processing power and access to information archives? Very few consumers take advantage of this accumulated knowledge, choosing instead to view cat videos and post pictures of food. The availability of comprehensive human knowledge combined with human disinterest is one of the fundamental ironies of the 21st century.

Size comparison between a 1gb Jazz drive cartridge circa 1998 and a Micro-SD card. Micro-SD cards are now in the terabyte range. The Jazz drive was useful for generic backup. The Jazz drive and the Iomega company are both long gone.

Digital Processing Power and Digital Storage Capacity Explosion

Data size names

Kilobyte (1024 Bytes)
Megabyte (1024 Kilobytes)
Gigabyte (1,024 Megabytes)
Terabyte (1,024 Gigabytes)
Petabyte (1,024 Terabytes)
Exabyte (1,024 Petabytes)
Zettabyte (1,024 Exabytes)
Yottabyte (1,204 Zettabytes)

Moore's law is the observation in 1965 that processing power and storage doubles about every two years. Moore's law is an observation and projection of a historical trend. The prediction continues to be accurate as of 2021, although the mechanism of doubling has deviated slightly from Moore's exact prediction. .
Original general purpose computers evolved into monolithic large mainframes, called vertical scaling. However, as smaller computer (servers) evolved, It became much more economical to have many small servers linked together compared to one gigantic mainframe. The data centers strategy of many smaller servers connected to together is called horizontal scaling. There are around 600 hyper-scale data centers in the world.. A hyper-scale data center has over 5,000 servers.
Global data article in The Conversation.
World Data article at Bernard Marr
In 2018, the total amount of data created, captured, copied and consumed in the world was 33 zettabytes (ZB)

Portion of typical server rack. Image via WikiCommons.

Examples of Primary Source Information Activities

Google.com is the famous master index of most of the internet. Many of the following sites are indexed through Google along with their own local indexes.
Ancestry is an aggregation of genealogy records from many sources. Over 20 billion records.
Antitrust at University of Michigan is a partnership of academic and research institutions, offering a collection of millions of titles digitized from libraries around the world, including Google books.

17,490,052 total volumes
8,428,047 book titles
469,947 serial titles
6,121,518,200 pages

Europeana is a digital aggregator of 3000 European cultural institutions. Approximately 50 million volumes
Wikipedia is a crowd sourced encyclopedia covers many human interest topics not documented elsewhere. 6,374,111 content pages
Idagio is a well-curated broad and deep audio streaming archive of classical music.
Project Gutenberg is an online library of free eBooks started in 2004. 60,000 titles. Project Gutenberg was the first provider of free electronic books, or eBooks. Michael Hart, founder of Project Gutenberg, invented eBooks in 1971 and his memory continues to inspire the creation of eBooks and related content today.
Simansrris is an index of High Quality Collections of Digitized Art and Archival Finds including

Digital Images Collections Guide is a bibliographic list of 85 subjects in 950 digital collections. This is a nested list of other lists each containing links to digital archives.

Aerospace Engineering
African American Studies
African Studies
Agriculture
Agriculture, Biological, and Environmental Sciences
American History
Amphibians
Anatomy
Animal Science
Anthropology
Archeology
Architecture
Art & Art History
Arts
Astronomy
Australia and Oceana History
Biology
Biophysics
Cardiology
Chicano Studies
Chinese Area Studies
Civil Engineering
Classical Studies
Climate & Soil Science
Clothing
Communication Studies
Costume Design
Cultural Studies
Dance
Dentistry
Dermatology
East Asian Studies
Ecology
Embryology
Entomology
European History
Fisheries & Aquatic Zoology
Fisheries & Aquatic Zoology
Forestry
Gender & Women's Studies
Geography
Geology
Greek Classical Studies
Gynaecology
Health (BioMedical) Sciences
History of Medicine
History of Science & Technology
Horticulture
Humanities
Japanese Area Studies
Jewish Studies
Journalism
Kinesiology
Landscape Architecture
Latin American Studies
Local & Regional
Marketing
Medicine (general)
Middle Eastern Studies
Native American Studies
Ornithology (Birds)
Pathology
Performing Arts
Photography
Physical Science & Engineering
Physics
Plant Pathology
Plant Science
Political Science
Primatology
Professional Programs
Religious Studies
Reptiles
Russian & Slavic Studies
Sculpture
Social Sciences
South Asian Studies
Surgical Science
Urology
Veterinary Science
World History

Notes

Jeffrey Sward, September 2021. Statistics were collected in September 2021