“Access to [microdata], and their optimum exploitation, requires appropriately designed technological infrastructure, broad international agreement on interoperability, and effective data quality controls. (...)
[The] long-term sustainability of the infrastructure required for data access is particularly important. Research institutions and government organizations should take formal responsibility for ensuring that (...) data are effectively preserved, managed and made accessible in order that they can be put to efficient and appropriate use over the long term. (...)
Specific attention should (also) be devoted to supporting the use of techniques and instruments to guarantee the integrity and security of research data. With regard to guaranteeing the integrity of a data set, every effort should be made to ensure the completeness of data and absence of errors. With regard to security, the data, along with relevant meta-data and descriptions, should be protected against intentional or unintentional loss, destruction, modification and unauthorized access in conformity with explicit security protocols.”
A proper technological infrastructure must be put in place for the various components of microdata archiving, i.e., data documentation, cataloging, dissemination, anonymization, and preservation.
International metadata standards have been developed for the documentation of microdata and related resources. The Data Documentation Initiative (DDI) and the Dublin Core standards provide a practical solution. Documenting datasets in compliance with these standards is made easy by the availability of specialized metadata editors such as the IHSN Microdata Management Toolkit (Nesstar Publisher) software.
Microdata cataloging and dissemination
Interested users need to be informed about the existence and characteristics of the datasets available. Many potential users will have very little if any information about available datasets. Good metadata must be made available, preferably in the form of a searchable on-line catalog. The objective of a microdata catalog is to provide easy access to data and documentation in a format convenient for users. Cataloging datasets in compliance with international metadata standards is made easy by the availability of open-source software provided by the IHSN.
Data anonymization requires that staff are knowledgeable about statistics and software packages such as Stata or SPSS. Some specialized software is available to measure or reduce disclosure risk, but none of these applications provide an integrated and satisfactory solution for complex hierarchical data files. Practically, anonymization remains very much an ad-hoc process. Microdata anonymization involves two steps: Detection of potential instances of disclosure risk and some form of data reduction or perturbation to reduce the risk. This latter step requires input from someone with subject matter knowledge who can recommend data reductions that will be the least damaging to the researchers who will be using the files. Specialized software and guidelines are available.
Microdata (and metadata) preservation
Digital data and metadata are vulnerable to software obsolescence, hardware and media obsolescence, physical threats, and human errors. Long-term preservation of data and metadata therefore requires proper procedures and infrastructure. See the IHSN guidelines.