4.2.4 Convention that generates persistent, unique identifiers for AIPs Copy URL

The repository shall have and use a convention that generates persistent, unique identifiers for all AIPs.

This is necessary in order to ensure that each AIP can be unambiguously found in the future. This is also necessary to ensure that each AIP can be distinguished from all other AIPs in the repository.

Documentation describing naming convention and physical evidence of its application (e.g., logs).

A repository needs to ensure that there is in place an accepted, standard naming convention that identifies its materials uniquely and persistently for use both in and outside the repository. The ‘visibility’ requirement here means ‘visible’ to repository managers and auditors. It does not imply that these unique identifiers need to be visible to end users or that they serve as the primary means of access to digital objects. Ideally, the unique ID lives as long as the AIP; if it does not, there must be traceability. Subsection 4.2.1 requires that the components of an AIP be suitably bound and identified for long-term management, but places no restrictions on how AIPs are identified with files. Thus, in the general case, an AIP may be distributed over many files, or a single file may contain more than one AIP. Therefore identifiers and filenames may not necessarily correspond to each other. Documentation must represent these relationships.

Picture2

The persistent unique identifier is generated by the domain name of the owner institution, followed by a forward slash, followed by the name, as assigned by the depositor, of the object/bag name. For example virginia.edu/jefferson_collection.

Files have two identifiers – one is a UUID generated by APTrust, and they have a second AIP identifier that is domain name / bag name / path of file within SIP. For example, virginia.edu/jefferson_collection/image.jpg (see Figure 1).

This is fully described at Bagging specifications#How Bag Names Become Object Names