To assist content creators in creating, converting and depositing documents that meet the level of quality necessary for full information capture and the highest degree of preservability over time, the CSU-Pueblo Digital Repository (hereafter "CSUPDR") is developing a set of specification and format best-practice guidelines for common content types. NOTE: It is important to refer to the Registered Formats and Support Levels Table when deciding on what format to use. For example, the table clearly shows that a PDF-A document is given the highest level of preservation support versus a Word document.
The CSU-Pueblo Library is committed to providing long-term access to the digital works the repository contains by adhering to digital preservation best practices. However, the level of preservation support provided for a contribution is determined by the file format in which it is submitted. Software, hardware, and file format obsolescence is a complex issue with outcomes that are difficult to predict. This includes the future ability to of the Library to convert obsolete file formats to accessible file formats without any loss to an original document’s look and feel. The CSUPDR will initially provide three levels of preservation support for specific file formats at the outset, as follows:
The CSUPDR will provide its highest level of preservation support, making its best effort to maintain the content, structure and functionality in the future. Level 1 service level is currently provided only for formats that are both publicly documented and widely used. This provides a high degree of confidence in our preservation commitment because it is more likely that tools will exist or be developed to undertake preservation actions, and that those actions will result in an understood and controlled transformation or migration.
Note: The content may be normalized (transformed to another stable format) to provide additional assurance that the information content is preserved or to facilitate discovery and viewing.
The CSUPDR will make limited efforts to maintain the usability of the file. The format will be monitored and may be transformed when significant risk to access is imminent but it is likely to be difficult to predict or control the consequences of any transformation or migration on content, structure or functionality.
Note: The file may be transformed to a more preservable format to ensure that the information content is not lost, even if some structure and functionality are sacrificed.
The CSUPDR provides basic preservation of the file (bitstream) and associated metadata as-is with no active effort made to monitor the format and associated risks or to normalize, transform or migrate the file to a more preservable format. Files may be opened and/or read by future applications, but there is no guarantee that the content, structure, or functionality will be preserved.
Note: Any format not yet reviewed and evaluated by the CSUDR team will receive level 3 service on deposit. A higher level may be assigned after format review takes place.
|
Preservation Level |
Level 1 |
Level 2 |
Level 3 |
|
Persistent identifier that will always point to the object and/or its metadata |
• |
• |
• |
|
Provenance records and other preservation metadata to support accessibility and management over time |
• |
• |
• |
|
Secure storage and backup |
• |
• |
• |
|
Periodic refreshment to new storage media |
• |
• |
• |
|
Fixity checks using proven checksum methods |
• |
• |
• |
|
Storage in a trusted preservable format (making a normalized version, if necessary) |
• |
for some formats |
|
|
Strategic monitoring of format |
• |
• |
|
|
Migration to succeeding format upon obsolescence |
• |
|
|
The three levels of preservation commitment are made at the individual file level. Complex content items comprised of multiple files in various formats will need additional evaluation to determine whether the operational relationships between the files can be maintained. If the original relationships are documented externally in metadata, that information will be preserved in any case. In addition, executables and some files that rely on a specific hardware/software environment will require additional evaluation because not only the format but the access environment must be considered in making a preservation determination.
The following list of formats and support levels will be reviewed and updated based on our growing experience with digital preservation and the emergence of new formats and standards. If you have a format that isn't listed below, please contact Karen Pardue by email at karen.pardue@colostate-pueblo.edu or by phone at (719) 549-2326 for more information.
|
Text, Page Description, and Microsoft Office File Formats |
||||
|
Format |
File Extension |
Mime Type |
Support Level |
Qualifying Factors/Notes |
|
PDF/A* |
|
application/pdf |
Level 1 |
Files not created per the “Best Practices” receive Level 2 support, and may be migrated to PDF/A |
|
Plain Text |
.txt |
text/plain; |
Level 1 |
|
|
Plain Text |
.txt |
text/plain; |
Level 1 |
|
|
Postscript |
.ps |
Application/ps |
Level 2 |
|
|
Rich Text |
.rtf |
Text/richtext |
Level 1 |
|
|
XML |
.xml |
text/xml |
Level 2 |
Deposit of appropriate DTD/schema with XML file is strongly encouraged and may impact preservation. Minimally, XML should be well-formed; explicit namespaces strongly preferred. |
|
HTML |
.html, .htm |
text/html |
Level 3 |
Requires HTML 4.0 or 4.01 validated markup and CSS files(s), if referenced, must be deposited with document. |
*PDF/A is preferred over PDF whenever possible, as it is becoming recognized as the archival standard. See http://en.wikipedia.org/wiki/PDF/a
for more information.
|
Format |
File Extension | Mime Type | Support Level | Qualifying Factors/Notes | ||
|
LaTeX |
.latex |
application/x-latex |
Level 2 |
We encourage that content should be converted to PDF/A by the depositor. |
||
|
TeX |
.tex |
application/x-tex |
Level 2 |
We encourage that content should be converted to PDF/A by the depositor. |
||
|
Microsoft Word |
.doc |
application/msword |
Level 2 |
We recommend that content be converted to PDF/A by the depositor. |
||
|
Microsoft PowerPoint |
.ppt |
application/vnd.ms-powerpoint |
Level 2 |
We recommend that content be converted to PDF/A by the depositor. |
||
|
Microsoft Excel |
.xls |
application/vnd.ms-excel |
Level 2 |
We recommend that content be converted to PDF/A by the depositor. |
||
|
Image File Formats |
||||
|
Format |
File Extension |
Mime Type |
Support Level |
Qualifying Factors/Notes |
|
TIFF |
.tiff |
image/tiff |
Level 1 |
This format is often slow to load unless compressed into a lossless format such as JPEG 2000 |
|
JPEG 2000 |
.jp2 |
image/jp2 |
Level 1 |
Preferred over JPEG |
|
JPEG |
.jpg |
image/jpeg |
Level 1 |
JPEG 2000 preferred. |
|
PNG |
.png |
image/png |
Level 2 |
|
|
BMP |
.bmp |
image/x-ms-bmp |
Level 3 |
|
|
GIF |
.gif |
image/gif |
Level 3 |
|
|
Audio File Formats |
||||
|
Format |
File Extension |
Mime Type |
Support Level |
Qualifying Factors/Notes |
|
Wave |
.wav |
audio/x-wav or audio/wav |
Level 2 |
|
|
MPEG audio |
.mp3 |
audio/mpeg, audio/mp3 |
Level 2 |
|
|
AAC_M4A |
m4a, .mp4 |
audio/m4a, audio/mp4 |
Level 3 |
|
|
AIFF |
.aif, .aiff |
audio/aiff, + |
Level 3 |
.wav or .mp3 preferred |
|
Audio/Basic |
.au, .snd |
audio/basic |
Level 3 |
.wav or .mp3 preferred |
|
Windows Media Audio |
.wma |
audio/x-ms-wma |
Level 3 |
.wav or .mp3 preferred |
|
Video File Formats |
||||
|
Format |
File Extension |
Mime Type |
Support Level |
Qualifying Factors/Notes |
|
MPEG-4 |
.mp4 |
video/mp4 |
Level 2 |
Many variants possible; preservation level not yet established |
|
AVI |
.avi |
video/avi, video/msvideo, video/x-msvideo + |
Level 3 |
.mp4 preferred |
|
Quicktime |
.mov |
video/quicktime, video/x-quicktime |
Level 3 |
.mp4 preferred |
|
MPEG-1 |
.mp1 |
video/mpeg |
Level 3 |
Many variants possible; preservation level not yet established |
|
Windows Media Video |
.wmv |
video/x-ms-wmv |
Level 3 |
.mp4 preferred |
Please contact Karen Pardue by email at karen.pardue@colostate-pueblo.edu or by phone at (719) 549-2326 if you have questions about a particular format.
Used with permission from CSU Libraries Digital Repository page