Presentation is loading. Please wait.

Presentation is loading. Please wait.

DArcMail Demonstration D igital Arc hive e Mail System Riccardo Smithsonian Institution Archiving.

Similar presentations


Presentation on theme: "DArcMail Demonstration D igital Arc hive e Mail System Riccardo Smithsonian Institution Archiving."— Presentation transcript:

1 DArcMail Demonstration D igital Arc hive e Mail System Riccardo Ferrante @raferrante Smithsonian Institution Archives @SmithsonianArch Email Archiving Stewardship Tools Workshop Harvard University

2 Email data points Earliest email dated in the late 1980’s First email preserved digitally in 2005 Largest account preserved during CERP 80K emails Favorite example of large account 250,000+ emails Largest account to date = 30 Gb ???,??? emails Most recent account acquired last week 20 GB Primary processing and preservation tool DArcMail Some of the Smithsonian’s email platforms over the past 35 years.

3 Introducing a successor to the CERP Parser DArcMail

4 CERP Parser Works on one message or a whole account Does preservation: MBOX to XML Generates metadata files and attachments directory, etc. (i.e., the “package”) All components are open source, but – Squeak is not a popular platform – Raw XML is ugly – GUI is the order of the day

5 DArcMail in the SI Archives Context Appraisal is a precondition to acquisition. Documentation of accessions, their accessions, etc. happens in SIA’s collection management system (CMS). Digital preservation is as preemptive as possible; it begins as soon as an accession is finalized. Storage packages manually transferred to separate server and LTOs.

6 DArcMail CERP Parser functions plus searching, exporting Simple GUI 4x faster processing Runs on Python and MySQL Puts understanding the account first, preservation second

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26 DArcMail Lifecycle stages outside DArcMail’s scope Appraisal, Capture and preliminary normalization if needed – MS Outlook for PSTs; MBOX client for other formats; Aid4Mail, MessageSave for preliminary normalization Sensitive Data Processing – MS Outlook for PSTs; MBOX client for other formats Repository – Transfer to spinning disk, tape Access – Online Discovery

27


Download ppt "DArcMail Demonstration D igital Arc hive e Mail System Riccardo Smithsonian Institution Archiving."

Similar presentations


Ads by Google