Presentation is loading. Please wait.

Presentation is loading. Please wait.

I / O: Care & Feeding of Your EMu Larry Gall Computer Systems Office Peabody Museum of Natural History Yale University.

Similar presentations


Presentation on theme: "I / O: Care & Feeding of Your EMu Larry Gall Computer Systems Office Peabody Museum of Natural History Yale University."— Presentation transcript:

1

2 I / O: Care & Feeding of Your EMu Larry Gall Computer Systems Office Peabody Museum of Natural History Yale University

3 I / O: Care & Feeding of Your EMu

4 I / O: Care & Feeding of Your EMu

5 I / O: Care & Feeding of Your EMu

6 I / O: Care & Feeding of Your EMu

7 I / O: Care & Feeding of Your EMu predictive text?

8 I / O: Care & Feeding of Your EMu

9 I / O: Care & Feeding of Your EMu

10 I / O: Care & Feeding of Your EMu an I/O bottleneck

11 I / O: Care & Feeding of Your EMu an I/O bottleneck

12 I / O: Care & Feeding of Your EMu an I/O bottleneck

13 I / O: Care & Feeding of Your EMu

14 I / O: Care & Feeding of Your EMu

15

16

17

18 Brief Peabody I/O

19 EMus Expand Exponentially

20 Brief Peabody I/O EMus Expand Exponentially Slashing EMu Before It’s Too Late

21 Brief Peabody I/O EMus Expand Exponentially Slashing EMu Before It’s Too Late Boost Your Performance/Nightlife

22 Brief Peabody I/O EMus Expand Exponentially Slashing EMu Before It’s Too Late Boost Your Performance/Nightlife I I O

23 ~14 million specimens

24 AnthropologyBotanyEntomology Invertebrate Paleontology Invertebrate Zoology Mineralogy & Meteoritics Paleobotany Scientific Instruments Vertebrate Paleontology Vertebrate Zoology

25 Peabody Collections Current Digital Snapshot Anthropology 325,000Lot Botany 400,000Individual Entomology 450,000Lot / Individual Invertebrate Paleontology 350,000Lot Invertebrate Zoology 350,000Lot Mineralogy & Meteoritics 35,000Individual Paleobotany 150,000Individual Scientific Instruments 5,000Individual Vertebrate Paleontology 125,000Individual Vertebrate Zoology 185,000Lot / Individual Items with an electronic record available (25 years effort): 64 % ~14 million items => ~2.7 million databaseable units

26 Peabody Collections Current Digital Snapshot Anthropology 325,000Lot Botany 400,000Individual Entomology 450,000Lot / Individual Invertebrate Paleontology 350,000Lot Invertebrate Zoology 350,000Lot Mineralogy & Meteoritics 35,000Individual Paleobotany 150,000Individual Scientific Instruments 5,000Individual Vertebrate Paleontology 125,000Individual Vertebrate Zoology 185,000Lot / Individual > 80% > 50% < 50% Items with an electronic record available (25 years effort): 64 % ~14 million items => ~2.7 million databaseable units

27 Peabody Collections Current Digital Snapshot Anthropology 325,000Lot Botany 400,000Individual Entomology 450,000Lot / Individual Invertebrate Paleontology 350,000Lot Invertebrate Zoology 350,000Lot Mineralogy & Meteoritics 35,000Individual Paleobotany 150,000Individual Scientific Instruments 5,000Individual Vertebrate Paleontology 125,000Individual Vertebrate Zoology 185,000Lot / Individual > 80% > 50% < 50% Items with an electronic record available (25 years effort): 64 % ~14 million items => ~2.7 million databaseable units 295,921 digital assets mostly JPG & TIF, variety of other MIME types

28 Brief Peabody I/O

29 EMus Expand Exponentially

30

31

32

33

34

35

36 New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module “Pork” may hide in plain sight SummaryData/ExtendedData AdmOriginalData remote SummaryData copies EMus Expand Exponentially

37 New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module “Pork” may hide in plain sight SummaryData/ExtendedData AdmOriginalData remote SummaryData copies EMus Expand Exponentially

38 New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module “Pork” may hide in plain sight SummaryData/ExtendedData AdmOriginalData remote SummaryData copies EMus Expand Exponentially

39 New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module “Pork” may hide in plain sight SummaryData/ExtendedData AdmOriginalData remote SummaryData copies EMus Expand Exponentially

40 New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module “Pork” may hide in plain sight SummaryData/ExtendedData AdmOriginalData remote SummaryData copies EMus Expand Exponentially

41 New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module “Pork” may hide in plain sight SummaryData/ExtendedData AdmOriginalData remote SummaryData copies EMus Expand Exponentially SLASH

42 Brief Peabody I/O EMus Expand Exponentially

43 Brief Peabody I/O EMus Expand Exponentially Slashing EMu Before It’s Too Late

44 porky EMus can be surly, and they will bite you Slashing EMu before It’s Too Late

45 porky EMus can be surly, and they will bite you Slashing EMu before It’s Too Late

46 porky EMus can be surly, and they will bite you Slashing EMu before It’s Too Late

47 porky EMus can be surly, and they will bite you Slashing EMu before It’s Too Late

48 porky EMus can be surly, and they will bite you Slashing EMu before It’s Too Late

49 porky EMus can be surly, and they will bite you Slashing EMu before It’s Too Late

50 porky EMus can be surly, and they will bite you Slashing EMu before It’s Too Late

51 slashing : Halloween

52

53 Jason

54 Leatherface

55 Chucky

56 Freddy Kruger

57

58

59 Freddy EMuger

60 Slash that EMu beast !

61 Slashing EMu before It’s Too Late eparties New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module EMu beast hiding in plain sight: AdmOriginalData S ummaryData/ExtendedData remote SummaryData copies

62 New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module EMu beast hiding in plain sight: AdmOriginalData S ummaryData/ExtendedData remote SummaryData copies Slashing EMu before It’s Too Late eparties AdmOriginalData

63 Slashing EMu before It’s Too Late eparties null data rows New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module EMu beast hiding in plain sight: AdmOriginalData S ummaryData/ExtendedData remote SummaryData copies AdmOriginalData

64 Slashing EMu before It’s Too Late eparties New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module EMu beast hiding in plain sight: AdmOriginalData S ummaryData/ExtendedData remote SummaryData copies AdmOriginalData

65 Slashing EMu before It’s Too Late eparties New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module EMu beast hiding in plain sight: AdmOriginalData S ummaryData/ExtendedData remote SummaryData copies AdmOriginalData Slashed by 31%

66 Slashing EMu before It’s Too Late eparties Slashed by 31% Freddie says why stop there ? New records are entered Existing records expand: more fields filled in more links established Records acquire new features: Darwin Core fields GUIDs (unique identifiers) New capabilities add records: Audit module Statistics module EMu beast hiding in plain sight: AdmOriginalData S ummaryData/ExtendedData remote SummaryData copies AdmOriginalData

67 sites – round 2 constant data Slashing EMu before It’s Too Late ecollectionevents

68 sites – round 2 lengthy labels Slashing EMu before It’s Too Late ecollectionevents

69 sites – round 2 prefixes for temporary use during migration Slashing EMu before It’s Too Late ecollectionevents

70 sites – round 2 Slashing EMu before It’s Too Late ecollectionevents

71 data rec seg ecatalogue Slashing EMu before It’s Too Late

72 Crunch 2 data rec seg delete nulls from AdmOriginalData Slashing EMu before It’s Too Late ecatalogue

73 Crunch 3 data rec seg delete nulls from AdmOriginalData shorten labels on AdmOriginalData Slashing EMu before It’s Too Late ecatalogue

74 Crunch 4 data rec seg delete nulls from AdmOriginalData shorten labels on AdmOriginalData delete prefixes on AdmOriginalData Slashing EMu before It’s Too Late ecatalogue

75 Crunch 4 data rec seg delete nulls from AdmOriginalData shorten labels on AdmOriginalData delete prefixes on AdmOriginalData Slashed by 55% Slashing EMu before It’s Too Late ecatalogue

76 Slashing EMu before It’s Too Late allowed adding in Darwin Core data, with a net disk space reduction

77 Slashing EMu before It’s Too Late methodologies used during the first pass slashings

78 Slashing EMu before It’s Too Late methodologies used during the first pass slashings Boring, repetitive, nothing very fancy: Iterative server-side scripting (texexport, texload) Several million record updates were involved Manually tweaked nightly cron jobs to accommodate Conducted during evenings over a six month period Watched closely to avoid taxing server performance

79 Slashing EMu before It’s Too Late Now we could do the following every night: Compact maintenance gets run on all modules (3.5 hours) Cron-ed plain text data dumps for all modules (3.5 hours): generate small, portable gzipped backups of all EMu data fully reinstantiate SQL database feeding local search portal fully reinstantiate SQL database feeding DiGIR/IPT services methodologies used during the first pass slashings

80 Slashing EMu before It’s Too Late Now we could do the following every night: Compact maintenance gets run on all modules (3.5 hours) Cron-ed plain text data dumps for all modules (3.5 hours): generate small, portable gzipped backups of all EMu data fully reinstantiate SQL database feeding local search portal fully reinstantiate SQL database feeding DiGIR/IPT services rather brutish gladiator-style slashing, needs operator intervention

81 Slashing EMu before It’s Too Late how about more subtle slashing ?

82 Slashing EMu before It’s Too Late something a little bit more insidious, and automated

83 Slashing EMu before It’s Too Late something a little bit more insidious, and automated

84 Slashing EMu before It’s Too Late Nurse Ratched shots and pills

85 Slashing EMu before It’s Too Late shots and pills Nurse Ratched

86 Slashing EMu before It’s Too Late Nurse Ratched shots and pills

87 Slashing EMu before It’s Too Late

88 Nurse Ratched Nurse RatchEMu

89 Slashing EMu before It’s Too Late Nurse Ratched Nurse RatchEMu

90 catalogue – round 2 data rec seg BEFORE Slashing EMu before It’s Too Late

91 catalogue – round 2 data rec seg BEFORE Slashing EMu before It’s Too Late SummaryData

92 catalogue – round 2 data rec seg BEFORE Slashing EMu before It’s Too Late SummaryData

93 catalogue – round 2 data rec seg AFTER Slashing EMu before It’s Too Late SummaryData

94 catalogue – round 2 data rec seg AFTER Slashing EMu before It’s Too Late Slashed by 29% SummaryData

95 catalogue – round 2 data rec seg SummaryData ExtendedData AFTER Slashing EMu before It’s Too Late Slashed by 29% SummaryData

96 catalogue – round 2 data rec seg AFTER Slashing EMu before It’s Too Late

97 texadmin – insert the slasher pills into validation segments

98 Slashing EMu before It’s Too Late texadmin – insert the slasher pills into validation segments

99 Slashing EMu before It’s Too Late texadmin – insert the slasher pills into validation segments emureindex: a Perl script in your ~emu/bin directory system(“texdesign –R $dbname /dev/null 2>&1”);

100 Slashing EMu before It’s Too Late texadmin – insert the slasher pills into validation segments emureindex: a Perl script in your ~emu/bin directory system(“texdesign –R $dbname /dev/null 2>&1”); slasher pills are reversible ! slasher pills work great on “visible” fields: (anything you see on screen and feel like slashing) slasher pills work great on “invisible” fields: (remote SummaryData strings copied from linked records)

101 Slashing EMu before It’s Too Late texadmin – insert the slasher pills into validation segments ecatalogue change Records:986,3611,557, % Disk use:10.4 gB6.3 gB-39.4% Record size:11.1 kB4.3 kB-61.8%

102 Brief Peabody I/O EMus Expand Exponentially Slashing EMu Before It’s Too Late

103 Brief Peabody I/O EMus Expand Exponentially Slashing EMu Before It’s Too Late Boost Your Performance/Nightlife

104 Now we could do the following every night: Compact maintenance gets run on all modules (3.5 hours) Cron-ed plain text data dumps for all modules (3.5 hours): generate small, portable gzipped backups of all EMu data fully reinstantiate SQL database feeding local search portal fully reinstantiate SQL database feeding DiGIR/IPT services

105 2014 every night: Compact maintenance gets run on all modules (1.4 hours) Cron-ed plain text data dumps for all modules (2.3 hours): generate small, portable gzipped backups of all EMu data fully reinstantiate SQL database feeding local search portal fully reinstantiate SQL database feeding DiGIR/IPT services

106 2014 every night: Compact maintenance gets run on all modules (1.4 hours) Cron-ed plain text data dumps for all modules (2.3 hours): generate small, portable gzipped backups of all EMu data fully reinstantiate SQL database feeding local search portal fully reinstantiate SQL database feeding DiGIR/IPT services 1. Pushing newly created multimedia files to Yale DAM 2. Pushing metadata updates to extant multimedia files to Yale DAM 3. OAI-PMH record harvesting by Yale Cross Collections search 4. Updating archives fonds (EAD) in Yale Finding Aid Database

107

108 n=18

109

110 1. output of the command “texlist –s” 2. time to run compact maintenance on all modules 3. time to run compact maintenance on just catalogue

111 diff emureindex emureindex.ypm 288c288 < echo “ Compacting database...” --- > echo “ Compacting database... `/bin/date`” 301c301 < echo “ Reconfiguring database...” --- > echo “ Reconfiguring database... `/bin/date`” 1. output of the command “texlist –s” 2. time to run compact maintenance on all modules 3. time to run compact maintenance on just catalogue

112 Time to complete compact maintenance on all modules (hours)

113

114 Number of records (x) and disk occupancy of records (y) among 18 KE clients

115 156 million ~1 TB Number of records (x) and disk occupancy of records (y) among 18 KE clients

116 156 million ~1 TB 553 million records, 2.7 TB! * Number of records (x) and disk occupancy of records (y) among 18 KE clients

117 156 million ~1 TB 553 million records, 2.7 TB! * * 434 million are eaudit and estatistics, “only” 119 million for all other modules combined Number of records (x) and disk occupancy of records (y) among 18 KE clients

118 29 million ecatalogue, 320 gB Number of records (x) and disk occupancy of records (y) among 18 KE clients

119 Percent of records that are eaudit and estatistics among 18 KE clients

120 somewhat greater range of variability Number of records (x) and disk occupancy (y) for emultimedia among 18 KE clients

121

122

123

124

125 Slashed by 82% all EXIF / XMP metadata remains in image headers

126 Yale DAM infrastructure

127 Yale DAM infrastructure

128 Yale DAM infrastructure

129 Yale DAM infrastructure

130 Yale DAM infrastructure

131 Yale DAM infrastructure

132 ALT-TUD it Yale DAM infrastructure

133 ALT-TUD it Yale DAM infrastructure

134 emultimedia EMu records:32,252142,350295,921 EMu disk use:22 gB83 gB52 gB DAM disk use:n.a.125 gB14,336 gB Yale DAM infrastructure

135 Know thyself, and thine own EMu

136 Slash early, slash often

137 as has become traditional…

138 We saw this slide already, you say It’s a trio of hackers holding sway Out of Melbourne came a fightin’ (2) A text database known as Titan Which would morph into EMu one day

139 We saw this slide already, you say It’s a trio of hackers holding sway Out of Melbourne came a fightin’ (2) A text database known as Titan Which would morph into EMu one day

140 That brand EMu is used for many things Just Google it and see what that brings An assortment of oils and gels Practically anything that sells To calm dry skin, bad rashes, and stings

141 Peabody’s EMu morphs often on screen Through the years how many have you seen? Is Photoshopping like this a sign Of some maladay unfortunately mine Has my daughter Jen inherited this gene?

142 In fact, I’d gotten it directly from Jim My late grandfather, who would spout it on a whim At family occasions when we did gather Or in longhand letters when he’d rather Write his brother-in-law from Omaha named Slim

143 In horror movies they slash, scream, and maul Everything in their paths, big and small Yet Freddy and Chucky don’t seem so gritty When adorned on a pooch or a kitty Maybe that’s worse – I can’t say, your call

144 That Swedish connection was definitely clear When John Doolan was bending our ear KE staff and Abba merged together (2) In white satin, boots and leather Just like these EMus of pop fame and endear

145 That Swedish connection was definitely clear When John Doolan was bending our ear KE staff and Abba merged together (2) In white satin, boots and leather Just like these EMus of pop fame and endear

146

147 I/OI/O … ^ ^ I/O, I/O, it’s off to Axiell we go To a new computing frontier And there’s nothing to fear So they say, hope its so, Hope it’s so, I dunno

148 I/OI/O … ^ ^ In yonder eras Liza was a catch To her entourage young men would attach Oh, here’s another famous actor (2) A comedian, and no detractor Were I to say that these four are a match

149 I/OI/O … ^ ^ In yonder eras Liza was a catch To her entourage young men would attach Oh, here’s another famous actor (2) A comedian, and no detractor Were I to say that these four are a match

150 Now here is a fanciful sight KE staff dressed in royal delight Evening will be beckoning soon (2) And will bring laughs, drinks, and a tune Let's party at the reception tonight

151 Now here is a fanciful sight KE staff dressed in royal delight Evening will be beckoning soon (2) And will bring laughs, drinks, and a tune Let's party at the reception tonight

152 We've finally come to the end Of the doggerel, my fine feathered friend It was all I could do not to faint When revealed in body paint (2) Are Aussie EMus so gaudily penned

153 We've finally come to the end Of the doggerel, my fine feathered friend It was all I could do not to faint When revealed in body paint (2) Are Aussie EMus so gaudily penned

154


Download ppt "I / O: Care & Feeding of Your EMu Larry Gall Computer Systems Office Peabody Museum of Natural History Yale University."

Similar presentations


Ads by Google