Presentation on theme: "Research data spring Enabling Complex Analysis of Large Scale Digital Collections 27/2/2015 Lots of money has been spent digitising heritage collections."— Presentation transcript:
Research data spring Enabling Complex Analysis of Large Scale Digital Collections 27/2/2015 Lots of money has been spent digitising heritage collections. Digitised heritage collections are data. But non- computationally trained scholars don't know what to ask of large quantities of data. Often they do not have access to high performance computing facilities. We aim to address this fundamental problem by extending research data management processes in order to enable novel research and a deeper understanding of emerging research needs.
Team 18/02/2015Enabling Complex Analysis of Large Scale Digital Collections2 James Baker Curator, Digital Research Melissa Terras Prof of Digital Humanities David Beavan Senior Research Associate Martin Zaltz Austwick Lecturer in Data Visualisation
Scope and Gap 18/02/2015Enabling Complex Analysis of Large Scale Digital Collections3 Non-computationally trained scholars don't know what to ask of large quantities of digitised data Large scale digitised collections are delivered in ad hoc forms. Exemplar workflows for analysis of large scale digitised collections are hard to find Deploy and index large scale British Library (BL) digitised collections at UCL Research IT Services (UCL RITS). Work with researchers to turn their research questions into computational analysis. Create and release derived data, queries, and visualisations (that demonstrate potential use) as citeable, CC-BY workflow packages “I want to know all the sentences that mention European cities circa 1850 to 1900 in a BL digitised texts and take away those results as a data set”
Impact and Benefits 18/02/2015Enabling Complex Analysis of Large Scale Digital Collections4 Outputs from phase one of the project would be used as case studies and exemplars engage a wider community and reduce research inefficiency The project will generate engagement with new scholarly communities around rich data resources Narratives and workflows would be used in interdisciplinary teaching at host institutions (Melissa: MA/MSc Digital Humanities, Martin: BASc Arts and Science, MRes Advanced Spatial Analysis and Visualisation; James: BL Doctoral Training, MA History, University of Kent)
Sustainability 18/02/2015Enabling Complex Analysis of Large Scale Digital Collections5 Derived data, queries, documentation, and visualisations released as citeable, CC-BY workflow packages with DOIs (DataCite or Figshare) Workflow packages embedded in teaching and research training Research computing communities beyond UCL deepen understanding of complex, poorly structured, and heterogeneous humanities data to enable process improvement Through BL Labs, university teaching, and BAU outreach activities, narratives and lessons learned will have substantial life beyond of the project
Outputs, milestones and indicators of success 18/02/2015Enabling Complex Analysis of Large Scale Digital Collections6 To month 3: ●Deploy 68k digitised books (circa 4bn words!) at UCL ●Identify 3+ early career researchers (2 in hand) ●Run multi-day pilot workshop in partnership with all parties, to work iteratively on data, workflow and research questions ●Output: workflow packages, derived data, visualisations to enable research insights Social & technical barriers to analysis of large scale digitised collections are reduced To month 7: ●Lead workshops and hackdays for the wider research community ●Deploy new BL datasets (based on researcher needs) ● Consolidate workflow packages and recipes ●Gather requirements for future infrastructure development (beyond scope of the project) To month 13: ●Recruit data champions to drive wider adoption of methods ●Support community led workshops focussed on specific domain needs and challenges ●Create cookbook from recepies
Funding 18/02/2015Enabling Complex Analysis of Large Scale Digital Collections7 To month 3: UCL RITS Development: £5,500 Materials Development, Management and Administration:£10,025 Delivery of pilot workshops: £4,100 Total, full economic cost: £19,625