GRAHAM GREENLEAF & PHILIP CHUNG UNSW FACULTY OF LAW & CO-DIRECTORS, AUSTLII JURISIN WORKSHOP, MIYAZAKI, JAPAN NOVEMBER 2012 1 Policies and technologies.

Slides:



Advertisements
Similar presentations
Full free access to law: Global policy aspects Professor Graham Greenleaf Graham GreenleafGraham Greenleaf Co-Director, AustLII & WorldLII 6th Law via.
Advertisements

Open Access Niamh Brennan Trinity College Dublin DRIVER Summit, Goettingen, January 17th 2008 Local Integration, National Federation TCD-RSS, TARA, IReL-Open,
History Study Center Primary and secondary sources documenting global history 2010.
28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.
1 Answer to the Questions and Comments on the Services of the National Diet Library NCC 2007 Open Meeting Friday, March 23, 2007 Nobuya AIHARA Reader Service.
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
1. The Digital Library Challenge The Hybrid Library Today’s information resources collections are “hybrid” Combinations of - paper and digital format.
Sourcebook for Hydrogen Applications Expansion & Dissemination Jeffrey Serfass General Manager Partnership for Advancing the Transition to Hydrogen APEC.
Sharing Enterprise Data Data administration Data administration Data downloading Data downloading Data warehousing Data warehousing.
Features and Uses of a Multilingual Full-Text Electronic Theses and Dissertations (ETDs) System Yin Zhang Kent State University Kyiho Lee, Bumjong You.
IAEA International Atomic Energy Agency ICSTI 2013 Annual Members’ Meeting March 2013.
Engineering Village ™ ® Basic Searching On Compendex ®
Serving up Statistics to an International Community IASSIST Conference Brian Buffett May 2003.
Depositing and Disseminating Digital Resources Alan Morrison Collections Manager AHDS Subject Centre for Literature, Linguistics and Languages.
Introduction to Implementing an Institutional Repository Delivered to Technical Services Staff Dr. John Archer Library University of Regina September 21,
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
1 CS 502: Computing Methods for Digital Libraries Lecture 27 Preservation.
Advanced Legal Research - Refugee Law Legal Research Skills Adviser August 2013 Office of Teaching & Learning Melbourne Law School.
Lecture-8/ T. Nouf Almujally
New Delhi, 19/12/07 Legal Information Institutes: What do they offer India and the SAARC region? Graham Greenleaf, Professor of Law, University of New.
WHAT DOES THE ACADEMIC LIBRARY COMMUNITY WANT FROM CROSSREF? James G. Neal CrossRef Annual Member Meeting 25 September 2002.
HTML Comprehensive Concepts and Techniques Intro Project Introduction to HTML.
Management, marketing and population of repositories Morag Greig, University of Glasgow.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
Educational Research Theses : Online Communities and Partnerships Sue Clarke Manager, Cunningham Library, ACER ETD2005: Evolution through discovery 28.
South African Education Portal
Legal Information Institutes: What do they offer the SAARC region and regional trade? Graham Greenleaf, Professor of Law, University of New South Wales,
AsianLII and other Legal Information Institutes in the Asia-Pacific: Assisting Courts and open justice Graham Greenleaf, Professor of Law, University of.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Presentation Outline What is a wiki? How does wiki work? Choosing a Wiki plan The educational benefits of a Wiki Wikis in higHeR eDucation Plans and Pricing.
WorldLII’s International Courts & Tribunals Project Responding to the fragmentation of international law Presentation by Philip Chung & Andrew Mowbray.
Other WorldLIIs - A modestly decentralised proposal for global free access Graham Greenleaf Co-Director AustLII / WorldLII / HKLII.
OpenURL Link Resolvers 101
THE ROAD TO OPEN ACCESS A guide to the implementation of the Berlin Declaration Frederick J. Friend OSI Open Access Advocate JISC Consultant Honorary Director.
Max Planck Encyclopedia of Public International Law Edited by Rudiger Wolfrum, Director of the Max Planck Institute for Comparative Public Law and International.
A.ABDULLAEV, Director of the Public Fund for Support and Development of Print Media and Information Agencies of Uzbekistan.
WorldLII: A new home for Commonwealth law online Graham Greenleaf, Philip Chung, and Andrew Mowbray 13th Commonwealth Law Conference Melbourne, April 2003.
Guide: DR. R. BALASUBRAMANI Assistant Professor Department of LIS Bharathidasan University Tiruchirappalli.
Click on the tab to find journals by Subjects. From the drop down menu, we will select Parasitology and Parasitic Diseases.
Legal Information Institutes: Creating law's global commons Unlocking IP Session 3 Graham Greenleaf University of New South Wales Co-Director, AustLII.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
European Commission on Preservation and Access Preservation of digital heritage Yola de Lusenet Lisbon, November
The International Privacy Law Library Graham Greenleaf Professor of Law & Information Systems, UNSW Co-Director, AustLII GPEN open Seminar, International.
1 ARRO: Anglia Ruskin Research Online Making submissions: Benefits and Process.
New approaches to expanding public rights in copyright Graham Greenleaf Professor of Law, UNSW Co-Director, Baker & McKenzie Cyberspace Law and Policy.
Free Access to Asian law via the Asian Legal Information Institute (AsianLII) Graham Greenleaf UNSW Faculty of Law Co-Director, AustLII & AsianLII
Getting Started with SharePoint 2010 Gareth Johns IT Skills Development Advisor.
OWL Representing Information Using the Web Ontology Language.
| 1 Open Access Advancing Text and Data Mining Libraries & Publishers working together to support Researchers What is Text Mining?
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
National Library of the Czech Republic Integration of digital materials into EDL Adolf Knoll National Library of the Czech Republic Helsinki CENL Workshop.
1/16/2016I. Revels Digital Imaging Workshop 1 Selection Considerations For Digital Imaging Projects.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
From Access to Archive Transforming Scholars Portal into an E-Journal Archive.
To find journals by language of publication, click on the Languages bar in the horizontal frame. The Languages drop down menu appear and we will choose.
The Future of Scholarly Communication & the Role of Libraries Roy Tennant eScholarship, The California Digital Library.
Modernizing Official Statistics 1. PURPOSE OF OFFICIAL STATISTICS The purpose of official statistics is to produce and disseminate authoritative results.
Using Content Presented by Karen Andrews Physical Sciences & Engineering Librarian, U.C. Davis Tuesday, September 13, :30-9:30 ASIDIC Fall 2005 Meeting.
Prof. Emily Ryan PA 101.  Primary sources are actual statements of the law.  Enormous amounts of primary source materials available are issued chronologically.
The overall classification of this briefing is UNCLASSIFIED US Africa Command Africa Partner Country Network Overview Mr. Jordan Pritchard AFRICOM KM.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Launch of AsianLII’s Myanmar/Burma legal databases Launch hosted by AustLII and by the Australia/Myanmar Constitutional Development Project.
Chapter 1 Introduction to HTML
Accessibility of Judicial Decisions on the Internet –
The International Privacy Law Library
Introduction of KNS55 Platform
Unit# 5: Internet and Worldwide Web
Intro Project Introduction to HTML.
Presentation transcript:

GRAHAM GREENLEAF & PHILIP CHUNG UNSW FACULTY OF LAW & CO-DIRECTORS, AUSTLII JURISIN WORKSHOP, MIYAZAKI, JAPAN NOVEMBER Policies and technologies in development of free access to legal information AustLII’s experience,

20 years ago 2 General environment  Privatization/commercialisation of government assets/services  Mosaic graphical browser (93): large-scale web uptake Legal informatics  IAAIL had started in 1987 (‘expert systems’ boom)  1 st generation online commercial legal retrieval systems  In Australia, up to $720/hour and almost no users  Brief success of law on CD-ROMs  No free access to legal information anywhere  NZ sold the only digital copy of its statutes

What would you have predicted about the web and access to law? 3 There was no guarantee that the web would deliver legal information for free access There were many interests (government and commercial) opposed to anything being free A few developments anticipated free content  Open source software: Apache, Perl etc  Mosaic, Netscape were free but not open source

AustLII’s origins 4 10 years experience in legal expert systems with own software (Mowbray) and experience in large-scale hypertexts and databases Frustration in not being able to obtain Australian cases and legislation for teaching or research Two host Law schools (UNSW and UTS) valued as academic success (i) community service (ii) ‘social justice’ and (iii) novel forms of publication Both government and commercial legal publishers were unprepared for the web  ‘first mover advantage’ 1995: We obtained a $100,000 academic grant

AustLII does research to build ‘research infrastructure’ 5

AustLII in 2012: Snapshot 6 Completely free and anonymous access to all 530 Australian & NZ databases  New one every 2 weeks since 1995 Over 600,000 page accesses per day  Dominant online provider in Australasia Million searchable & interlinked documents Small staff of about 10 people (equivalent full-time) Go8/ATN (15 U) study of ‘research impact’ over last 10 years found AustLII had greatest ‘social impact’ AustLII does research to build ‘research infrastructure’

7

AustLII international projects 8 Since 2000, AustLII has been asked to help built new LIIs internationally, including: NameScopeFrom BAILIIUK & Ireland2000 PacLIIPacific Islands (17 countries)2001 HKLIIHong Kong2002 NZLIINew Zealand2005 SAFLIIS & E Africa (10 countries)2006 LII of IndiaIndia (36 jurisdictions)2010 LiberLIILiberia2011

AustLII operates 3 international portals in cooperation with 12 other LIIs 9 Asian Legal Information Institute (AsianLII) 300 Databases from 28 Asian countries Commonwealth Legal Information Institute (CommonLII) 905 Databases from 56 common law countries World Legal Information Institute (WorldLII) 1255 Databases from all 15 collaborating LIIs

[A] 10 Key policy elements in the development of AustLII Insistence on the right to republish (above all else) 2. Differentiation of ‘value adding’ was rejected 3. Automation for sustainability 4. Collaborations with data sources 5. An expansive view of content 6. Serve all audiences 7. Independence 8. The platform comes first 9. Help, and collaborate with, other LIIs 10. A publisher, not a repository

1Insistence on the right to republish 11 Core policy of free access : the right to republish any ‘public legal information’  Stressed by AustLII from 1995 onward (over) In Australia, this was our political demand  Argued with governments for 5 years to free the law  ‘Right to republish’ was by 2000 accepted by all official legal sources

AustLII’s ‘obligations of official sources’ (1995) AustLII (1995) advocated 6 obligations of official legal data sources, as necessary for ‘full free access’: 1. Provision in a completed form, including additional information best provided at source (eg consolidation) 2. Provision in an authoritative form, including citations 3. Provision in the form best facilitating dissemination 4. Provision to any 3rd-P republisher on a marginal-cost-basis 5. Provision with no re-use restrictions or licence fees 6. Preservation of a copy by the public authority Main point: Source self-publication is only useful (more choice), not essential. Right of republication is essential. ‘Free access’ to law is closer to ‘free speech’ than ‘free beer’ 12

The right to republish, internationally 13 Overseas, we observe and utilise local law. Most countries exempt legal sources from copyright Example: Japanese Copyright Law A13 Exempts constitution, laws and regulations; notifications etc; translations and compilations thereof by state, local or independent administrative organs. Result is that we can include official translations of Japanese law in the databases on AsianLII

2No ‘value adding’ differentiation 14 We rejected any distinction between ‘basic’ and ‘value-added’ versions of legal sources 1. Today’s value adding is tomorrow’s commonplace  It is a meaningless distinction 2. It would be an excuse for public bodies to withhold the best versions of public legal information  Attempts to only give AustLII PDFs (not RTFs) or ‘old’ data have been resisted and defeated 3. It could create a conflict of interests within AustLII between what would be free and ‘AustLII+’ You can’t be a little bit free or a little bit pregnant

3. Automation for sustainability 15 No significant editorial intervention in texts is possible  ‘Free access’ imposes strict financial disciplines All AustLII database creation is very largely automated  Example: automatic conversion of case-law streams into marked up databases with hypertext links  Philip will give details Any projects/systems we develop have to be capable of immediate large scale deployment  We don’t have the capacity to do experimental ‘pure’ research for its own sake

4Collaborations with data sources 16 We must accept all data available and deal with it AustLII sought and developed (often over 15 years or more) close policy and technical collaboration with all its data sources  Over 200 Courts & Tribunals decisions to AustLII in (relatively) consistent formats  10 legislative offices consult on structured data formats, but only moderate consistency  BUT insisting on the right to republish comes first This cooperation makes automation possible

5An expansive view of content 17 From the outset, AustLII tried to include all forms of ‘public legal information’ ‘Five pillars of free access content’ emerged  legislation, case law, treaties, (non-commercial) legal scholarship and law reform reports. Interconnecting these different legal sources in every way possible was always a main goal  All documents (except Acts) have consistent citations imposed by AustLII – this has made interconnections possible AustLII’s aim is now comprehensive coverage of Australasian law, both horizontally (all sources) and vertically (historically)

‘Vertical’ comprehensiveness: ‘Colonial Legal History Library’ 18

2.275 Million searchable documents by 4 main content types 19

145M page accesses in 2012 by 4 main content types 20

Demonstration search over AustLII 21 Search for materials on imprisonment for debt  debt* w/10 (prison or imprisonment) debt* w/10 (prison or imprisonment) Some features of the search results  Finds all types of content except treaties  Conventional relevance ranking (inverse document frequency etc)  Finds materials from all 10 jurisdictions  Finds materials ranging from 26/11/12 to 1796  Hypertext links between all types of materials  Citator shows history of citation of cases

6Serve all audiences 22 Aim to serve all audiences for legal information  Eg decisions of many small tribunals  Approx. 10% (est.)of access is from the general community  Of identified users, 45% commercial sector, 27% education, 27% government, and 1% community. But most are not identifiable. Commercial appeal (including to the legal profession) is important but secondary Some audiences come to AustLII via linkages from commercial publishers’ systems  These publishers then become funding sources Specialist ‘Libraries’ for special audiences  Eg Aviation; Taxation; Health Practice; Indigenous law

Commercial sector = 45% of identifiable users 23

7Independence 24 Independence requires:  Host institution support against external pressure  Obligations on data sources to provide data  Financial independence Reliance on only one source of funding risks sacrificing independence AustLII now has over 300 regular contributors (‘multi-stakeholder’), plus grant support  In 2012 $2M: $1,080K contributions, $976K grants  AustLII has achieved independent sustainability (only)

A$1,080,000 contributions to AustLII in 2012, by sector 25 Education sector also contributed A$400,000 to competitive grant income

$976,000 competitive grants to AustLII (2012) 26

8The platform comes first 27 Individual research projects are not primarily important for their own sake  Different from most academic researchers They are important for the contribution they make to the long-term development of AustLII  or another LII such as WorldLII, AsianLII etc AustLII is an applied research Centre  AustLII’s research is almost all applied research  Its purpose is to improve the systems we operate  ‘research to improve research infrastructure’

9Help, and collaborate with, other LIIs 28 Global free access to law involves international reciprocity  it is vital to give technical and other assistance to other LIIs.

20 LIIs at the ‘Law via Internet’ annual meeting and Conference, Hong Kong The Free Access to Law Movement now has 45 members from all continents

AustLII collaborates with twelve other LIIs to provide the portals AsianLII, CommonLII and WorldLII 30

10A publisher, not a repository 31 AustLII protects its assets, so as to protect the sustainability of free access We limit search engines spidering & searching  Google etc cannot search any AustLII case law  This protects privacy (Australian policy of all Courts)  This also prevents loss of our valuable asset – aggregation We do not allow anyone to republish the data we have aggregated – AustLII is not a repository  They must go and collect it themselves from its sources ‘Free access’ is not the same as ‘open content’

[B] AustLII’s technology policies 1. Large scale automated hypertext 2. A search engine that scales up 3. Reliance on file systems, static addresses and open source software 4. An automated, international, citator 5. Replication & backup of other collaborating LIIs 6. Simultaneous searching and ranking of texts in multiple languages 7. Point-in-time legislation 8. Subject-specific libraries 32

1. Large scale automated hypertext automated insertion of over 40 million hypertext links using contextual sparse natural language parsing  context: type of document, jurisdiction, part within document links to legislation – whole Acts as well as sections of Acts (both internal and external), links to definitions links to cases links to journal articles links to treaties 33

Example: [2011] QCA

2. A search engine that scales up Open source, free-text search engine Speed, flexibility, portability and reliability  build performance: 20GB per hour on commodity hardware  search performance: Single word searches return in under seconds ‘Size is no object’ – trade-off between disk space and speed of retrieval  concordance ratio: 55% approx – relatively large but concordance is easy to read and minimises unnecessary file input/output Used by many LIIs from around the world: BAILII, PacLII, SAFLII, LIIofIndia, HKLII, NZLII, LiberLII, CyLaw, SamLII 35

Autosearch Using heuristics to detect search types automatically Recognises searches for any words Recognises connectors: and, or, near, w/n Searching for Acts and sections  section 14 Privacy Act 1988 Searching for cases (case name or citation):  Mabo v Others  [1995] HCA 23 Searching for references to standards Customisable to other search types 36

Search for Acts and sections 37

‘Noteup’ integration hypertext and information retrieval an automatic search for cases, legislation, and secondary materials that refer to a particular Act, section of an Act, or case basically, performs a ‘reverse hypertext lookup’ every section in an Act and case has a ‘Noteup’ link 38

Noteup example: s13 Privacy Act 39

Noteup example (2) 40

3. Reliance on file systems etc for speed of access, pre-generate files use the file system as the ‘database’ static and predictable web addresses  [2012] HCA 18 -> /au/cases/cth/HCA/2012/18.html use of open source where possible: Apache, Perl, gcc etc 41

4. An automated, international citator use of “data mining” techniques: Automatic recognition of case names and citations use WorldLII and other data sources major problem: mined citations may be incorrect or ambiguous (particularly in an international context) conflict identification and correction facilities were developed to deal with this issue these facilities use information such as the country, jurisdiction, court and type of citation and rely on heuristic (statistical) techniques 42

Unmining process 43

Handling ambiguity 44

XML citation database example 45

LawCite citator LawCite contains 4.2M case / journal article citation entries LawCite Free-access citator with global coverage Automatically generated and maintained Fully integrated with AustLII – ‘By Citation Frequency’ Markup tool 46

LawCite example 47

5. Replication & backup 48

6. Simultaneous searching in multiple languages SINO was developed initially for English and has been extended to other western languages extending SINO to handle UTF-8 SINO’s u16a representation  Any non-ASCII UTF-8 character (eg Chinese, Korean) can be converted into an alpha-numeric (flat) representation  Hexadecimal form – 0 to 9 and A to F  Resulting form may be confused with numeric words in western languages  ‘ 春 ’ is ‘6625’ in hexadecimal form 49

SINO’s u16a representation (2) The characters ‘u16a’ are added to any such representation to create a unique string  ‘u16a’ is rare to non-existent in natural language These u16a ‘shadow files’ are then used for SINO to search (as a proxy for the original)  text in the original language is presented to the user Example: bankrupt* or insolven* or การล้มละลาย or kepailitan or pailit or 破產 or 破产 or Phá s ả n 50

Example: English, Thai, Chinese, Indonesian 51

7. Point-in-time legislation ability to consider what a piece of legislation stated at some point in the past and to navigate legislation historically multi-jurisdictional in nature automated approach variety of data formats – some are XML documents; others are just Word documents quality of data varies between jurisdictions challenge: uniform and consistent interface 52

Point-in-time legislation - Versions 53

Point-in-time legislation – compare versions 54

Example 2: Lower quality data 55

Compare versions 56

8. Subject-specific libraries Issue: How can LIIs create subject-oriented resources economically? AustLII’s method of creating subject-oriented ‘Libraries’ by semi-automated means 1. Include all relevant specialist databases 2. Editor develops search of remaining DBs for subject, and decides what percentage of results are sufficiently relevant 3. Concordance created of that % of results => virtual database for inclusion in Library. 4. Search re-run, with same % of results used for concordance, for automatic updates of virtual DB 57

Indigenous Law Library 58

Likely key future technologies Using legal dictionaries to automate cross-language searching Searching without users anticipating text occurrence  using citation patterns to find similar documents  using search logs to create ‘best’ search Crowd-sourcing, suitable for an authoritative law site 59

Thank you – Questions? 60