Presentation is loading. Please wait.

Presentation is loading. Please wait.

My Website Was Lost, But Now It’s Found Frank McCown CS 110 – Intro to Computer Science April 23, 2007.

Similar presentations


Presentation on theme: "My Website Was Lost, But Now It’s Found Frank McCown CS 110 – Intro to Computer Science April 23, 2007."— Presentation transcript:

1 My Website Was Lost, But Now It’s Found Frank McCown CS 110 – Intro to Computer Science April 23, 2007

2 A Little About Me 1989-92 – Smoky Hill High School 1992-96 – B.S. in CS from Harding University 1996-97 – Software Engineer at Lockheed Martin 1997-04 – Instructor of CS at Harding Univ. 2002 – M.S. in CS from Univ of Arkansas at Little Rock 2004-present – CS Ph.D. student working with Michael Nelson Fall 2007 – Assoc. Professor of CS at Harding Univ.

3

4 Frank McCown Education Ph.D. in Computer Science – Old Dominion Univ. (2007 expected) M.S. in Computer Science – Univ of Arkansas in Little Rock (2002) B.S. in Computer Science – Harding University (1996) Work Experience 1997-2004 – Instructor of CS at Harding University (Searcy, AR) 1996-1997 – Software Eng for Lockheed Martin (Denver, CO) 1995 – Software Engineer Intern for Auto-trol (Denver, CO) Honors 2007 – Outstanding Graduate Research Assistant 2006 – College of Sciences Dissertation Fellowship 2005 – Outstanding Graduate Assistant 2004 – Dominion Scholar

5

6 Industry vs. Academia No preference Academia Industry 2000 survey by The Scientist magazine asked their readers: Overall which environment do you prefer? 73% of survey respondents had held research positions in industry and academia. http://www.the-scientist.com/2001/4/16/28/2/

7 Industry vs. Academia Movement Academia  Industry is common Industry  Academia very uncommon Flexibility Schedule Focus Compensation

8 Research Interests Digital preservation Will we be able to see our websites 20 years from now? Web crawling How can search engines and web archives duplicate/ download our websites more efficiently and effectively? Search engines How much/what content do commercial search engines index and cache? How synchronized are search engines APIs with what the general user sees?

9 Black hat: http://img.webpronews.com/securitypronews/110705blackhat.jpg Virus image: http://polarboing.com/images/topics/misc/story.computer.virus_1137794805.jpg Hard drive: http://www.datarecoveryspecialist.com/images/head-crash-2.jpg

10 How much of the Web is indexed? Estimates from “The Indexable Web is More than 11.5 billion pages” by Gulli and Signorini (WWW’05)

11 Web Infrastructure

12

13

14 Cached Image

15 First developed in fall of 2005 Available for download at http://www.cs.odu.edu/~fmccown/warrick/ http://www.cs.odu.edu/~fmccown/warrick/ www2006.org – first lost website reconstructed (Nov 2005) www2006.org DCkickball.org – first website someone else reconstructed without our help (late Jan 2006) DCkickball.org www.iclnet.org – first website we reconstructed for someone else (mid Mar 2006) www.iclnet.org Internet Archive officially endorses Warrick (mid Mar 2006)

16 Warrick-related Publications Frank McCown, Norou Diawara, and Michael L. Nelson. Factors Affecting Website Reconstruction from the Web Infrastructure. JCDL 2007. June 2007. Vancouver, British Columbia, Canada.Factors Affecting Website Reconstruction from the Web Infrastructure Catherine C. Marshall, Frank McCown, and Michael L. Nelson. Evaluating Personal Archiving Strategies for Internet-based Information. IS&T Archiving 2007. May 2007. Arlington, Virginia. Evaluating Personal Archiving Strategies for Internet-based Information Frank McCown and Michael L. Nelson. Characterization of Search Engine Caches. IS&T Archiving 2007. May 2007. Arlington, Virginia, USA.Characterization of Search Engine Caches Frank McCown, Joan A. Smith, Michael L. Nelson, and Johan Bollen. Lazy Preservation: Reconstructing Websites by Crawling the Crawlers. WIDM 2006. November 2006. Arlington, Virginia.Lazy Preservation: Reconstructing Websites by Crawling the Crawlers Frank McCown and Michael L. Nelson. Evaluation of Crawling Policies for a Web-Repository Crawler. HYPERTEXT 2006. August 2006. Odense, Denmark.Evaluation of Crawling Policies for a Web-Repository Crawler

17 Search Engine APIs Frank McCown and Michael L. Nelson. Poster: Search Engines and Their Public Interfaces: Which APIs are the Most Synchronized? WWW 2007Search Engines and Their Public Interfaces: Which APIs are the Most Synchronized? Frank McCown and Michael L. Nelson. Agreeing to Disagree: Search Engines and their Public Interfaces. JCDL 2007Agreeing to Disagree: Search Engines and their Public Interfaces

18 Thank You Questions?


Download ppt "My Website Was Lost, But Now It’s Found Frank McCown CS 110 – Intro to Computer Science April 23, 2007."

Similar presentations


Ads by Google