Presentation is loading. Please wait.

Presentation is loading. Please wait.

IS IT REALLY THAT BAD? Verifying the extent of full-text linking problems Karen R. Harker, MLS, MPH Collection Assessment Librarian UNT Libraries.

Similar presentations


Presentation on theme: "IS IT REALLY THAT BAD? Verifying the extent of full-text linking problems Karen R. Harker, MLS, MPH Collection Assessment Librarian UNT Libraries."— Presentation transcript:

1 IS IT REALLY THAT BAD? Verifying the extent of full-text linking problems Karen R. Harker, MLS, MPH Collection Assessment Librarian UNT Libraries

2 I find searching for journal articles and actually finding full articles to a very difficult. Sometimes it just links to another site and then to a fragment of an article. It's hard to find links to online articles--some links say they can't find it, but sometimes you can still locate it and I don't know why the main "find links" page doesn't bring it up. Sometimes, if a link is not provided to an article in the search results, it is very difficult to find. The article linker often comes up with no results even though it says the article is in UNT's collection. Frustrated with "Find Full-Text" when "it doesn't work; if it's not 100% perfect, there is really no point in offering the service.

3

4

5 What have you done?

6 Link-Checking R A N D O M Selection Cannot Be Predicted

7 What is required? Intermediate to advanced Excel (but NOT programming)

8 Knowledge About your collection About the problem

9 Clear questions

10 Think about... your problem your collection your link resolver your people

11 Brainstorm

12 Come away with…

13 What Kind of Research Questions? Just a few Be Specific Compared to what? What is important to you?

14 Such as… Are links from EBSCO more successful than links from Ovid? Is full-text linking better or worse compared to last year? Is Serials Solutions 360 link resolver more likely to get to the full-text from our key resources than EBSCOs? What is the chance that a client will get the full-text of an article on the first click?

15 Start with the Results

16 SourcesFull-TextNo Full-TextTotal EBSCO9010100 Ovid8515100 Totals17525200 Comparing One Source With Another Confidence LevelChi-Square TestTarget Chi-Square 0.900.1611.00-0.90=0.10 Is Chi-Square test < Target? No Is Ovid Significantly Different from EBSCO? No

17 SourcesFull-TextNo Full-TextTotal EBSCO9010100 Average7525100 Totals16535200 Comparing One Source With the Average Confidence LevelChi-Square TestTarget Chi-Square 0.900.0011.00-0.90=0.10 Is Chi-Square test < Target? Yes Is EBSCO Significantly Different from the Average? Yes

18 TargetsFull-TextNo Full-TextTotal EBSCO9010100 Expected or Ideal Rate955100 Totals18515200 Comparing One Target With The Expected or Ideal Rate Confidence LevelChi-Square TestTarget Chi-Square 0.900.0221.00-0.90=0.10 Is Chi-Square test < Target? Yes Is EBSCO Significantly Different from Ideal Rate? Yes

19 Random Sampling Review or background, depending on your viewpoint

20 Sampling Terms Universe Sampling Population Sampling Frame Sample All Citations Citations in databases to which we have access to articles to which we have full-text access Only Journal Articles

21 Sample All Citations Citations to articles to which we have full-text access Only Journal Articles

22 Selection Methods Convenience sampling The chance of being selected is not known The probability of any one citation being selected is known. Non-probabilityProbability

23 Simple Random Sampling Every citation that meets the criteria has an equal chance of being selected. See Demo.Demo NOTE: Articles vary greatly by source, target and year.

24 Stratified Sampling Every citation in discrete homogeneous groups has an equal chance of being chosen. Try Demo again…Demo Useful to zero-in on a possible problem Stratify by source, target & year, but would be time consuming

25 Sampling Population Samples Selected from Each Stratum Strata

26 Cluster Sampling When the sampling population naturally clusters (e.g. source and targets). The way they cluster doesnt affect your outcome. Divide population into these clusters Randomly select the clusters to be a part of the sampling frame Randomly select sample from selected clusters Useful for very large populations.

27 Sampling Population Samples Selected from Selected Clusters Clusters

28 This Methodology Simple randomized cluster 1. Select a sample of ejournals (clusters) 2. Search each database for articles 3. Randomly select a citation (sample) 4. Test and record results Most useful for questions that are focused on the sources.

29 Other Questions, Other Designs Comparing link-resolvers: Matched-pair 1. Select a sample of ejournals 2. Using one of the link resolvers, search the source for articles in these ejournals. 3. Randomly select a citation 4. Test and record results 5. For the next link resolver, search each source for the same citation (the matched pair). 6. Test and record results.

30 environment For problems related to environment (browsers, location of user, etc.) : Use the same method as the link-resolver, only change the browser or location. targets For problems related to targets : Use the same method, but… Randomly select ejournals from each target Other Questions, Other Designs

31

32 Using Excel to Help You Along Practical Applications

33 Before we begin… For those with Laptops, download files: Excel file: http://digital.library.unt.edu/ark:/67531/metadc96818/ PDF of Steps: http://digital.library.unt.edu/ark:/67531/metadc96827/ Or, just follow along…

34 I need to check how many? May be fewer than youd think Need to know: Sampling strategy Kind of analysis Expected rate Chance of a title being indexed in the source Number of databases or sources to examine Educated Guess

35

36 Selecting the Journal Titles (Clusters) 1. Download your ejournals list May want to limit to only those used recently 2. Randomly select the correct number of journals based on sample size 3. Randomly assign each title to the databases or sources you will be searching. Excel Tricks Remove Duplicates Fill Cell – assigns a new ID number Sampling method in Data Analysis Randomly selects IDs from your list VLOOKUP – gets the titles for the selected IDs RANDBETWEEN – Randomly assigns each title to a source to be searched

37 Search Source If found Random Number Select Citation Test Citation Record Result Next Title If Not Found

38 Test the Sources 1. Login to the database 2. Search for articles in the first journal 3. If none are found, note this in your results and skip to next title. 4. If some are found, note the total number of articles.

39 Test Sources If articles are found: 1. Sort the list by author last name (if possible) 2. Note the total number of articles found 3. Enter this number in the Sample Size Calculator worksheet (Random Article Selector) 4. Note the Select this article number 5. In the database, navigate to this article 6. Click on the Find Full-Text button

40 Full-Text PDF!

41 Test Sources Note the success of that link in your tally sheet Rinse & repeat: For each journal in the list For each database to be tested

42 Search Source If found Random Number Select Citation Test Citation Record Result Next Title If Not Found

43 Tips & Tricks Search by ISSN, if possible. Display the most citations per page If full-text article is in that database, skip title. This is a non-response the Response Rate the Sample Size

44 Summarize the data Count up all results in each source Create ratios for each result (e.g. Full-text ratio) # Full-Text / # Titles Found Average the ratios

45 Example Data: Raw Counts

46 Example Data: Ratios

47 Test the Results So you have a ratio – so what? What does it mean? Is it high? Low? Compared to what? Use the Chi-Squared test to compare the ratios Excel: CHISQ.TEST(actual range, expected range) If the result is less than 0.10 (or 1.00 – Confidence Level), then the difference is statistically significant. This may be good or not good, depending on what your comparing against.

48 Comparing Against an Ideal % Actual range: the # of Successes & # of Failures Expected Range: # expected to be success & # of expected failures Example: CHISQ.TEST(B2:C2, B3:C3) ABCD 1SourcesFull-Text No Full- Text Total 2EBSCO9010100 3Expected or Ideal Rate955100 4Totals18515200 5Chi-Square Test0.022 6Significantly Different from Expected Count?Yes

49 Sources to TestSourcesFull-TextNo Full-TextTotal 1EBSCO702191 2Ovid9023113 Totals16044204 Chi-Square Test0.031632 Significantly Different?Yes Sources to TestSourcesFull-TextNo Full-TextTotal 3ProQuest8718105 1EBSCO702191 Totals15739196 Chi-Square Test0.032782 Significantly Different?Yes Sources to TestSourcesFull-TextNo Full-TextTotal 2Ovid9023113 5Means82.333333320.66666667103 Totals172.33333343.66666667216 Chi-Square Test0.322856 Significantly Different?No

50

51 Context is King The value of the result depends on what you are measuring and comparing If the difference between two sources is not significant, then they are statistically similar. If the difference between two link resolvers is significant, then one is better than the other. NOTE: This doesnt tell you by how much! Use your best judgment

52 I am here to help you… Karen R. Harker 940-565-2688 karen.harker@unt.edu karen.harker@unt.edu


Download ppt "IS IT REALLY THAT BAD? Verifying the extent of full-text linking problems Karen R. Harker, MLS, MPH Collection Assessment Librarian UNT Libraries."

Similar presentations


Ads by Google