Presentation is loading. Please wait.

Presentation is loading. Please wait.

CATAR-Content Analysis Toolkit for Academic Research

Similar presentations


Presentation on theme: "CATAR-Content Analysis Toolkit for Academic Research"— Presentation transcript:

1 CATAR-Content Analysis Toolkit for Academic Research
Introduction Installation Usage Interpretation Case Study Yuen-Hsien Tseng National Taiwan Normal University 2018/04/25

2 Content Analysis: Introduction
Related Subjects: Bibliometrics, Scientometrics, Infometrics Content analysis in social science Related Journals JASIST, Scientometrics, Journal of Infometrics Related Conferences ISSI: International Society for Scientometrics and Infometrics STI: Science and Technology Indicators 2

3 Content Analysis: Motivation
Prior art search and analysis is expected to be done in a half day From a vice president of a second largest analog IC design company To know the past, avoid reinventing the wheels, and improve innovation For strategic planning in S&T To know highly impact authors, institutions For speech, cooperation, advise For evaluation, budget distribution 3

4 Automatic Content Analysis
Long term goal: Automatic literature analysis, organization, and presentation To form hypothesis for exploration, verification, and decision making Related studies: Structured Abstract in library science (1987) Automated structured abstract in biology (2007) Automatic patent analysis (2004, NTCIR) Sentimental analysis in literature (2010, STI) 4

5 Automatic Content Analysis -Tools(1/2)
CiteSpace Chao-Mei Chen, Drexel University (2003) To know the paradigm shift in research VOSviewer Nees Jan van Eck and Ludo Waltman (2007) CWTS of Leiden University 5

6 Automatic Content Analysis -Tools(1/2)
Science Mapping Software Tools: Review, Analysis, and Cooperative Study Among Tools Cobo, et al, JASIST 2011 paper Compare nine tools (free, commercial) Bibexcel, VantagePoint, Sci2 Tool, … None of them cover all the functions of the others Scientometrics analysis has a standard procedure (Börner et al 2003) CATAR released in 2010 (developed since 2004) 6

7 CATAR-Introduction Content Analysis Toolkit for Academic Research
Yuen-Hsien Tseng, CATAR technical details: Yuen-Hsien Tseng, Chi-Jen Lin, and Yu-I Lin, "Text Mining Techniques for Patent Analysis", Information Processing and Management, Vol. 43, No. 5, 2007, pp (indexed in ESI as top 1% cited paper in 2017) Journal clustering of Library and Information Science for subfield delineation using the bibliometric analysis toolkit: CATAR", Scientometrics, Vol. 95, No. 2, pp , May 2013. 7

8 CATAR Analysis Functions
Overview analysis Topic Clustering based on: bibliographic coupling co-word analysis 8

9 CATAR Installation Please check below link for latest update:
C:\CATAR folder: C:\CATAR\src\ : programs for analysis C:\CATAR\Source\ : data to be analyzed C:\CATAR\Result\ : result after analysis C:\CATAR\doc\ : intermediate result during analysis (just for debugging, no need for end user) 9

10 Preparing Data for Analysis
Delineate the data (the most important step, the second valuable part) Data from keyword search Publication records in core journals Combined search results Journals + keywords + time limit Data record verified by domain experts Data source ready for analysis by CATAR Records from Web of Science Patents from USPTO 10

11 ISI WoS Publication Record
Only the fields in red color are used. Cited References are used in the bibliographic coupling for topic clustering and citation tracking FN ISI Export Format VR 1.0 PT J AU Tseng, SC Tsai, CC AF Tseng, Sheng-Chau Tsai, Chin-Chung TI On-line peer assessment and the role of the peer feedback: A study of high school computer course SO COMPUTERS & EDUCATION LA English DT Article DE interactive learning environments; secondary education; learning communities; improving classroom teaching; peer assessment ID WORLD-WIDE-WEB; ASSESSMENT SYSTEM; HIGHER-EDUCATION; STUDENTS; THINKING; SCIENCE; SELF AB The purposes of this study were to explore the effects and the validity of on-line peer assessment in high schools and … C1 Natl Chiao Tung Univ, Inst Educ, Hsinchu 300, Taiwan. Natl Chiao Tung Univ, Ctr Teacher Educ, Hsinchu 300, Taiwan. RP Tsai, CC, Natl Chiao Tung Univ, Inst Educ, 1001 Ta Hsueh Rd, Hsinchu 300, Taiwan. EM CR ROTH WM, 1997, SCI EDUC, V6, P373 DOCHY F, 1999, STUD HIGH EDUC, V24, P331 NR 23 TC 2 PU PERGAMON-ELSEVIER SCIENCE LTD PI OXFORD PA THE BOULEVARD, LANGFORD LANE, KIDLINGTON, OXFORD OX5 1GB, ENGLAND SN J9 COMPUT EDUC JI Comput. Educ. PD DEC PY 2007 VL 49 IS 4 BP 1161 EP 1174 DI /j.compedu PG 14 SC Computer Science, Interdisciplinary Applications; Education & Educational Research GA 218OF UT ISI: ER 11

12 Import Fields of WoS AU: authors' names, e.g., Kainz, H; Hofstetter, H
TI: publication title, e.g., Adaption of the main waste water treatment … SO: journal title, e.g., WATER SCIENCE AND TECHNOLOGY。 DE: keywords given by the authors: large wastewater treatment plant; ID: identifiers given by WoK to describe the topics of the article AB: publication's abstract C1: authors‘ countries, e.g., USA, UK, … CR: cited references: BALDI F, 1988, WATER AIR SOIL POLL, V38, P111 NR: Number of references, e.g., 3 TC: Times Cited, e.g., 1 PY: publication year, e.g., 1996 SC: source categories e.g.,Environmental Sciences; Water Resources UT:indexing key given by WoS, e.g., ISI:A1996VF 12

13 Overview Analysis Parse data and save into DBMS for management, cross tabulation, and verification Trend Analysis Trend indicator in terms of the slope of the linear regression based on the publication volume per year Yuen-Hsien Tseng, Yu-I Lin, Yi-Yang Lee, Wen-Chi Hung, and Chun-Hsiang Lee, " A Comparison of Methods for Detecting Hot Topics", Scientometrics, Vol. 81, No. 1, Oct. 2009, pp Command to be executed: C:\CATAR\src>perl –s automc.pl -OOA SE ..\Source_Data\SE\data Command option Folder name for result Path to the data for analysis 13

14 DOS Command Find and run cmd.exe in MS Windows Change drive to C: C:
Change folder to CATAR: cd \CATAR Change working directory: cd src Absolute path: C:\CATAR\Source_Data\SE\data Relative path: if working directory is under \CATAR\src then the path to data is ..\Source_Data\SE\data 14

15 Overview Analysis Example
Results in : C:\CATAR\Result\SE\_SE_by_field.xls papers Search Command #1 54 SO=(Journal of the Learning Sciences) #2 640 SO=(Computers & Education) #3 238 SO=(Science Education) #4 187 SO=(Journal of Computer Assisted Learning) #5 249 SO=(Journal of Research in Science Teaching) #6 365 SO=(British Journal of Educational Technology) #7 326 SO=(Educational Technology & Society) #8 144 SO=(ETR&D-Educational Technology Research And Development) #9 422 SO=(International Journal of Science Education) #10 SO=(Research in Science Education) #11 143 SO=(Innovations in Education and Teaching International) #12 2,912 #1 or #2 or #3 or #4 or #5 or #6 or #7 or #8 or #9 or #10 or #11 Document Type=(Article) Databases=SCI-EXPANDED, SSCI, A&HCI Timespan= 15

16 Year Production: Top 8 Countries
USA UK TAIWAN AUSTRALIA CANADA TURKEY NETHER LANDS SPAIN 2004 12 3 1 6 4 2005 138 69 36 38 16 14 29 15 2006 139 63 31 25 18 19 13 2007 173 70 61 43 28 20 21 2008 204 72 108 44 34 2009 198 71 84 42 24 2010 7 2 total 870 352 328 141 116 114 98 16

17 Most Productive Authors: Top 10
NC TC IF FC FTC FIF Tsai, CC 37 227 6.14 17.6 104.9 5.96 Roth, WM 18 61 3.39 7.7 25.7 3.34 Koper, R 15 60 4.00 3.8 21.4 5.63 Hwang, GJ 14 94 6.71 3.7 27.3 7.38 Valcke, M 13 165 12.69 4.3 53.4 12.42 Lee, O 12 93 7.75 3.2 23.0 7.19 Chang, CY 11 49 4.45 5.2 25.6 4.92 Huang, YM 42 3.82 3.6 12.8 3.56 Sadler, TD 110 10.00 4.7 48.6 10.34 Chang, KE 56 5.09 3.3 16.6 5.03 AU Tseng, SC Tsai, CC Tseng, SC : 0.5 Tsai, CC : 0.5 AU Tseng, SC Tsai, CC Tseng, SC : 1 Tsai, CC : 1 NC=Normal Count: each co-author is counted as a single author FC=Fractional Count: all the co-authors are counted as a single author IF =TC/NC, FIF=FTC/FC 17

18 Most Productive Institutes: Top 15
NC TC IF FC FTC FIF Natl Taiwan Normal Univ 61 220 3.61 45.6 157.4 3.45 Nanyang Technol Univ 52 217 4.17 37 149.2 4.03 Open Univ 50 265 5.30 41.3 234.4 5.68 Natl Cent Univ 46 276 6.00 29.2 164.1 5.62 Indiana Univ 39 315 8.08 22.8 171.0 7.50 Natl Taiwan Univ Sci & Technol 35 212 6.06 22 117.8 5.35 Natl Cheng Kung Univ 34 108 3.18 27.4 90.3 3.30 Middle E Tech Univ 33 87 2.64 24.3 70.3 2.89 Florida State Univ 32 145 4.53 21.2 75.0 3.54 Curtin Univ Technol 31 85 2.74 18.9 51.2 2.71 Univ Georgia 138 4.45 19.3 81.7 4.23 Natl Chiao Tung Univ 29 150 5.17 18.6 93.8 5.04 Univ London 168 5.79 20.9 83.6 4.00 Arizona State Univ 28 104 3.71 18.4 62.8 3.41 Weizmann Inst Sci 27 153 5.67 20.7 121.3 5.86 Data are from the C1 field of each record: C1 Natl Chiao Tung Univ, Inst Educ, Hsinchu 300, Taiwan 18

19 Data are from the CR field of each record:
Most Cited References *NAT RES COUNC, 1996, NAT SCI ED STAND 245 LEDERMAN NG, 1992, J RES SCI TEACH, V29, P331 63 LAVE J, 1991, SITUATED LEARNING LE 157 *NRC, 1996, NAT SCI ED STAND VYGOTSKY LS, 1978, MIND SOC DEV HIGHER 131 DRIVER R, 2000, SCI EDUC, V84, P287 61 BROWN JS, 1989, EDUC RES, V18, P32 113 DRIVER R, 1996, YOUNG PEOPLES IMAGES 59 WENGER E, 1998, COMMUNITIES PRACTICE 109 MILLAR R, 1998, 2000 SCI ED FUTURE *AM ASS ADV SCI, 1993, BENCHM SCI LIT 93 LEMKE JL, 1990, TALKING SCI LANGUAGE POSNER GJ, 1982, SCI EDUC, V66, P211 78 *NAT RES COUNC, 2000, INQ NAT SCI ED STAND 57 SHULMAN LS, 1986, EDUC RES, V15, P4 76 LINCOLN YS, 1985, NATURALISTIC INQUIRY 52 COHEN J, 1988, STAT POWER ANAL BEHA 70 BROWN AL, 1992, J LEARN SCI, V2, P141 SHULMAN LS, 1987, HARVARD EDUC REV, V57, P1 67 COLLINS A, 1989, KNOWING LEARNING INS, P453 Data are from the CR field of each record: CR ROTH WM, 1997, SCI EDUC, V6, P373 19

20 Data are from the CR field of each record:
Most Cited Authors Rank AU NC 1 ROTH WM 411 11 LEDERMAN NG 230 2 *NAT RES COUNC 397 12 BANDURA A 226 3 DRIVER R 395 13 VOSNIADOU S 214 4 JONASSEN DH 336 14 KUHN D 213 5 MAYER RE 323 15 TABER KS 196 6 VYGOTSKY LS 259 16 OSBORNE J 195 7 TSAI CC 250 17 BROWN AL 184 8 CHI MTH 249 18 SHULMAN LS 180 9 *AM ASS ADV SCI 246 19 AIKENHEAD GS 178 10 LAVE J 242 20 TOBIN K 176 Data are from the CR field of each record: CR ROTH WM, 1997, SCI EDUC, V6, P373 20

21 Data are from the CR field of each record:
Most Cited Journals rank J9 DF 1 J RES SCI TEACH 4707 11 COMPUT HUM BEHAV 622 2 SCI EDUC 3368 12 LEARN INSTR 3 INT J SCI EDUC 2927 13 EDUC RES 618 4 COMPUT EDUC 1668 14 COGNITION INSTRUCT 581 5 J LEARN SCI 899 15 J EDUC COMPUT RES 562 6 J EDUC PSYCHOL 877 16 EDUC PSYCHOL 523 7 ETR&D-EDUC TECH RES 829 17 STUDIES SCI ED 468 8 REV EDUC RES 825 18 RES SCI EDUC 446 9 J COMPUT ASSIST LEAR 737 19 J CHEM EDUC 443 10 BRIT J EDUC TECHNOL 717 20 INSTR SCI 433 Data are from the CR field of each record: CR ROTH WM, 1997, SCI EDUC, V6, P373 21

22 Topic Clustering Procedure
Indexing Construction (for fast analysis) Similarity Computation Document Clustering Cluster Labels Generation Multi-Stage Clustering for Topic Trees Multi-Dementional Scaling (MDS) for Topic Map Cross tabulations for topic and other data 22

23 Indexing Building Bibliographic Coupling (BC) : Co-word Analysis (CW)
Constructing BC matrix Normalize citation counts Co-word Analysis (CW) Remove stop words (the, of, for, on, and, at, …) Normalize terms (stemming, lemmatization, vocabulary control) Keyterm extraction ( patented [Tseng, 2002, JASIST]) Building inverted files for later fast computation 23

24 Similarity Computation
Doc A B 詞彙 1 詞彙 2 詞彙 T Co-word 文獻 1 文獻 2 文獻 M Bibliographic Coupling T=2529 for 318 EEPA papers M=9957 for 318 EEPA papers Sim(A, B) = 2x|S(A)∩S(B)| |S(A)|+|S(B)| D1 D2 Dn D1 D2 Dn 24

25 Topic Tree Agglomerative Hierarchical Clustering (AHC)
Complete link criterion Dendrogram D D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14 D15 D16 D17 Threshold: 0.075 Result: 6 clusters 0.3 0.2 0.1 0.0 25

26 主題樹範例 (電影新聞資料) 1(7): 161 : 7 Docs. : 0.3478 (美國: 9.4)
13 : : :納尼亞傳奇 美國片 55 : : :V怪客 美國片 48 : : :北國性騷擾 美國片 1 : : :惡狼ID 美國片 32 : 3 Docs. : (影迷: 7.0, 美國: 2.4) 14 : 2 Docs. : (影迷: 4.0, 絕命終結站: 3.5, 絕命: 3.5, 飛車: 2.8, 雲霄飛車: 2.8) 11 : : :奪魂鋸2美國片 27 : : :絕命終結站3雲霄飛車驚魂 16 : : :偷穿高跟鞋 美國片 9(3): 28 : 3 Docs. : (傑克: 10.0, 李安: 8.9, 傑克基倫霍: 7.0, 基倫霍: 7.0, 希斯萊傑: 3.2) 17 : 2 Docs. : (李安: 11.0, 傑克: 5.7, 斷背山: 4.9, 希斯萊傑: 4.0, 傑克基倫霍: 3.2) 3 : : :李安靠 斷背山重拾熱情 7 : : :斷背山 美國片 21 : : :鍋蓋頭 美國片 12(3): 74 : 3 Docs. : (奶油: 7.3, 絕配: 6.0, 料理: 5.1, 凱特: 4.9, 尼克: 3.2) 58 : 2 Docs. : (番紅花: 6.3, 凱特: 6.0, 番紅花醬汁: 4.9, 尼克: 4.0, 鮮奶: 4.0) 68 : : :料理絕配 跟著男主角做義國菜 71 : : :料理絕配 跟著女主角做法國菜 69 : : :料理絕配 看電影學用餐禮儀 相似度 類別序號 與篇數 類別標題詞 類別編號 (下一階使用) 與篇數 26

27 Generation of Cluster Labels
Automatic extracting cluster terms as cluster labels Yuen-Hsien Tseng, " Generic Title Labeling for Clustered Documents", Expert Systems With Applications, Vol. 37, No. 3, 15 March 2010, pp 27

28 Multi-Stage Clustering
Each stage is an AHC Topics Stage 2 Concepts Stage 1 Docs. Outliers: below threshold, unable to be clustered 28

29 Topic Map MDS (Multi-Dimensional Scaling)
Projecting high dimensional similarities into 2 dimension similarity map for ease of visualization 6. Biomedicine 1.Chemistry 5. Material 3. Generality 2. Electronics and Semi-conductors 4. Communication and computers Topic map of US patents of MOST 29

30 Topic Map and Topic Tree
Carbon Nanotube patents 30

31 Analysis by Bibliographic Coupling
Command example: C:\CATAR\src>perl -s automc.pl -OBC SE ..\Source_Data\SE\SE.mdb Results: C:\CATAR\Result\SE_BC *.html: topic tree *all*.html: cross tabulation of topics *.xls: cross tabulation of topics *titles*.html: titles of each cluster 31

32 Analysis based Co-Word
Command example: C:\CATAR\src>perl -s automc.pl -OCW SE ..\Source_Data\SE\SE.mdb Results: C:\CATAR\Result\SE_CW *.html: topic tree *all*.html: cross tabulation of topics *.xls: cross tabulation of topics *titles*.html: titles of each cluster 32

33 BC Analysis Example Reasonable: 100%
Thresuld=0.0 1(6): 34 : 6 Docs. : (cluster: 5.1, map: 3.0, min: 3.0, text: 2.1) 12 : 4 Docs. : (cluster: 7.0, patent: 5.2, text: 3.7, generic: 2.6, title: 2.6) 5 : 3 Docs. : (cluster: 5.0, generic: 3.1, title: 3.1, text: 2.4, document: 2.3) 1 : 2 Docs. : (generic: 4.0, title: 4.0, cluster: 3.2, document: 3.1, correlation coefficient: 2.0) 2 : ISI: : 2006:Toward generic title generation for clustered documents 6 : ISI: : 2010:Generic title labeling for clustered documents 3 : ISI: : 2007:Text mining techniques for patent analysis 4 : ISI: : 2007:Patent surrogate extraction and evaluation in the context of patent mapping 18 : 2 Docs. : (education: 4.0, content analysi: 2.0, content: 2.0, media: 2.0) 7 : ISI: : 2010:Mining concept maps from news stories for measuring civic scientific literacy in media 8 : ISI: : 2010:Trends of Science Education Research: An Automatic Content Analysis 2(3): 15 : 3 Docs. : (neural network: 3.1, quadratic: 2.3, sort: 2.3, perceptron: 1.7) 2 : 2 Docs. : (quadratic: 3.0, sort: 3.0, perceptron: 2.3, winner-take-all: 1.4, constant-time: 1.4) 13 : ISI:A1995QT : 1995:ON A CONSTANT-TIME, LOW-COMPLEXITY WINNER-TAKE-ALL NEURAL-NETWORK 9 : ISI:A1992HU : 1992:SOLVING SORTING AND RELATED PROBLEMS BY QUADRATIC PERCEPTRONS 10 : ISI:A1992HY : 1992:CONSTRUCTING ASSOCIATIVE MEMORIES USING HIGH-ORDER NEURAL NETWORKS 3(2): 14 : 2 Docs. : (automatic: 3.1, chinese: 1.4, text: 1.4, thesauru: 1.4) 0 : ISI: : 2001:Automatic cataloguing and searching for retrospective data by use of OCR text 1 : ISI: : 2002:Automatic thesaurus generation for Chinese documents 4(2): 3 : 2 Docs. : (code: 4.0, decoder: 1.4, fast: 1.4, reed-muller: 1.4) 11 : ISI:A1993MA : 1993:DECODING REED-MULLER CODES BY MULTILAYER PERCEPTRONS 12 : ISI:A1993MA : 1993:FAST NEURAL DECODERS FOR SOME CYCLIC CODES 5(1): 36 : 1 Docs. : 0 (hot: 2.0, detect: 2.0, comparison: 2.0, topic: 1.1, scientometric: 0.7) 5 : ISI: : 2009:A comparison of methods for detecting hot topics Reasonable: 100% 33

34 BC Analysis Example: 2nd Stage
Thresuld=0.0 Reasonable: 100% 1(2): 1 : 5 Docs. : (neural: 4.0, perceptron: 3.0, code: 2.4, decoder: 1.8, network: 1.8) 1 : 15 : 3 Docs. : (neural network: 3.1, quadratic: 2.3, sort: 2.3, perceptron: 1.7) 3 : 3 : 2 Docs. : (code: 4.0, decoder: 1.4, fast: 1.4, reed-muller: 1.4) 2(2): 2 : 8 Docs. : (automatic: 5.0, document: 4.0, text: 4.0, generation: 3.0, cluster: 1.8) 0 : 34 : 6 Docs. : (cluster: 5.1, map: 3.0, min: 3.0, text: 2.1) 2 : 14 : 2 Docs. : (automatic: 3.1, chinese: 1.4, text: 1.4, thesauru: 1.4) 3(1): 4 : 1 Docs. : 0 (hot: 2.0, detect: 2.0, comparison: 2.0, topic: 2.0, scientometric: 1.0) 4 : 36 : 1 Docs. : 0(hot: 2.0, detect: 2.0, comparison: 2.0, topic: 1.1, scientometric: 0.7) Labe ID from stage 1 34

35 BC Analysis Example: 2nd Stage
35

36 Co-Word Analysis Example
Reasonable: 60%-80% 1(5): 29 : 5 Docs. : (term: 19.0, document: 6.7, algorithm: 4.0) 7 : 3 Docs. : (document: 12.2, generic: 7.7, cluster: 7.6, term: 7.4, algorithm: 6.0) 2 : 2 Docs. : (cluster: 10.8, generic: 10.0, label: 7.0, title: 7.0, document: 5.6) 2 : ISI: : 2010:Generic title labeling for clustered documents 6 : ISI: : 2006:Toward generic title generation for clustered documents 7 : ISI: : 2002:Automatic thesaurus generation for Chinese documents 3 : 2 Docs. : (map: 7.7, patent: 5.4, term: 4.1, scientific: 4.0, new: 4.0) 1 : ISI: : 2010:Mining concept maps from news stories for measuring civic scientific literacy in media 4 : ISI: : 2007:Patent surrogate extraction and evaluation in the context of patent mapping 2(3): 19 : 3 Docs. : (automatic: 7.3, text: 6.9, analysi: 4.9, approach: 4.6, topic: 1.9) 4 : 2 Docs. : (science: 7.4, analysi: 6.9, education: 5.4, science education: 5.4, research: 5.4) 0 : ISI: : 2010:Trends of Science Education Research: An Automatic Content Analysis 5 : ISI: : 2007:Text mining techniques for patent analysis 8 : ISI: : 2001:Automatic cataloguing and searching for retrospective data by use of OCR text 3(2): 1 : 2 Docs. : 1.00 (network: 7.7, memory: 4.0, associative memory: 2.7, winner-take-all: 2.0) 12 : ISI:A1992HY : 1992:CONSTRUCTING ASSOCIATIVE MEMORIES USING HIGH-ORDER NEURAL NETWORKS 9 : ISI:A1995QT : 1995:ON A CONSTANT-TIME, LOW-COMPLEXITY WINNER-TAKE-ALL NEURAL-NETWORK 4(1): 30 : 1 Docs. : 0 (trend: 6.7, different: 5.0, better: 3.0, trend observation: 3.0, choice: 3.0) 3 : ISI: : 2009:A comparison of methods for detecting hot topics Have common term: Map, Mapping, but different interpretation 36

37 Breakdown Trends of ICT in Edu.
Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 68 : 993 104 : 464 22 : 237 85 : 139 97 : 55 51 : 83 1990 38 1 9 7 1991 53 8 2 6 1992 55 4 11 1993 50 3 1994 42 18 5 1995 17 23 1996 47 12 19 1997 57 27 10 1998 66 29 1999 52 28 14 2000 69 33 15 2001 43 13 2002 44 2003 34 2004 56 59 2005 71 21 2006 37 2007 78 22 25 Main stream topic Dying out topics Hot topics during that period Topic with periodic attraction Promising topics (not yet mature)

38 Interpretation (1/2) The most valuable part of your analysis
Access file: data after parsing or loading Updatable for BC/CW analysis Excel file: various cross tabulation for analysis HTML file: topic tree Results are under C:\CATAR\Result\ : The topic trees in stage n are under the folder name ended with Sn The topic maps of stage n are in the folder name ended with S(n+1) 38

39 Interpretation (1/2) Accept default parameters and try different parameters Interpretable is more important than reasonable Meaningful information may scatter in the results of different stages Need domain experts to help interpretation and verification Reference: Chao-Mei Chen (2010) How to choose parameter in CiteSpace: 39

40 Case Studies Yulan Yuan, Ulrike Gretzel, and Yuen-Hsien Tseng*, "Revealing the Nature of Contemporary Tourism Research: Extracting Common Subject Areas through Bibliographic Coupling ", International Journal of Tourism Research, Vol. 17, No. 5, pp. 417–431, Sep./Oct. 2015, DOI: /jtr.2004. Yuen-Hsien Tseng*, Chun-Yen Chang, M. Shane Tutwiler, Ming-Chao Lin, and James Barufaldi, " A Scientometric Analysis of the Effectiveness of Taiwan's Educational Research Projects", Scientometrics, Vol. 95, No. 3, pp , June 2013. Yuen-Hsien Tseng and Ming-Yueh Tsay, " Journal clustering of Library and Information Science for subfield delineation using the bibliometric analysis toolkit: CATAR", Scientometrics, Vol. 95, No. 2, pp , May 2013. Yueh-Hsia Chang, Chun-Yen Chang, Yuen-Hsien Tseng, "Trends of Science Education Research: An Automatic Content Analysis", Journal of Science Education and Technology, Vol. 19, No. 4, 2010, pp 40

41 Remarks Start from the overview analysis For non-WoS data
So that the Wos can be parse into the database for later use Followed by BC and CW analysis For non-WoS data Refer to: C:\CATAR\Source_Data\movie\movie.mdb C:\CATAR\Source_Data\eport\eport.mdb Put your own data into table TPaper in the database based on the meaning of the field names Separate each value by “; “ if multiple values in a field: Chang, YH; Chang, CY; Tseng, YH 41


Download ppt "CATAR-Content Analysis Toolkit for Academic Research"

Similar presentations


Ads by Google