Presentation is loading. Please wait.

Presentation is loading. Please wait.

全民健康保險研究資料庫論文 產出分析、研究方向及使用示範 陳曾基 國立陽明大學醫學院醫務管理研究所 台北榮民總醫院家庭醫學部.

Similar presentations


Presentation on theme: "全民健康保險研究資料庫論文 產出分析、研究方向及使用示範 陳曾基 國立陽明大學醫學院醫務管理研究所 台北榮民總醫院家庭醫學部."— Presentation transcript:

1 全民健康保險研究資料庫論文 產出分析、研究方向及使用示範 陳曾基 國立陽明大學醫學院醫務管理研究所 台北榮民總醫院家庭醫學部

2 今天在北醫權充一天和尚 Der Prophet gilt nichts im eigenen Land. Nullus propheta in patria. An ass in Germany is a professor in Rome.

3 -

4

5 Materials: Extract data from PubMed ("insurance, health"[MeSH Terms] OR "national health programs"[MeSH Terms] OR health insurance[TW] OR national health[TW] OR national insurance[TW] OR claims data*[TW] OR claim data*[TW] OR insurance claim*[TW] OR insurance data*[TW] OR administrative data*[TW] OR nationwide data*[TW] OR national data*[TW] OR NHIRD[TW] OR NHI[TW] OR BNHI[TW] OR population based[TW] OR population*[ti] OR nationwide[ti] ) AND taiwan[All Fields] AND English[lang] AND 1996:2009[dp]  * Accuracy not guaranteed !!! Courtesy of Yu-Chun Chen 2010

6 Materials: Review of NHIRD Papers  383 articles are included Courtesy of Yu-Chun Chen

7 NHIRD Papers Grows Exponentially Courtesy of Yu-Chun Chen

8 NHIRD Papers Increase In Both Quantity and Quality Courtesy of Yu-Chun Chen

9 Cumulative Number of Papers Using NHIRD, Publish Year Cumulative no. of NHIRD studies Cumulative no. of NHIRD studies indexed in JCR2008 Cumulative no. of authors Cumulative no. of study fields Cumulative no. of journals publishing papers Average 5-year annual growth rate (%) a Doubling time b (year) a Annual growth rate=(no. of studies in current year – no. of studies in previous year) / no. of studies in previous year b Doubling time is estimated by fitted exponential model

10 Distribution of Study Topics Top 10 subjects in MeSH N = 59N = 329N =3 83 Subject category MeSH No.%Rank No.%Rank No.%Rank [H02] Health Occupations [E02] Therapeutics  [N03] Health Care Economics and Organizations [N05] Health Care Quality, Access, and Evaluation  [N02] Health Care Facilities, Manpower, and Services  [H01] Natural Science Disciplines  [N04] Health Services Administration  [F04] Behavioral Disciplines and Activities  [I01] Social Sciences  [N06] Environment and Public Health Courtesy of Yu-Chun Chen

11 Average IF in SCI Fields SCI categoryNo. of articleAverage IF HEALTH CARE SCIENCES & SERVICES PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH PSYCHIATRY HEALTH POLICY & SERVICES PHARMACOLOGY & PHARMACY MEDICINE, GENERAL & INTERNAL CLINICAL NEUROLOGY OBSTETRICS & GYNECOLOGY SURGERY PEDIATRICS CARDIAC & CARDIOVASCULAR SYSTEMS ENDOCRINOLOGY & METABOLISM GASTROENTEROLOGY & HEPATOLOGY OPHTHALMOLOGY IMMUNOLOGY NEUROSCIENCES RESPIRATORY SYSTEM PERIPHERAL VASCULAR DISEASE MEDICINE, RESEARCH & EXPERIMENTAL Courtesy of Yu-Chun Chen

12 Productivity of authors: reproducible success? Author# of article Lin HC99 Lee HC36 Chou YJ30 Lee CH29 Chen TJ25 Chou P24 Hwang SJ21 Xirasagar S20 No. of articles per author No. of authors Cum. % to all authors (%) Cumulative contribution to articles Cumulative percentage to all research (%) > Author# of article Chen CS19 Chen YH18 Huang N18 Chou LF16 Liu TC16 Chang HJ15 Lin CH15 Chen YC14 Tang CH14 Huang WF11 Wang JD11 Yang CY11 Courtesy of Yu-Chun Chen

13 Status: 6 Apr 2011

14 Distribution of NHIRD Papers by Journal Impact Factor and Year IF (2008) (n = 9)(n = 18)(n = 32)(n = 26)(n = 55)(n = 43)(n = 93)(n=107) >= 103 [5-10) [3-5) [1-3) < NA

15 Social Network Analysis as a Tool to Visualize Flow of Information Analysis of studies using health databases 德國海德堡大學醫療資訊研究所 陳 育 群 Sep 25, 2010

16 Collaboration Network: Chen YC et. al. Scientometrics (2010) Taiwan’s NHIRD: administrative health care database as study object in bibliometrics

17 Collaboration network: 2000 Collaboration network, 2000

18 Collaboration network: 2001 Collaboration network, 2001

19 Collaboration network: 2002 Collaboration network, 2002

20 Collaboration network: 2003 Collaboration network, 2003

21 Collaboration network: 2004 Collaboration network, 2004

22 Collaboration network: 2005 Collaboration network, 2005

23 Collaboration network: 2006 Collaboration network, 2006

24 Collaboration network: 2007 Collaboration network, 2007

25 Collaboration network: 2008 Collaboration network, 2008

26 Collaboration network: 2009 Collaboration network, 2009

27 Collaboration network: 2009 (label) Collaboration network, 2009

28 Design of Claims-Based Studies Observational studies Descriptive studies Analytic studies - Ecological - Cross-sectional - Case-control - Cohort 資料處理 單純複雜 低臨床 相關 高臨床 相關

29 Types of Study Designs / Computation Simple description Association / Relationship Complex computation (Data mining)

30 Simple Description Disease – Epidemiologic features of Kawasaki disease in Taiwan, – A nationwide survey on epidemiological characteristics of childhood Henoch-Schönlein purpura in Taiwan. – Prevalence and risks of chronic airway obstruction: a population cohort study in Taiwan. Drug – Utilization of hepatoprotectants within the National Health Insurance in Taiwan. – Demographics and patterns of acupuncture use in the Chinese population: the Taiwan experience. Person – Risks and causes of hospitalizations among physicians in Taiwan Specialty – Use frequency of traditional Chinese medicine in Taiwan. Sector – Patterns of ambulatory care utilization in Taiwan.

31 Association / Relationship A  B (no temporal consideration) – Association between physician volume and hospitalization costs for patients with stroke in Taiwan: a nationwide population-based study. A  B (temporal change) – Seasonal variations in urinary calculi attacks and the association with climate: a population-based study. A => B (temporal sequence) – Risk of extrapyramidal syndrome in schizophrenic patients treated with antipsychotics: a population-based study. – Sudden sensorineural hearing loss increases the risk of stroke: A 5-year follow-up study – Does elective caesarean section increase utilization of postpartum maternal medical care? * Control Group

32 Complex Computation Association rule mining – Application of a data-mining technique to analyze coprescription patterns for antacids in Taiwan. Frequent itemset mining – The prescriptions frequencies and patterns of Chinese herbal medicine for allergic rhinitis in Taiwan.

33 Simple Description Disease – Epidemiologic features of Kawasaki disease in Taiwan, – A nationwide survey on epidemiological characteristics of childhood Henoch-Schonlein purpura in Taiwan. – Prevalence and risks of chronic airway obstruction: a population cohort study in Taiwan. Drug – Utilization of hepatoprotectants within the National Health Insurance in Taiwan. – Demographics and patterns of acupuncture use in the Chinese population: the Taiwan experience. Specialty – Use frequency of traditional Chinese medicine in Taiwan. Sector – Patterns of ambulatory care utilization in Taiwan. Pediatrics : IF 4.789, Ranking 2 / 86 (Pediatrics) Chest : IF 5.154, Ranking 4 / 40 (Respiratory system) Rheumatology : IF 4.136, Ranking 7 / 22 (Rheumatology)

34 Association / Relationship A  B (no temporal consideration) – Association between physician volume and hospitalization costs for patients with stroke in Taiwan: a nationwide population-based study. A  B (temporal change) – Seasonal variations in urinary calculi attacks and the association with climate: a population-based study. A => B (temporal sequence) – Risk of extrapyramidal syndrome in schizophrenic patients treated with antipsychotics: a population-based study. – Sudden sensorineural hearing loss increases the risk of stroke: A 5-year follow-up study – Does elective caesarean section increase utilization of postpartum maternal medical care? Clin Pharmacol Ther : IF 7.586, Ranking 9 / 219 (Pharmacology …) Med Care : IF 3.194, Ranking 5 / 62 ( Health Care Sciences... ) Stroke : IF 6.499, Ranking 6 / 156 ( Clinical Neurology ) J Urology : IF 3.952, Ranking 9 / 57 ( Urology … )

35 Complex Computation Association rule mining – Application of a data-mining technique to analyze coprescription patterns for antacids in Taiwan. Frequent itemset mining – The prescriptions frequencies and patterns of Chinese herbal medicine for allergic rhinitis in Taiwan. Allergy : IF 6.204, Ranking 2 / 17 (Allergy)

36 使用示範 (陳育群醫師製作)

37 NHIRD Datasets

38 健保資料庫長得這樣 …

39 變成這樣 …

40 40

41 File Structures of Datasets to Process Single file Multiple files: – Of the same format – Of similar formats in different years – Of different formats, but connected through Primary key / foreign key Loop-up table

42

43 Main Tasks 1. 資料轉換 READ 2. 變數處理 SELECT, FILTER APPEND, UNION JOIN, SORT 3. 結果分析 AGGREGATE OUTPUT STATISTICS

44 統計套裝軟體 資料管理系統 程式語言

45 1. 資料轉換 2. 變數處理 3. 結果分析              

46 資料轉換 1. 資料轉換 READ 由資料來源讀取出資料,將它們轉換成適合 分析的型態,並且將它們匯入資料庫內。 通常還要搭配著資料清潔 (Data Cleaning) 將 系統源頭許多未經整合的、不允許的、遺失 的或者錯誤的資料,在匯入之前重新整頓 (Garbage In, Garbage Out)

47 變數處理 2. 變數處理 SELECT, FILTER APPEND, UNION JOIN, SORT 把分別儲存於不同表格的原始資訊,如 醫院層級、藥物分類、疾病分類、重大 傷病等等「串連」許多個資料表 資料加值

48 以 SAS 為例

49 實際上的例子 (1) 已知某個族群 (ID_Cohort) ,想了解這個族 群在 2000 年到 2006 年的就診情形(醫院層 級、醫院所在地點、看診日期)

50 實際上的例子 (2) 已知某個族群 (ID_Cohort) , 排除掉 (exclusion criteria) ,想了解這個族群在 2000 年到 2006 年內第一次就診日期

51 實際上的例子 (3) 已知某個族群 (ID_Cohort) ,扣掉不符合的 案例 (exclusion criteria) ,想了解這個族群在 2000 年到 2006 年內最後一次就診日期及就 醫地點(醫院層級、醫院地點)、治療時 期及這期間使用藥物。 有點複雜。

52

53 SQL 語法 SQL (Structured Query Language) 1970 由 IBM 發表,用於大型資料庫中 資料的定義 / 操作 / 查詢 / 控制 適合資料串連,幾乎遍及所有資料庫系統 ( MS SQL Server, IBM DB2, Oracle, MySQL … 等等) SAS 6.0 以後加入專門資料操作用模組 (Proc SQL)

54 What’s COOL in SQL ? SELECT, FILTER, APPEND, UNION, SORT, JOIN SQL is designed for MULTIPLE RELATION tables JOIN – MERGE (in SAS) is a special case of JOIN (equal join) Reads like English

55 實際上的例子 (1) 已知某個族群 (ID_Cohort) ,想了解這個 族群在 2000 年到 2006 年的就診情形(醫 院層級、醫院所在地點、看診日期)

56 SELECT hosp_cont_type, area_no, func_date FROM cd JOIN ID_Cohort ON cd.id = ID_Cohort.id JOIN HOSB2006 ON cd.hosp_id = HOSB2006.hosp_id Q: 想了解某族群 (ID_Cohort) 的就診情形(醫院層級、醫院所在地點、看診日期)

57 實際上的例子 (2) 已知某個族群 (ID_Cohort) , 排除掉 (exclusion criteria) ,想了解這個族群在 2000 年到 2006 年內第一次就診日期

58 SELECT id, min(func_date) as FirstVisit FROM cd JOIN ID_Cohort ON cd.id = ID_Cohort.id WHERE id NOT IN ( SELECT id FROM excludeCriteria ) GROUP BY id Q: 想了解某族群 (ID_Cohort) 排除掉 (exclusion criteria) 想了解第一次就診日期

59 實際上的例子 (3) 已知某個族群 (ID_Cohort) ,扣掉不符合的 案例 (exclusion criteria) ,想了解這個族群在 2000 年到 2006 年內最後一次就診日期及就 醫地點(醫院層級、醫院地點)、治療時 期及這期間使用藥物。 有點複雜。

60 WITH tmpVisit AS ( SELECT id, min(func_date) as FirstVisit, max(func_date) as LastVisit FROM cd JOIN ID_Cohort ON cd.id = ID_Cohort.id WHERE id NOT IN ( SELECT id FROM excludeCriteria ) GROUP BY id ) SELECT id, DATEDIFF(month, FirstVisit, LastVisit) as Duration FROM tmpVisit …… Q: 某個族群 (ID_Cohort) ,扣掉不符合的案例 (exclusion criteria) ,想了解 最後一次就診日期及就醫地點(醫院層級、醫院地點)、治療時期及這期間使用藥物

61 SQL Also Works in SAS SELECT hosp_cont_type, area_no, func_date FROM cd JOIN ID_Cohort ON cd.id = id_cohort.id JOIN HOSB2006 ON cd.hosp_id = HOSB2006.hosp_id Q: 想了解某族群 (ID_Cohort) 的就診情形(醫院層級、醫院所在地點、看診日期)

62  PROC SQL; ;  QUIT;  SELECT hosp_cont_type, area_no, func_date FROM cd JOIN ID_Cohort ON cd.id = id_cohort.id JOIN HOSB2006 ON cd.hosp_id = HOSB2006.hosp_id SQL Also Works in SAS Q: 想了解某族群 (ID_Cohort) 的就診情形(醫院層級、醫院所在地點、看診日期)

63 PATTERNS OF TRADITIONAL CHINESE MEDICINE (TCM) USE IN PATIENTS WITH INFLAMMATORY BOWEL DISEASE (IBD): A POPULATION STUDY IN TAIWAN Example: Prevalence analysis Yu-Chun Chen, Fang-Pey Chen, Tzeng-Ji Chen, Li-Fang Chou, Shinn-Jang Hwang. Hepato-Gastroenterology 2008;55: [SCI]

64 Research Objective Inflammatory bowel disease (IBD) 在台灣地區 的盛行率 ? IBD 病患使用中醫的情形 ? IBD 病患接受何種中醫治療 ? 使用資料庫: – 承保資料檔, 重大傷病檔, 中醫門診處方及治療 明細檔

65 國家衛生研究院 全民健保資料庫 中醫門診處方及治療明細檔 (CM_CD 檔 ) – 年, 共 228 個檔案, 82 GB

66 中醫門診處方治療明細檔部分內容 2005 年 12 月的一個檔案 : 444 MB – 1,550,000 筆資料

67 Example: IBD 在台灣的盛行率 1. 資料轉換 2. 變數處理 3. 結果分析   

68 Data Processing with SQL Server 2005

69 SQL Server: Task 1. 資料轉換 將健保資料檔轉換成 SQL 資料檔, 並儲存在 IBD_DATA 資料檔中 bulk insert IBD_DATA..HV 匯入 from ‘xxxxx.dat’ 從 with 選項 ( batchsize = , formatfile = ' 檔案格式.fmt' )

70 SQL Server: Task 2. 變數處理 IBD 病患 : 挑取重大傷病代碼為 555.x 或 556.x ( 當作分子 ) SELECT id 選取 FROM HV 從 HV WHERE 條件 LEFT(ACODE_ICD, 3) = '555' OR LEFT(ACODE_ICD, 3) = '556'

71 SQL Server: Task 2. 變數處理 Population: 挑取 2004 年所有的保險人 ( 當作分母 ) SELECT Pop.*, IBDpt.ID----- 選取 FROM Pop 從 POP LEFT OUTER JOIN IBDpt 串聯 ON Pop.id = IBDpt.id----- 依照

72 SQL Server: Task 3. 結果輸出 分別計算性別、年齡層的盛行率 SELECT sex, age, count(*) 選取 FROM JoinTABLE 從 JoinTABLE GROUP BY sex, age 合併計算

73 SQL Server: Task 3. 結果輸出 分別計算性別、年齡層的盛行率

74 Results Prevalence of IBD in Taiwan is 5.6 per 100,000; Male > Female Women were more likely to use TCM than men (40.5% vs. 34.3%). 45.5% patients had GI diagnoses at their TCM visits. Most of their TCM visits contained herbal remedies (90%).

75 I have a dream …

76

77 The NHIRD research will enable the current generation of medical professionals in Taiwan to know the Amis better than the Amish. 健保資料庫研究 讓 台灣新世代醫事人員 瞭解 台東 多於 美東 * Amis : 阿美族

78 Some suggestions

79 How can we start to do? Become familiar with the NHIRD codebooks and NHI regulations Think of research problems Read relevant literature Discuss with colleagues Find friends familiar with data processing Motivation and courage Tolerance and endurance 79

80 Paper Production Flow IdeaMethodWriting Tools Teams Atmosphere Infrastructure Journal Idea English Materials ComputingStatistics

81

82 Advertisement

83 Open-source P-Q-R Solutions to NHIRD Data Management Coming !

84 Thanks for Your Attention !


Download ppt "全民健康保險研究資料庫論文 產出分析、研究方向及使用示範 陳曾基 國立陽明大學醫學院醫務管理研究所 台北榮民總醫院家庭醫學部."

Similar presentations


Ads by Google