Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introductions and Course Overview Monday 30 January 2006 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Lecture 1.

Similar presentations

Presentation on theme: "Introductions and Course Overview Monday 30 January 2006 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Lecture 1."— Presentation transcript:


2 Introductions and Course Overview Monday 30 January 2006 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Lecture 1

3 UCSF Participants NameRoleDates Sandy SchwarczInstructor30 Jan-10 Feb George RutherfordInstructor30 Jan-3 Feb Tom NovotnyInstructor6-10 Feb Sanny ChenData analyst31 Jan-10 Feb Laura Packel (UCB)Coordinatorat UCSF UCSF, University of California, San Francisco UCB, University of California, Berkeley

4 First AuthorTitle/TopicUCSF Staff Dr. Huai Yang (China)? Dr. Som Leakhann (Cambodia) Dr. Kongxay Luangphengso (Laos) Dr. Has Phal Mony (Cambodia) Dr. Sarun Saramony (Cambodia) Dr. Yin Jiaxiang (China) ? Factors influencing client accessibility for VCCT in Cambodia Field evaluation of F1 Antigen enzyme- linked assay for serodiagnosis of animal plague in Yunnan, China Participants

5 First AuthorTitle/TopicUCSF Staff Dr. Chanpheth Phothilath (Laos) Effect of education and condom promotion on knowledge among service women Dr. Trinh Thanh Thuy (Vietnam) Dr. Worawit Kitisak (Thailand) Dr. Chakrarat Pittayawonganon (Thailand) Retrospective evaluation of chest radiograph screening for TB among HIV+s Enhancing HIV/AIDS passive case surveillance through fungal OI diagnosis and reporting Tsunami injury/casualties Participants

6 First AuthorTitle/TopicUCSF Staff Dr. Asadong WannachakEpidemiology of Liver Cancer Dr. Panithee Thammawijaya Dr. Nattaphon Yampikulsakul Dr. Rapeepan Dejpichai Dr. Pawinee Doung- ngern Dr. Theerayudh Sukmee Dr. Kusak Bumrungsana Chronic mass hysteria from spiritual beliefs and teenage affection Effect of Cadmium on renal function Sero-prevalence among people living near confirmed H5N1 cases Assessment of early warning and rapid response in Thailand ? Participants (Thailand)

7 First AuthorTitle/TopicUCSF Staff Dr. Chawanee Sintuvongsanon ? Dr. Chaiyaporn Suchatsoonthorn Dr. Warin Pongkankham Dr. Wandee Kongkaew Dr. Karoon Chanachai ???????? Participants (Thailand)

8 Why publish? Ethical obligation to subjects and society To have the greatest public health and clinical impact Really understand your topic Currency of academic research Future grant applications

9 Course objectives A complete draft of paper:  <3000 words, 3 tables, 1 figure, 20 references A template for writing future research papers The experience of peer review A timeline for submission and publication By the end of the workshop, you will hopefully have:

10 Sections of a research paper Introduction Methods Results Discussion Acknowledgments References Tables Figures Title page -Long title -Short (running) title -Authors -Affiliations -Correspondence Abstract Key words

11 Structure of the Course 2 weeks in length  First week for lectures and drafting sections of paper  Last week for completing writing with mentors and for formal peer review Assigned reading and writing sections to be completed each day Peer review of each section first thing each morning in week 1 Individual mentoring, writing time, statistical consultation  Usually in the afternoon, and more time towards the end of the course Peer review of full articles  Last two days of course

12 Lecture topics- Week 1 Monday 30 January:  Course overview  Title, introduction, literature review, references  Group excercise – “Elevator test” Tuesday 31 January:  Choosing a statistical test  Methods Wednesday 1 February:  Results  Tables and figures

13 Lecture topics- Week 1 Thursday 2 February:  Discussion Friday 3 February:  Abstract  Authorship, title page, choosing a journal, instructions to authors, cover letter, submission

14 Lecture topics - Week 2 Monday 6 February  The peer review process Tuesday 7 February  Responding to reviewers’ comments Wednesday 8 February  Draft manuscript is due at 4:00 PM

15 Lecture topics - Week 2 Thursday 9 February  Peer reviews - Groups 1 and 2 Friday 10 February  Peer reviews - Group 3 and 4  Completed manuscript is due at 4:00 PM  Course wrap up and evaluation  Graduation

16 Additional course activities Every day:  One-on-one work with advisors/instructors  Team writing  Individual writing Statistical consultation and analysis

17 Additional course activities Tea x 2 Lunch

18 The Research Question Monday 30 January 2006 Lecture 2 Bangkok Scientific Writing Workshop 30 January - 10 February 2006

19 Learning to summarize research study or question in 1-2 sentences Forces author to understand and synthesize all the important elements of the study Valuable skill for communicating clearly with colleagues Applicable format to describe a research proposal, a study underway, or one that is completed

20 Examples Describing a study already completed: We present the results of a randomized controlled trial among HIV-uninfected Thai injection users that evaluated if a recombinant gp120 vaccine reduces the incidence of HIV infection. Describing a proposal: Using an observational longitudinal cohort design, we will determine whether HIV+pregnant women who take vitamin supplements have improved pregnancy outcomes, compared to women who are not taking supplements.

21 Three elements of research summary statements 1. Study design  Trial – randomized, controlled, blinded (or not)  Cohort – longitudinal, cross-sectional, double, retrospective  Other sampling designs – cross sectional consecutive, convenience, chart review 2. Subjects Men, women, HIV infected/uninfected, place of recruitment (clinic, hospital, community, geographic area – India, Africa, US) 3. Primary variables  Predictor  Outcome

22 Examples of research questions We present the results of a year long randomized controlled trial (1. study design) among 3000 HIV- uninfected Thai injection drug users (2. subjects) that evaluated if a recombinant gp120 vaccine (3. predictor) reduces the incidence of HIV infection (3.outcome variables). Class to identify 3 components: Using an observational longitudinal cohort design, we will determine whether HIV-infected pregnant women who take vitamin supplements have improved pregnancy outcomes, compared to women who are not taking supplements.

23 Class introductions and examples Everyone please introduce themselves WILL EACH PARTICIPANT PLEASE TELL THE CLASS: Your name A summary of your paper/study including the 3 main elements Areas needing most work – your goals for workshop

24 Getting Started Monday 30 January 2006 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Lecture 3

25 First: get organised The best papers are hinged on a primary finding and its significance  Identify and select which findings you want to present in this paper  Avoid including everything-- can write other papers to elaborate on other findings  Think of the “MPU”: minimal publishable unit Know the literature -- be a scholar

26 Second: be familiar with the specific manuscript-structure of journals Obtain “Instructions for Authors” from the journal in which you wish to publish  Examples: Journal of the Medical Association of Thailand AIDS Lancet Read model papers:  From journal of where you’d like to publish your paper  On topic similar to your paper

27 Third: make an outline Make an outline with major headings:  Introduction  Methods  Results  Discussion Use subheadings for Methods and Results Make lists of each major point to be addressed in the introduction and the discussion Keep manuscript parts together in one electronic and hard copy file

28 Example of an outline I. Introduction (general to specific) A.Men who have sex with men (MSM) exist in all countries and cultures B.MSM are severely affected by HIV/AIDS 1.Prior research in developing world has focused mostly on Brazil and Thailand 2.More recently, studies conducted in India C.However, almost nothing is know about MSM in Africa D.We implemented a survey of MSM in Uganda to gauge their level of risk for HIV

29 Fourth: start writing Fill in outline as sentences and paragraphs are written Be concise  Short sentences, short paragraphs  Use sub-headings to keep organized  Shorter papers have better chance of publication

30 Tips for writing well Start each paragraph with a topic sentence Flow: move smoothly between paragraphs  Thought of last sentence flows into thought of first sentence of next paragraph Avoid clichés  “Important”, “Significant”  “More research is needed” (unless you specifically say what is needed)

31 Tips for writing well Peer review at least once; twice is better Re-write & re-write & re-write At some point soon: “Out the door” If rejected, re-submit:  Every paper has a home  Reviewers can be biased and capricious  So can editors

32 Comments on general style: Use scientific English (or Thai!) Papers dont have to be bland or boring Use concise language and sentences Imitate writing and language conventions of the field (psychology vs. Clinical or public health writing) Use active voice (active verbs) as much as possible Stick to facts that can be documented, and avoid speculation Avoid the use of “I”. Limited use of “we” is OK. American vs. British conventions  Spelling—depends on the journal, be consistent  Laboratory values

33 Two orders for writing manuscripts

34 In what order should an article be written? 1.Results -- Put tables and data together first--  Use “working tables” to organize and understand data and relationships – too lengthy for publication, but useful for author  Helps to Identify primary 1-2 findings of the paper  Rule of thumb: 3 tables and 1 figure for publication 2. Writing up results: follow order of the tables and figure  Describe subjects, distribution of demographics, main variables and main outcome (“univariate analyses”)  Bivariate analyses: association of predictors with main outcome  Multivariate and longitudinal analyses  Elaborate upon single most important finding  Sub-analysis of important groups and potential biases

35 In what order should an article be written? 3. Methods  Matches how you got Results (no more, no less) 4. Discussion  Primary important finding clearly stated first – punch line  Relevant other findings, confirmation of other studies, enhancing causal inference  Surprising, contradictory, unexpected findings  Limitations  Public health implications (HIV prevention or care)

36 In what order should an article be written? 5. Introduction If written last – allows you to lead reader appropriately  First identify the general issue (HIV epidemiology, prevention, care in Asia, Thailand)  Specific issue  What is missing in current knowledge  How this study will address holes in current understanding 6. References  20 is usually sufficient

37 In what order should an article be written? 7. Title  Title should reflect single main finding, or main point of study, and should be interesting 8. Abstract  Usually written last; falls more easily into place once results and discussion are written

38 Alternative order of writing Introduction first  Use background section of your research protocol  Use this system if you need to research the literature to understand importance or context of primary findings. Can help focus. Methods early on  Easy to write if already known  Helps you to recall exactly what was done in the study – particularly important if you didn’t implement or design the study. Clear understanding of methodology and its limitations is important for interpretation of results  Write this section if still waiting for analyses to be completed

39 Alternative order of writing Results, table and figures  Always construct tables first, before writing Discussion This is the method we’ll use

40 Tips for writer’s block Only work on a topic that you are interested in Just start.  Start filling in easy pieces  Don’t worry how it looks at first. It’s always easier to edit Stay here, no e-mail, no cell phones Write incrementally, by sentence, by paragraph, by section

41 Title, Title page, Introduction and References Lecture 4 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Monday 30 January 2006

42 Title page Title (today) Authors (more later) Author’s affiliations Corresponding author’s address Word count text, word count abstract Disclosures, conflicts of interest, funding, previous presentations


44 Types of titles Explanatory Interrogatory Declarative Cute and Catchy

45 The explanatory title Says exactly how the study was done  Study design  Main outcome  Main predictor  Study population, site Advantages  Most common, recognized, standard  Sometimes required by journal  Targets relevant audience with key title words Disadvantages  Can be boring, long

46 Examples of explanatory titles Descriptive HIV voluntary counseling and testing and HIV incidence in male injecting drug users in northern Thailand: evidence of an urgent need for HIV prevention Analytic Lack of association between human immunodeficiency virus type 1 antibody in cervicovaginal lavage fluid and plasma and perinatal transmission, in Thailand. Intervention The efficacy of fluconazole 600 mg/day versus itraconazole 600 mg/day as consolidation therapy of cryptococcal meningitis in AIDS patients.

47 The interrogatory title Poses the most important question Advantages  Catches interest  Focussed Disadvantages  You better answer the question in the paper!  Says little about the design  May not be allowed by journal

48 Examples of interrogatory titles HIV seroconversion among factory workers in Phnom Penh: Who is getting newly infected? Is there a heterosexual HIV epidemic in the United States? (Note: this paper did not answer the question) Are recent increases in sexual risk behaviour among older or younger men who have sex with men? Answer: Both. (Note: question and answer in title)

49 The declaratory title Says the main finding as a simple sentence Advantages  No one misses the point  Interesting, provocative, focussed  Good for conference abstracts Disadvantages  May not be allowed by journal  Invites disagreement

50 Examples of declarative titles HIV infection may adversely affect clinical response to chloroquine therapy for uncomplicated malaria in children Deferral of blood donors with HIV risk factors saves lives and money in Zimbabwe Low socioeconomic status is associated with a higher rate of death in the era of highly active antiretroviral therapy, San Francisco

51 The cute and catchy title Uses a pun, humor, or trendy term Advantages  Catches attention, interesting, provocative  Good for conference abstracts Disadvantages  Glib, flippant, sometimes in bad taste  May not be allowed by journal

52 Examples of cute and catchy titles Cruising on the Internet highway The gay 90s: a review of research in the 1990s on sexual behaviour and HIV risk among men who have sex with men A tale of two futures: HIV and antiretroviral therapy in San Francisco

53 What kind of title is each one? Case-control study of risk factors for Penicillium marneffei infection in HIV-infected patients in northern Thailand (actual title) What are the risk factors for Penicillium marneffei infection in HIV-infected patients in northern Thailand? Contact with soil is a risk factor for Penicillium marneffei infection in HIV-infected patients in northern Thailand. Good penicillin and bad penicillin: risk factors for Penicillium marneffei infection in HIV-infected patients in northern Thailand

54 Class participation Volunteer to present your explanatory titles Volunteer to present your interrogatory titles Volunteer to present your declarative titles Volunteer to present your cute and catchy titles

55 Additional tips for titles Are the title and main research finding closely related? Is the title objective in tone?  If not declarative, can you back it up? Are special features of the study mentioned?  E.g., Randomized, population-based, unique population, new method

56 Introduction Think of the Introduction as 4 sentences: 1.The general situation 2.The specific situation 3.The gap in our knowledge of the specific situation 4.What you did to fill the gap

57 Introduction Build 4 sentences into 3-4 paragraphs 10-20 references Progress from general facts to specific facts End on how your study fits into progression:  Entirely new hypothesis  More rigorous methodology, higher order of study RCT, Population-based  Special, new population  New measure, test

58 Example of a 4-sentence introduction 1.General: HIV/AIDS care and prevention are rapidly expanding in world, Asia, Thailand 2.Specific: Surveillance data, usually ANC-based, guide planning of care and prevention programs 3.Gap: Few routine data on men, non-pregnant women 4.How we fill gap: We analyzed trends in HIV prevalence at VCT sites to assess usefullness as surveillance tool

59 Expand to 4 paragraph introduction I.Introduction: A.General: Cambodia has a high level of STIs 1.Estimated numbers of STI 2.Impact on HIV transmission B.Specific: Low partner treatment hinders STI control 1.Partner treatment is standard part of STI management 2.Barriers to partner treatment 3.Little partner treatment is done in Cambodia C.Gap: Few studies of client-centered partner notification in Asia 1.Hong Kong: Patient-delivered partner treatment 2.Vietnam: Physician counseling D.How will filled gap: We conducted a controled trial of a single session client-centered partner notification counseling intervention

60 Example of Introduction Class participation: Attempt 4-sentence Introduction of your paper

61 References In the Internet age:  No longer need an exhaustive literature review for every paper  A scientific paper is not a doctoral dissertation!  Less need for bibliography-like references  Less need to “prove” you are a world’s expert: But still need to show you understand the issues  20 journal papers are usually sufficient (10 are even better)

62 References in the Introduction Specific facts, assertions, assumptions Seminal studies Review papers on topic Model for your paper (be sure your study has not already been done) 10 to 15 are usually sufficient

63 References in the Methods Previous publications from the same study  Especially if more detailed New or unique measures, lab tests Previously validated questionnaires New or complex theoretical models Occasionally unusual statistical tests 0, 1 or 2 are usually sufficient NO REFERENCES IN THE RESULTS!

64 References in the Discussion Strengthen causal inference:  Consistency in studies with similar or different methods  Biological plausibility, coherence  Alternative explanations Contradicting studies 10 – 15 are sufficient (some already used in Introduction)

65 Hierarchy of references Recent, peer-reviewed journal articles Very recent conference abstracts Guidelines (from respected institutions, WHO, UNAIDS, CDC) Medical text books Websites (from respected institutions, factual) Reports (if easily obtained, official) Dissertations (hard for others to access) Newsletters, fact sheets, non-peer reviewed

66 Cautious use of references In press (add full reference before printing) Personal communication (person, date) Unpublished data by one of the co- authors Unpublished data by someone else

67 Avoid use of references From popular press Future publications:  Submitted by not accepted  Not yet submitted, in preparation  Not yet written References you don’t have handy (you will be asked to cite fully) References you have not read

68 Additional tips for references Don’t over do it:  Too many for one fact or common knowledge Results section should not have references Line references up with corresponding facts with the sentence  Otherwise in chronological order Redo search if long lag in publication

69 Format for references Vancouver style, most common for biomedical journals van Griensven F, Thanprasertsuk S, Jommaroeng R, et al. Evidence of previously undocumented epidemic of HIV in men who have sex with men in Bangkok, Thailand. AIDS 2005; 19:521-526. Psychology journals tend to use different formats (AIDS and Behavior)

70 Format for references For now, use placeholder:  [Last name of first author, year; next, year]  E.g., [Baryarama, 2002; Kaharuza, 2003] Number using journal format, do last!  Superscript. 1-4,12  Bracket [1-4,12]. Parentheses (1-4,12).  Psychology journals (Barayama & Kaharuza, 2003).

71 Getting ready to write the Results Section: The “elevator test” Class participation—5 volunteers needed! You get into the elevator with your boss. He or she asks: “What did you find in that research study you did?” You have one minute before the two of you get off on his or her floor. Explain the single most important finding of your study in one minute.

72 Today’s homework Outline:  Major sections, major sub-headings, main points Medline literature search:  Find 5 - 10 key references Draft:  Title  Introduction (as 3-4 sentences or 3-4 paragraphs)

73 Additional activities today Lunch Work with mentors/analysts and write for remainder of the afternoon We will divide you into four peer review groups for tomorrow and give each person a partner Please give your partner your title and a copy of your introduction section at the end of the day today Please make six more copies of your title and introduction for tomorrow morning’s peer review

74 Methods Lecture 5 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Tuesday 31 January 2006

75 Methods Describe how you did the study with enough detail for the reader to judge the study’s strengths and weaknesses Can be used to repeat study if needed

76 Methods Four key points to communicate: 1.Study design 2.Subjects 3.Measurements 4.Analysis

77 Methods Do not present results  Exception: Pyschology literature sometimes presents recruitment and subject description in Methods  Assume our papers will be biomedical (include description of subjects in Results)  But describe recruitment in methods here

78 Methods But, methods are linked to Results section  If you add data to the Results, then must be sure how data was collected is described in the Methods  If you decide not to add certain analyses in the Results, then drop description from Methods E.g. If a lengthy clinical exam, and laboratory evaluation was part of the study, but you are not describing these results, you do not need to provide detail in the methods; they can be mentioned, however.

79 Methods 500 – 750 words 0, 1 or 2 references  Previous paper from same study  Validated questionnaire  New laboratory test  Complex theoretical model for behavioral studies

80 Methods Sub-sections with headings recommended Find model paper to follow sub- section headings Outline each sub-subsection Expand each sub-heading to a short paragraph

81 Minimal Methods sub-headings Subjects Measurements Analysis

82 Expanded Methods sub-headings Overall study design Setting Study subjects Study procedures and measurements Intervention (if any) Laboratory methods Data analysis Ethical considerations

83 1. Overall study design Pivotal sentence Think one sentence:  “We conducted an X study of the effect of Y on Z in a population of W in Q from year R to year S.” (sound familiar? Study description......) This sentence may appear at end of Introduction, instead

84 1. Overall study design Basic study designs:  Cross-sectional survey  Case-control study  Cohort study (longitudinal, retrospective, x- sectional or showing BL data only)  Trial – Randomized, controlled  Before-after study Combinations (describe both) Special cases (more later )

85 1. Overall study design Elements to include:  Prospective or retrospective (usually understood by basic design)  Blinded  Randomized  Secondary analysis of data collected for other purpose  Descriptive, exploratory vs. hypothesis testing  Time frame – either mentioned specifially when study was conducted, and length of f/u

86 Examples of overall study design “We analyzed trends in the prevalence of HIV infection over a 9- year period (1992-2000) from a large database of rural Thai VCT clients tested at 4 established sites.”

87 Examples of overall study design Class participation: Is anyone unclear about their study design?

88 2. Setting Geography  County, urban, rural  Describe typical demographics of setting  Chance for a more interesting literary description – i.e. What kind of a place is Chiang Rai, Bantei Meanchay, Phuket? Facility  Hospital, clinic, VCT sites—private, public, large etc.  Can describe typical clientele

89 2. Examples of settings Geography: “The setting of our study is Pnomh Penh, Cambodia, the country’s capital and largest city...” Facility: “The Phuket and the Takua Pa Hospitals are tertiary care facilities located in southern coastal Thailand near the areas most heavily damaged by the tsunami of 26 December 2004.”

90 2. Setting Setting may be included in Overall Study Design Setting may be included in Study Subjects

91 3a. Study subjects Who are they? Where do they come from? How did you sample them? How did you recruit them specifically? How did you enroll them?

92 3a. Study subjects Who did you enroll :  Inclusion criteria  Exclusion criteria  Don’t spell out as separate headings Time frame if relevant  Season may matter

93 3b. Study subjects Sampling design  Random sample- how randomized  Consecutive, convenience  Sub-sample of larger study  Special sampling procedures Venue-Day-Time Respondent-Driven Sampling

94 3c. Study subjects Procedures for:  Initial contact, or recruitment—came in through clinic; recruited through peer outreach; responded to advertisement  Enrollment- may be different from place of recruitment; enrolment implies that they met inclusion criteria  Consent –informed, signed or not  (IRB approvals)

95 3d. Study subjects Special case: Case-control subjects:  Describe larger population  Describe definition of cases  Describe selection of controls  Describe matching (or mention no matching)

96 3d. Study subjects Special case: Randomized controlled trial:  Describe larger population from which subjects drawn from  Randomization procedures  Blinding Subjects Interviewers Researchers Statisticians

97 4a. Measurements Describes how data were collected If a qualitative study, describe:  Type of interview (in-depth interviews, open- ended, semi-structured)  Focus group discussions: recorded, transcribed how many per group  How long it took, who performed them, and where  Confidentiality, names, etc.

98 4b. Measurements If a quantitative study based on questionnaire data, describe the interview and questionnaire:  Interviewer-administered, self-administered, ACASI  How long did it take; where done  How developed – piloted, revised, translated, back translated?  Were any items standardized?  Was a follow-up questionnaire done – if so, differ?  Describe general questionnaire domains, for example: demographic characteristics, general health, HIV/AIDS knowledge and attitudies, risk behaviors

99 4c. Measurements Clinical evaluation  Physical exam- by whom, of what, including any particular measurements?  Treatment provided? Follow-up exam? Include only those measurements that are ultimately presented in Results

100 4d. Measurements Special situations  Data collected for other purposes, secondary analysis  Data abstraction procedures- for chart reviews; meta-analyses  Other data sources: Census Other studies Assumptions of models

101 5. Intervention Details of intervention as intended  Theory (e.g., behavioral)  Components, logistics  “Dose”, intensity of program Control activities Intervention and Control Activities may be separate sub-headings Results may describe what was actually delivered

102 6. Laboratory methods Often separate sub-heading Screening, confirmatory tests; may need to indicate parameters for a pos test (OD cut- off) Manufacturer of tests:  Product name (Company name, City, State or Country) References for new, experimental tests Indicate where performed –which lab

103 7a. Data analysis, statistical methods Discuss where and into what program data was entered; where stored; where analyzed  “Data were entered on site into Access, transferred to SAS, and evaluated for range and logic checks. The data were then transferred to the server at the TUC data management center for analysis using SAS version 9.1 (SAS Institute, Cary, NC).”

104 7b. Data analysis, statistical methods Focus on primary analysis Statistical tests in order of use Results  Univariate- Distributions of variables were evaluated using means, SD, median, range,and proportions  Bivariate- Differences in proportions were evaluated using chi-square tests, difference in means using t-test. Odds ratios were calcuated with 95% confidence intervals using logistic regression

105 7c. Data analysis, statistical methods Variables  Identify primary predictor variables – particularly if collapsed, composite variable  Identify primary outcome variable  Scales – whether Cronbach-alphas used; factor analysis; new scales created or modified

106 7d. Data analysis, statistical methods Special analyses  Multivariate Which variables were included and criteria – associated in bivariate analysis, and p<.10  Stratification Male vs. female, young vs. old  Sub-group analyses  Analysis of potential biases Participants vs. non-participants Lost to follow-up vs. retained in longitudinal studies

107 7e. Data analysis, statistical methods Additional considerations  Collapsing of variables, transformation  Power estimation prior to study (not common)  Consideration of statistical significance P < 0.05 P < 0.01 if many comparisons P < 0.1 for interactions, inclusion in model

108 8. Ethical considerations Also can be at end of Study Subjects or on cover page Approval by IRBs Special considerations:  Vulnerable populations (prisoners, minors)  Exempt or waiver of IRB (use of secondary data)  Waiver of informed consent

109 Examples of ethical considerations: simple “Human subjects review boards in Cambodia and the United States approved the study protocol.” – too vague Better: the IRB committees of the Cambodian National Institute of Public Health, the Centers for Disease Control and Prevention and UCSF reviewed and approved the study

110 Examples of ethical considerations: special population “Subjects aged 15 to 18 years were considered emancipated minors and able to consent to the study. The protocol for this study was reviewed, approved, and monitored by the ethical committees of the Mahidol University and the University of California, San Francisco.”

111 Examples of ethical considerations: exempt “The analyses presented in this report consisted only of secondary unlinked data analysis; no contact with human subjects occurred.” This statement would go at the end of the “Data Collection” collection

112 Special methods situations (find models to follow) Randomized Controlled Trials  Special structure (CONSORT) Evaluation of diagnositic tests Mathematical modeling Secondary analysis of multiple data sources

113 Special methods situations (find models to follow) Review papers Meta-analysis and systematic reviews Cost-effectiveness analysis Non-biomedical journals

114 Today’s homework Draft (revise):  Methods (minimum outline sub- headings)  Revise title pages and introductions as needed  Please give a copy of your methods section to your partner and have six additional copies ready for peer review tomorrow morning

115 Statistical Analyses of Data Lecture 6 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Tuesday 31 January 2006

116 Steps in data analysis-1 Data collection Data entry Data cleaning These FIRST THREE STEPS ARE CRITICAL – your analysis will only be as good as the data that are collected- “garbage in, garbage out”

117 Data cleaning Should be ongoing – from initial data entry through analysis The earlier you clean your data, the better – can sometimes be too late  May need to recall the subject, talk to the interviewer or clinician, repeat lab tests Primary tools for data cleaning are range and logic checks- can be automatic if programmed into data entry system

118 Data cleaning For each measure, examine the data by:  Range checks (accuracy) - be sure there are no impossible values) Example – subjects with age of 99 years, 60 children etc.  Logic checks - do they data make sense Look for discrepant results Consistency of answers between interviewers –Example: subject initially circumcised on baseline visit; on f/u described as uncircumcised –50 subjects marked as undergoing syphilis testing; have results for 53 subjects

119 Data cleaning Missing values  Look for missing values and missing files  What are reasons for missing values? Subject/ client didn’t respond Data weren’t recorded Data weren’t entered Something was lost – file, questionnaire etc. If numbers of missing values are large – need to resolve; if numbers are small and data set is large – can leave for now

120 Data cleaning How many of you have “clean” data? How have you determined that they are clean?

121 Data set preparation “Freezing” the database - the “dataset” - for analysis - no more cleaning or data entry Variable items should be labeled –not by questionnaire number; response codes should be labeled for ease of interpreting output Variables may need to be reformatted, new variables created; create composite variables Output should be labeled by date with a cover sheet indicating contents; file the output

122 Statistics made simple

123 Topics to be covered - 1 1.Basic “descriptive” statistics  Types of variables (2): Categorical (dichotomous) Continuous variables  Proportions /percents – categorical  Means, medians – continuous Standard deviations, range, confidence intervals

124 Topics to be covered - 2 2.Basic “analytical” statistics  Differences in proportions, between groups  Differences in means, between groups  Incidence, and incidence rates

125 Topics to be covered - 3 3.Measures of bivariate association  Relative risk  Odds ratios  Relative hazard  Correlation coefficients  Kaplan-Meier and survival curves

126 Topics to be covered - 4 4.Statistical tests and when to use them (depending on the predictor and outcome variables)  Dichotomous predictor/ dichotomous outcome (difference in proportions) Chi-square /z-statistic: differences in proportions  Dichotomous predictor/ continuous outcome (difference in means) T-test, Fisher’s exact test: differences in means  Continuous predictor and continuous outcome Linear regression

127 Topics to be covered - 5 5.Meaning of a p-value  How to present p-values 6.Multivariable analysis  Stratification  Multivariate logistic regression Statistical test of differences in association Presentation of results  Multivariate linear regression

128 Topics to be covered - 1 1.Basic “descriptive” statistics  Types of variables (2): Categorical (dichotomous) Continuous variables  Proportions /percents – categorical  Means, medians – continuous Standard deviations, range, confidence intervals

129 First step in analysis Descriptive statistics – frequencies of variables Important initial step in determining how clean your data set is Understanding your data

130 Types of variables Continuous  Quantitative intervals in order  Examples: Number of sexual partners Weight Age Categorical  Dichotomous (yes/no) (dead/alive, AIDS)  Nominal (name) –no order (raçe, marital status, occupation)  Ordinal (order) WHO stages of HIV infection, levels of education) Continuous variables can be either distributed symmetrically (normally) or asymmetrically

131 Descriptive statistics: Univariate analysis Frequency distributions of all potential variables of interest Number of observations –missing values For categorical/dichotomous variables:  Proportions in each category of response  Point prevalence, surveillance– 95% CI For continuous variables  Mean, standard deviation  Median, range of values  95% confidence interval Create categorical variables as needed from continuous variables; create composite variables

132 How to describe the distribution of categorical (dichotomous) variables Number of observations = 250 Proportions- percents  Yes = 175/250 = 0.7  No = 75/250 = 0.3 Response category Number in each category N=175 N=75 Fig. 1 Number of subjects who received VCT testing. N= 250 70% 30%

133 Put initial data and reformatted dichotomous and categorical variables into working tables VariableN % with missing (Round off) % without missing Consent to VCT? (Var 1) Yes, test in home Yes, test in center Yes, intend to test No, won’t test missing 260 100 25 72 55 8 38.5% 9.6% 26.9% 21.2% 3.8% (39%) (10%) (27%) (21%) (4%) 252 40% 10% 29% 22% -- Among those who intend to test eventually tested? (Var 2) Yes No missing 72 50 20 2 69.4% 27.8% 2.8% (69%) (28%) (3%) 70 71% 29% -- Received VCT (comp. var)250 Yes No missing 175 75 10 70% 30%

134 How should continuous variables be described?

135 Distribution of a theoretical continuous (interval) variable Number of observations (no. of subjects) = 100 Mean # partners =? Median = ? Mode = ? Range = ? How is the variable distributed? What is the relationship between mean, median, mode? No. of lifetime sexual partners Fig. Distribution of no. of lifetime sexual partners No. of subjects

136 Meaning of terms Mean (M) = average value Standard deviation (  ) measures the variation in the values around the mean (M), for (N) observations, estimates the population from which sample was taken: SD=  = √  (y i -M) 2 /(N-1) Standard error of the mean =  /√N Indicates the variation in values around Mean in the sample; always smaller than SD,  Confidence interval (e.g., ±95%) = M ± 1.96 

137 What descriptive statistics should be shown? Standard deviation (  ) –show if you want to indicate the spread of the variable in the sample/population Standard error (of the mean) =  /√N Generally less used, unless want to show how “precise” the measurement is; or want to have a small SE – i.e. look good Confidence interval (e.g., ±95%) = better way of showing precision of the estimate

138 Confidence intervals are used to estimate precision of a result Descriptive results: Means, proportions Analytic results: Relative risk, odds ratios To compare RR, OR The wider the confidence interval, the less precise the estimate; smaller – usually larger sample size e.g.: Prevalence of HIV in population is 15% (95% CI: 1%,30%) vs 15% (95% CI: 11%,19%)

139 Continuous variables: Non-normal distributions, skewed data Mean > median >mode Skewed by data at higher values Mode > median > mean Skewed by data at lower values MeanMedianMode 3.653.02.0 No. of sexual partners Pink – SF Green - Eureka No. of subjects Fig. No of MSM sexual partners last mo in SF and Eureka; N=77 mode median Mean

140 What descriptive statistics should be shown for non-normally distributed or asymmetrical continuous variables ? Median, with range: Range indicates the upper and lower values N=77 subjects in study of MSM in Eureka Median # partners = 3.0 (Range: 1-9) Mean = less appropriate Median, with 25-75% percentiles Indicates that 50% of all values lie within this range, or 25% above, 25% below

141 Presentation of continuous variables Working tables Variable (N=100) No. Lifetime sex partners, Mean, +/- SD 10 (+/- 3)95% CI (4-14) Median, range 10 (1-19) No lifetime partners 1 2-5 6-10 11-15 16-19 N 1 14 40 35 10 (%) 1% 14% 40% 35% 10%

142 Continuous variables –initial evaluation Working tables VariableNMean +/-SDMedianRange Age, years 30023.6 yrs+/- 15.1 yrs20.1 yrs3-80 What would you do with these data?

143 Topics to be covered - 2 2.Basic “analytical” statistics  Differences in proportions, between groups  Differences in means, between groups  Incidence, and incidence rates

144 Bivariate analysis Comparison of distribution (%, mean) of predictor variables within outcome groups Comparison of distribution (%, mean) of outcome variables within predictor groups

145 2x2 table Useful presentation for many types of data Predictor variable Outcome variableTotal PresentAbsent Presentaba+b Absentcdc+d Totala+cb+dN

146 Prevalence of outcome in dichotomous predictor groups Row % Predictor variable Outcome variableTotal Malaria +Malaria - +HIV100 (67%) 50 (33%) 150 -HIV25 (25%) 75 (75%) 100 Total125 250 Can describe differences in prevalence : 67% of those with HIV had malaria, compared to 25% of those without HIV.

147 Differences in means or medians between 2 groups (either predictor or outcome) PredictorOutcome variable HIV+HIV- Mean # sexual partners, lifetime 20 (95% CI: 10-30) 5 (95% CI: 1-10) Among HIV+ subjects, mean # lifetime partners was 20, compared to HIV- subjects, in whom mean # lifetime partners was 5.

148 Incidence Outcome PredictorPresentAbsentTotal Present (exposed)aba+b Absent (unexposed)cdc+d TOTALa+cb+dN Incidence density is the overall incidence of outcome in sample ID = (a+c)/N Incidence in exposed = a/a+b Incidence in unexposed = c/c+d

149 Incidence is a rate and is measured over time New HIV infections in 1 year PredictorHIV+HIV-Total Intervention199100 Control496100 TOTAL5195200 Denominators are standardized to person-years of exposure (usually per 100 person-years) Calculate total number of persons and total follow-up time for each Overall incidence, incidence density = 5/200 person-years = 2.5/100 person years = 2.5%/ (person) year

150 Topics to be covered - 3 3.Measures of bivariate association  Relative risk  Odds ratios  Relative hazard  Correlation coefficients  Kaplan-Meier and survival curves

151 Strength of associations Need a way to indicate strength of association between predictor and outcome variables. This can be estimated by:  Risk ratio  Relative risk  Relative risk for incidence (hazard)  Odds ratio  Correlation coefficient Terms risk ratio, rate ratio, relative risk, relative hazard are often mixed and confused

152 Measures of association used are based on type of study Study typeMeasure of association Cross-sectionalRisk ratio Case-controlOdds ratio Cohort- longitudinal or cross-sectional Relative risk; Odds Ratio ExperimentalRelative risk

153 Other measures of association Measure of associationExplanation Risk over timeIncidence Relative risk – hazard ratio Ratio of incidence between groups Attributable risk or relative risk reduction Difference in incidence between two groups Population attributable risk Excess risk in a population due to a risk factor

154 Risk ratio is the association between dichotomous variables in cross-sectional studies Predictor variable Outcome variableTotal PresentAbsent Presentaba+b Absentcdc+d Totala+cb+dN Risk ratio = a/(a+c) b/(b+d)) Risk of predictor in outcome group ÷ risk of predictor in group without outcome

155 Relative risk is the association between dichotomous variables in cohort studies and experiments Predictor variable at beginning of study Outcome variable at end of study Total PresentAbsent Presentaba+b Absentcdc+d Totala+cb+dN Relative risk = a/(a+b) c/(c+d) Risk of outcome in group with predictor ÷ risk of outcome in group without predictor

156 Relative risk Estimates the magnitude of an association and is equivalent to the probability that the outcome will occur, given an exposure, compare to non-exposed persons Prevalence in exposed group (I e ) RR = Prevalence in non-exposed group (I o ) RR=1 No association RR>1 Risk of outcome increased RR<1 Risk of outcome decreased (protective)

157 Relative risk – example Predictor variable Outcome variableTotal Malaria +Malaria - +HIV100 (67%)50 (33%)150 -HIV25 (25%)75 (75%)100 Total125 250 Relative risk: (100/150) / (25/100) =.67/.25 = 2.68 = 2.7 HIV+ persons have 2.7 x risk of malaria than HIV negative persons.

158 Relative risk – used when evaluating incidence rates Outcome PredictorPresentAbsentTotal Present (exposed)aba+b Absent (unexposed)cdc+d TOTALa+cb+dN Incidence in exposed = a/(a+b) Incidence in unexposed c/(c+d)

159 N=250 subjects, malaria measured over 1-year period, no drop-out Predictor variable Outcome variableTotal Malaria +Malaria - +HIV100 (67%)50 (33%)150 -HIV25 (25%)75 (75%)100 Total125 250 Relative risk malaria incidence = (100/150) per 150 person years ÷ (25/100) per 100 person years =.45/.25 = 1.8 Relative risk – used when evaluating incidence rates

160 Odds ratio is the association between dichotomous variables in case-control studies PredictorOutcome variableTotal PresentAbsent Presentaba+b Absentcdc+d Totala+cb+dN OR = ad/bc Odds Ratio: Odds of outcome in those with predictor ÷ Odds of outcome in those without predictor = (a/b) ÷ (c/d) = ad/bc Odds of outcome among those with predictor = number with outcome (a) ÷ number who don’t develop outcome (b)

161 Odds ratio PredictorOutcome variableTotal PresentAbsent Presentaba+b Absentcdc+d Totala+cb+dN Odds Ratio: Cohort study Odds of outcome in those with predictor/ Odds of outcome in those without predictor = (a/b) / (c/d) = ad/bc Case control: Odds of predictor in those with outcome/ Odds of predictors in those without outcome = (a/c) / (b/d) = ad/bc

162 Interpretation of Odds ratios OR = 1.0 No effect OR >1.0 Effect, greater odds of outcome OR <1.0Effect, less odds of outcome

163 Precision of measurements - 95% confidence intervals Generally – indicates that if the study were repeated numerous times, 95% of the time, the true value would lie between these limits Symmetrical around value for means and point estimates (proportions) (  ± 1.96  ) Not symmetrical around a values for RR, or OR (because calculated using log-values) When to Use: Means Point estimates /proportions OR -- also indicates statistical significance RR -- also indicates statistical significance

164 Correlation coefficients measure strength of association between two continuous variables If relationship is linear: r = slope x SD predictor SD outcome r = correlation coefficient SD = standard deviations of predictor and outcome r 2 = variance or the proportion of spread in one variable that can be explained by other variable Example: Family income is correlated with years of education. If r=0.9 and r2=.81, then 81% of spread in income (variance) can be explained by differences in education The closer r 2 is to 1.0, the stronger the association

165 More complicated types of analyses for longitudinal studies Kaplan-Meier & survival analysis – time-to- event analysis K-M shows time to death, or proportion still alive Used when follow-up periods and drop out are different Survival analysis – shows proportion of persons free of the event (death or disease) over time

166 Kaplan-Meier analysis Proportion of total subjects remaining alive Time, months

167 Topics to be covered - 4 4.Statistical tests and when to use them (depending on the predictor and outcome variables)  Dichotomous predictor/ dichotomous outcome (difference in proportions) Chi-square /z-statistic: differences in proportions  Dichotomous predictor/ continuous outcome (difference in means) T-test, Fisher’s exact test: differences in means  Continuous predictor and continuous outcome Linear regression

168 Statistical tests measure the probability that the observed association (between predictor and outcome) is not caused by chance alone

169 Basic statistical tests

170 Outcome variable Predictor variableContinuous, normally distributed Continuous, not normally distributed or ordinal with >2 categories Nominal with >2 categories Dichotomous Continuous, normally distributed Correlation, linear regression, F test Spearman rank correlation Analysis of variance (F test) Logistic regression (likelihood ratio) Continuous, not normally distributed or ordinal with >2 categories Spearman rank correlation Kruskall-Wallis Nominal with >2 categories Analysis of variance (F test) Kruskall-WallisContingency table (Chi- squared) DichotomousComparison of means (t test); ANOVA Wilcoxon rank sumContingency table (Chi- squared) Chi-squared or z test But the reality is a little more complicated…

171 Analysis of a dichotomous outcome variable by a dichotomous predictor variable Chi-squared or z-test is used for dichotomous predictors and outcomes, or for 2 x 3 (4,5) associations Tests differences in proportions Fisher’s exact test is used when the expected values in any cell are <5

172 Analysis of a dichotomous predictor variable and a continuous outcome variable Analyze by t-test Use the same test when you have a continuous predictor variable and dichotomous variable i.e. used to test differences in means between categories ANOVA – test differences between multiple means

173 Analysis of an experiment/trial Analyze like a cohort study with RR Intentional to treat analysis  Most conservative analysis  Include all subjects assigned to a treatment or control group, including those who never received the intervention Subgroup analysis  Subgroups should be identified before randomization

174 Topics to be covered - 5 5.Meaning of a p-value  How to present p-values 6.Multivariate analysis  Stratification  Multivariate logistic regression Statistical test of differences in association Presentation of results  Multivariate linear regression

175 Interpretation of p-value When you run a statistical test, you will obtain a p-value p stands for probability From statistical point of view, you are testing the “null” hypothesis or the probability that there is “no effect” A p-value of 0.05, means that there is an approximately 5% chance that there is no effect, or that association that is seen was due to chance

176 p-values and statistical significance What p-values are “significant” Is p=0.052 different from p=0.049? p=0.05 is a convention You can show actual p-values, because even if they are greater than 0.05, they can demonstrate that there is an association even if it is not “statistically significant” For p<.01 you don’t show actual values  p<.001, p<.0001 adequate

177 p-values and confidence intervals Confidence intervals (CI) can also show statistical significance of an effect size (such as RR, OR) CI that includes the value 1.0, indicates that there is NO effect; p>.05  RR = 1, no effect; RR 1.0 ( positive effect)

178 p-values and confidence intervals Examples

179 Table 6.26 Independent Predictors of Coronary Heart Disease Among 2124 Middle Aged Subjects Predictor Relative Risk* 95% Confidence Intervalp Male1.71.1-2.6.01 Age (per 10 yr)1.61.4-2.0<.0001 Serum cholesterol (per 20 mg/dL)1.31.0-1.8.05 Systolic blood pressure (per 10 mm Hg)2.01.1-3.6.02 Current Smoker (vs. never smoked)3.01.7-5.4<.0001 *Relative risks approximated with odds ratios from logistic regression model.

180 Table 6.27 Univariate Predictors Not Associated with Lung Cancer (After Adjustment for Other Factors in Multivariate Models) Predictor Univariate OR (95% CI)* Multivariate OR (95% CI)*Removed by Thinness (<90% IBW)2.1 (1.3-3.1)1.4 (0.8-2.5)Subject’s smoking Income (per $10,000)0.8 (0.6-1.0)1.0 (0.8-1.2)age Spouse’s smoking (yes/no)3.1 (1.5-6.2)1.3 (0.7-2.2)Subject’s smoking Body weight (per 5 kg)0.6 (0.4-0.8)0.9 (0.7-1.2)disease stage *Relative risks approximated with odds rations. CI denotes confidence intervals. †IBW=ideal body weight Which predictors are significant at p<.05 value?

181 Inferring causality in observational studies Just because an association is “statistically significant”, does not mean that there the predictor variable has caused the outcome (this is referred to as “causality”) Example from class – association, causality?

182 Inferring causality in observational studies Strengthens concept of causality: Predictor variable precedes the outcome variable in time Strength of association Biological plausibility Association observed in different studies with different designs Strength of association increases as exposure to predictor increases (dose response)

183 Subgroup analysis Associations may be stronger in subgroups – example stratify be gender/age/marital status etc. Subgroup analyses are frequently the most interesting and show the associations most clearly

184 Topics to be covered - 5 5.Meaning of a p-value  How to present p-values 6.Multivariable analysis  Stratification  Multivariate logistic regression Statistical test of differences in association Presentation of results  Multivariate linear regression

185 Strategies for confounding variables Definition of confounding variable? Examples:

186 Confounding variables Definition of confounding variable?  Factor that is associated with both the predictor of interest and the outcome Examples:  Gum chewing is associated with smoking, and therefore gum chewing appears to be associated with lung cancer. Actual relationship is between smoking and lung cancer  Age is associated with HIV infection. Older age is associated with greater number of sexual partners; age confounds the relationship between number of sexual partners and HIV infection.

187 Strategies for confounding variables In the analysis phase  Stratification – by confounder or variable  Statistical adjustments – through multivariate analysis

188 Stratification Separate participants into strata (or subgroups) by potential confounding variables (e.g., smokers and non-smokers) Advantages: can be done after data collection, is flexible and one can un-do stratification Disadvantages: lack of power (due to size of subgroups) and it is necessary to have measured the co-variates of interest

189 Statistical adjustment Variety of techniques can control for multiple confounding variables simultaneously Includes techniques like multivariate regression  Linear regression when variables are continuous  Logistic regression when the predictor variable is continuous and the outcome variable is dichotomous Permits the full use of continuous variables Statistics (e.g., AOR—adjusted OR) are more or less difficult to understand

190 Multivariate logistic regression Various types of logistic regression models Multivariate logistic regression looks at the association of a particular predictor with the outcome, while simultaneously “controlling” for other predictors (while holding them constant) Include :  Variables for which you almost always want to control – e.g. age  Include variables that are significantly associated with the outcome in bivariate analysis  Some variables will not remain significant in multivariate model

191 Table 6.27 Univariate Predictors Not Associated with Lung Cancer (After Adjustment for Other Factors in Multivariate Models) Predictor Univariate OR (95% CI)* Multivariate OR (95% CI)*Removed by Thinness (<90% IBW)2.1 (1.3-3.1)1.4 (0.8-2.5)Subject’s smoking Income (per $10,000)0.8 (0.6-1.0)1.0 (0.8-1.2)age Spouse’s smoking (yes/no)3.1 (1.5-6.2)1.3 (0.7-2.2)Subject’s smoking Body weight (per 5 kg)0.6 (0.4-0.8)0.9 (0.7-1.2)disease stage *Relative risks approximated with odds rations. CI denotes confidence intervals. †IBW=ideal body weight

192 More complicated analysis of longitudinal studies Cox proportional hazards models  When using survival analysis, associations between predictors and outcomes are expressed in as hazards GEE models – longitudinal models with multiple time points and measurements

193 Summary Examine the distribution of each variable individually Analyze your primary hypothesis with bivariate analysis Measure the strength of association Calculate the statistics for the comparison (e.g., p value) Control for confounding variables

194 Results Lecture 7 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Wednesday 1 February 2006

195 General recommendations for Results Follow the sequence of Tables and Figures Follow the sequence of the Methods (or vice versa) Use sub-headings if complex or many secondary analyses Think five paragraphs  1000 to 1250 words, 4 – 5 pages

196 Results in five paragraphs 1.Study population: eligible and recruited (1a: follow-up if RCT or prospective study) 2.Key variables and primary outcome (univariate) 3.Primary hypothesis (bivariate) 4.Multivariate analysis (confounding factors, interactions) 5.Stratification, subanalyses, examination of biases, corroborative analyses

197 Results paragraph 1: recruitment Number of persons approached  Where  When Number of persons eligible  Why any excluded Number of persons enrolled  Reasons for refusal Differences between cases and controls Differences between intervention and control arms

198 Results paragraph 1: recruitment Example: “A total of 246,715 clients age 15 years or older had their first test performed at the 4 main branches over the period January 1992 to December 2000. From these clients, we excluded 44,974 who reported illness as a reason for testing to avoid selection bias due to higher prevalence of HIV infection (61% vs. 14%, respectively) and an increase in the proportion of such clients over time (14% to 19%). The proportions of clients excluded due to illness were similar across sites: 15% Kampala; 17% Jinja; 25% Mbarara; and 20% Mbale.”

199 Results paragraph 1a: follow-up Use only for cohort studies and experiments Number of persons lost to follow-up Reasons for loss to follow-up Any difference between those lost to follow-up and those who continued in the study? Remember that follow up is part of both Newcastle-Ottawa and CONSORT criteria

200 Results paragraph 1a: follow-up Example: “A total of 137 persons (50%) completed the follow-up interview. Overall follow-up did not differ by study arm assignment (p=0.22). Persons lost to follow-up in the intervention arm did not differ from persons lost to follow- up in the control arm with respect to gender, age, education, employment, marital status, and types of partners (all p values >0.05). However, persons who completed follow-up were more likely to be women than men (58% vs. 42%, p=0.008) and less likely to have regular sexual partners (43% vs. 55%, p=0.048)...”

201 Results paragraph 2: describe key variables and primary outcome Univariate results (Table 1) Focus written text on main findings  Relevant demographic characteristics  Describe any differences in cases vs. controls  Describe any differences in experiment vs. control (placebo)  Main outcome prevalence or incidence density (here or in paragraph 3)

202 Results paragraph 2: describe key variables and primary outcome Example: “Of 201,741 clients meeting the inclusion criteria, 49% were female, and 71% were younger than 30 years of age (Table 1). About half of these clients were single, and about one quarter were seeking premarital testing...” May also include in paragraph 2: “Overall, adjusted prevalence of HIV infection declined from 23% to 13%...”

203 Results paragraph 3: associations with outcome Bivariate associations with main outcome (Table 2) Focus written text on significant findings  Statistically significant  Clinically significant

204 Results paragraph 3: main associations with primary outcome Example: “Overall, adjusted prevalence of HIV infection declined from 23% to 13%, with a decreasse from 17% to 9% among men (P < 0.001) and from 31% to 17% among women (P < 0.001) (Table 2)...” Expansion to additional main findings:...Among men, significant findings...Among women, significant individual vs. couples site, etc.

205 Results paragraph 4: multivariate analysis Independent associations with main outcome (Table 3) Main hypothesis, single most important finding Focus written text on ruling out confounders

206 Results paragraph 4: multivariate analysis Example: “Table 3 shows the results of the multivariate analysis comparing partner notification outcome by study arm controlling for potential confounding by partner types, gender, age, and employment status. Subjects allocated to the intervention counseling were significantly more likely to notify any partner compared to those in the control arm (OR 4.1, 95% CI 1.3 – 13.2).”

207 Results paragraph 5: sub-analyses Stratified analyses (Table 4)  Special sub-populations  Effect modification (interactions)  Focus written text on differences between sub-populations

208 Results paragraph 5: sub-analyses Temporal trends (Figure 1)  Focus written text on significant increases or decreases Ruling out biases Secondary aims

209 Examples of sub-analyses Class participation: What important sub- analysis might you include in your paper?

210 Additional tips for results Don’t mix Methods into Results  If you conduct a new analysis or sub-analysis, add into Methods Don’t mix Discussion into Results  No interpretation beyond self-evident  (Also, don’t introduce Results into Discussion, go back) Be clear and concise Double check numbers, do they add up?

211 Tables and Figures Lecture 8 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Wednesday 1 February 2006

212 Tables Why use tables? Tables present your information at a glance -- often readers will look only at tables and not read the Results section Therefore, a reader must be able to understand your study population and all your results (i.e. your paper) by looking at the tables. Compared to the text in the manuscript, tables:  Present data more compactly  Allow for side-by-side comparisons of data

213 Tables General approach Follows same format as the Results section Table 1: Descriptive characteristics of your population – distribution of demographic characteristics; can be divided by groups of interest (among males/females; HIV+/-; disclosure/nondisclosure). Even if descriptive characteristics not a major component of the paper, all readers want to know basic information about the population

214 Tables General approach Complicated studies – particularly randomized trials – need an initial figure illustrating recruitment, proportion eligible, refused, dropped out, randomized, followed Table 2: Initial bivariate results – relationships between predictors and outcome. Table 3; More detailed bivariate relationships, or multivariate results Table 4 or Figure: Multivariate, survival analyses etc. Figures, graphs: use only if provide new information in a more visually dramatic or understandable manner.

215 Tables General approach -2 Rule of thumb: if you have fewer than 5 or 6 pieces of information to present in a table, consider putting it in the text instead On the other hand, do not use excessive detail or you will detract from your overall message Clearly label the rows & columns to assist the reader—particularly for figures, clearly label x and y axis.

216 Components of a table Title Row and column headings The rows themselves The data Footnotes

217 Table components - outline Table 8.1. Descriptive title, such as “Structure of a Typical Table”.* ROWS Heading with predictor variable names (e.g. demographics, STIs, behavior) COLUMNS Outcome variable name Present Absent N (%) N(%) First variable -Age categories-<20 20-25Data Second variable categoriesData *Not all tables follow this format The table should make sense without the text.

218 Table components THE NEED FOR “N” All tables must have: 1.A total N – in title, or elsewhere 2.Denominators must be evident – must be able to calculate values in table 3.If numbers don’t add up, need to explain

219 Table titles should be descriptive enough to tell reader what will appear in the table Table 8.2. Poor titles and better alternatives Poor TitlesBetter Titles Characteristics of subjectsCharacteristics of the 54 men enrolled in the trial Comparison of active treatment with diuretic therapy compared with placebo in 122 men Effects of treatment of hypertension and placebo groups Predictors of quality of lifeFactors associated with differences in quality of life: multivariate models Independent (p<.05) predictors of quality of life using logistic regression following step- wise selection procedures, using the criteria of reference 6 Factors associated with differences in quality of life: Multivariate models

220 Headings Table 8.3 Selected hemodynamic measurements (Mean +/- SD) at baseline and during follow-up in 58 subjects with hypertension Week of Treatment* MeasurementBaseline16 Heart rate (per minute)76 ± 1268 ± 865 ± 7 Systolic blood pressure (mm Hg)162 ± 21142 ± 18138 ± 14 Diastolic blood pressure (mm Hg)96 ± 1282 ± 1080 ± 6 *All measures showed significant (p <.01) differences from baseline at weeks 1 and 6. The headings should be informative; don’t make reader refer back the to the text. Use a brief description. Column headings reflect the comparison of primary interest. Column headings should be distinctive; use italics or bold. Put units in parentheses (or separated by commas) immediately after row descriptions.

221 Table formatting Rules for table details will be determined by the journal --- look at tables published in the journal you have chosen for examples and follow that format. Keep footnotes to a minimum; use only for essential details and abbreviations. Order or number your footnotes from top to bottom and within a line, from left to right. Use these symbols *, †, ‡, §, ║,¶. Double these symbols if you need more **, ††, etc.

222 Table formatting, continued Put the percentage symbol (%) right next to the number if space permits, e.g. 25%. Align the numbers in each column by using a centering tab function or centering the cells in the table layout. Center the column headings over the columns. Cite all the tables in the manuscript text. Adding formatting details and niceties make it easier on the reviewer – always a good thing.

223 Types of tables Tables that list information – rather than data—e.g. Describing testing algorithms, treatment algorithms Tables listing characteristics of sample  Distribution of characteristics--%, % in categories, mean, median  Distribution of variables can also be given in relationship to an outcome variable – and therefore be “bivariate” – provide more information in 1 table, rather than 2 tables

224 Types of tables Tables showing associations of predictor and outcomes:  Row percents—if showing proportion of outcome within predictor categories  Column percents – if showing distribution of characteristics between outcome groups (intervention, control; trial; ca- control studies

225 InfectionBiological Specimen Test UsedTreatment ChlamydiaUrinePCR (Roche)Azithromycin 1 gm PO GonorrheaUrine, Urethral discharge Smear & culture; urine PCR(Roche) Cefixime 400 mg PO SyphilisSerology, Ulcer swab VDRL/ TPHA, RPR, PCR (Roche Multiplex) 2.4 MU IM benzathine penicillin Herpes simplex 2Serology, Ulcer swab HSV2 IgG (Focus Tech), PCR (Roche Multiplex) Acyclovir 400 mg tid for 10 days H. ducreyiUlcer swabPCR (Roche Mulitplex)Injectable ceftriaxone 250 mg IM Hepatitis BSerumHbSAg HIV 1&2SerumELISA (Lab systems)/ WB;Referral for management of OI Table of lists, descriptions of methods

226 Tables of distribution of values and frequency Table 8.5. Frequency of pathogens in 840 women with lower urinary tract infections. Type of PathogenFrequency (%) Gram-negative bacteria63 Escherichia coil35 Proteus mirabilis12 Gram-positive bacteria26 Staphylococcus aureus13 Enterococcus species9 Other bacteria11 Classroom Exercise: What are the problems with this table?

227 Table 8.5. Frequency of pathogens in 840 women with lower urinary tract infections. Type of PathogenN% of total Gram-negative bacteria52963% Escherichia coil 294 35% Proteus mirabilis 101 12% Gram-positive bacteria21826% Staphylococcus aureus 110 13% Enterococcus species 76 9% Other bacteria9211% Suggestions for revisions –can offset numbers and percentages

228 Identify deficiencies in following table: Table 8.6. Characteristics of the subjects. Male594(49.75%) Female600(50.25%) Age64.47 ± 5.23 History of diabetes103(8.63%) History of CHD56(4.69%) Body weight7.41 ± 7.3 Shoe size9.2 ± 2.1 Calories per month62,125.4 ± 15,781.2 Problems: 1. Vague title 2. No column headings 3. No Total N provided 4. Both male and female categories not needed 5. Decimal to 1/100 th not needed 6. Why is shoe size included? 7. Decimal point for calories per mo not needed 8. Why calories per month? 9. Should values be given by gender rather than for whole sample?

229 Suggested improved table Table 8.6. Characteristics of the 1194 Subjects Enrolled in the Better Eating Trial (BET). CharacteristicN (%) Male 594 (50) History of diabetes 103 (9) History of coronary heart disease 56(5) Age (yr)64 ± 5 Body weight (kg)74 ± 7 Calories per day2,070 ± 530 *Plus-minus values are means ±SD Text could read: similar numbers of men & women in study; 33% of subjects were over 65 years old; 25% were more than 10 kg above ideal body weight; most were free of chronic medical problems.

230 Table 8.7. Characteristics of the 1194 Subjects Enrolled in the Better Eating Trial (BET). CharacteristicPercentage of Mean ± SD Female50% History of diabetes9% History of coronary heart disease5% Age (yr)64 ± 5 Body weight (kg)74 ± 7 Calories per day2,070 ± 530 If actual numbers really don’t matter, an acceptable alternative is to show only the percentages and the means.

231 Stratify the subjects into groups if there are important differences between the groups Table 8.8. Characteristics of 1194 subjects enrolled in the the Better Eating Trial (BET), by gender. Men (n=594) Women (n=600) Age (yr)62 ± 566 ± 6 Body weight (kg)80 ± 668 ± 8 History of diabetes (N, %)40 (7)63(10) History of coronary heart disease (N, %)38(7)18(3) *Plus-minus values are means ±SD Differences should also be pointed out in the text: Men were more than twice as likely to have a history of heart disease, and diabetes was 40% more common among women.

232 Results from a randomized trial – stratify by study groups Table 8.9. Characteristics of 1194 subjects enrolled in the the Better Eating Trial (BET), by randomization status Special Diet (n=797)Control (n=397)p Age (yr)64 ± 565 ± 60.35 Body weight (kg)74 ± 673 ± 60.42 History of diabetes8%9%0.26 History of coronary heart disease5%4%0.64 *Plus-minus values are means ±SD P-values refer to differences in distribution of characteristics between the two groups Percentages may be easier to follow especially if the numbers in each study group vary a lot. Describe what differences/statistical test to which p-values refer

233 Tables that compare groups When you compare groups you are presenting either of two types of information 1.The measurements or characteristics of the groups 2.The differences between the groups You need to decide which is more important because it will determine how you design your table

234 Table 8.10. Demographic profile and relationship to HIV status among high risk men in Mumbai (N=1901, 15% HIV+ overall). * p<0.05 for the difference HIV% between categories CharacteristicOverall N % HIV + % Age, yrs -16-25 -26-35 -36+ 1155 754 417 50% 32% 18% 11% 20% * 12% Married69730%15% Education <4 yrs 4-9 10+ 815 1028 447 37% 44% 19% 16%* 14% 11% Lives in slum/footpath Flat/chawl 1633 692 70% 30% 15%* 11% Long term migrant161470%14%

235 Table 8.10. Demographic profile and relationship to HIV status among high risk men in Mumbai (N=1901, 15% HIV+ overall). * p<0.05 for the difference HIV% between categories CharacteristicColumn N Adds up to total N in sample Column % Adds up to 100% Row % HIV + Age, yrs 16-25 26-35 36+ N=category 1155 754 417 Denominator= total N 50% 32% 18% Denominator = Row N 11% 20% * 12% Married69730%15% Education <4 yrs 4-9 10+ 815 1028 447 37% 44% 19% 16%* 14% 11%

236 When you want to emphasize the predictor variables themselves, give the column % Table 8.11. Characteristics 112 Subjects Enrolled in study of TB among patients in Mulago Hospital. Characteristic (unit) HIV+ (n=51) HIV- (n=66)p Age, mean (yr)32 ± 823 ± 6<.001 Male55%45%>.05 HSV2 infection70%32%<.001 History of TB50%20%,.05 *HSV2 = Herpes simplex virus 2 detected by IgG Focus Technologies Clear that the 2 types of subjects, HIV+ and HIV- are different. Only need a p-value to show differences statistically significant.

237 Emphasis on the comparison between groups (e.g. in a randomized trial) Table 8.12. Effect of intensive vacuuming on pulmonary function at 6 months in the Vacuum Away Dust (VAD) study. Measurement (unit) Vacuum (n=60) Control (n=57) Vacuum-Control Difference (95% CI)*p Forced expiratory volume, 1 sec. (L)2.0 ± 0.61.6 ±0. 80.4 (0.1, 0.7)<.01 Peak expiratory flow (L/min)290 ± 80260 ± 12030 (5, 55)<.02 Prednisone dose (mg/day)10 ± 1514 ± 124 (-2, 6)>0.15 *CI = confidence interval. When emphasis is on the differences between the groups, also need to know if the difference is significant, measure of the effect size (in this study the effect size is measured by differences between means), and how precise it is.

238 Presenting multivariate results Table 8.13. Independent predictors of coronary heart disease among 2124 middle-aged subjects. Predictor Relative risk* 95% Confidence Intervalp Male1.71.1-2.6.01 Age (per 10 yr)1.61.4-2.0<.0001 Serum cholesterol (per 20 mg/dL)1.31.0-1.8.05 Systolic blood pressure (per 10 mm Hg)2.01.1-3.6.02 Current Smoker (vs. never smoked)3.01.7-5.4<.0001 *Relative risks approximated with odds ratios from logistic regression model. Use meaningful terms such as relative risk and provide units for the predictor values. Units sometimes need to be spelled out (e.g. current vs never smoker) and sometimes can be implied e.g. men compared to women.

239 What should be left out of a table Don’t include everything that was measured. Pick out the important items and make your point. However, don’t make this determination just by what was statistically significant. This is misleading. To avoid accusations of multiple-hypothesis testing, have a few pre-specified hypotheses and indicate what they are. Report on these.

240 Checklist for tables 1.Is the title sufficiently descriptive without being too much/too long? 2.Do the rows and columns line up neatly? Is each column centered under its heading? Are there denominators for the column headings? Are the headings bolded or italicized? Do the row characteristics (predictor variables) have units? 3.Are there any unneeded data, repeated N’s, excessive precision, or ambiguous abbreviations? Ask yourself: Do I need it?

241 Checklist for tables 4.Do I need it in such glorious detail? Do I need to abbreviate it? 5.Is the meaning of every item obvious without referring to the text? 6.After you have completed all of your tables, ask yourself: Can two or more tables be combined? 7.Are all the tables cited in the text? Are they cited in order?

242 Figures

243 Why use figures? “One picture is worth a thousand words” But use caution and common sense  Figures are time consuming  Good at conveying overall effects but poor at conveying specific measurements  If details matter, use a table instead or put the exact values in the text --- figures can only show a few results  A poor figure is worse than no figure at all

244 Common types of figures Photographs Diagrams Data presentations Maps

245 Photographs Never assume the reader will recognize anything in a photograph Label everything that is relevant, using arrows, asterisks and common abbreviations Unless the scale of the photograph is obvious, include a ruler or indicate the magnification or reduction in the figure’s legend

246 Photographs Photographs are relatively expensive to publish and hard to include in an electronic version of the paper Make sure the photograph is really needed and adds to the paper To check clarity of the photograph, photocopy it –assume it will be copied over and over as your paper is passed around

247 Example of an unlabeled photograph: What is anatomic site? What exactly is the photograph showing?

248 Diagrams as figures Appropriate diagrams might include the flow of subjects in a study, complicated sampling schemes, or a genetic pedigree But keep it simple, err on the side of simplicity rather than thoroughness Use smaller fonts for the less important items or details Consider getting professional help -- good desktop publishing skills can make a diagram look professional and clear

249 As a general rule sampling schemes are displayed vertically Figure 8.1 Sampling scheme for the study 2311 subjects contacted 1485 agree to participate 750 control735 intervention 709 complete study699 complete study 691 ineligible 135 refuse 36 die or dropout 41 die or drop out RANDOMIZATION

250 SUI with FSW HIV/STI >10 FSW partners Any unprotected sex With FSW Anal sex with FSW aOR=1.4* OR=2.2* OR = 3.1* OR=1.5* aOR=1.1 aOR=1.7* aOR=0.8 aOR: adjusted OR from multivariable analysis * p-value <0.05 OR: unadjusted OR of SUI to individual risk factors Figure 8.2. Mediation analysis of relationship of sex under the influence of alcohol (SUI) to HIV/STIs FSW = female sex worker

251 Measurement algorithms can be displayed on a horizontal time axis Figure 8.3. Timing of study measurements Initial interview; Consent obtained Repeat EKG Exercise testRepeat EKG Enrollment 1992-93 First visit March-June 1994 Second visit Summer 1995 Study end December 1996

252 Figures that present numerical data These types of figures are the hardest to do well. They can be very effective if done well, but need to ask if really needed. Use if overall pattern is more important than actual values (1 picture worth 1000 words). Figures should have a minimum of four data points. Anything less can be placed in the text.

253 Figures connecting data points measured on several occasions in the same subject Figure 8.4. Heart rate in beats per minute by day of treatment in 5 patients. Multiple measurements on multiple occasions would be hard to demonstrate in a table or in the text.

254 Effects in different groups or at different times Figure 8.5. Differing effects of treatment with successolol in patients with low renin & high renin hypertension. The filled diamonds are the means; the bars are the 95% confidence intervals

255 Always check figures for potential mis- interpretation – Beware of lines that cross Figure 8.6. Blood glucose (red diamonds) and serum insulin levels ( yellow squares) versus time. Note: eye is drawn to crossover at day 3 & 4 even though just a coincidence

256 Consider re-drawing with different scale to avoid problem of crossing lines Figure 8.6. Blood glucose (diamonds) and serum insulin levels (squares) versus time.

257 Figures can be used to illustrate a lack of association Figure 8.7. Lack of association between thyroid-stimulating hormone (TSH) and glucose in patients at weight-loss clinic r = -0.03 Bars jumping up and down, a tangle of lines or scattered dots can be effective, but be sure to indicate your interpretation – i.e. no association

258 Types of numerical figures Pie charts Scatter plots Bar graphs Line graphs

259 Pie Charts Figure 8.8. Causes of neuropathy in primary care patients. Avoid using in written manuscripts Effective for oral/powerpoint or poster abstract presentations Data usually better presented in another format Use text if only a few slices ; table if several slices or more Can only use pie charts to show mutually exclusive categories

260 Table 8.9. Condom use with male partners among men who have sex with men (N=431)

261 Table 8.10. Gender of lifetime sexual partners among male participants in Mumbai intervention trial (N=1901). Sex only with men n=11 (0.5%) Sex with females n=1892 (99%) Sex with men n=431 (23%) Sex with hires n=373 (20%) Sex with Females (n=1892) 90% FSW partner 35% Casual partner 60% Girlfriend 30% Married MSM n=431 MSH n=185 MSF n=1285

262 Scatter Plots Figure 8.11. Correlation between height and weight in 10 subjects. r = 0.76 Scatter plots can easily show the correlation or lack of correlation between 2 variables Showing the regression coefficient is helpful too

263 Bar graphs Valuable for displaying results by categories of subjects, e.g. men & women, or before & after Use to provide more dramatic illustration of differences between groups Most useful when the absolute value of the outcome variable is most important (rather than the confidence interval) Need to choose how to display the pattern of the data

264 Figure 8.12. Likelihood of admission to an intensive care unit by age and gender Results easily displayed in 2 dimensions Compare values for men and women next to each other, for each age group

265 Rearrangement Figure 8.13. Likelihood of admission to an intensive care unit by age and gender. Rearranged so that the taller bar stands to the right - visually easier to interpret

266 Too much rearrangement Figure 8.14. Likelihood of admission to an intensive care unit by age and gender. 3 dimensions: age, gender and probability. Looks cluttered and is un-necessarily complicated.

267 Figure 8.15. Sex under the influence of alcohol with a female sex worker (n=1743).

268 Crosshatches or lined bars to distinguish categories Figure 8.16. Annual risk of hepatoma by age and alcohol consumption. Make sure patterns are clearly different – otherwise confusing; may be hard to distinguish; colors may be better but not many journals will print in color

269 Can use colors to distinguish categories Figure 8.17. Annual risk of hepatoma by age and alcohol consumption Colors are effective for PowerPoint presentations and posters; often cannot be used for publications

270 Stacked bar graphs Figure 8.18. Site of death among persons 65 years of age or older in the U.S and in Canada, 1988 When the pattern is not clear, using designs to distinguish sections can be helpful. No more than four or five sections. Make sure patterns are different enough. Stacked bar graphs may be confusing – be sure to explain in legend and text

271 When to use stacked bar graphs Figure 8.19. Proportions of U.S. graduating medical students in 1975, 1985, and 1995 choosing primary care specialties. This is not so easy to read. Consider changing to stacked bar graphs -- work well when the category totals add to 100%

272 Figure 8.19. Proportions of U.S. graduating medical students in 1975, 1985, and 1995 choosing primary care specialties. A stacked bar graph makes the point better than the previous bar graph –visually can compare each category more easily

273 Line Graphs Figure 8.20. Blood pressure in 10 subjects treated with ineffectivipine. Reader will be confused into looking for a pattern or hidden message when there isn’t one — except to show no change. Might be useful for investigator to understand data.

274 Survival Curves Figure 8.21. Recurrence-free survival of cancer patients: intervention and control groups during 6-year follow-up. Intervention (N) 152 123 110 86 51 24 12 Control(N) 148 110 98 72 63 29 10 Survival curves show proportion surviving at various time points, also known as Kaplan-Meier Curves.

275 Bar and Whisker Plot Figure 8.22. Mean, median, 25 th and 75 th percentiles, and range of creatinine clearance by age of the subjects. Useful for describing the distribution of the data. Shows the range (whiskers), mean (filled circle), median (horizontal line) and 25 th & 75 th percentiles (box).

276 Maps Figure 8.23. Five year survival among persons with AIDS by census tract, San Francisco, California, 1996 - 2001. Maps are useful to show geographic distribution of outcome variables

277 Figure legends and text 1.Overcrowding is undesirable, but inadequate documentation is worse. Make sure your figure has a legend and that labels describe the x and y axes and the bars or lines in the figure. 2.Remember, figures in your article may be reproduced and used as a slide or handout by others. 3.The legend shouldn’t give away the results. The text should complement and expand on the information given in the legend. 4.Avoid ambiguous abbreviations; readers should understand the point of the figure at a glance.

278 Checklist for figures 1.Is the figure necessary, helpful? 2.Does every figure make its point clearly? If not, have you tried alternative versions? 3.Are the axes, lines, bars and points labeled? Are the scales correct? 4.Does each figure have a legend? 5.Are the figures numbered and do they appear in the text in that order? 6.Does the text complement the information in the figures?

279 By end of today Revise Study Description and 2 Main findings (Not your title):  Study design  Subjects  Predictor and outcome **********Analysis Plan*******  Obtaining data, cleaning data?  Running frequencies  Basic comparisons?  Sub-analyses  Multivariates?

280 Today’s homework Prepare study description and main findings Draft:  Primary tables and figures  Results section  Revise title page, introduction and methods as needed

281 Study description and main findings (not your title) Study design, study subjects, outcome and predictors  Example: This is a cross-section study of risk factors for melioidososis among 100 survivors of the 2004 tsunami (could geographical site, part of larger study, etc.) Main finding(s): 1.Melioidosis was diagnosed in 5% of survivors 2.Spending >2 hours in water was associated with risk of melioidosis (RR = 2.4, p=0.03)

282 Discussion Lecture 9 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Thursday 2 February 2006

283 Template for discussion Mission accomplished!  The single most important finding Not only that…  Secondary findings  Confirms or refutes other published studies Mea culpa  Limitations  But, redemption! Wrap up and conclusions  Public health implications

284 Mission accomplished! The first sentence of Discussion  “We found…” The “elevator test”

285 Mission accomplished! The outcome of the RCT:  Did the intervention work or not? The central hypothesis of your original proposal Answer the question posed by “interrogatory title”

286 Mission accomplished! Strengthen causal inference:  Rule out alternative interpretations Chance (significance, large study) Confounding (controlled for) Bias (later, Mea Culpa)  Bradford-Hill’s criteria for causality

287 Mission accomplished! Bradford-Hill’s criteria for causal inference (1965):  Cause before effect (cohort or cross-sectional?)  Biological plausibility (reason why? Theory, biology)  Consistency with other studies (triangulation)  Strength of association (magnitude of effect)  Dose-response (or, analogous dose-response)  Randomised controlled study Does your study meet these criteria? If not, do other studies?

288 Examples of “Mission accomplished!” sentence “Our assessment shows that despite Thailand’s remarkable success in controlling the HIV epidemic among the general population, the HIV prevalence among MSM was found to be surprisingly high.”

289 Examples of “Mission accomplished!” sentence(s) “Thailand has a well-developed public health infrastructure that provides residents with more than 90% of their health care. The MOPH response to the December 26 tsunami was rapid and effective at mitigating the health consequences of the tsunami among survivors…Health assessments conducted 1 week after the tsunami indicated that, despite a huge influx in the number of patients, the medical system was intact and functioning effectively.”

290 Class participation: “Mission accomplished!” Give a one sentence summary of your most important result….

291 Not only that… “We also found…” Sub-group findings, effect modifiers of single most important finding:  Men vs. women  Young vs. old Secondary questions, findings Unexpected findings (place in literature)  Contradict other studies, conventional wisdom

292 Examples of “Not only that…” sentences “Consistent with research conducted in the the Western world, anal intercourse and increased sexual activity were the main risk factors for HIV infection.” “As seen in other disasters, rapid health assessments can identify immediate health needs and help prioritize public health interventions.”

293 Class participation: “Not only that…”

294 Mea culpa “We recognize limitations of our study…” Confess, come clean  No study is without potential bias  No study is perfectly executed  No study is definitive Avoid criticism early on by acknowledging study limitations Road to redemption

295 Mea culpa Start with single biggest threat to internal validity  Differential loss to follow-up  Participation bias Explain (if you can):  Likely size of this bias  Likely direction of this bias

296 Mea culpa Address common problems and biases if they are a particular concern in your study:  Sample size, power (when no association)  Incomplete responses, data quality  Self-reported behavior, recall bias  Causality in cross-sectional study  Unmeasured and unknown confounders  External validity, representativeness  Alternative interpretations, explanations  Not enough money…

297 Mea culpa… and redemption! “However, we do not feel this bias is likely to…” How you did your best to address the bias in the design and analysis Other evidence that bias is not likely to change primary conclusion Other studies had worse biases

298 Example of primary “Mea culpa” sentences “Our findings are subject to several limitations. First, VCT clientele may not be representative of the general Thai population.” “The primary limitation to interpreting our data is that only half of persons enrolled completed follow-up.”

299 Examples of “Mea culpa” sentences (with redemption) “Second, our data were drawn from only 4 major towns in northern Thailand and do not represent the whole country. However, the fact that our data were comparable with ANC data suggests that our major findings of declining prevalence are not likely to have been affected significantly by such differences.”

300 Class participation: “Mea culpa….redemption”

301 Wrapping it up Don’t end on a sour note! “Despite these potential limitations…” Big picture, extrapolation Public health implications  HIV prevention  HIV care Clinical practice implications Setting the future research agenda (be specific)

302 Wrapping it up Sometimes the “wrap up” is added on as a separate section:  Conclusions  Recommendations  Program Implications

303 Examples of “wrap up” sentences “The high HIV prevalence found among MSM in Thailand coincides with reports of previously undocumented epidemics of HIV infection among MSM in China, Cambodia, and Indonesia and of ongoing HIV transmission among MSM in the Western world. The continuing spread of HIV among MSM highlights the urgent need for more effective behavioral and biomedical interventions to halt the spread of HIV infection in this population.”

304 Examples of “wrap up” sentences “Despite limitations of our study, we believe that the addition of a single client- centered counseling session increased the delivery of partner notification services overall and improved self- reported success in referring partners to treatment.”

305 Class participation: “Wrap up”

306 Additional tips for the Discussion Stick to your data and your findings  Do not speculate on causes that are not suggested by your data  But, OK to offer new hypotheses Do not include new study results  All findings must be in Results  Go back and include them Avoid “More research is needed…”  Unless you say very specifically what is needed

307 Additional tips for the Discussion Do not simply repeat results Use words rather than numbers or statistics Avoid promising future papers or studies

308 Additional tips for the Discussion OK to strike uncertain tone if uncertain Avoid bragging (well, a little is OK)  First ever, first in Asia, Thailand  First controlled study Avoid clichés Do not end on a sour note!

309 Today’s homework Draft (revise):  Discussion section  Continue to work on other parts of your manuscript and analysis

310 Abstract Lecture 10 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Friday 3 February 2006

311 General considerations for abstracts Leave until last  Should fall into place if all other sections are done  Things change along the way Abstracts for manuscripts are simpler than abstacts for conferences Abstracts appear in MEDLINE  Make very easy to read  Numbers and statistics for key findings

312 General considerations for abstracts Check format for journal Word count  500, 250, sometimes as short as 100 Two types of abstracts  Unstructured  Structured

313 Unstructured abstracts No background or 1 sentence only 1 sentence Methods (study design, study population) 2-3 sentence Results (primary hypothesis, main findings, avoid Mea Culpa unless very big) 1-2 sentence Discussion (main outcome [Mission Accomplished, possibly wrap up])

314 Example of an unstructured abstract “HIV voluntary counseling and testing (VCT), an important strategy for HIV prevention and care, has been available in all government hospitals in Thailand since 1992. We assessed factors associated with HIV testing, its uptake, and estimates of HIV incidence after HIV testing among male northern Thai injecting drug users (IDUs) admitted for inpatient drug treatment. Participants were interviewed about risk behaviors and HIV testing history before VCT was provided as part of the study. Of 825 IDUs who participated, 36% reported a prior HIV test. Factors associated with prior HIV testing in multiple logistic regression analysis included higher education and having >1 lifetime sex partner. Needle sharing was not associated with prior HIV testing. Of the 298 men with a prior test, 80% reported a negative result on their last prior HIV test, of whom 28% tested positive in our study, leading to an estimated incidence rate of 10.2 per 100 person-years. Fifty-nine percent of the IDUs who reported a prior HIV test stated that they did not receive pre- and/or posttest counseling. HIV incidence among IDUs remains high despite having VCT. Extending HIV prevention and harm reduction programs is urgently needed for IDUs in the region.” Word count: 197

315 Structured abstracts Follow journal format Basic:  Objective: not complete sentence  Methods  Results  Conclusion Alternative: Conform to journal examples  Design, Setting, Main Outcome Measure RCT format

316 Example of an unstructured abstract BACKGROUND: The Ministry of Public Health (Thailand), MoPH, has had a program called National Access to Antiretroviral Program for People who have AIDS (PHA) or "NAPHA", to offer free antiretroviral drugs (ARV), which are locally produced in Thailand, to any HIV-1 infected patients with CD4<200 since 2002. This program may increase usage of ARV therapy and the emergence of HIV-1 drug resistance. OBJECTIVES: To monitor HIV-1 ARV drug resistant codon mutation in Thailand before and after the "NAPHA" program. MATERIALS AND METHODS: EDTA blood samples were collected from 542 HIV-1 infected subjects, who received ARV therapy in 1999 and 2001-2003, and perinatal chemoprophylaxis in 1998 and 2000. HIV-1 pol nucleotide sequences were analyzed. RESULTS: The percentage of drug resistant detection from the ARV therapy group in 1999 and 2001-2003 were 12.14 (34/280), 10.23 (9/88), 86.96 (20/23) and 57.55 (61/106), respectively. Of 332 NRTI drug resistant codon mutation, 226 (68.07%) were thymidine analogue mutations (TAMs). The percentage of TAMs detection in 1999 and 2001-2003 were 7.14 (20/280), 9.09 (8/88), 56.52 (13/23) and 43.34 (46/106), respectively. Of 105 NNRTI drug resistant codon mutation, 95 (90.48%) were related to nevirapine drug resistance. CONCLUSION: Thailand may need more appropriate monitoring of drug resistance in the free ARV therapy program to protect the future usage of drugs by minimizing the emergence of drug resistance.

317 Key words Follow abstract Select the 3 to 5 that will land your paper into the hands of the right audience Standard words in MeSH headings

318 Today’s homework Draft:  Abstract  Key words  Revise other parts of your manuscript

319 Authorship Lecture 11 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Friday 3 February 2006

320 Authorship The “currency” of research But, a source of hurt feelings  Recognition of collaborators  Cultural differences

321 Authorship Potential problems  Omission of those who merit authorship (or should have been offered the opportunity)  Inclusion of those who do not merit authorship  Order of authorship Clarify authorship as early as possible  But, don’t stymie productivity  PI or mentor should shield you

322 Authorship Journals are cracking down  Frowning on non-contributors  Frowning on “ghost authors”  Frowning on “gift authorship” Usually up to 6 authors acceptable  More than 8 may require written explanation  Some require written statement of roles

323 Criteria for authorship International Committee of Medical Journal Editors  Established in 1978 in Vancouver  Established common criteria for publication of scientific articles in health  Established clear criteria for authorship in 1988

324 Authorship criteria (JAMA) Each author can swear, in writing:  Unique, previously unpublished  Can provide the data to publishers  Agree corresponding author can edit Each author approves final manuscript Each author:  Contributed to conception, design, analysis, interpretation  Put pen to paper, or major editing  Provided statistical expertise, obtained funding, logistical support, supervision

325 Written justification of authorship “MH Katz participated in the planning and analysis of the data and wrote the paper. SK Schwarcz, TA Kellogg, and W McFarland participated in the planning and analysis of the data and edited the paper. J Klausner participated in the planning and analysis of the sexually transmitted disease data and edited the paper. JW Dilley participated in the planning and analysis of the anonymous testing data and edited the paper. S Gibson participated in the planning and analysis of the community survey data data and edited the paper.”

326 Authorship rank Best: First and *corresponding = responsible for paper 2 nd best: Last, “senior author”, PI, “grandfather of ideas” 3 rd best: Second 4 th best: Third, then drops off from here (only 3 authors then “et al” in many reference formats 5 th best: Fourth and so on according to contribution Worst: Next to last *Corresponding author is responsible for paper: Can be anyone - Adds prestige, but responsibility

327 Alternatives to authorship Group authorship  Provides a means to add many authors  “…for the Young Men’s Survey Group” Acknowledgements  For those who do not meet authorship criteria but who contributed

328 By end of today Turn in to Sandy and Sanny - electronic copies  Study description above  Title  Introduction, Methods  Reference List  Drafts of Tables—Hard Copies Please  Do you have both Working Tables and Final Tables?  Draft Results section  Draft Discussion section

329 Homework for weekend By Monday  Make sure you have working tables  Revise tables for manuscript itself  Consider need for figures and prepare them – not everyone will need figures  Read section on statistical analysis – review. Consider what statistical tests and comparisons need to be done  Continue to revise all four sections as needed

330 Timeline Next week  Monday Putting it all together  Choosing a journal  Tuesday – writing up the Discussion  Wednesday – longitudinal data analysis (David Glidden)  Thursday/Friday – Abstract, submission to journal, pick a journal End of next week  Complete draft to be finished and turned in

331 Putting it all together Lecture 12 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Monday 6 February 2006

332 Choosing a scientific journal Field: Biomedical, psychological, social science, statistical Focus: Disease-focused (e.g., AIDS) or general audience? Audience: International or home country? Competition: Competitive or very likely to be published? Timing: Quick or long wait? Sequence:  Aim high and go lower.  Or, go for the “sure thing” Luck

333 Choosing a scientific journal Check the references section in your proposal to see what journals have published similar articles Ask your preceptor, professors, boss Check word count, length requirements  Full article of original research  Brief  Data letter  Letter to the editor Sponsored supplements

334 Biomedical, general audience, international, highly competitive New England Journal of Medicine British Medical Journal JAMA Lancet Science Nature PLoS Medicine

335 Public Health, epidemiology, infectious diseases, very competitive American Journal of Public Health American Journal of Tropical Medicine and Hygiene American Journal of Epidemiology Bulletin of World Health Organization Clinical Infectious Diseases Epidemiology and Infection International Journal of Epidemiology Journal of Infectious Diseases Lancet Infectious Diseases (primarily reviews) Sexually Transmitted Diseases Sexually Transmitted Infections Transactions of the Royal Society of Tropical Medicine and Hygiene Tropical Medicine and International Health

336 Other specialty journals Pediatrics Transfusion Family planning, obstetrics and gynecology Social science (Social Science in Medicine) Free access journals (BMC Public Health, BMC Infectious Diseases) Health Policy and Planning, Journal of Health Policy Virology

337 Southeast Asian general medical and specialty journals Journal of the Medical Association of Thailand Journal of Public Health (Bangkok) Southeast Asian Journal of Tropical Medicine and Public Health Southeast Asian Journal of Social Science

338 HIV/AIDS-focused journals AIDS (number 1) Journal of the Acquired Immune Deficiency Syndromes AIDS and Behavior AIDS Education and Prevention AIDS Care AIDS Research and Human Retroviruses AIDS Patient Care and STDs International Journal of Sexually Transmitted Diseases and AIDS

339 Impact factor Counting references to rank the use of scientific journals was reported in 1927 by Gross and Gross. The term “impact factor” was used in Science Citation Index (SCI) in 1963. This led to a byproduct, Journal Citation Reports (JCR), and a burgeoning literature using bibliometric measures.

340 Impact factor Science Citation Index (SCI) is a publication related to Journal Citation Reports, publishing “impact factor” of journals since 1963. The “impact factor ratio” is calculated as the number of citations in 1 year for all articles divided by the number of articles published in the journal in the last two years

341 Medical journals with the highest impact factor (2004) RANK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15JOURNAL Annual Review of Immunology CA - A Cancer Journal for Clinicians New England Journal of Medicine Nature Reviews. Cancer Physiological Reviews Nature Reviews. Molecular Cell Biology Reviews of Modern Physics Nature Reviews. Immunology Nature Science Annual Review of Biochemistry Nature Medicine Cell Nature Immunology JAMA IMPACT FACTOR 52.431 44.515 38.570 36.557 33.918 33.170 32.771 32.695 32.182 31.853 31.538 31.223 28.389 27.586 24.831

342 106 articles published in 58 journals indexed in PubMed with Thailand AND HIV, 2005 AIDS (10) Southeast Asia Journal of Tropical Medicine and Public Health (8) Journal of the Acquired Immune Deficiency Syndromes (7) AIDS Care (6) J Med Assoc Thailand (6) American Journal of Tropical Medicine and Hygiene (3) Antiviral Therapy (3) Plus two each from:  AIDS Research and Human Retroviruses (2)  Health Care for Women International (2)  Health Policy (2)  International Journal of Epidemiology (2)  International Journal of Tuberculosis and Lung Disease (2)  Journal of Clinical Microbiology (2)  Pediatric Infectious Disease Journal (2)

343 Cover letter Accompanies manuscript to editor Official letterhead Address for correspondence Increasingly done by e-mail Consent form signed by all authors

344 Cover letter: 1 st paragraph “Dear Dr. Patterson: Please find enclosed four copies of a manuscript entitled “The role of the Mediterranean diet in the prevention of protease inhibitor-associated lipodystrophy in Croatia” for consideration for publication.”

345 Cover letter: 2 nd paragraph Paragraph 2: Why this paper will be of interest to your readers “We feel the paper will be of particular interest to your readers as it addresses a unique experiment of nature that may have widespread clinical applicability. Few studies of protease inhibitor-associated lipodystrophy have been able to evaluate the diet of patients.”

346 Cover letter: additional Suggested reviewers and why (if allowed) Not submitted elsewhere Co-authors meet criteria All co-authors sign (typically they will all sign a form that the journal provides) “We hope our paper will receive favorable consideration for publication in AIDS.”

347 Maximizing the chance of publication Short Clear, concise writing  Easier to accept one more short paper  Easier to communicate most important point Conforms to normal conventions, format  Introduction, Methods, Results, Discussion When returned for revision, careful, thoughtful, diplomatic response to every single peer reviewer comment

348 Today’s work, homework Draft:  Cover letter to journal editor  Continue revisions of all sections

349 Journal review and responding to reviewers’ comments Lecture 13 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Monday 6 February 2006

350 Under review Usually 2 reviewers, sometimes 3, plus editor Online submission indicates status  Sent to review or not If under review for greater than 4 months, contact journal  Reviewer late  Lost article  Editor can’t decide

351 Types of comments from reviewers Major concern due to:  Flaws in design, analysis, interpretation  Confounding, often unmeasured  Evidence of other studies Minor concern:  Include alternative view  Include key reference Editorial, style, grammar

352 Frustrating reviewers’ comments Appear not to have grasped main point Appear not to have carefully read Read too carefully! Long comments Not constructive  No suggestions for what do change  Nothing can be done  “The study the authors should have done…” Pedantic, showing off Have their own agenda  Consider appeal to editor

353 Editor’s decision letter Never say: “Great, we’ll take it.” “Favorably disposed pending minor revisions…” “Cannot accept in present form, but will consider with revisions that address the reviewers’ comments…” “Willing to consider resubmission with major revisions…” “Unfortunately, your paper did not receive a high enough priority rating…”

354 Responding to reviewers’ comments Resubmit if any positive response Address every single point, number by number and say exactly how you have changed the text, table or figure

355 Responding to reviewers’ comments Concede all “easy points”  Additional references  Include alternative points of view  Grammar, style Polite, contrite, diplomatic Avoid rebuttals if possible

356 Letter responding to reviewers’ comments Paragraph 1: “We are delighted that AIDS & Behavior will consider publication of our paper pending satisfactory revisions as suggested by the reviewers.”

357 Letter responding to reviewers’ comments Paragraph 2: “We have given careful consideration to all the reviewers comments and have done our best to address them all. The following is a point by point explanation of how we have address the concerns and revised our manuscript.”

358 Letter responding to reviewers’ comments Addressing Major Concerns: Reviewer #1. “1. The authors should address the question of whether HIV seroconversion is associated with amphetamine use or drug use in general. “Following the reviewer’s suggestion, we constructed a variable for ‘any drug use’. Persons with any drug use had elevated risk for unprotected sex (RR=2.3, 95% CI 1.2 – 4.4) compared to non-drug users. For persons who used amphetamine (with our without other drugs) the association with HIV seroconversion was even further elevated (RR 3.0, 95% CI 1.4 – 6.5). These new results suggest that amphetamine use is more strongly associated with HIV seroconversion than drug use in general. We have added these results to page 13 as….”

359 Letter responding to reviewers’ comments Addressing Minor Concerns: Reviewer #2. “10. Finally, on a minor point, the authors speak of ‘amphetamine use during sex.” The phrase ‘sex during amphetamine use’ might be better.” “We have changed the phrasing to ‘sex during amphetamine use’.”

360 Letter responding to reviewers’ comments Addressing Minor Concerns: 8. Page 3, line 12, we have deleted the word “seductively”. 9. Page 5, line 12 and References, we have added the citation by Mermin et. al. as #19 and made corresponding changes to the numbering.

361 Letter responding to reviewers’ comments Final paragraph: “We thank the reviewers for their thoughtful comments. With these revisions, we feel the paper has been substantially improved. We hope it will receive favorable consideration for publication in …”

362 Peer reviewing Lecture 14 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Wednesday 8 February 2006

363 Peer review: purpose An organized process to give and receive constructive feedback  Early on for major modification before waste too much time  At end to maximize chance of publication  In this course we only have time for the latter Better to find issues now among friends than later among critics New ideas from a fresh perspective

364 Peer review: purpose NOT: A test of reviewer’s ability to tear apart your work NOT: A test of your ability to defend of what you did NOT: A time to say, “The study you should have done is...”

365 Format for reviewer Start with a one sentence statement of the central finding of the paper  To be sure you understand the paper Provide three strengths of the paper or study  Everyone is sensitive Select three concerns or issue to discuss  Other minor points can be written on draft or discussed later  Address grammar and style by written comments

366 Format for reviewers Types of critique:  Things that appear incorrect (wrong analysis, internal contradictions, interpretation of findings)  Sections that are unclear  Sections that need more detail With each major criticism, provide a concrete solution or suggestions Reviewer 1: Limit verbal commentary 15 minutes Reviewer 2: Don’t repeat, only add something new

367 Format for author Listen  Resist the urge to defend what you did Take notes and use what is helpful  Some ideas will be excellent  Sometimes, the reviewer missed the point (but so might other readers, clarify) OK to disagree with reviewer  But, no need to argue  Silently reject and move on

368 When the author should speak Optional two minute introduction that calls attention to particular areas you are having problems with Provide clarification only if the reviewer specifically asks To ask for clarification of a comment At end, to ask for more feedback on a specific area of difficulty To thank the reviewer

369 Format for chair To keep track of procedures and time To stop unproductive arguements Perogative to comment on paper as well Perogative to allow others present to comment To summarize major points made To thank both reviewer and writer

370 Course wrap up, status of manuscripts, next steps and course evaluation Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Friday 10 February 2006

Download ppt "Introductions and Course Overview Monday 30 January 2006 Bangkok Scientific Writing Workshop 30 January - 10 February 2006 Lecture 1."

Similar presentations

Ads by Google