2013 Internship Projects & Abstracts

2013 Presentations

Prognostic Utility of Single-Nucleotide Polymorphisms in Inflammatory Arthritis

Callear A¹, Foth W¹, Strenn R², Brilliant M¹, Schrodi S¹
¹Center for Human Genetics and ²Biomedical Informatics Research Center
Research area: Genetics

Amy Callear
Amy Callear
University of Pittsburgh

Background: The inflammatory arthritis conditions ankylosing spondylitis (AS), psoriatic arthritis (PsA), and rheumatoid arthritis (RA) are difficult to diagnose, and this delay is associated with increased disease burden. A significant portion of single-nucleotide polymorphisms (SNPs) that confer susceptibility to these conditions have been identified, sparking interest in their potential diagnostic use. The purpose of this study was to investigate the prognostic utility of a panel of these genetic markers by using machine learning models to classify confirmed inflammatory arthritis cases as AS, PsA, or RA.

Methods: Seventy-three participants of the Personalized Medicine Research Project (PMRP) with inflammatory arthritis (AS: n=26, PsA: n=7, RA: n=40) were genotyped for 37 SNPs found to be associated with AS, PsA, or RA in genome-wide association studies. Using Weka, a machine learning software package, classification models were first developed on a training set of 10,000 samples created from SNP frequencies reported in the literature. The performance of several classification models, including neural networks, decision trees, support vector machines, and Naïve Bayes, was assessed, with feature selection performed for each. The best model, Naïve Bayes applied to 29 SNPs, was evaluated on the test set of PMRP participants. Predictive performance was assessed according to the accuracy and the area under the ROC curve (AUC).

Results: The accuracy of the Naïve Bayes model on the training set was 78.05% and the AUC was 0.884. The accuracy of the model applied to the test set of PMRP participants was 61.64% and the average AUC was 0.635 (AS: AUC=0.608, PsA: AUC=0.665, RA: AUC=0.648), perhaps reflecting poor concordance between test and training sets. The AUC is statistically significant (p≤0.05).

Conclusion: Despite statistical significance, this SNP panel was not highly effective in classifying inflammatory arthritis types in this population. Use of a larger SNP array could potentially enhance the predictive performance.

Geospatial Factors for Lyme Disease Risk in Wisconsin

Campbell CC, Schotthoefer AM PhD
Integrated Research and Development Laboratory
Research area: Infectious Diseases 

Christopher Campbell
Christopher Campbell
University of Minnesota
Twin Cities

Background: Over the past twelve years, the quantity of human Lyme disease cases in Wisconsin has greatly increased, particularly in the north-eastern counties of the state. The pattern of infections and recent surge in cases has not been extensively studied. To better understand the geographic risk of Lyme disease, we analyzed environmental factors to determine conditions associated with infections.

Methods: We identified 4573 laboratory-confirmed cases of Lyme disease in the Marshfield Clinic Health Care System in Wisconsin from 2000 through 2011; of these, 3296 had addresses that could be mapped in ArcGIS 10.1. Publically available environmental layers were evaluated in our analysis. We used 2010 U.S. Census data to calculate incidence rates and generate controls. The cases and controls associated with the layers were analyzed with logistic regression in SAS 9.3.

Results: Fragmentation factors (e.g. road density) tested had little or no effect, accounting for ~10% of total variation in cases. Persons living beyond 1 km from a highway and in wooded regions were 1.59 (95% Confidence Interval: 1.38, 1.84) times more likely to be tested positive for Lyme disease than those in wooded areas and within 1 km of a highway. Land cover, bedrock type, soil temperature, and soil order were successful at determining geographic risk, correctly modeling ~70% of cases. Notably, forest edge and open-to-low development were associated with high risk, and highly developed regions were associated with lower risk. Frigid soils and the orders spodosols and alfisols also conferred higher risk.

Conclusions: This study confirmed that living within the edges of forest substantially increases risk of Lyme disease, as suggested by other studies. Additionally, this study shows that features outside of the immediate home environment (e.g., >1 km) may strongly influence risk.

Influenza and Workplace Productivity Loss in a Community Cohort of Working Adults

Gajewski A, Sundaram M, Belongia E, Van Wormer J.
Center for Clinical Epidemiology and Population Health
Research area: Epidemiology 

Anna Gajewski
Anna Gajewski
Emory University

Background: Acute respiratory illnesses (ARIs) cost the U.S. tens of billions of dollars annually in direct medical care, but indirect ARI-associated costs are predicted to exceed this figure. Influenza plays a significant role in workplace productivity losses, due to wide-spread occurrence, severe symptom profile, and variable seasonal vaccine effectiveness. However, no studies to date have compared laboratory confirmed influenza cases to other ARIs in terms of short-term impact on workplace absenteeism (time away from work) and presenteeism (impairment while at work).

Methods: An analysis was conducted using data from employed participants in the 2012-13 Rapid Analysis of Influenza Vaccine Effectiveness (VE) study. Multiple linear regression was used to test the association between influenza status at the time of VE study enrollment and overall workplace productivity loss during the 1-2 week period following ARI symptom onset. Workplace productivity loss (0-100%) was measured using a modified Work Productivity and Activity Impairment questionnaire.

Results: Total productivity loss was 69% for participants with influenza and 61% for participants with other ARIs (p=0.004). After adjusting for sex, week of symptom onset, and smoking, influenza was significantly associated with an 8.5% increase in workplace productivity loss (p=0.002). Sensitivity analyses on absenteeism and presenteeism outcomes indicated that missed work days were the principal driver of workplace productivity loss in the influenza positive group.

Discussion: Influenza was associated with workplace productivity loss above that observed in individuals with non-influenza ARIs. This additional productivity loss in the influenza group was primarily attributable to hours absent from work. More research is needed to better understand the full economic implications and how much variability there is between influenza seasons. The findings suggest that productivity loss from ARIs, including influenza, is responsible for a significant portion of the overall economic burden of ARIs.

Efficient Manipulation and Retrieval of Large-Scale Next Generation Sequencing (NGS) Data

Kuhn A, He M.
Biomedical Informatics Research Center
Research area: Biomedical Informatics 

Andy Kuhn
Andy Kuhn
University of Wisconsin

Background: A new generation of technologies allows sequencing and genotyping of personal genomes at a speed and accuracy unimaginable a few years ago. These technologies have already caused an explosion of data on human genome diversity and data on how interactions between the entire genome and non-genomic factors affect health. The most common data generated in next-generation sequencing (NGS) are stored in variant call format (VCF) files. One VCF file can contain hundreds of gigabytes of genetic information, which makes it difficult to manipulate and retrieve information. We developed a method to manipulate and retrieve multiple large-scale VCF files efficiently.

Methods: We developed a programming method to simulate a number of VCF files for evaluating the performance of manipulation of multiple VCF files. Then we incorporated a program called Tabix in our program to compress and index the multiple VCF files. We also developed a program with both user friendly interface and command line modes to load multiple VCF files, manipulate and retrieve genetic information and quality scores from them. To test the efficiency, we compared the performances of processing 500 and 1000 VCF files individually on a BIRC server.

Results: The running time of retrieving a variant and its quality scores within 1000 VCF files was 0.193 milliseconds (MS) while the running time of retrieving a variant and its quality scores within 500 VCF files was 0.083 MS.

Conclusions: The comparison shows the running time of manipulating a large number of VCF files is similar to the time used to manipulate small number of VCF files. It means the developed method can be used to manipulate thousands of VCF files, which are a real dataset of a NGS study, in a reasonable time.

A Longitudinal Study of Adverse Drug Effects in Patients with Fibromyalgia

Lau PK, Burmester JK, Berg DL, Mazza JJ, Schmelzer JR, Yale SH.
Clinical Research Center
Research area: Clinical Research 

Patrick Lau
Patrick Lau
John Hopkins University

Background: Fibromyalgia (FM) is a clinical syndrome that includes chronic widespread musculoskeletal pain and other symptoms such as sleep disturbances and depression. Treatment of FM depends on patient symptoms, physician experience and preferences, and knowledge of adverse drug effects (ADEs). This study examined ADEs associated with drugs currently used in clinical practice for treatment of FM.

Methods: We retrospectively examined the records of 250 patients diagnosed with FM by a rheumatologist in 2011-2012 and abstracted data on medications used to treat FM and ADEs at the person-level. The definition of an ADE is an unfavorable and unintended response to a drug. We calculated the proportion of ADEs by drug using the patient as the observation unit, their exposure to the drug, and ADE occurrence, summing across all patients with same drug exposures. Similarly, we developed data at the person-level to allow comparisons of the likelihood of an ADE from FM drugs between patients who had and did not have ADEs to drugs not prescribed for FM prior to their FM diagnosis by a rheumatologist.

Results: Among the drugs prescribed to at least 10 patients, those most frequently associated with ADEs were Milnacipran (40.0%), Pregabalin (29.5%), Duloxetine (25.6%), Celecoxib (25.0%), and Venlafaxine (25.0%). Drugs with the lowest frequencies of ADEs were Diclofenac Epolamine (0.0%), Baclofen (0.0%), Lidocaine (0.0%), Hydrocodone-Acetaminophen (0.0%), and Ibuprofen (0.0%). Our preliminary analyses suggest that patients who experience a prior ADE to a non-FM drug may have a greater risk of experiencing an ADE from FM treatment (38.98% vs. 29.17%, p = 0.15).

Conclusions: Patients with a history of an ADE to a non-FM drug should be closely monitored for ADEs after FM treatment. One consideration for clinicians when prescribing medications to treat FM should be the risk for ADEs, measured roughly in this study with relative frequencies.

Analysis of the Relationship between Obstructive Sleep Apnea and Venothromboembolic Events

Olson EM, Burmester JK, Berg RL, Boero JA, Karanjia PN, Mazza JJ, Schmelzer JR, Yale SH.
Clinical Research Center, Marshfield Clinic Research Foundation, Marshfield, WI.
Research area: Clinical Research 

Emily Olson
Emily Olson
St. Olaf College

Background: Obstructive Sleep Apnea (OSA) is characterized by pauses or decreases of air flow during sleep. This condition is associated with a higher risk of cardiovascular disease including stroke and myocardial infarction. Although previous studies investigated the relationship between OSA and coagulopathy, limited information is available regarding whether OSA is a risk factor for venothromboembolic events (VTE). This study examined the VTE incidence in relation to OSA severity and treatment compliance.

Methods: This study retrospectively reviewed the electronic medical records of 619 Marshfield Clinic patients with a documented history of (a) a polysomnogram indicating OSA and (b) one or more VTE, in addition to records for 100 OSA control patients with no recorded VTE. Patients were excluded if the diagnostic polysomnogram was nonexistent or prior to 2000, or VTE diagnostic dates were nonspecific. For the 292 evaluable patients who met the diagnostic criteria, diagnosis dates determined the time between the OSA diagnosis and VTEs. OSA treatment compliance was established based on provider comments. Other abstracted data included BMI, polysomnographic variables (including Apnea/Hypopnea Index, AHI), age, gender, and hypercoaguable states.

Results: The Kaplan-Meier plot showed that the median years to VTE was 5.8 for non-compliant patients (95% confidence interval: 3.3-7.4) and 7.3 years for compliant patients (95% confidence interval: 5.9-11.3). The trend for increased VTE rate among non-compliant individuals was not significant (p=0.496). After adjusting for gender, age, BMI and AHI (a measure of OSA severity) in a proportional hazards regression model, only age was highly significant (p=0.001). AHI had a slight influence (p=0.054). Lack of statistical significance may be due to a small sample size.

Conclusions: Neither OSA severity or treatment compliance statistically changes the VTE incidence. More data on confounding variables is required before showing that OSA is an independent risk factor for VTE.

Predictive Text Keyboard Applications in a Healthcare Setting

Omage S, Mahnke A, Lin S.
Biomedical Informatics Research Center
Research area: Interactive Clinical Design 

Stephanie Omage
Stephanie Omage
University of Queensland
School of Medicine

Background: Mobile technological devices such as tablets are gradually being introduced for healthcare activities such as data entry. Fast and accurate data entry is desirable. Typing with these devices may require learning to use touchscreen/onscreen keyboards. We evaluated the effectiveness, efficacy and satisfaction of using default and predictive text keyboards on a Samsung tablet.

Methods: Medical/clinical note-taking personnel participated in a one-time usability evaluation lasting approximately 30 minutes. This usability test was conducted in a simulated environment on a Samsung android tablet using both default and predictive keyboards. Participants were asked to take on the role of a medical provider typing de-identified sample clinical notes. Effectiveness (accuracy/error rates) and efficacy (words per minute) of each keyboard were recorded and measured using a Morae recorder and ExamDiff Pro comparison tool. Baseline (or typical) typing speed was measured on a Fujitsu convertible laptop computer. The sequence of the keyboard test (default, predictive or Fujitsu) was randomized. Participants’ subjective experiences including satisfaction were queried using surveys and an exit interview.

Results: Under IRB approval, 15 subjects were recruited. Median typing speed (in words per minute, WPM) was faster in the default (17.9) than predictive keyboard (13.3), with p-value=0.03. Percentage error for the predictive text keyboard was 3.81 and that for the default keyboard was lower at 3.50 (p>0.05). Eight out of 15 participants preferred predictive keyboard over the default when queried through survey.

Conclusions: Although participants were not comfortable with predictive texting initially, many expressed potential to learn to use it and become more efficient with it. A longitudinal study following participants over time can monitor change or lack of change in effectiveness, efficacy and satisfaction of using predictive keyboards over time.

Formal Usability Evaluation of a Touch-Based Tooth Charting Application

Sorenson AD, Schwei KM, Mahnke AN, Acharya A.
Biomedical Informatics Research Center
Research area: Dental Informatics 

Adam Sorenson
Adam Sorenson
University of Wisconsin

Background: Clinical computing in dentistry is increasingly prevalent. Studies demonstrate a need for intuitive clinical information systems in dental practices. The dental informatics team at Marshfield Clinic had previously developed a prototypical "Intelligent Tooth Charting System" (iTooth) for tablet use, enabling dental providers to record tooth status using touch-based gestures. The purpose of this study was to evaluate the usability of iTooth through feedback from dental providers.

Methods: Nine participants were recruited, including dentists, hygienists, and dental assistants. During one-on-one sessions, the moderator gave participants basic instruction on iTooth. Participants were video and audio recorded as they completed typical tooth charting tasks using a think-aloud protocol. Pre- and post-test interviews and surveys measuring the system usability scale (SUS) score and Net Promoter Score (NPS) were administered. Participants completed a desirability word selection exercise. Sessions were reviewed and analyzed using Morae Manager (TechSmith).

Results: iTooth had a SUS score of 67.75 and a NPS of zero. Task time varied from 10 to 120 seconds. Average percent completion without assistance varied from 10% to 100% among all tasks. Usability problems were identified, including participants’ inability to locate a particular icon (78% of participants), confusion resulting from charting statuses (44% of participants), and 50% recognition failure for the root canal gesture. Most commonly selected words describing the system were "efficient," "useful," "usable," and "new." Conclusions: A NPS of zero is considered neutral and a SUS score of 67.75 translates to the grade of "C", meaning the current prototype has average usability. Results will direct iTooth’s future development, improving usability and functionality. Suggested improvements include revised icon placement for easier location, grouping of "pathology" and "treatment planned" charting statuses, and enhanced symbol recognition. Feedback illustrated that iTooth may be an exciting alternative to traditional mouse-computer interfaces in the future of dental charting.

Radiographic Evaluation of Stillbirth: What Does it Contribute?

Swenson EE, Schema LS, McPherson EW.
Department of Medical Genetics Services
Research area: Clinical Research 

Erica Swenson
Erica Swenson
University of Wisconsin School
of Medicine and Public Health

Background: The Wisconsin Stillbirth Service Program (WiSSP) was established to provide comprehensive stillbirth evaluation and promote research into causes of stillbirth. The WiSSP protocol has always included whole body radiographs. However, as prenatal ultrasound quality has improved, most information previously acquired from radiographs has been obtained prenatally. Several recent stillbirth evaluation protocols have omitted X-rays or limited them to select cases. Using the 2799 cases evaluated through WiSSP, we evaluated the utility of whole body radiographs in making diagnoses in the stillborn population.

Methods: Through a computerized search of the WiSSP data, radiographic anomalies were identified in 517/2032 cases with radiographs. For these 517 cases, radiographs were compared with data including perinatal and family history, clinical examination, photographs, autopsy, placental pathology, and laboratory results to determine what role the radiograph played in assigning cause of fetal death.

Results: 234/517(45%) anomalous radiographs were sufficient to make a diagnosis. 30/2032(1.5%) radiographs resulted in a diagnosis that would otherwise have been missed, and 204/2032 (10.0%) provided confirmation of a diagnosis for an overall diagnostic yield of 11.5%. 197/517(38%) helped identify abnormalities contributing to fetal death. Conditions recognized on radiographs included hydrops(165), skeletal dysplasias(25), Mendelian other than skeletal dysplasias(38), chromosomal(95), sporadic birth defects(116), multiple congenital anomalies(43), and non-genetic conditions such as fetal sepsis(12). Only 16/517(3%) did not help assign cause of death, 15/517(3%) were misleading, and 56/517(11%) were incidental anomalies that did not explain fetal death.

Conclusion: Radiographs are useful for recognizing skeletal dysplasias, other birth defects involving bones, and soft tissue abnormalities. Radiographs provided an overall diagnostic yield comparable to other parts of the stillbirth evaluation and are especially valuable in the absence of autopsy and photographs. With the possible exception of cases where a definitive cause of death is known immediately, radiographs should remain a routine part of stillbirth evaluation.

Developing a Reference Framework for Cross-Mapping between Different Dental Diagnostic Terminologies

Vesel MK, Shimpi NA, Acharya A.
Biomedical Informatics Research Center
Research Area: Dental Informatics 

Megan Vesel
Megan Vesel
Northern Michigan University

Background: Unlike in medicine, there are no nationally standardized diagnostic terminology sets (DTS) for dentistry. International Classification of Disease (ICD) codes do cover some of the diagnoses in dentistry; however it is not granular enough. A more granular, dental-specific DTS, EZCodes, was created by Harvard University. American Dental Association has also developed a DTS called SNODENT for dentistry. Our objective was to develop a reference framework between ICD and EZCodes and to understand the coverage of the diagnostic terms across these DTSs.

Methods: Cross-mapping was achieved by looking for semantic and/or syntactic matching terms between ICD-9-to-ICD-10, ICD-9-to-EZCodes, and ICD-10-to-EZCodes. Mapping between each term was coded as complete, partial or no match. Once all terms were mapped between two DTSs, we categorized the type of relationship as one-to-one, one-to-many, or many-to-one. All mapping was done using online references, and searching for direct or related matches. Cross-mapping was then reviewed by a dentist for accuracy.

Results: Overall, ICD-9-to-EZCodes showed 36% of the terms matched, of these, 35% completely matched, 65% partially matched. ICD-10-to-EZCodes had a greater amount of matching codes at 47%, out of which 74% completely matched, and 26% partially matched. ICD-9-to-ICD-10 showed 95% matching, with 70% complete and 30% partial matches.

Conclusions: This study developed a reference framework for cross-mapping between different dental diagnostic terminologies. Our results confirm the ICD-9-to-ICD-10 cross-mapping had the largest amount of complete and partial matches and lowest amount of no matches, as expected when comparing related terminologies. Being a detailed dental diagnostic vocabulary, EZCodes had the largest percentage of no matches to ICD-9 and ICD-10. Additionally, ICD-10 was more in depth than ICD-9, explaining why ICD-10 more closely matched the granularity of EZCodes, with fewer no matches in ICD-10-to-EZCodes than in ICD-9-to-EZCodes.