Beyond the Comfort Zone: Traditional Statistical Programmers Embrace R to Expand their Toolkits
In the pharmaceutical industry, traditional statistical programmers have long relied on proprietary software to perform data analysis tasks. However, in recent years, there has been a growing interest in open-source tools like R, which offer a range of benefits including flexibility, reproducibility, and cost-effectiveness.
In this presentation, we will explore the ways in which statistical programmers in the pharmaceutical industry are embracing R to expand their toolkits and improve their workflows, including data visualization and the generation of Analysis Data Model (ADaM) datasets.
One key challenge in using R to generate ADaM is bridging the gap between open-source R packages (e.g., admiral, metacore, metatools, xportr from Pharmaverse) and the company’s internal resources. We will discuss strategies for overcoming this challenge and how it can be integrated into a company’s existing infrastructure, e.g., including the development of in-house R packages and provide internal template scripts/use cases.
Overall, this presentation will provide examples of how R can be used as a powerful complement to traditional statistical programming languages, such as SAS. By embracing R, statistical programmers can expand their toolkits, collaborate across the industry to tackle common issues, and most importantly, provide value to their organizations/industry.
Pronouns: she/herVancouver, BC, CanadaKangjie Zhang is a Lead Statistical Analyst at Bayer within Oncology Data Analytics team. She uses SAS and R for statistical analysis and reporting, supporting clinical trial studies and facilitating the transition from SAS to R for clinical submissions. With a passion for open-source projects, she has contributed to multiple R packages. Before joining the pharma industry, she worked as a Data Analyst at CHASR ([Canadian Hub for Applied and Social Research](https://chasr.usask.ca/index.php)) and the Saskatoon Police Station, utilizing R for data collection, manipulation, and predictive modeling. |