Dror Berel

Stop Building, Start Assembling

Afternoon Keynote, 1:15 - 1:45 PM

After 15+ years as a data science consultant spanning academia and public health, I’ve witnessed spectacular pipeline failures that haunt my dreams: spaghetti code architectures, unnecessarily complex solutions, and bizarre attempts to rebuild standard tools from scratch. These experiences taught me a counterintuitive truth: the most sophisticated data solution is the one you didn’t have to build at all. This talk challenges the “build from scratch” mentality plaguing data science teams and presents a radical alternative: stop building, start assembling. By leveraging mature, battle-tested open-source ecosystems, teams achieve what custom solutions rarely deliver—maintainability, transparency, and scalability with a fraction of the effort.

I’ll share real-world transformations, including how we replaced a costly closed-source vendor tool (3-day runtime) with 4 lines of open-source code that runs instantly. You’ll learn practical strategies for choosing the right tools over reinventing them, and discover why simplicity and clarity consistently outperform complexity.

Key Takeaways:

Why your “sophisticated” custom pipeline is probably a liability
How to identify when existing tools solve your “unique” problem
Real examples of dramatic simplification wins
Practical approaches you can implement immediately

Perfect for data scientists, engineers, and team leads tired of maintaining fragile, over-complicated systems who want to build solutions that actually last.

Pronouns: he/him

Seattle, WA, USA

Dror Berel is a statistical consultant with over 20 years of work experience in both academia and industry. He loves using R for (almost) everything. He works as an independent consultant, solving business problems and scale analytical tools for diverse data domains, leveraging both traditional Machine learning and Causal Inference along with modern approaches. Among the data domains he specialize in are: genomic biomarker discovery, clinical data reporting (CDISC, ADaM, TLGs). His services also include: Authoring Real World Evidence data analysis and documentation. Writing statistical plans, Power analysis, Design of experiments, and biomarker discovery. Developing R/Shiny apps, and REST APIs. Golem, Rhino, Teal and others.