The European Anaphylaxis Registry is a uniqe database, that gathers anaphylaxis cases along with high quality clinical data. It is maintained by the NORA e.V.. I’m lucky to be working on this challenging dataset as a data scientist.
My job in the project
My work on the European Anaphylaxis Registry started in 2015. I first looked into the sex differences in anaphylaxis severity. Next, I was investigating the influence of aggrevating factors on the severity of anaphylaxis.
What I have learned
This project was a demanding and complex task which required me to incorporate new tools int daily practice. I had to reorganise my analytic workflow to use GIT daily, registering all changes made to the files. Also collaboration with tem members who preferred to work with Word files posed a challenge. Nevertheless, I managed to use R to create a reproducible paper written entirly in R and with the use of “knitr” package convert it into final word documents.
I have learned how to work with bibliographies in R, how to divide analysis and reports into separate files which than can be used to produce the final manuscript. During this project I had to start writing my own package with helper functions around the logistic regression outputs and I have seen the problems of scalability, that were resolvesd by over 50 custom R scripts, to manage the big data analysis.
- Logistic regression analysis
- Comparing logistic regression models
- Odds ratios and relative risks differences
- Code externalization
- ggplot perfection to produce publication ready figures.
- It is possible to collaborate on a manuscript with multiple authors on a paper written directly in knitr, but it requires that all changes be introduced to the source files each time what can be a daunting task. 2. Using GIT is essential not to get lost in along the way
- Having a strict stucture for the analysis for future projects
- Analysis and manuscript write-ups need a strictly defined file structure
- Mimimizing code snippets in the manuscript file increases the readability of the file.
- All functions should be in separate files
- Wrapping a paper into a R package might be a good idea for future projects.