Data Engineer
- United Kingdom
- Permanent
My client, a biotech company harnessing artificial intelligence to transform the drug discovery and development process with a focus on immunology and inflammation are in search for a Data Engineer to strengthen to develop & enhance upon their Data Stack.
As a Data Engineer, you will play a critical role in generating, processing, storing, and providing access to multimodal datasets related to immune-mediated diseases. Your role will involve creating technical solutions to enhance the quality, accessibility, and usability of multimodal datasets, enabling teams across the company to conduct impactful research and drive innovation.
You will join the R&D Team and report directly to the Chief Scientific Officer (CSO).
Your Responsibilities:
- Data Generation: Discover and evaluate publicly available immuno-inflammation datasets, including clinical and molecular data, and develop automated solutions for their collection and integration.
- Data Processing: Design, build, and maintain scalable bioinformatics pipelines to automate the curation, cleaning, and preparation of the Scienta Lab data portfolio, ensuring data integrity and reliability for downstream analysis.
- Data Annotation: Establish and maintain high-quality dataset annotations, ensuring they are comprehensive, accurate, and aligned with internal standards.
- Data Documentation: Manage and maintain thorough documentation for all datasets, including metadata, provenance, and usage guidelines, adhering to industry best practices to ensure reproducibility.
- Data Visualization: Develop visualizations and feasibility studies to assess datasets and support business decisions. Provide interactive dashboards and tools for intuitive data exploration and actionable insights.
- Coding Collaboration: Partner with the technical team to support data modeling efforts, promote best practices in software engineering, and ensure seamless dataset integration into analytical workflows.
- Cross-Team Collaboration: Work closely with business, scientific, and data science teams to ensure datasets are accessible, well-documented, and meet quality standards. Serve as the primary point of contact for dataset-related inquiries and technical support.
Your Profile:
- Experience: 4+ years in computer science, bioinformatics, computational biology, or a related field.
- Data Expertise: Strong understanding of omics datasets (e.g., transcriptomics, proteomics) and clinical data structures.
- Data Interpretation: Proven ability to analyze complex datasets and create effective data visualizations.
- Hands-on experience in developing and optimizing bioinformatic pipelines using workflow management systems (e.g., Snakemake, Nextflow).
- Strong programming skills in Python, with a solid grasp of software engineering best practices.
- Teamwork: Proven ability to collaborate effectively with biologists, researchers, and software engineers in a multidisciplinary environment.
- Problem-Solving: A solution-oriented mindset with a strong sense of service to meet project and team needs.