Senior Data Engineer (with Python)

  • Red Sky
  • Remote job

Senior Data Engineer (with Python)

Job description

Red Sky is a key part of the Tar Heel Capital Pathfinder fund - www.thcpathfinder.com, which was launched in late 2016. Our idea is to build projects with a global perspective and with particular emphasis on modern IT solutions. Every year we introduce new, revolutionary products to the market. We support great ideas and promote talents. We help build innovative startups not only by providing financial resources, but above all through substantive support.


We operate in accordance with the venture building formula - we create projects based on the experience of experts, including teams of programmers, designers, project managers, HR, administration and marketing - ready to develop and support promising startups.


NaturalAntibody is seeking a Senior Data Engineer to work on its computational antibody drug discovery product portfolio.


Antibodies are natural proteins of the immune system tasked with identification of noxious molecules for elimination. This extraordinary molecular-recognition capacity of antibodies was harnessed for the purpose of drug discovery, with multiple antibody-based blockbuster drugs on an ever growing market. Antibody-based therapies are typically developed using arduous experimental protocols. Computational approaches now hold the promise of accelerating this drug development and this is the focus of our company.


NaturalAntibody is a company specializing in development of computational methods for antibody-based drug discovery. Our goal is to understand the biology of antibody molecules, their therapeutic context and how such knowledge can be translated to improved antibody therapy design. We pursue this goal by collecting, generating and analysing antibody data, with an end goal of applying our findings to antibody discovery.


Responsibilities

As a Senior Data Engineer you will design and work on our data and analytics stacks. You will contribute to our data stack by analysing our existing databases and creating novel datasets. You will employ this data to improve our analytics stack by creation of suitable computational models addressing pertinent needs in antibody-based therapy development. The work will be a combination of software development and research so you should be well suited to tackle open-question challenges in an independent fashion. The work will bring you with a close collaboration with leading experts in the field of drug discovery in the pharmaceutical industry so communication and teamwork skills are very important.


Here are just a few examples of potentials tasks or activities in this role:

  • Designing Big Data pipelines in line with good practices like IaaC, High Availability and Security in mind
  • Data collection, curation and maintenance for existing data stack and novel databases.
  • Analysis, benchmarking of the existing models in analytics stack.
  • Development of novel computational models on antibody drug discovery.
  • Research into antibody biology and their therapeutic context.
  • Liaising with clients from the industry.

Requirements

The successful candidate should have:

  • Expertise in handling large datasets, preferably (e.g. Next Generation Sequencing, Proteomics, Protein Structures).
  • Programming skills in Python and tools designed for Big Data processing (terabytes of data) like Spark, Apache Airflow
  • Experience in designing cost efficient data pipelines using AWS tools like AWS EMR, AWS Glue, Step Functions etc.
  • Knowledge of IaaC tools like Terraform or CloudFormation
  • A high level of self-discipline - as Data Engineer you will be responsible for making meaningful decisions about project’s course based on your insights
  • Full proficiency of English is mandatory.

Nice to have:

  • A Master level degree in computer science, statistics, datascience, bioinformatics or similar. PhD would be a strong plus.
  • Prior work in Immunoinformatics is a strong plus.
  • Hands-on expertise in applied statistical methods – knowledge of machine learning is a plus.

Remuneration:

  • 14 000 - 18 000 PLN net B2B

In addition to remuneration, we offer:

  • Work on innovative project with the direct impact on its development
  • Access to unique knowledge within the organization and cooperation with outstanding experts and business partners from around the world
  • 100% financing for participation in industry conferences throughout Europe
  • Budget for books, training and other materials
  • Private medical care package (Medicover), Multisport card, group insurance
  • Integration events in the spirit of #redskyteamspirit


Additional information:

  • Methodology: Kanban
  • Technical stack:
    • Python
    • Pyspark
    • AWS EMR, AWS Glue, S3, ECS
    • DocumentDB (MongoDB)
    • Terraform