Interviewing project grant awardee Kevin Arvai

Today we’re interviewing Kevin Arvai. Kevin is a bioinformatician with an interest in personal genetic data and he was awarded a project grant to implement a project that will bring genotype imputation to the Open Humans community.

Kevin, please give our blog readers a quick introduction about who you are!

I am a data scientist at a clinical genetics company in Maryland. My background and formal education is in biology, however I completed a master’s degree in computational biology and bioinformatics. Like many, I’m riding the wave of data that our generation has found itself immersed in by competing in data science competitions and contributing to “open-” (source, science, data) projects. I’m particularly interested in machine learning and human genetics but looking forward to learning new skills by building Imputer.

When and how did you come to Open Humans?

I came to Open Humans in February 2018 after working on a project with the Director of Research, Bastian, at a hackathon hosted by NCBI.

Have you been involved in any projects on Open Humans so far, either as a participant or even running your own

Not only is this my first project working with Open Humans, this is my first project as part of a open source community. Open Humans was a welcoming and collaborative group of people that encouraged my ideas, so it seemed like a perfect fit to start contributing.

Your project Imputer was awarded one of the Open Humans project grants. Can you explain us what the project is about?

The goal of Imputer is to provide users with a more comprehensive picture of their genome. Direct to consumer genetics companies, like 23andMe, only genotype a small fraction of the genome. Researchers are finding new genetic locations associated with traits and diseases at a rapid pace. Users might be interested in knowing their genotype status for these new associations, but the locations may be in regions that direct to consumer tests are not genotyping. Imputer leverages the vast amount of genotype data made available by 1000 genomes project and by the Haplotype Research Consortium to provide Open Humans users with genotype estimates at additional locations in their genome.

How did you come up with the idea behind Imputer?

The genesis of Imputer was spawned from long conversation over lunch with Bastian.

Is there anything important that we didn’t cover so far that you’d like to add?

I’d like to encourage others who are “interested in, but anxious about” contributing to open source projects to take the leap! If you’ve found this post, Open Humans is a great place to start!

Kevin’s encouragement motivated you to take action? The Open Humans project grants are ongoing and you can apply for one too!

