File upload & “data types”

Last week I finalized and publicly shared the File Uploader activity on Open Humans.1

This is a standard “activity” in Open Humans. Our structure is modular: in theory, anyone can use the APIs to create an activity that does the same thing. (In this case, the activity is created and administered by us. Yes, it’s open source.)

As such, using the File Uploader requires going to a different site (which is linked on the activity page):

What’s the “File Uploader” for?

This is intended as a replacement for “Data Selfies” to be generic approach for file upload to Open Humans.

There are a variety of personal data files that people might want to upload in order to (1) share with research and citizen science projects, and/or to (2) run personal data analyses on using our notebooks tool.

Indeed, the number of potential data sources is infinite, and tracking which type of data is in a file is important (and tricky): this is critical authorizing and sharing data with other activities, and for identifying data when running personal data analysis.

About “Data Types”

“File Uploader” adds something new that “Data Selfies” lacked: data types.

Historically, Open Humans primarily understood “what” a file is according to “data source” — that is, the activity that added the data. (e.g. “23andMe Uploader” or “AncestryDNA Uploader”). This had two issues: (1) a single activity might generate more than one “type” of data, (2) different activities might have data of the same “type” and people might want to manage that in a unified way (e.g. “any personal genetic data”).

On top of that: it’s weird overhead to create a new website for each new type of file upload!

So “DataTypes” was created! Don’t see a DataType you need? You can create one yourself…

How to create a new DataType

  1. Go to the DataTypes page:
  2. Click “Add datatype”
  3. Add information (e.g. “parent datatype” if any, how it’s acquired or created, format info)
  4. Mark “uploadable” if this is something the File Uploader should support

Data Types created in this way (i.e. “uploadable”) are immediately available as an option on the File Uploader.

What’s left?

So far, the utility of “data type” isn’t fully implemented. Next things to do are probably…

  1. Search Personal Data Notebooks according to “data type”. Right now it’s only possible to search according to “data source”. (Hopefully this is easy to do!)
  2. Enable requesting “data type” for activities. Currently activities can only request authorization for specific “data sources”. This is probably “hard” to add (the logic and interface gets complicated), but it should be possible for a project to request a type of data (e.g. “sleep data”) in a generic way.
  3. Retire old file uploader projects. As mentioned above, it’s silly to have separate projects for each one. (But doing this smoothly might require completing step #2 above.)

1 This was initially created a long time ago! I was very absent during the pandemic, and I’m trying to get back into things…. dusting off old stuff we meant to finish. 😅

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.