I recently finished a new artwork — called Genetic Portraits — which is a series of microscope photographs of laser-etched glass that data-visualize a person’s genetic traits.
I specifically developed this work as an experimental piece, for the Bearing Witness: Surveillance in the Drone Age show. I wanted to look at an extreme example of how we have freely surrendered our own personal data for corporate use. In this case, 23andMe provides a (paid) extensive genetic sequencing package. Many people, including myself have sent in saliva samples to the company, which they then process. From their website, you can get a variety of information, including their projected likelihood that you might be prone to specific diseases based on your genetic traits.
Following my line of inquiry with other projects such as Data Crystals and Water Works, where I wrote algorithms that transformed datasets into physical objects, this project processes individual’s genetic sequence to generate vector files, which I later use to laser-etch onto microscope slides. The full project details are here.
Concept + Material
I began my experiment months earlier, before the project was solidified, by examining the effect of laser-etching on glass underneath a microscope. This stemmed from conversations with some colleagues about the effect of laser-cutting materials. When I looked at this underneath a microscope, I saw amazing results: an erratic universe accentuated by curved lines. Even with the same file, each etching is unique. The glass cracks in different ways. Digital fabrication techniques still results in distinct analog effects.
When the curators of the show, Hanna Regev and Matt McKinley, invited me to submit work on the topic of surveillance, I considered how to leverage various experiments of mine, and came back to this one, which would be a solid combination of material and concept: genetic data etched onto microscope slides and then shown at a macro scale: 20” x 15” digital prints.
Surrendering our Data
I had so many questions about my genetic data. Is the research being shared? Do we have ownership of this data? Does 23andMe even ask for user consent? As many articles point out, the answers are exactly what we fear. Their user agreement states that “authorized personnel of 23andMe” can use the data for research. This sounds officially-sounding text simply means that 23andMe decides who gets access to the genetic data I submitted. 23andMe is not unique: other gene-sequencing companies have similar provisions, as the article suggests.
Some proponents suggest that 23andMe is helping the research front, while still making money. It’s capitalism at work. This article in Scientific American sums up the privacy concerns. Your data becomes a marketing tool and people like me handed a valuable dataset to a corporation, which can then sell us products based on the very data we have provided. I completed the circle and I even paid for it.
However, what concerns me even more than 23andMe selling or using the data — after all, I did provide my genetic data, fully aware of its potential use — is the statistical accuracy of genetic data. Some studies have reported a Eurocentric bias to the data and The FDA has also has battled with 23andMe regarding the health data they provide. The majority of the data (with the exception of Bloom’s Syndrome) simply wasn’t predictive enough. Too many people had false positives with the DNA testing, which not only causes worry and stress but could lead to customers taking pre-emptive measures such as getting a mastectomy if they mistakenly believe they have are genetically predisposed to breast cancer.
A deeper look at the 23andMe site shows a variety of charts that makes it appear like you might be susceptible (or immune) to certain traits. For example, I have lower-than-odds of having “Restless Leg Syndrome“, which is probably the only neurological disorder that makes most people laugh when hearing about it. My genetic odds of having it are simply listed as a percentage.
Our brains aren’t very good with probabilistic models, so we tend to inflate and deflate statistics. Hence, one of many problems of false positives.
And, as I later discovered, from an empirical standpoint, my own genetic data strayed far from my actual personality. Our DNA simply does not correspond closely enough to reality.
Data Acquisition and Mapping
From the 23andMe site, you can download your raw genetic data. The resulting many-megabyte file is full of rsid data and the actual allele sequences.
Isolating useful information from this was tricky. I cross-referenced some of the rsids used for common traits from 23andMe with the SNP database. At first I wanted to map ALL of the genetic data. But, the dataset was complex — too much so for this short experiment and straightforward artwork.
Instead, I worked with some specific indicators that correlate to physiological traits such as lactose tolerance, sprinter-based athleticism, norovirus resistances, pain sensitivity, the “math” gene, cilantro aversion — 15 in total. I avoided genes that might correlate to various general medical conditions like Alzheimer’s and metabolism.
For each trait I cross-referenced the SNP database with 23andMe data to make sure the allele values aligned properly. This was arduous at best.
There was also a limit on physical space for etching the slide, so having more than 24 marks or etchings one plate would be chaotic. Through days of experimentation, I found that 12-18 curved lines would make for compelling microscope photography.
To map the data onto the slide, I modified Golan Levin’s decades-old Yellowtail Processing sketch, which I had been using as a program to generate curved lines onto my test slides. I found that he had developed an elegant data-storage mechanism that captured gestures. From the isolated rsids, I then wrote code which gave weighted numbers to allele values (i.e. AA = 1, AG = 2, GG = 3, depending on the rsid).
Based on the rsid numbers themselves, my code generated (x, y) anchor points and curves with the allele values changing the shape of each curve. I spent some time tweaking the algorithm and moving the anchor points. Eventually, my algorithm produced this kind of result, based on the rsids.
The question I always get asked about my data-translation projects is about legibility. How can you infer results from the artwork? Its a silly question, like asking an Kindle engineer to to analyze a Shakespeare play. A designer of data-visualization will try to tell a story using data and visual imagery.
My research and work focuses deep experimentation with the formal properties of sculpture — or physical forms — based on data. I want to push boundaries of what art can look like, continuing the lineage of algorithmically-generated work by artists such as Sol Lewitt, Sonia Rappaport and Casey Raes.
Is it legible? Slightly so. Does it produce interesting results? I hope so.
But, with this project, I’ve learned so much about genetic data — and even more about the inaccuracies involved. It’s still amazing to talk about the science that I’ve learned in the process of art-making.
Each of my 5 samples looks a little bit different. This is the mapping of actual genetic traits of my own sample and that of one other volunteer named “Nancy”.
Genetic Traits for Scott (ABOVE)
GENETIC TRAITS FOR NaNCY (BELOW)
We both share a number of genetic traits such as the “empathy” gene and curly hair. The latter seems correct — both of our hair is remarkably straight. I’m not sure about the empathy part. Neither one of us is lactose intolerant (also true in reality).
But the test-accuracy breaks down on several specific points. Nancy and I do have several differences including athletic predisposition. I have the “sprinter” gene, which means that I should be great at fast-running. I also do not have the math gene. Neither one of these is at all true.
I’m much more suited to endurance sports such as long-distance cycling and my math skills are easily in the 99th percentile. From my own anecdotal standpoint, except for well-trodden genetics like eye color, cilantro aversion and curly hair, the 23andMe results often fail.
The genetic data simply doesn’t seem to be support the physical results. DNA is complex. We know this, it is non-predictive. Our genotype results in different phenotypes and the environmental factors are too complex for us to understand with current technology.
Back to the point about legibility. My artwork is deliberately non-legible based on the fact that the genetic data isn’t predictive. Other mapping projects such as Water Works are much more readable.
I’m not sure where this experiment will go. I’ve been happy with the results of the portraits, but I’d like to pursue this further, perhaps in collaboration with scientists who would be interested in collaboration around the genetic data.
FOUR FINAL SLIDE ETCHINGS (BELOW)