b5media.com

Advertise with us

Enjoying this blog? Check out the rest of the Health & Wellness Channel Subscribe to this Feed

Genetics and Health

Genetics Interview #17: Stew of Flags and Lollipops

by Hsien Hsien Lei, PhD on September 21st, 2006

If I were giving career advice to my son (who’s only four-years-old by the way), I would tell him to consider going into informatics. And if I were really pushy, I’d suggest bioinformatics. With computing power increasing exponentially and the internet offering up overwhelming amounts of information, we need people who can figure out a way to organize it all so the rest of us can actually deal with it. One such person is Stew (pseudonym) of Flags and Lollipops and postgenomic. I’m glad he took time out of his busy schedule working at Nature in the web publishing department to do this genetics interview for us!

1. You work in bioinformatics which I think is the glue that holds the genome revolution together. What kind of role do you think bioinformatics plays?

I’d agree: modern day genetics relies on vast quantities of data that you couldn’t begin to navigate or process efficiently without software of some sort. Nowadays sequence ‘search engines’ like BLAST and genome browsers like Ensembl are standard tools for genetics researchers. On an even more basic level, without sequence alignment algorithms there’d be no complete genomes to search or browse in the first place.

That’s the data processing side of bioinformatics. It’s also got a role to play in creating new data from the old. By doing clever things with existing information you can, for example, take a novel gene, feed it through machine learning algorithms and get back a predicted function based on the sequences of genes that have already been studied, or model a particular process in a cell, or predict which point mutation out of many on a particular gene is most likely to be responsible for causing some disease.


2. What kinds of genetics projects are you involved in and what are your specific contributions?

There’s always day-to-day stuff - “bread and butter” bioinformatics - like converting data from one format to another. This sounds pretty simple but it’d inevitably involve bringing together two different databases or genome releases (refinements to the human genome map get released occasionally, with bits of sequence added or taken away: you have to be sure when bringing two bits of data together that they refer to the same release). I think people would be surprised at how much time in bioinformatics can be taken up with just getting different programs to talk to one another. Don’t get me started on the number of “unique” identifiers that a gene can have.

The last lab that I worked in was interested in finding the genes underlying certain human diseases. That was really interesting - I liked being involved in something of such direct, obvious benefit to people.

Part of this involved data processing: dealing with genotyping results for large groups of patients and controls, for example.

Another part was looking at how software might help with some of the problems that researchers in this area need to contend with. Nowadays finding the gene involved in simple, Mendelian disorders is relatively straightforward, but it’s a totally different story with complex disorders like diabetes or schizophrenia where it’s the interaction between many different genes and the environment which results in disease.

Researchers can pinpoint areas which they suspect contains a gene involved in the disease, but typically each area in question is huge - containing tens to hundreds of genes. It’d take far too long (and cost $$$) to look at each one in turn, so traditionally at this point scientists would have to go back to literature and sequence databases to draw up a shortlist based on clues like expression patterns and Gene Ontology (GO) terms.

We wrote some programs to automate this process as far as possible - processing and comparing large amounts of data is something that computers are very good at, obviously.

3. For people interested in bioinformatics as a career, what would be
your advice?

There’s not really a traditional route into bioinformatics, so don’t worry too much about getting specific qualifications beyond a basic science degree. Enthusiasm and a willingness to learn should be enough to get you started. Having said that you will, obviously, need to know how to write programs.

I’d suggest that your first steps should be:

If you’re a biologist, learn Perl. In the short-medium term you’ll need (and appreciate) it.

If you’re a computer scientist, check out the NCBI bookshelf and learn some molecular biology basics.

In either case you might feel a bit overwhelmed at first - don’t worry about it. You’ll pick things up as you go along.

4. The amount of genetic data we’re gathering is growing exponentially. What do you think is the best approach to getting it all under control?

The thing to remember about all this data is that typically it’s just raw, unprocessed stuff. For working scientists to get the most out of it the data needs to be analyzed and put into context.

That means having computer systems powerful enough to store and process everything - some people are interested in using the Grid (a ‘new’ technology that, put very simply, involves networking many computers together and sharing the processing load) for this.

Unfortunately it’s difficult to get funding to maintain a small database and in any case they don’t, as a rule, generate publications. Therefore it’ll fall to the big players (UCSC, the NCBI and Ensembl) to store - and provide an interface to - the end results, at least in the longer term. This isn’t a bad thing: it lets scientists access all of the data about a particular protein / gene / location in one place.

5. As our personal genomes get sequenced, what measures should the average citizen take to ensure his/her privacy?

Here in the UK this is something that researchers are taking very seriously - not least because of initiatives like the UK BioBank, which is collecting the genetic profiles of thousands of people.

They key thing would be to make sure that you know at the point of collection where your data is going to be used. Whoever is doing the collecting has an obligation to get your informed consent - “informed” being the operative word here. You’re the one in control - say no if you’re uncomfortable with anything.

Patient data is typically anonymized in research projects: anything that could plausibly identify you is removed. Unfortunately levels of anonymity vary - what makes you anonymous to somebody off the street just looking for your name doesn’t necessarily make you anonymous to somebody who knows other things about you (age? weight? gender?) searching through the records.

Still, don’t get too worried. In general genetics researchers go to great lengths to fulfil their legal and ethical responsibilities. There are far more pressing privacy intrusions to be concerned about: supermarkets giving your purchase behaviour to your bank or internet companies making your search history public, for example. :)

Thanks, Stew. You’re truly on the cutting edge of the genome revolution. Exciting times!!

Tags: , , , , , , , , , ,

POSTED IN: Genetics Interviews

2 opinions for Genetics Interview #17: Stew of Flags and Lollipops

Have an opinion? Leave a comment: