Monthly Archives: October 2012

Mapping genes to pathways using biomaRt

One common task in RNA-seq analysis is to map significantly regulated genes into pathways. Pathway information are available from various databases, e.g. reactome, uniprot, etc. The code below shows how to map genes of interest (as reported by cufflinks/cummeRbund) to human pathways in uniprot.


library(cummeRbund)
data <- readCufflinks()
gene_list <- genes(data)
gene_diff_data <- diffData(gene_list)
gene_list <- features(gene_list)
gene_list <- subset(gene_list, select=c('gene_id', 'gene_short_name'))
sig_gene_data <- subset(gene_diff_data, (significant=='yes'))
library(biomaRt)
unimart = useMart('unimart', dataset='uniprot')
pathway.data <- data.frame('gene_name', 'go_name')
for (i in sig_gene_data$gene_id) {
 gene_name = subset(gene_list, gene_id == i, select='gene_short_name')
 tmp <- getBM(attributes=c('gene_name','organism', 'go_name'),
 filters=c('gene_name'),
 values=gene_name,
 mart=unimart)
 tmp <- subset(tmp, tmp$organism=='Homo sapiens',
 select=c('gene_name', 'go_name'))
 tmp <- tmp[grep('^P:', tmp$go_name),]
 colnames(tmp) <- colnames(pathway.data)
 pathway.data <- rbind(pathway.data, tmp)
}
write.table(pathway.data, 'sig_gene_pathway.csv', sep=',', row.names=FALSE)

 

 

Advertisements

About me

I am a computational biologist. My work is mainly focused on RNA-seq studies and computational modeling of cell signaling pathways.