Monthly Archives: October 2012

Mapping genes to pathways using biomaRt

One common task in RNA-seq analysis is to map significantly regulated genes into pathways. Pathway information are available from various databases, e.g. reactome, uniprot, etc. The code below shows how to map genes of interest (as reported by cufflinks/cummeRbund) to human pathways in uniprot.


library(cummeRbund)
data <- readCufflinks()
gene_list <- genes(data)
gene_diff_data <- diffData(gene_list)
gene_list <- features(gene_list)
gene_list <- subset(gene_list, select=c('gene_id', 'gene_short_name'))
sig_gene_data <- subset(gene_diff_data, (significant=='yes'))
library(biomaRt)
unimart = useMart('unimart', dataset='uniprot')
pathway.data <- data.frame('gene_name', 'go_name')
for (i in sig_gene_data$gene_id) {
 gene_name = subset(gene_list, gene_id == i, select='gene_short_name')
 tmp <- getBM(attributes=c('gene_name','organism', 'go_name'),
 filters=c('gene_name'),
 values=gene_name,
 mart=unimart)
 tmp <- subset(tmp, tmp$organism=='Homo sapiens',
 select=c('gene_name', 'go_name'))
 tmp <- tmp[grep('^P:', tmp$go_name),]
 colnames(tmp) <- colnames(pathway.data)
 pathway.data <- rbind(pathway.data, tmp)
}
write.table(pathway.data, 'sig_gene_pathway.csv', sep=',', row.names=FALSE)

 

 

Advertisements