Skip to content
July 4, 2012 / tninja1980msn

20120704.org

Festival travel

SLU family

  • Nothing, not interesting

Tacoma

  • Air show – very cool
  • Sea Park – nice

Shopping

  • Macy: Clothing
  • Great Wall: Food
  • Walmart: bedding

June 29, 2012 / tninja1980msn

Start learning Clojure

Install

Books

  • [ ] Joy of Clojure
  • [ ] Seven languages in seven weeks – Reading now
  • [ ] Clojure programming
  • [ ] Clojure in Action

Feeling

  • Simple: code is data
  • Functional programming
  • High performance
  • Cross platform, based on jvm
  • Lisp based
  • Compojure: web framework to learn

Fun

  • Yes. It worth learning.

May 14, 2012 / tninja1980msn

miR2net.org

Build the graph and show some basic statistics

Load the data, create graph using “igraph” package

The vertex type distribution

# basic statistics of this graph
# vertics types
library(stringr)
vertex <- V(g)
print(ascii(transform(ddply(transform(data.frame(vertex=vertex$name), type=ifelse(str_detect(vertex, 'hsa-'), 'microRNA', 'Gene')), .(type), summarise, n=length(type)), percent=sprintf('%.1f%%', n*100/sum(n))), include.rownames=F, digits=0, caption='regulator type distribution'), type='org')
regulator type distribution
type n percent
Gene 9532 98.1%
microRNA 180 1.9%

Edge type distribution

# edge types
library(plyr)
library(ascii)
print(ascii(transform(ddply(reg2tar, .(type), summarise, n=length(type)), percent=sprintf('%.1f%%', n*100/sum(n))), include.rownames=F, digits=0, caption='regulation type distribution'), type='org')
regulation type distribution
type n percent
mir2gene 516 2.7%
tf2gene 18455 95.3%
tf2mir 403 2.1%

Render the network

Render MYC network

# generate the subgraph
nodes.name <- c('MYC')
nodes.id <- which(V(g)$name %in% nodes.name) - 1
neighbor.nodes <- neighbors(g, v=nodes.id)
g.sub <- subgraph(g, c(nodes.id, neighbor.nodes))

# plot it
library(stringr)
plot(g.sub, layout=layout.fruchterman.reingold, vertex.size=ifelse(str_detect(V(g)$name, 'hsa-'), 3, 6), vertex.label=V(g.sub)$name, vertex.color=ifelse(str_detect(V(g)$name, 'hsa-'), 'pink', 'lightblue'), edge.color=ifelse(E(g.sub)$coef > 0, 'red', 'green'))

http://tninja1980msn.files.wordpress.com/2012/05/wpid-mycnetwork.pdf

Render TP53 network

# generate the subgraph
nodes.name <- c('TP53')
nodes.id <- which(V(g)$name %in% nodes.name) - 1
neighbor.nodes <- neighbors(g, v=nodes.id)
g.sub <- subgraph(g, c(nodes.id, neighbor.nodes))

# plot it
library(stringr)
plot(g.sub, layout=layout.fruchterman.reingold, vertex.size=ifelse(str_detect(V(g)$name, 'hsa-'), 3, 6), vertex.label=V(g.sub)$name, vertex.color=ifelse(str_detect(V(g)$name, 'hsa-'), 'pink', 'lightblue'), edge.color=ifelse(E(g.sub)$coef > 0, 'red', 'green'))

http://tninja1980msn.files.wordpress.com/2012/05/wpid-tp53network.pdf

Render the whole network

Not able to do it since it just run out of my RAM (3-4G).

Potential solutions:

  • It require some time to build a R-cytoscape pipeline to render it on low-RAM machine.
  • Use a high-performance computer, for example, Amazon EC2 Large RAM Ultra-Large Instance is required for this job.

May 11, 2012 / tninja1980msn

my first post

I am good!

May 10, 2012 / tninja1980msn

showoff2.org

Let’s do a linear regression

x <- runif(1000) * 100
y <- x * 5 + rnorm(1000)
fit <- lm(y ~ x)
library(ascii)
print(ascii(summary(fit)), type='org')

|             | Estimate | Std. Error | t value | Pr(> \vert t \vert ) |
|-------------+----------+------------+---------+----------------------|
| (Intercept) | -0.03    | 0.06       | -0.43   | 0.67                 |
| x           | 5.00     | 0.00       | 4408.99 | 0.00                 |

Let’s do a pca

x <- runif(1000) * 100
y <- x * 5 + rnorm(1000)
z <- runif(1000); w <- rnorm(1000)
df <- cbind(x, y, z, w)
p <- prcomp(t(df))
plot(p)

http://tninja1980msn.files.wordpress.com/2012/05/wpid-pca.png?w=595

May 10, 2012 / tninja1980msn

showoff.org

Do some text-mining work

import nltk

s = 'I love my wife pengpeng.'

print nltk.pos_tag(s.split(' '))

[('I', 'PRP'), ('love', 'VBP'), ('my', 'PRP$'), ('wife', 'NN'), ('pengpeng.', 'NNP')]

plot a histogram

x=rnorm(100)
hist(x)

http://tninja1980msn.files.wordpress.com/2012/05/wpid-test.png?w=595

do a linear regression

x <- runif(1000)
y <- x^2 * 3 + x * 5 + rnorm(1000)
library(ggplot2)
g <- ggplot(data.frame(x, y), aes(x=x,y=y)) + geom_point() + geom_smooth()
print(g)

http://tninja1980msn.files.wordpress.com/2012/05/wpid-lm.png?w=595

November 2, 2010 / tninja1980msn

Running Galaxy in a production environment

cite from http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ProductionServer

Use a clean environment
wget http://bitbucket.org/ianb/virtualenv/raw/tip/virtualenv.py
python virtualenv.py –no-site-packages galaxy_env

Disable the developer settings
in universe_wsgi.ini:
debug = False
use_interactive = False

Switch to a database server (postgres suggested)

sudo apt-get install postgresql

follow the suggestion here to create galaxy database

modify universe_wsgi.ini:

postgres:///galaxy?host=/var/run/postgresql

Using a proxy server

follow instruction here: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ApacheProxy

Follow

Get every new post delivered to your Inbox.