R: how to write a loop to get a matrix? -
thanks wonderful solution suggested diliop previous question.
how pair-wise "sequence similarity score" ~1000 proteins?
to build upon answer, tried write loop pair-wise "sequence similarity score" 1000 proteins following code.
for (i in 1:1000){ score <- score(pairwisealignment(seqs[[i]]$seq, seqs[[i+1]]$seq, substitutionmatrix=blosum100, gapopening=0, gapextension=-5))}
however, difficult me convert each score data.frame
, list out score automatically?
seq1 seq2 score seq1 seq3 score seq1 seq4 score .... seq1000 seq1000 score
could expert give me more hints how 1000 x 1000 proteins?
this appears task can expand.grid , apply:
seqs <-c("seq1","seq2","seq3"); dat <- expand.grid(seqs,seqs, stringsasfactors=false) dat apply(dat, 1, function(seq) paste(seq[1], seq[2], sep="--") ) #[1] "seq1--seq1" "seq2--seq1" "seq3--seq1" "seq1--seq2" "seq2--seq2" "seq3--seq2" "seq1--seq3" #[8] "seq2--seq3" "seq3--seq3"
admittedly there duplication of effort if function returns same value f(seq1,seq2) f(seq2,seq1), if wanted gain efficiency limit first argument apply:
datr <- dat[dat[,1] > dat[,2] , ]
so if made such restricted-row-dataframe, datr
, perhaps:
datr$score <- apply(datr, 1 , function(seq) { score(pairwisealignment( seq[1], seq[2], substitutionmatrix=blosum100, gapopening=0, gapextension=-5)) }
(not knowing arguments in last line. should learn put in real data in examples , list required packages library
or require
calls.)
Comments
Post a Comment