Weighted k-word matches: a sequence comparison tool for proteins

Jing, Junmei; Wilson, Susan; Burden, Conrad

Weighted k-word matches: a sequence comparison tool for proteins

Date

2011

Authors

Jing, Junmei

Wilson, Susan

Burden, Conrad

Publisher

Australian Mathematical Society

Abstract

The use of k-word matches was developed as a fast alignment-free comparison method for dna sequences in cases where long range contiguity has been compromised, for example, by shuffling, duplication, deletion or inversion of extended blocks of sequence. Here we extend the algorithm to amino acid sequences. We define a new statistic, the weighted word match, which reflects the varying degrees of similarity between pairs of amino acids. We computed the mean and variance, and simulated the distribution function for various forms of this statistic for sequences of identically and independently distributed letters. We present these results and a method for choosing an optimal word size. The efficiency of the method is tested by using simulated evolutionary sequences, and the results compared with blast.

URI

http://hdl.handle.net/1885/22183

Collections

ANU Research Publications

Source

ANZIAM Journal

Type

Journal article

Restricted until

2037-12-31

Downloads

File

Description

01_Jing_Weighted_k-word_matches:_a_2011.pdf (3.45 MB)

Full item page

Weighted k-word matches: a sequence comparison tool for proteins

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Source

Type

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until

Downloads