LSCluster, a large-scale sequence clustering and aligning software for use in partial identity mapping and splice-variant analysis

Husi, H., Skipworth, R.J., Fearon, K.C.H. and Ross, J.A. (2013) LSCluster, a large-scale sequence clustering and aligning software for use in partial identity mapping and splice-variant analysis. Journal of Proteomics, 84, pp. 185-189. (doi:10.1016/j.jprot.2013.04.006)

Full text not currently available from Enlighten.

Abstract

Many sequence analyses and multiple sequence alignment tools are widely used in biological research and are well described. However, large-scale proteome-wide analysis to identify potential splice-variants, describe the sequence differences compared to a progenitor sequence and cluster those sequences into individual groups for further analysis is a difficult task with the tools available, and a desktop-based, stand-alone search engine with the capabilities to align and cluster thousands of sequences and present the output in a deprecated format has been lacking. We have developed a novel software named LSCluster (Large-Scale CLUSTERing) which allows users to group tens of thousands of sequences based on sequence alignments or partial identity mapping, and can be used specifically for the detection of splicing variants and other pairs of sequences sharing identical fragments. One of the unique features of LSCluster is its ability to display the alignment output as a deprecated string thereby listing only differences in aligned sequences. The software (current version 2.0) is freely available through the PADB (Proteomic Analysis DataBase) initiative at www.PADB.org.

Biological significance: Large-scale proteome-wide analysis to identify potential splice-variants, describe the sequence differences compared to a progenitor sequence and cluster those sequences into individual groups for further analysis is a difficult task with the tools presently available. This work introduces a desktop-based, stand-alone search engine with the capabilities to align and cluster thousands of sequences and present the output in a deprecated format. We have developed a novel software named LSCluster (Large-Scale CLUSTERing) which allows users to group tens of thousands of sequences based on sequence alignments or partial identity mapping which can be used specifically for the detection of splicing variants and other pairs of sequences sharing identical fragments. One of the unique features of LSCluster is the ability to display the alignment output as a deprecated string listing only differences in aligned sequences. The software (current version 2.0) is freely available through the PADB (Proteomic Analysis DataBase) initiative at www.PADB.org.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Fearon, Prof Kenneth and Husi, Dr Holger
Authors: Husi, H., Skipworth, R.J., Fearon, K.C.H., and Ross, J.A.
Subjects:Q Science > QA Mathematics > QA76 Computer software
Q Science > QH Natural history > QH301 Biology
College/School:College of Medical Veterinary and Life Sciences > Institute of Cardiovascular and Medical Sciences
Journal Name:Journal of Proteomics
ISSN:1874-3919
ISSN (Online):1876-7737

University Staff: Request a correction | Enlighten Editors: Update this record