CSE 6406: Bioinformatics Algorithms

April 2008 Semester

Starting from 10-th May 2008

Quick Links:     Contact/Course teacher:     Books     Schedule and Course syllabus     Announcements     Marks Distribution     Study Topics     Web resource     Presentation     Project     Final Exam     Bonus/Open Problems     Exercise

 

Contact/Course teacher:

Dr. Masud Hasan

Assistant Professor

Department of CSE, BUET

Office: CSE 657

Phone: PABX: 8614640-44, 9665650-80, Ext. 7738

Email: masudhasan "at" cse "dot" buet "dot" ac "dot" bd

Course web: http://203.208.166.84/masudhasan/cse6406_April_08.html

 

Books:

  1. An Introduction to Bioinformatics Algorithms, by Neil C. Jones and Pavel A. Pevzner, MIT Press, 1st Edition, 2004, [We will follow this book, collect it. Indian Edition should be available at Nilkhet (try "Memory Books")]

  2. Computational Molecular Biology: An Algorithmic Approach, by Pavel A. Pevzner, MIT Press, 1st Edition, 2000.

Schedule and Course syllabus:

Time: Tuesday 5:00pm to 8:00pm

Location: MML (Multimedia Lab), 3rd floor, CSE Department

 

We will try to cover the following topics as per the following tentative sequence:

 

Date, Class Topics Presentation schedule
Lecture 1, 13/05/2008 Course policies, Introduction [1: Ch 3, 2: Ch 13, slides]  
Lecture 2, 20/05/2008 Restriction Mapping: Partial Digest Problem [1: Ch 4], Double Digest Problem [2: Ch 2.2, 2.3, slides]  
Lecture 3, 27/05/2008 Motif Finding [1: 4.4-4.10, Ch 5.5, 12.2, slides-1,   slides-2]  
Lecture 4, 03/06/2008 Genome Rearrangement: Sorting Permutations [1: Ch 5.1-5.4, slides]  
Lecture 5, 10/06/2008 Sequence Alignment [1: Ch 6.4-6.6, 6.8-6.9, Cormen: 15.4]  
Lecture 6, 17/06/2008 No Lecture

Masum, Saddam

Lecture 7, 24/06/2008 DNA Sequencing [1: Ch 8.3-8.8] Rifat, Anupom, Jamil, Mahbub
Lecture 8, 01/07/2008 Protein Sequencing and Identification [1: Ch 8.10-8.15] Bayzid, Mahmudul, Faysal, Nahiduzzaman
Lecture 9, 08/07/2008

Combinatorial Pattern Matching [1: Ch 9.1-9.5]

Guest Lecturer: Dr. Mohammad Sohel Rahman

Swakkhar, Tanvir, Ahsan
15/07/2008 No class, Summer vacation  
Lecture 10, 22/07/2008 No Lecture Kuntal, Rasel, Salahuddin, Sayed
Lecture 11, 29/07/2008 Clustering and Trees (Ch 10.1-10.2, 10.4 (follow slides)) Anindya, Rajkumar, Mahmuda, Mohsina
Lecture 12, 05/08/2008 No Lecture Atif, Foyez, Shahrear, Rakib, Musa, Ashis
Lecture 13, 12/08/2008 No Lecture Mashfiqui, Shuvabrata, Tariq, Mehrab, Tanim
Lecture 14, 19/08/2008 No Lecture Nahiyan, Amit, Tanvirul Islam, Tanvir Anwar
Final Exam, 02/09/2008, 5:00pm Final Exam  

 

 

Announcements:

Marks distribution:

Active participation in the class: 20%

Select a study topic and give a presentation on that: 20%

Write a project on that topic: 20%

Final exam: 40%

Bonus: depending upon the level of work in finding any new result on your research topic or an open problem mentioned in the class.

 

Study topics:

Possible topics: [In primary stage, more topics and more details to come...] [You need GhostView to view ps files]

  1. [Saddam] Polynomial algorithm for PDP by Moris Nivat.

  2. [Tanvir Anwar] The Point Placement Problem: This geometric problem is very much related to restriction mapping problem: Papers:

  3. [Masum] Other Mapping Techniques: Study Double Digest Problem in more detail, study other mapping techniques such as Optical Mapping and Probed Partial Digest Problem, find recent related results. Start with book [2], Chapter 2, and the references there in.

  4. [Jamil, Mahbub] Simple and faster algorithms for sorting signed permutation by reversals. Try to find out why it is difficult to achieve an O(n) or O(nlogn) time algorithm for this problem. The second and third papers below compute the reversal distance, not the sequence of reversals, in subquadretic (i.e., o(n2) time). Papers:

  5. [Mahmudul] Why sorting unsigned permutations by reversals is NP-hard but that for signed permutations is polynomial? What is the glitch? Read the following by Caprara and the polynomial papers, as necessary, from the immediate above project..

  6. [Faysal] Sorting by Transpositions: Can you say why it is so far difficult to determine the complexity of sorting by transpositions, while the problem of sorting by reversals has already been settled down. Read the following two papers, of which the first one is the best approximation result on transpositions and the second one is NP-Hardness of reversals.

  7. [Nahiduzzaman] Sorting by Short Swap: Other than reversals and transpositions, there are many other operations that have been considered for sorting permutations, and short swap is one of them. It is not well-studied. Few results include these papers:

  8. [Swakkhar] Mathematical Challenges from Genomics and Molecular Biology [best suitable for a student taking Combinatorial Optimization course]: Read the following two papers and explain the mathematical challenges in bioinformatics.

  9. [Anindya, Rajkumar] Approximation algorithms on Protein Folding in different models

  10. [Bayzid] Haplotype

  11. [Ahsan] Approximation algorithms for subtree distances between phylogenies

  12. [Rasel] Recombination in Phylogenetic Network

  13. [Salahuddin] New Techniques for Phylogenetic Tree Reconstruction

  14. [Kuntal] BLAST, FASTA

  15. [Rifat] Bioinformatics Software

  16. [Sayed] Biological databases

  17. [Mohsina] Biological databases

  18. [Anupom] AI related algorithms in Bioinformatics

  19. [Mahmuda] Sorting permutations by more than one rearrangement operation

  20. [Atif] Finding motifs with gaps.

  21. [Foyez] Data mining

  22. [Shahrear] Finding segregated sites in DNA sequences

  23. [Mashfiqui] Protein folding

  24. [Rakib] Repetitions and regularities in weighted sequence

  25. [Shuvabrata] Parallel algorithms in bioinformatics problems

  26. [Tariq] DNA computing

  27. [Mehrab] Phylognetic tree

  28. [ABM Musa] Cache Oblivious algorithms for Bioinformatics

  29. [Ashis] Clustering in bioinformatics

  30. [Tanim] System Biology

  31. [Nahian] Service oriented systems for bioinformatics

  32. Research fields, other than the algorithms, in Bioinformatics: Database, System Bioinformatics, etc.

Presentation [20%]:

Follow the guideline below in your presentation.

Project [20%]:

Project will be the written form of what you find on your topic. It must be written in LaTeX. Report size is expected to be within10 to 15 pages with 11 size fonts and single spacing. You can collect latex installation files for Windows from me (~450 MB).

 

Web resources: [more to come...]

Active participation in the class [20%]:

You will get this mark when you actively participate in the class, where by an "active participation" means you come to the class in time, turn your mobile off, attentively listen, ask (logical) questions, give comments, etc. It is not necessary that you ask questions or give comments all the time; even, your body language without any verbose participation may indicate you are attentive in the class. In your final exam some marks may be reserved for questions that will judge whether you actively participated in the class or not.

 

 

Final exam [40%]:

Syllabus: See the schedule and course syllabus table.

 

Bonus/Open Problems:

If you can find any new result on your/others topic or solve any problem mentioned in the class you will get some bonus marks on top of your total marks.

Solved! By Saddam Hossain, see his solution here, Saddam gets 2 marks bonus. Congratulation to Saddam!

Ahsanur Rahman found some sources which shows that a 2(1 + ln n)-approximation is possible for this problem. Ahsan gets a partial mark, 1. Congratulation to Ahsan.

Solved! By Md. Abu Sayeed Mondol. See his solution here. It's so easy! Abu Sayeed gets 2 marks bonus. Congratulation to Abu Sayeed!

 

Excercise: