Combined lecture-practical blocks: 1. Introduction to Machine Learning (ML), Deep Learning (DL) and AI in Biology Introductory session. Basic terminology of the topic, general principles behind AI, ML, and DL. Introduction to general use AI tools in biology (ChatGPT, Perplexity, Copilot, Grok, Gemini, NightCafe). Introduction to Python, Jupyter, and Colab for ML/DL in Biology. 2. Data in biology: Types, Challenges and Preprocessing. Principles of handling of different types of biological data (DNA/RNA sequence data, protein sequence data, microscopy images, large pools of text). Data normalization and cleaning, handling of outliers and missing data. Data anonymization. Data preparation for model training and execution. 3. Supervised learning algorithms Supervised learning principles, data annotation, feature selection. Common supervised algorithms (logistic regression, random forest). Annotating a dataset and training a supervised model based on annotated data. Model utilization for a test dataset. 4. Unsupervised learning algorithms Principles and areas of application for unsupervised learning. Common unsupervised algorithms (k-means, principal component analysis). Noise2Void. Application of unsupervised learning algorithms for pattern discovery. 5. Model Evaluation and Validation Evaluation of ML/DL models and validation of obtained results. Parameters for assessment of ML/DL models. Statistical testing of the results, overfitting indicators, reliable generalization parameters. 6. Generative Models in Biology Variational Autoencoders and Generative Adversarial Networks in biology. Large language models (LLMs) and their utilization in biological data processing. Seminar 1: sharing personal experience in using AI for studies and research. 7. Convolution Neural Networks in Bioimaging AI and DL in microscopy data analysis. Convolution filters and local patterns. U-Net neural network, YOLO object detection. Adaptive optics. Cell detection and classification with Cellpose and QuPath. 8. Recurrent Neural Networks and Sequence Models DNA/RNA sequence classification. Network memory. Long short-term memory, Gated Recurrent Unit. Practical exercise in human splice site (SpliceAI) and transcription factor binding (DeepBind) prediction 9. AI in Protein Structure Prediction and Drug Discovery Prediction of protein structures, binding interfaces, protein complexes. Synthetic ligands and drug discovery. AlphaFold, and RoseTTAFold for protein structure analysis 10. AI in Proteomics and Protein Function Prediction AI application for determination of protein functional activity, interactions and localization. Practical exercise in prediction of protein subcellular localization and creation of synthetic images. 11. Natural Language Processing and text mining in biology Natural language processing vs text mining in biological context. Available large text databases (PubMed, UniProt). Classical text mining vs Deep learning-based approaches. Text Mining with BioBERT 12. Ethical concerns and future of AI in Biology Data ownership, sensitive data, energy consumption. Hallucinating AI. Future directions, consciousness and technological singularity. Seminar 2: showcase of AI utilization in own research
|