CDS DS 595

AI Methods for Science

Boston University · Spring 2026

About the Course

AI methods are increasingly central to how science gets done, spanning simulation, experiment, theory, and observation. This course aims to equip students with the methods to understand and carry out research at the intersection of AI and the natural sciences. Topics include probabilistic inference, neural networks that encode physical symmetries and domain knowledge, generative models for scientific data, and simulation-based inference. While framed in terms of scientific applications, the methods discussed extend well beyond scientific research, with broad applicability across industry and general AI R&D.

A major focus of the course is on large language models and their emerging role in science. As LLMs become more capable of scientific reasoning and operating autonomously, understanding how to evaluate, adapt, and collaborate with these systems is becoming essential. We explore what it means to work alongside AI scientists, and how to critically assess their capabilities as well as limitations.

Applications are drawn from domains including physics, materials science, and biology. The course involves two assignments emphasizing method design and critical analysis in collaboration with AI tools, plus two projects: a midterm applying AI methods to a scientific problem, and a final project finetuning an LLM to elicit a scientific capability.

Learning Objectives

By the end of this course, students will be able to:

Logistics

Lecture
Mon/Wed 12:20–1:35pm, CAS 218
Discussion
Tue 11:15am–12:05pm, MUG 205
Instructor
Siddharth Mishra-Sharma (smishras@bu.edu)
TF
Wanli Cheng (cwl1997@bu.edu)
Office Hours
Tue 3–5pm or by appointment, CDS 1528
TF Office Hours
Mon/Wed 2:15–3:15pm or by appointment, CDS 14th floor Green Corner

Resources

There is no required textbook. Many readings reference Understanding Deep Learning by Simon J.D. Prince (MIT Press, 2023); the PDF is available on the website. Other readings are drawn from research papers and online resources.

Schedule and Optional Reading

This schedule will change as the course progresses.

Week 1
L1
Wed Jan 21
Science in the Era of Computation
Course intro and logistics; Historical and philosophical perspectives
Week 2
L2
Mon Jan 26
Reasoning Under Uncertainty
Bayesian inference; Fitting a model to data; Model selection
L3
Wed Jan 28
Framing Scientific Problems as ML Tasks
Classification, regression, inference, generation, compression, anomaly detection, ...
D1
Tue Jan 27
Lab 1: JAX and Bayesian Inference
Week 3
L4
Mon Feb 2
Learning by Sampling and Optimization
MCMC; Monte Carlo methods; Variational inference
L5
Wed Feb 4
Building Blocks of Learned Representations
Neural networks primer
Assignment 1 out Starter
D2
Tue Feb 3
Lab 2: Hamiltonian Monte Carlo
Week 4
L6
Mon Feb 9
Encoding Scientific Structure in Neural Networks I
How data and theory inform NN architectures; CNNs
L7
Wed Feb 11
Encoding Scientific Structure in Neural Networks II
Graphs and locality; GNNs; Sequence and time-series models
D3
Tue Feb 10
Lab 3: Training Neural Networks
Week 5
Mon Feb 16
No class — Presidents' Day
L8
Tue Feb 17
Encoding Scientific Structure in Neural Networks III
Symmetry-preserving neural networks
L9
Wed Feb 18
Learning Distributions from Data I
Density estimation; Latent variable models; VAEs
Assignment 1 due; Assignment 2 out Starter
Week 6
L10
Mon Feb 23
Cancelled — Snowstorm
L11
Wed Feb 25
Learning Distributions from Data II
Diffusion
D4
Tue Feb 24
Lab 4: Variational Autoencoders
Week 7
L12
Mon Mar 2
Guest Lecture: Ameya Daigavane
AI + bio
L13
Wed Mar 4
Learning Distributions from Data III
Flow matching; applications
Assignment 2 due; Assignment 3 out
D5
Tue Mar 3
Lab 5: Diffusion
Spring Recess
Mar 7–15
Spring Recess — No classes
Week 8
L14
Mon Mar 16
Differentiating Through Scientific Simulators
Differentiable programming
L15
Wed Mar 18
Learning Through Exploration
Reinforcement learning and search
D6
Tue Mar 17
Lab 6: Differentiable Programming
Week 9
L16
Mon Mar 23
Inverting Simulators I
Simulation-based inference
L17
Wed Mar 25
Inverting Simulators II
Simulation-based inference; applications to physics and cosmology
Assignment 3 due
D7
Tue Mar 24
Lab 7: Reinforcement Learning
Week 10
L18
Mon Mar 30
From Specialized to General Intelligence; Scaling
From task-specific, to domain-specific, to general scientific agents
Final project out
L19
Wed Apr 1
Guest Lecture: Gaia Grosso
AI + particle collider physics
D8
Tue Mar 31
Lab 8: Simulation-Based Inference
Week 11
L20
Mon Apr 6
Quantifying and Predicting LLM Scientific Capabilities
Evaluations and forecasting for scientific R&D tasks
L21
Wed Apr 8
LLM Building Blocks
Attention; Transformers; Compute
D9
Tue Apr 7
Final project work
Week 12
L22
Mon Apr 13
Teaching LLMs to Science
Training and eliciting scientific capabilities
L23
Wed Apr 15
Learning Unified Representations Across Scientific Modalities
Foundation models for science
Final proposal due Fri Apr 17
D10
Tue Apr 14
Final project work
Week 13
Mon Apr 20
No class — Patriots' Day
L24
Wed Apr 22
Frontiers
Cutting-edge topic chosen by class
D11
Tue Apr 21
Final project work
Week 14
L25
Mon Apr 27
Being a Human Scientist
Research in an AI-driven scientific landscape
L26
Wed Apr 29
Final Project Presentations
D12
Tue Apr 28
Final project work
Finals
Mon May 4
Finals — No exam
Final project due

Discussion Sections

Tuesdays 11:15am–12:05pm in MUG 205.

Week Date Topic Notes
Tue Jan 20 No discussion First day of classes
2 Tue Jan 27 Lab 1: JAX and Bayesian Inference Starter Due Wed Jan 28
3 Tue Feb 3 Lab 2: Hamiltonian Monte Carlo Starter Due Wed Feb 4
4 Tue Feb 10 Lab 3: Training Neural Networks Starter Due Wed Feb 11
Tue Feb 17 No discussion Substitute Monday schedule
6 Tue Feb 24 Lab 4: Variational Autoencoders Starter Due Wed Feb 25
7 Tue Mar 3 Lab 5: Diffusion Due Wed Mar 4
Mar 7–15 No discussion Spring Recess
8 Tue Mar 17 Lab 6: Differentiable Programming Due Wed Mar 18
9 Tue Mar 24 Lab 7: Reinforcement Learning Due Wed Mar 25
10 Tue Mar 31 Lab 8: Simulation-Based Inference Due Wed Apr 1
11–14 Apr Final project work Proposal due Apr 17, Report due May 4

Topics Not Covered

Due to time constraints, this course does not cover several areas in AI for science, including neural operators, physics-informed learning, surrogate modeling, symbolic regression, causal inference, interpretability methods, experimental design, active learning, and recent AI-for-math developments (e.g., LLM-guided theorem proving). Some of these may be covered in later weeks as the course evolves.

Assessment

Discussion Labs 20%
Assignment 1 15%
Assignment 2 15%
Assignment 3 15%
Final Project 35%
Total 100%

Discussion Labs

Weekly in-class labs reinforce lecture material through hands-on programming. Students work through a notebook during discussion, exploring implementations and comparing results. Graded on participation and completion. Labs are due end of day Wednesday.

Assignments

Three assignments develop skills in method design and critical analysis. AI tools may be used freely, but the analysis and interpretation require critically engaging with what was produced. The discussion labs build foundational skills for these assignments.

Final Project

Teams of 2–3 identify a scientific capability that current large language models struggle with, then finetune a language model to improve that capability. This is a two-stage project:

Timeline

Deliverable Out Due
Discussion Labs Tuesdays Wednesday following lab
Assignment 1 Wed Feb 4 Wed Feb 18
Assignment 2 Wed Feb 18 Wed Mar 4
Assignment 3 Wed Mar 4 Wed Mar 25
Final Project Mon Mar 30 Proposal: Fri Apr 17
Report: Mon May 4

Policies

Attendance

Regular attendance in lectures is expected. Please notify the instructor of planned absences.

Late Work

Late submissions are not accepted without prior arrangement. Extensions may be granted for documented emergencies.

Collaboration

Discussion of concepts and approaches is encouraged. However, all submitted code and written work must be your own. When collaborating, you must acknowledge your collaborators.

AI Tools

Learning to work effectively with AI is itself a course objective. Use AI tools freely to explore ideas, debug code, and deepen understanding. Focus on building genuine competence—understanding why something works, not just that it works. Disclose AI assistance in submissions, including its form and extent. See also the CDS GAIA policy.

Academic Conduct

All students are expected to read and abide by the BU Academic Code of Conduct. Plagiarism includes copying or restating work or ideas of another person or AI software without citing the source. In computing coursework, this includes sharing code, reusing code across courses without permission, and uploading assignments to external sites. Please review the examples of plagiarism provided by the BU Computer Science department. All suspected cases of plagiarism will be reported to the Academic Dean.

Accommodations

Boston University is committed to providing reasonable accommodations to students with documented disabilities. Students seeking accommodations should contact Disability & Access Services (25 Buick Street, Suite 300; 617-353-3658) as early as possible in the semester. A new Faculty Accommodation Letter (FAL) must be requested each semester; DAS will send this directly to instructors.

Religious Observance

Students observing a religious holiday during regularly scheduled class time are entitled to an excused absence. Please notify the instructor in advance to make arrangements for any missed work.

Recordings

Recording of lectures requires instructor permission. Students approved for recording as an accommodation must limit use to personal study and may not share recordings.