CDS DS 595

AI Methods for Science

Boston University · Spring 2026

About the Course

AI methods are increasingly central to how science gets done, spanning simulation, experiment, theory, and observation. This course aims to equip students with the methods to understand and carry out research at the intersection of AI and the natural sciences. Topics include probabilistic inference, neural networks that encode physical symmetries and domain knowledge, generative models for scientific data, and simulation-based inference. While framed in terms of scientific applications, the methods discussed extend well beyond scientific research, with broad applicability across industry and general AI R&D.

A major focus of the course is on large language models and their emerging role in science. As LLMs become more capable of scientific reasoning and operating autonomously, understanding how to evaluate, adapt, and collaborate with these systems is becoming essential. We explore what it means to work alongside AI scientists, and how to critically assess their capabilities as well as limitations.

Applications are drawn from domains including physics, materials science, and biology. The course involves two assignments emphasizing method design and critical analysis in collaboration with AI tools, plus two projects: a midterm applying AI methods to a scientific problem, and a final project finetuning an LLM to elicit a scientific capability.

Learning Objectives

By the end of this course, students will be able to:

Logistics

Lecture
Mon/Wed 12:20–1:35pm, CAS 218
Discussion
Tue 11:15am–12:05pm, MUG 205
Instructor
Siddharth Mishra-Sharma (smishras@bu.edu)
TF
Wanli Cheng (cwl1997@bu.edu)
Office Hours
Tue 3–5pm or by appointment, CDS 1528
TF Office Hours
Mon/Wed 2:15–3:15pm or by appointment, CDS 14th floor Green Corner

Resources

There is no required textbook. Many readings reference Understanding Deep Learning by Simon J.D. Prince (MIT Press, 2023); the PDF is available on the website. Other readings are drawn from research papers and online resources.

Schedule and Optional Reading

This schedule will change as the course progresses.

Week 1
L1
Wed Jan 21
Science in the Era of Computation
Course intro and logistics; Historical and philosophical perspectives
Week 2
L2
Mon Jan 26
Reasoning Under Uncertainty
Bayesian inference; Fitting a model to data; Model selection
L3
Wed Jan 28
Framing Scientific Problems as ML Tasks
Classification, regression, inference, generation, compression, anomaly detection, ...
D1
Tue Jan 27
Lab 1: JAX and Bayesian Inference
Week 3
L4
Mon Feb 2
Learning by Sampling and Optimization
MCMC; Monte Carlo methods; Variational inference
L5
Wed Feb 4
Building Blocks of Learned Representations
Neural networks primer
Assignment 1 out
D2
Tue Feb 3
Lab 2: Hamiltonian Monte Carlo
Week 4
L6
Mon Feb 9
Encoding Scientific Structure in Neural Networks I
How data and theory inform NN architectures; CNNs
L7
Wed Feb 11
Encoding Scientific Structure in Neural Networks II
Graphs and locality; GNNs; Sequence and time-series models
D3
Tue Feb 10
Lab 3: Training Neural Networks
Week 5
Mon Feb 16
No class — Presidents' Day
L8
Tue Feb 17
Encoding Scientific Structure in Neural Networks III
Symmetry-preserving neural networks
L9
Wed Feb 18
Learning Distributions from Data I
Density estimation; Latent variable models; VAEs; Energy-based models
Assignment 1 due; Assignment 2 out
Week 6
L10
Mon Feb 23
Learning Distributions from Data II
Diffusion models; Flow matching
L11
Wed Feb 25
Learning Distributions from Data III
More diffusion; applications
D4
Tue Feb 24
Lab 4: Diffusion Models
Week 7
L12
Mon Mar 2
Guest Lecture: Ameya Daigavane
AI + bio
Midterm project out
L13
Wed Mar 4
Differentiating Through Scientific Simulators
Differentiable programming
Assignment 2 due
D5
Tue Mar 3
Midterm project intro + SCC setup
Spring Recess
Mar 7–15
Spring Recess — No classes
Week 8
L14
Mon Mar 16
Learning Through Exploration
Reinforcement learning and search
L15
Wed Mar 18
Inverting Simulators I
Simulation-based inference
D6
Tue Mar 17
Lab 5: Differentiable Programming
Week 9
L16
Mon Mar 23
Inverting Simulators II
Simulation-based inference; applications to physics and cosmology
L17
Wed Mar 25
Discovering Equations from Data
Symbolic regression
D7
Tue Mar 24
Midterm project work
Week 10
L18
Mon Mar 30
Guest Lecture
AI + astro
L19
Wed Apr 1
Guest Lecture: Gaia Grosso
AI + particle collider physics
Midterm project due; Final project out
D8
Tue Mar 31
Lab 6: Simulation-Based Inference
Week 11
L20
Mon Apr 6
From Specialized to General Intelligence; Scaling
From task-specific, to domain-specific, to general scientific agents
L21
Wed Apr 8
Quantifying and Predicting LLM Scientific Capabilities
Evaluations and forecasting for scientific R&D tasks
D9
Tue Apr 7
Final project work
Week 12
L22
Mon Apr 13
LLM Building Blocks
Attention; Transformers; Compute
L23
Wed Apr 15
Teaching LLMs to Science
Training and eliciting scientific capabilities
Final proposal due Fri Apr 17
D10
Tue Apr 14
Final project work
Week 13
Mon Apr 20
No class — Patriots' Day
L24
Wed Apr 22
Learning Unified Representations Across Scientific Modalities
Foundation models for science
D11
Tue Apr 21
Final project work
Week 14
L25
Mon Apr 27
Frontiers
Cutting-edge topic chosen by class
L26
Wed Apr 29
Being a Human Scientist
Research in an AI-driven scientific landscape
D12
Tue Apr 28
Final project work
Finals
Mon May 4
Finals — No exam
Final project due

Discussion Sections

Tuesdays 11:15am–12:05pm in MUG 205.

Week Date Topic Notes
Tue Jan 20 No discussion First day of classes
2 Tue Jan 27 Lab 1: JAX and Bayesian Inference Starter Due Wed Jan 28
3 Tue Feb 3 Lab 2: Hamiltonian Monte Carlo Starter Due Wed Feb 4
4 Tue Feb 10 Lab 3: Training Neural Networks Due Wed Feb 11
Tue Feb 17 No discussion Substitute Monday schedule
6 Tue Feb 24 Lab 4: Diffusion Models Due Wed Feb 25
7 Tue Mar 3 Midterm project intro + SCC setup
Mar 7–15 No discussion Spring Recess
8 Tue Mar 17 Lab 5: Differentiable Programming Due Wed Mar 18
9 Tue Mar 24 Midterm project work Midterm due Apr 1
10 Tue Mar 31 Lab 6: Simulation-Based Inference Due Wed Apr 1
11–14 Apr Final project work Proposal due Apr 17, Report due May 4

Topics Not Covered

Due to time constraints, this course does not cover several areas in AI for science, including neural operators, physics-informed learning, surrogate modeling, causal inference, interpretability methods, experimental design, active learning, and recent AI-for-math developments (e.g., LLM-guided theorem proving). Some of these may be covered in later weeks as the course evolves.

Assessment

Assignment 1 15%
Assignment 2 15%
Discussion Labs 10%
Midterm Project 25%
Final Project 35%
Total 100%

Discussion Labs

Weekly in-class labs reinforce lecture material through hands-on programming. Students work through a notebook during discussion, exploring implementations and comparing results. Graded on participation and completion. Labs are due end of day Wednesday.

Assignments

Two assignments develop skills in method design and critical analysis. AI tools may be used freely, but the analysis and interpretation require critically engaging with what was produced. The discussion labs build foundational skills for these assignments.

Midterm Project

Teams of 2–3 conduct a mini research project applying methods from the first half of the course (inference, architectures, generative models) to a scientific problem. Choose from a suggested list or propose your own. Deliverable is a ~4 page workshop-style paper + code.

Final Project

Teams of 2–3 identify a scientific capability that current large language models struggle with, then finetune a language model to improve that capability. This is a two-stage project:

Timeline

Deliverable Out Due
Discussion Labs Tuesdays Wednesday following lab
Assignment 1 Wed Feb 4 Wed Feb 18
Assignment 2 Wed Feb 18 Wed Mar 4
Midterm Project Mon Mar 2 Wed Apr 1
Final Project Wed Apr 1 Proposal: Fri Apr 17
Report: Mon May 4

Policies

Attendance

Regular attendance in lectures is expected. Please notify the instructor of planned absences.

Late Work

Late submissions are not accepted without prior arrangement. Extensions may be granted for documented emergencies.

Collaboration

Discussion of concepts and approaches is encouraged. However, all submitted code and written work must be your own. When collaborating, you must acknowledge your collaborators.

AI Tools

Learning to work effectively with AI is itself a course objective. Use AI tools freely to explore ideas, debug code, and deepen understanding. Focus on building genuine competence—understanding why something works, not just that it works. Disclose AI assistance in submissions, including its form and extent. See also the CDS GAIA policy.

Academic Conduct

All students are expected to read and abide by the BU Academic Code of Conduct. Plagiarism includes copying or restating work or ideas of another person or AI software without citing the source. In computing coursework, this includes sharing code, reusing code across courses without permission, and uploading assignments to external sites. Please review the examples of plagiarism provided by the BU Computer Science department. All suspected cases of plagiarism will be reported to the Academic Dean.

Accommodations

Boston University is committed to providing reasonable accommodations to students with documented disabilities. Students seeking accommodations should contact Disability & Access Services (25 Buick Street, Suite 300; 617-353-3658) as early as possible in the semester. A new Faculty Accommodation Letter (FAL) must be requested each semester; DAS will send this directly to instructors.

Religious Observance

Students observing a religious holiday during regularly scheduled class time are entitled to an excused absence. Please notify the instructor in advance to make arrangements for any missed work.

Recordings

Recording of lectures requires instructor permission. Students approved for recording as an accommodation must limit use to personal study and may not share recordings.