Welcome to the SAMPL website. Please continue reading for a more detailed description of what SAMPL is, the current challenges in SAMPL4, and a look back at previous SAMPLs.

The SAMPL experiment

SAMPL is an attempt at prospectively testing protein and small molecule modeling. Ideally this would consist of experiments conceived to distinguish between competing ideas or methods and perhaps over time we shall get there. For now, this is an analysis of methods applied to data not seen by participants, a 'blind' assessment. We make no claims against the limitations of such an attempt, believing it more important to light a candle than curse the darkness. And blind tests do help us avoid the tendency to bias our theories and approaches to known answers. They also provide a more realistic "real world" setting for methods.

With SAMPL we intend to avoid "who won, who lost". On such a small sample size, no such statistically sound pronouncement could be made anyway. Rather, we see this as an opportunity for groups to test their methods, learn from the experience and share lessons learned.


Building on a series of successful blind challenges for computational chemistry, the SAMPL4 challenge will have hydration free energy, host-guest, and binding free prediction components. As with previous SAMPL challenges, the opportunity for blind predictions is available and the SAMPL project will culminate with a science meeting in September, 2013.

In the host-guest prediction component, participants will be provided with the chemical structures of the host molecule and a set of guest molecules. In the present round, we will focus on two hosts. The first is CB7, which avoids the complexities encountered in SAMPL3 for a cucurbituril derivative with four carboxyl groups. The guests will be commercially available compounds expected to span a range of affinities. The experiments will yield relative binding affinities (via NMR and potentially supplemented by ITC), through competitive binding experiments, for the sake of experimental convenience. For a second host, we will focus on a basket-shaped octa-acid host studied by the Gibb lab, and experimental data will be similar (spanning 9 guests, all charged carboxylic acids), but with affinities measured by ITC. This portion of the challenge is now active and can be downloaded here.

Protein-ligand binding prediction will focus on a set of HIV integrase inhibitors with affinities in the micromolar range. Nearly 400 compounds have been tested, and roughly 100 of these are known to bind with ~50 new high-resolution crystal structures available. Challenge aspects will focus on three main categories for binding: (1) virtual screening, where participants attempt to identify the binders out of a small library; (2) binding mode/affinity prediction, where participants attempt to predict binding modes or binding modes and affinities; and (3) affinity prediction, where participants attempt to predict affinities (or relative affinities) given binding modes; the number of compounds in this last component will be quite small. Participation in challenge aspect (3) precludes participation in (1) or (2) unless the aspects are done sequentially. We are grateful to Avexa Ltd, Australia for the data. This portion of the challenge begins shortly.

Hydration free energy prediction will focus on a dataset of around 50 molecules with hydration free energies curated by J. Peter Guthrie, as in previous SAMPL challenges. Details on this aspect of the challenge are available here and the data is available for download by clicking the "My account" button on the left pane.

All elements of the SAMPL challenge are now open, including submissions.

SAMPL4 predictions are due FRIDAY, Aug. 16.

The SAMPL4 meeting will take place September 20, 2013, at Stanford University. For workshop registration, please use this history page for additional details.

SAMPL in the Future

We hope SAMPL continues to be useful and regular event. There are many possible variants we would like to try in future assessments, such as providing partial information, e.g. a subset of actives or binding modes, more typically of an industrial project setting. Additional physical properties, such as tautomer ratios, pKas, thermodynamic data could be sought. Of course, these efforts rely on the availability of appropriate, blinded data, which is often difficult to obtain. We strongly believe a blinded assessment component to our field will help us judge progress and hope you both agree and can contribute.

SAMPL has traditionally been run by OpenEye; however, as of 2012-2013, it is being transitioned to being run by an outside group of people with logistical support from OpenEye. Currently, this is a rather ad-hoc group, but going forward we hope to start a SAMPL steering committee which will plan future SAMPL challenges strategically.

Kind Regards,
Matt Geballe, Geoff Skillman & Anthony Nicholls (2007-2011 organizers)
David Mobley, John Chodera, Tom Peat, Terry Stouch, Vijay Pande, and Mike Gilson (2012-2013 organizers)
OpenEye Scientific Software, Inc.

