About

The Machine Learning and the Physical Sciences workshop aims to provide an informal, inclusive and leading-edge venue for research and discussions at the interface of machine learning (ML) and the physical sciences. This interface spans (1) applications of ML in physical sciences (ML for physics), (2) developments in ML motivated by physical insights (physics for ML), and most recently (3) convergence of ML and physical sciences (physics with ML) which inspires questioning what scientific understanding means in the age of complex-AI powered science, and what roles machine and human scientists will play in developing scientific understanding in the future.

Recent years have seen a tremendous increase in cases where ML models are used for scientific processing and discovery, and similarly, instances where tools and insights from the physical sciences are brought to the study of ML models. The harmonious co-development of the two fields is not a surprise: ML methods have had great success in learning complex representations of data that enable novel modeling and data processing approaches in many scientific disciplines. Indeed, in some sense, ML and physics are concerned with a shared goal of characterizing the true probability distributions of nature. As ML and physical science research becomes more intertwined, questions naturally arise around what scientific understanding is when science is performed with the assistance of complex and highly parameterized models. Taken to the extreme, if an ML model is developed for a scientific task and demonstrates robustness and generalizability but lacks interpretability in terms of an existing scientific knowledge basis, is this still a useful scientific result?

The breadth of work at the intersection of ML and physical sciences is answering many important questions for both fields while opening up new ones that can only be addressed by a joint effort of both communities. By bringing together ML researchers and physical scientists who apply and study ML, we expect to strengthen the much needed interdisciplinary dialogue, introduce exciting new open problems to the broader community, and stimulate the production of new approaches to solving challenging open problems in the sciences. Invited talks from leading individuals in both communities will cover the state-of-the-art techniques and set the stage for this workshop, which will also include contributed talks selected from submissions. The workshop will also feature an expert panel discussion on ``Philosophy of Science in the AI Era" --- focusing on topics such as scientific understanding in the age of extremely complex ML models, automating science via machines, and ML models as source of inspiration for scientific discoveries. Finally, there will be multiple community building activities such as a voluntary mentorship opportunity and round table discussions on curated topics to foster connection building and facilitate knowledge sharing across disciplines, backgrounds, and career stages.

NeurIPS 2022

The Machine Learning and the Physical Sciences 2022 workshop will be held on December 3, 2022 as a part of the 36th annual conference on Neural Information Processing Systems (NeurIPS).

Speakers and panelists

Call for papers

Important dates (all times AoE)

  • Submission Deadline: September 22
  • Review Deadline: October 8
  • Accept/Reject Notification Date: October 15
  • Workshop: December 3

Call for papers

In this workshop, we aim to bring together physical scientists and machine learning researchers who work at the intersection of these fields – i.e., applying machine learning to problems in the physical sciences (physics, chemistry, mathematics, astronomy, materials science, biophysics, and related sciences) or using physical insights to understand and improve machine learning techniques.

We invite researchers to submit work particularly in the following areas or areas related to them:

  • ML for Physics: Applications of machine learning to physical sciences including astronomy, astrophysics, cosmology, biophysics, chemistry, climate science, earth science, materials science, mathematics, particle physics, or any related area;
  • Physics in ML: Strategies for incorporating prior scientific knowledge into machine learning algorithms, as well as applications of physical sciences to understand, model, and improve machine learning techniques;
  • ML in the scientific process: Machine learning model interpretability for obtaining insights to physical systems; Automating multiple elements of the scientific method for discovery and operations with experiments;
  • Any other area related to the subject of the workshop, including but not limited to probabilistic methods that are relevant to physical systems, such as deep generative models, probabilistic programming, simulation-based inference, variational inference, causal inference, etc.

Contributed Talks

Several accepted submissions will be selected for contributed talks. Contributed talks can be in-person or remote depending on the preference of the presenter.

Posters

Accepted work will be presented as posters during the workshop. At the same time as the in-person poster session, we will also facilitate a virtual poster session in GatherTown. Authors of submitted papers will be able to indicate their preference for an in-person presentation or a virtual presentation. Furthermore, in order to facilitate viewing presentations in different time zones, the authors of each accepted paper will get the opportunity to submit a 5 minute video that summarizes their work.

In case the number of posters that can be presented in-person is limited by the available physical space, a subset of works will be selected to be presented virtually. We will try to keep the authors preference for in-person/virtual poster presentations in mind during this selection. The remaining posters can be presented during the virtual poster session, and through the 5 minutes videos that will be uploaded to the workshop website.

Important note for work that will be/has been published elsewhere

All accepted works will be made available on the workshop website. This does not constitute an archival publication or formal proceedings; authors retain full copyright of their work and are free to publish their extended work in another journal or conference. We allow submission of works that overlap with papers that are under review or have been recently published in a conference or a journal, including physical science journals. However, we do not accept cross-submissions of the same content to multiple workshops at NeurIPS. (Check the list of accepted workshops this year.)

Submission instructions

Submit your work on the submission portal.

  • Submissions should be anonymized short papers (extended abstracts) up to 4 pages in PDF format, typeset using the NeurIPS style.
  • The authors are required to include a short statement (approximately one paragraph) about the potential broader impact of their work, including any ethical aspects and future societal consequences, which may be positive or negative. The broader impact statement should come after the main paper content. The impact statement and references do not count towards the page limit.
  • Appendices are highly discouraged, and reviewers will not be required to read beyond the first 4 pages and the impact statement.
  • A workshop-specific modified NeurIPS style file will be provided for the camera-ready versions, after the author notification date.
  • Workshop organizers retain the right to reject submissions for editorial reasons: for example, any paper surpassing the page limitation or not including the broader impact statement will be desk-rejected.
  • Submissions will be kept confidential until they are accepted and until authors confirm that they can be included in the workshop. If a submission is not accepted, or withdrawn for any reason, it will be kept confidential and not made public.

Review process

Submissions that follow the submission instructions correctly (i.e., are not rejected due to editorial reasons, such as exceeding the page limit, missing the impact statement, etc,) are sent for peer-review. Below are some of the key points about this process that are shared with the reviewers and authors alike. Authors are expected to consider these in preparation of their submissions and when deciding to apply for the reviewer role.

  • Papers are 4 pages long. Appendices are accepted but highly discouraged; the reviewers will not be required to read the appendices.
  • There will be multiple reviewers for each paper.
  • Reviewers will be able to state their confidence in their review.
  • We will provide an easy-to-follow template for reviews so that both the pros and the cons of the submission can be highlighted.
  • Reviewers will select their field of expertise so that each submission has reviewers from multiple fields. During the matching process, the same list of subject fields is used for submissions and reviewer expertise in order to maximize the quality of reviews.
  • Potential conflicts of interest based on institution and author collaboration are addressed through the CMT review system.
  • Criteria for a successful submission include: novelty, correctness, relevance to the field, at the intersection of ML and physical sciences, and showing promise for future impact. Negative or null results that add value and insight are welcome.
  • There will be no rebuttal period. Minor flaws will not be the sole reason to reject a paper. Incomplete works at an advanced progress stage are welcome.

More detailed guidelines for reviewers will follow.

We would also appreciate your skills as a reviewer! Please consider applying for that role here.

Guidelines and best practices

These guidelines and best practices were derived in part from the NeurIPS Conference checklist. We consider these to be more of a guide than a checklist. We also wish to remind authors and reviewers that a) we welcome papers on works in progress and b) that reviewers may use this as a guideline in reviewing submissions.

Our guidelines here supersede the NeurIPS conference checklist.

  1. Claims
    1. Do the main claims made in the abstract and introduction accurately reflect the paper's contributions and scope?
      1. Claims in the paper should match theoretical and experimental results in terms of how much the results can be expected to generalize.
      2. The paper's contributions should be clearly stated in the abstract and introduction, along with any important assumptions and limitations. It is fine to include aspirational goals as motivation as long as it is clear that these goals are not attained by the paper.
  2. Limitations of the work
    1. Did you describe the limitations of your work?
    2. Point out any strong assumptions and how robust your results are to violations of these assumptions (e.g., independence assumptions, noiseless settings, model well-specification, asymptotic approximations only holding locally). Reflect on how these assumptions might be violated in practice and what the implications would be.
    3. Reflect on the scope of your claims, e.g., if you only tested your approach on a few datasets or did a few runs. In general, empirical results often depend on implicit assumptions, which should be articulated.
    4. Reflect on the factors that influence the performance of your approach. For example, a facial recognition algorithm may perform poorly when image resolution is low or images are taken in low lighting. Or a speech-to-text system might not be able to be reliably used to provide closed captions for online lectures because it fails to handle technical jargon.
    5. We understand that authors might fear that complete honesty about limitations might be used by reviewers as grounds for rejection. It is worth keeping in mind that a worse outcome might be if reviewers discover limitations that aren't acknowledged in the paper. In general, we advise authors to use their best judgment and recognize that individual actions in favor of transparency play an important role in developing norms that preserve the integrity of the community. Reviewers will be specifically instructed to not penalize honesty concerning limitations.
    6. We again remind authors and reviewers that we welcome papers on works in progress in advanced stages of completion.
  3. Theoretical Results
    1. Did you state the full set of assumptions of all theoretical results?
      1. All assumptions should be clearly stated or referenced in the statement of any theorems.
    2. Did you include  mathematical proofs of theoretical results?
      1. The proofs can either appear in the main paper or in the appendix, but if they appear in the supplemental material, authors are encouraged to provide a short proof sketch to provide intuition. Proofs that are essential to justify the claims in the submission should appear in the main body of the submission.
  4. Existing Literature
    1. You are encouraged to discuss the relationship between your results and related results in the literature.
  5. Experimental Results
    1. Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the appendix or as a URL)?
    2. Main experimental results include your new method and baselines. You should try to capture as many of the minor experiments in the paper as possible. If a subset of experiments are reproducible, you should state which ones are.
    3. While we encourage release of code and data, we understand that this might not be possible, so "no because the code is proprietary" is an acceptable answer.
    4. At submission time, to preserve anonymity, remember to release/link to anonymized versions of code packages/repositories.
  6. Training and computing details
    1. Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)?
    2. The full details can be provided with the code, but the important details should be in the main paper.
    3. Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)?
      1. Answer "yes" if you report error bars, confidence intervals, or statistical significance tests for your main experiments.
      2. Did you include the amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)?
      3. Ideally, you would provide the compute required for each of the individual experimental runs as well as the total compute.
    4. Note that your full research project might have required more compute than the experiments reported in the paper (e.g., preliminary or failed experiments that didn't make it into the paper). The total compute used may be harder to characterize, but if you can do that, that would be even better.
  7. Existing Assets
    1. If you are using existing assets (e.g., code, data, models) or curating/releasing new assets…
      1. If your work uses existing assets, did you cite the creators?
        1. Cite the original paper that produced the code package or dataset.
        2. Remember to state which version of the asset you're using.
        3. If possible, include a URL.
      2. Did you mention the license of the assets?
        1. State the name of the license (e.g., CC-BY 4.0) for each asset.
      3. If you scraped data from a particular source (e.g., website), you should state the copyright and terms of service of that source.
  8. Releasing Assets
    1. If you are releasing assets, you should include a license, copyright information, and terms of use in the package. If you are repackaging an existing dataset, you should state the original license, as well as the one for the derived asset (if it has changed).
    2. If you cannot find this information online, you are encouraged to reach out to the asset's creators.
  9. New Assets
    1. Did you include any new assets either in the appendix or as a URL?
      1. During submission time, remember to anonymize your assets, e.g. by creating an anonymized URL.
      2. If you cannot release (e.g., the asset contains proprietary information), state the reason.
      3. Did you discuss whether and how consent was obtained from people whose data you're using/curating?
        1. For example, if you collected data from/via crowdsourcing, did your instructions to crowdworkers explain how the data would be used?
        2. Even if you used an existing dataset, you should check how data was collected and whether consent was obtained. We acknowledge this might be difficult, so please try your best; the goal is to raise awareness of possible issues that might be ingrained in our community.
      4. Did you discuss whether the data you are using/curating contains personally identifiable information or offensive content?
        1. There are some settings where the existence of this information is not necessarily bad (e.g., swear words occur naturally in text). This question is just to encourage discussion of potentially undesirable properties.
        2. Explain how you checked this (e.g., with a script, manually on a sample, etc.).
  10. Ethics and Broader Impacts Statement
    1. Have you read the ethics review guidelines and ensured that your paper conforms to them?
      1. Please read the ethics review guidelines.
    2. You are encouraged to use a CO2 emissions tracker and consider the environmental impact of your model/calculations. See, for example, the experiment impact tracker (Henderson et al.), the ML CO2 impact calculator (Lacoste et al.), and CodeCarbon.
    3. Did you discuss any potential positive or negative societal impacts of your work?
      1. Examples of negative societal impacts include potential malicious or unintended uses (e.g., disinformation, generating fake profiles, surveillance), environmental impact (e.g., training huge models), fairness considerations (e.g., deployment of technologies that could further disadvantage historically disadvantaged groups), privacy considerations (e.g., a paper on model/data stealing), and security considerations (e.g., adversarial attacks).
      2. We expect many papers to be foundational research and not tied to particular applications, let alone deployments, but being foundational does not imply that research has no societal impacts. If you see a direct path to any negative applications, you should point it out, even if it's not specific to your work. In a theoretical paper on algorithmic fairness, you might caution against overreliance on mathematical metrics for quantifying fairness and examples of ways this can go wrong. If you improve the quality of generative models, you might point out that your approach can be used to generate Deepfakes for disinformation. On the other hand, if you develop a generic algorithm for optimizing neural networks, you do not need to mention that this could enable people to train models that generate Deepfakes faster.
      3. Consider different stakeholders that could be impacted by your work. It is possible that research benefits some stakeholders while harming others. Pay special attention to vulnerable or marginalized communities.
      4. Consider possible harms that could arise when the technology is being used as intended and functioning correctly, harms that could arise when the technology is being used as intended but gives incorrect results, and harms following from (intentional or unintentional) misuse of the technology.
      5. If there are negative societal impacts, you should also discuss any mitigation strategies (e.g., gated release of models, providing defenses in addition to attacks, mechanisms for monitoring misuse, mechanisms to monitor how a system learns from feedback over time, improving the efficiency and accessibility of ML).
      6. For more information, see this unofficial guidance from last year and other resources at the broader impacts workshop at NeurIPS 2020.
  11. Anonymization
    1. At submission time, to preserve anonymity, remember to release anonymized versions.

Organizers

For questions and comments, please contact us at ml4ps2022@googlegroups.com.

Steering Committee

Sponsors

Sponsors are welcome. Please contact us.