Sie sind auf Seite 1von 21

Some Opinions on

Reproducibility in ML

Hugo Larochelle
Google Brain
These opinions might change...

… during this talk


Why we want reproducible machine learning research?

● As a consumer of research
○ To know what new knowledge our next hypothesis can build on
○ To build systems that are more complex than what I could build
● As a producer of research
○ So that my research has real, tangible impact
○ To instill trust in my work in general

If it isn’t reproducible, it might just as well not exist!


What does reproducible machine learning mean?

Option 1:

It’s research with results that can be reproduced


by another researcher entirely based on the paper

● Requires a measurable notion of matched results (confidence intervals!)


● Might favour research on simple methods
● Might be hard to achieve if other researcher has less compute resources
● Confirming reproducibility requires a lot of work, which won’t get done
without proper incentives
What does reproducible machine learning mean?

Option 1:

It’s research with results that can be reproduced


by another researcher entirely based on the paper

● We should aim for more!


Requires a measurable notion of matched results (confidence intervals!)
● Might favour research on simple methods
● Might be hard to achieve if other researcher has less compute resources
● Confirming reproducibility requires a lot of work, which won’t get done
without proper incentives
What does reproducible machine learning mean?

Option 2:

It’s research that comes with open source code,


which reproduces the results from the paper

● Still requires a statistical notion of matching results


● Easier to publish complex methods (all details are in the code)
● More robust to variations in resources between groups
● Confirming reproducibility is trivial (assuming sufficient resources)
Should we have explicit seals of reproducibility?

● Should it be part of the publication process?


○ We can hardly get reviewers to read papers, so forget about code
○ Other disciplines don’t even have a notion of source code

● Should we explicitly make the confirmation of reproducibility a


publishable result?
○ That’s what other disciplines do
○ But seems like a waste of time when code could just be released
Should we even have formal seals of reproducibility?

● Informal processes might be sufficient


○ One’s research needs reproducibility to have impact
○ Reputation travels fast with social media
○ Paper can have impact even if difficult to reproduce
■ Remember the “black art” of neural networks?
■ No seal vs. with seal = SGD vs. batch GD?
Should we even have formal seals of reproducibility?

● What if code runs a test-set-overfitted solution?


○ Should be accetable to report numbers with new hyper-parameter
search using published code
○ Even if we don’t, this would become apparent on any new dataset

● But wouldn’t hurt to have soft incentives for publishing code


○ Why not make it only optional to compare with previous work that
doesn’t have code for reproducibility (just like arXiv results)?
○ At the very least, matching a published non-open-sourced baseline
with a new method that is open-sourced, should be sufficient
Maybe we should never assume published results are true!

● Under that assumption, it makes sense to:


○ Report possibly different numbers (e.g. new hyper-parameter
search) using published code
○ Not require comparing with published results that aren’t easily
verifiable (i.e. requires substantial effort to reproduce)
○ Or only require to match performance of non-open-sourced
methods, since that’s still tangible progress for the community
(at least now there’s available code to reach that performance)
Reproducibility, Trust and Transparency

● If we perfectly trusted each other’s research, we wouldn’t look for formal


seals of reproducibility

● One way of favoring trust is transparency

● What if we performed our research completely in the open, from start to


finish?
A solution: AI-ON

AI-ON - Massively Collaborative Machine Learning Research


● Work in progress, put forth by Gabriel Pereyra (DeepMind) & Francois
Chollet (Google Brain)
● Inspired by Polymath Project
● Open-source the entire research process
○ Currently code is open-sourced only after publication
○ In AI-ON, discussions, code and experiment contributions are made
publicly, as the project progresses
http://ai-on.org/
Success story:
Separating Overlapping
Chromosomes With
Deep Learning
A solution: AI-ON

How would it work


● We are collecting research proposals from ML labs
○ proposals should be substantial enough
(think early draft of a paper, without the experiments section)
● Each proposal would have a public Github repository
● Proposers provides guidance for anyone interested in contributing
● Once a project is ready for publication, the proposer helps write the
paper
● Authorship is determined by contribution history
A solution: AI-ON

Should help with...


● Reproducibility: thanks to open source code and possibility to recruit
contributors only to run reproducibility experiments
● Education in ML: would provide examples for junior students to learn
how to do research (much like OpenReview for reviewing)
● Credit assignment: beyond author order, it’d be possible to trace back
“who did what”
● Diversity in ML: becomes possible to involve researchers from any
institution, education level or background
● Reproducibility (bis): this might just be the right research model for
extensive studies of previously published baselines
A solution: AI-ON

FAQ
● Aren’t there risks of getting his research ideas “scooped”?
○ Not if other researchers can also be involved and collaborate
○ Since proposal is public, wouldn’t look too good on the “scoopers”
● Aren’t there risks of flag planting?
○ It’s not much of a flag if just a (non-citable) proposal with no results
○ Again, not if other researchers can also be involved and collaborate
(proposals are an invitation to be co-authors!)
A solution: AI-ON

We have proposals (or commitments) from:


● Yoshua Bengio (University of Montreal)
● Kyunghyun Cho (NYU)
● Alexander Rush (Harvard)
● Me! (Google Brain)

Please consider involving your lab!


You may reach out to Gabriel (pereyra@google.com) and François
(fchollet@google.com)
A solution: AI-ON

Example of projects
● You have a new library and want it to grow, combine that with an
extensive / reproducibility study
● "Dataset paper" where several benchmarks could be implemented
● New project your lab would normally do internally but could "open up"
● On a new method that you've published already, but is applicable more
broadly and you need help in running additional experiments
● A conference paper that you'd like to convert into a long journal paper
Please consider involving your lab!
You may reach out to Gabriel (pereyra@google.com) and François
(fchollet@google.com)
Conclusion

Open source is key to better reproducibility


● We should figure out the right incentives for open sourcing code

We should think of open sourcing the whole research process


● AI-ON is a promising step in this direction
● We need “buy-in” from research labs and influencers!
● It’s not just for reproducibility, it’s also for more equality of opportunities
Merci!

Das könnte Ihnen auch gefallen