ML6 at NeurIPS 2020
December 14, 2020

ML6 at NeurIPS 2020

The Neural Information Processing Systems (NeurIPS) conference is a meeting that revolves around machine learning and computational neuroscience. Held every year in December, it aims to develop novel computational and statistical strategies for information processing and to understand the mechanisms for information processing in the brain. Each year, a selection of topics is put forward that form the basis for several workshops, poster presentations, invited talks by guest lecturers and expo sessions held by some of the industry’s biggest and most influential companies.

In this year’s edition, we — Jules Talloen and Thomas Uyttenhove, both Machine Learning Engineers at ML6 — have gotten a paper submission accepted at NeurIPS workshops. For this reason, we attended the virtual NeurIPS 2020 conference and documented trends and relevant topics that we encountered. This blogpost will go over ML6’s workshop contributions and will give you a closer look at some of the talks and demonstrations that caught our eye. We have divided the talks into 3 main topics:

ML6 Contributions

For one, NeurIPS provides informal cutting edge venues for discussion of current topics in the form of smaller meetings or workshops.

In this section, we will present you the two workshops to which we contributed: Beyond Backpropagation for Jules’ paper “PyTorch-Hebbian: facilitating local learning in a deep learning framework” and Machine Learning for Health for Thomas’ paper on “Interpretable Epilepsy Detection in Routine, Interictal EEG Data using Deep Learning”.

Beyond Backpropagation: Novel Ideas for Training Neural Architectures

Image for post
Is backpropagation the ultimate tool on the path to achieving artificial intelligence as its success and widespread adoption would suggest?

Since its discovery, many have questioned the biological plausibility of backpropagation. The limitations of backpropagation have motivated people to investigate alternative learning paradigms. The Beyond Backpropagation workshop aims to accommodate discussions towards novel ideas for training neural architectures by combining perspectives from engineering, machine learning and neuroscience.

Jules’ paper, titled “PyTorch-Hebbian: facilitating local learning in a deep learning framework”, introduces a novel machine learning framework for local learning, a type of learning that more closely resembles how humans learn. The deep neural networks known today have come a long way since their origins in ‘brain-inspired computing’. However, since then the understanding of the human brain has also improved. One thing that remains certain is that the human brain is far more power- and data-efficient at computing than deep learning on digital computers. Although the mysteries of the human brain are far from solved, several learning mechanisms are well known. A lot of the efficiency of learning is the result of self-organization: learning rules that work locally to adapt the network to the data it typically receives without any supervision or global reward. This type of local learning is now made much more accessible by Jules’ novel machine learning framework.

Machine Learning for Health: Advancing Healthcare for All

Image for post
Image for post

The Machine Learning for Health (ML4H) workshop aims to bring together machine learning researchers, clinicians, and healthcare data experts for advancing healthcare-related research into cutting-edge machine learning approaches for improving patient outcomes. This year, the focus of the workshop is on how to make sure such sophisticated and powerful tools are not only accessible to highly-specialized caregivers but to healthcare workers on a global scale, thus enabling healthcare for all. Key topics highlighted during this year’s installment are public health, fairness, clinical practice, and epidemiology (which should come as no surprise). Additionally, a lot of attention appears to have gone into explainable machine learning.

In the context of public health and clinical practice, Thomas’ paper on “Interpretable Epilepsy Detection in Routine, Interictal EEG Data using Deep Learning” looks into the application of machine learning for epilepsy detection. Epilepsy is a common brain disease that affects over 50 million people worldwide, most of which in middle and low-income countries. Epileptics are three times as likely to die prematurely due to complications that arise during the infamous epileptic seizures, but proper diagnosis and treatment would allow 70% of patients to live seizure-free. Whereas the detection of seizures has been successfully “solved” using machine learning, there has been a lack of attention in discovering epilepsy when the patient is not having a seizure. By responding to this gap in epileptic research, this work proposed a machine learning approach that surpasses the successful diagnosis rate of the previous best method. Additionally, the proposed approach allows the visualization of how such diagnoses are made, such that the model’s decisions can be validated for more wide-spread adoption in clinical settings.

Cool Tech

In this section, we will present you the coolest tech that caught our eye during the conference. First, we will discuss three expo talks by some of NeurIPS’ biggest sponsors: Zalando Research on GAN applications for fashion generation, Apple on their newest M1-chip, and Facebook CTRL-Labs on a novel wrist sensor for immersive hand control. To end this section, we will describe an interesting, more technical, talk by Anthony M. Zador on the genetic bottleneck.

GAN Applications in Fashion Article Design and Outfit Rendering

by Zalando Research

When a user is looking for a particular fashion item, they would typically need to provide a textual description in order to search for it. A lot of information is lost in this modality change. The customer may not know the right fashion terms that apply to the article in mind. In order to avoid these modality changes, Zalando proposes Generative Search. With Generative Search, the user is presented with some example clothing images. They can combine two of these images to generate a new image that matches what they are looking for. This image is then used to search through Zalando’s database.

Image for post
The presented Proof-of-Concept of Zalando’s Fashion Renderer.

Additionally, Zalando demonstrated their Fashion Renderer. Given some clothing images and a body pose, the tool generates a new image of a fashion model in the specified pose, wearing the given clothing items. This tool allows users to easily create their own outfits and visualize what they would look like when worn.

Accelerated Training with ML Compute on M1-Powered Macs

by Apple

A brief overview of Apple’s M1-chip.
Apple’s new M1-chip. (source: Apple)

Last month, Apple announced their all-new M1 chip: a powerful chip specifically designed for Macs. With an integrated 8-core CPU — both high efficiency and high performance cores, the M1 chip offers a low-latency yet high-bandwidth experience. Additionally, it features a powerful machine learning accelerator and high-performance 8-core GPU that offer a large leap forward in supporting machine learning solutions.

In light of these improved machine learning capabilities, Apple has released an accompanying machine learning framework with their latest installment of macOS, Big Sur. The framework, called ML Compute, provides low-level APIs for building computational graphs and defining training loops. Performance is further optimized by distributing the workload between CPU and GPU, given the available hardware.

A Mac-optimized version of the popular machine learning framework TensorFlow 2.4 enables the training of TensorFlow graphs directly on the GPU of both M1-based and Intel-based Macs — a feat that was previously limited to Apple’s CPUs! However, a significant speed increase of 3.3x to 7.0x and power saving from 3.5x to 8x can be noted when opting for the M1-chip Macs over Intel-based Macs. These claims regarding the performance-increase offered by the new hardware and software are promising at the very least.

Building Neural Interfaces: When Real and Artificial Neurons Meet

by Facebook CTRL-Labs

When humans interact with technology, we tend to have a pretty high bandwidth for input. We are capable of processing enormous amounts of information. However, our output is typically limited to typing on a small screen, or keyboard at best. At Facebook CTRL-Labs, they tried to bypass these bottlenecks and built a direct Brain-Computer Interface (BCI).

Image for post
The motor intent sensor, presented by Facebook CTRL-Labs, is worn around the wrist and predicts the current hand position.

They propose a non-invasive neural interface, worn around the forearm. The device contains a model that has been trained to reconstruct motor intention, what the user wants their hand to be doing. Using these intents, it is possible to determine the current hand position. The talk demonstrated how the technology can be used for immersive control in virtual reality, seamless navigation through interfaces and typing without a keyboard.

The Genomic Bottleneck: A Lesson from Biology

by Anthony M. Zador, Neuroscientist, Cold Spring Harbor Laboratory

Image for post
The genomic bottleneck explores how general intelligence in nature is attributed to the wiring of the brain rather than learning during life. (source: PsyPost)

Throughout the years, the pursuit of general intelligence has often been inspired by how brains work in the real world. However, contrary to most machine learning approaches available today, general intelligence among animals and humans is largely due to the highly structured brain connectivity and only a small part can be attributed to clever learning algorithms. This structure is constructed based on genomic information present in the DNA of animals and humans, that is the result of thousands to millions of years of evolution. Interestingly enough, the genomic information itself is many orders of magnitude smaller than the actual complex wiring of the brain, meaning that genomes only contain compressed rules for this wiring.

In this presentation, Zador explains how his research explores this concept with what is called the genomic bottleneck algorithm. In this algorithm, a so-called G-network compresses a working solution by approximating relations in an approximated weight matrix. The network thus acts as some sort of regularizer that aims to minimize the error on data prior to learning and penalizes the network’s complexity. Using this approach, highly compressed models were still able to perform well on non-trivial problems (MNIST, CIFAR, SVHN) while enabling more rapid learning that transfers well to similar, related problems.

COVID-19

Image for post
COVID-19 was one of the big topics of NeurIPS 2020. (source: Sciensano)

An — unsurprisingly — recurring topic was the application of AI and machine learning towards dealing with the current pandemic. Companies worldwide have responded to a call to action to all facets of research and technology regarding COVID-19. Among those were many AI companies. During NeurIPS’ expo sessions, IBM Research and BenevolentAI presented their efforts towards combating the coronavirus during the last months.

AI against COVID-19

by IBM Research

Image for post
A dashboard for easy exploration of global measures against the COVID-19 spread. (source: IBM Research)

In their presentation, IBM discusses the research they have conducted since the start of the pandemic. For one, they applied natural language processing (NLP) techniques to Wikipedia pages that document the efforts of governments worldwide towards stopping the spread of the disease. As a result, they enabled comparative studies of how different countries tackled the pandemic and which measures were most effective in tackling COVID-19.

Moreover, they developed a tool called Deep Search, that enables data-driven discovery for scalable knowledge ingestion. With optical character recognition (OCR) and NLP, text documents are transformed into machine-digestible components. These, in turn, are subjected to NLP to extract entities and relationships for the construction of a knowledge graph. Based on this graph, research ideas can automatically be supported with suitable data. A similar approach was used to analyze scientific papers and extract key insights for automatic answering of COVID-related questions.

Image for post
IBM’s COVID-19 Q&A tool. (source: IBM Research)

Lastly, they proposed a drug candidate explorer that finds suitable molecules for novel targets — such as COVID-19. Current methods for doing this often cost millions of dollars, can take over 10 years, have to comply with many constraints, and must find a needle in the haystack of more than 10⁶⁰ possible molecule configurations. Additionally, in case of COVID-19, there is very little data to base research on. As a response to these challenges, IBM’s CogMol (COntrol Generation of MOLecules) machine learning pipeline is based on generative models trained on molecule and protein sequence embeddings to automate this process and generate unique, diverse, and chemically valid drug candidates for COVID-19.

How we leverage ML and AI to develop life-changing medicines — a case study with COVID-19

by BenevolentAI

Just like IBM Research with their CogMol project, BenevolentAI saw an opportunity in the current drug discovery model. On top of the high cost and development time, they also identified high risks, high inefficacy for the best-selling drugs, and many diseases that are still untreated. For this reason, BenevolentAI aims to apply machine learning and AI to tackle the failing drug discovery and development as a whole. Their proposed drug discovery pipeline consists of four components: knowledge acquisition and representation, target identification, precision medicine, and molecular design. It was this approach that also led to the discovery of the very successful Olumiant Baricitinib rheumatoid prescriptive drugs for treating COVID-19 patients that require supplemental oxygen.

Image for post
BenevolentAI’s drug discovery pipeline successfully predicted the benefits of Olumiant Baricitinib rheumatoid prescriptive drugs for treating COVID-19 patients requiring supplemental oxygen. (source: Olumiant)

During knowledge acquisition, NLP is performed to extract entities and textual links from text-derived data (such as proteins, genes, literature, and patents), ingested data (structured machine understandable form of biological information), and patient-level data (such as omics, patents, clinical reports, and diseases). This data is then combined and embedded into a knowledge graph, allowing rapid construction of datasets for bespoke use-cases.

Disease and drug targets are then identified by finding missing relations between diseases and compounds potentially curing these diseases. Missing genes that have a high probability to resolve particular diseases are selected for more detailed studies and experiments. This methodology has already shown high performance in predicting clinical failure and success of particular genes in clinical trials.

Latent variable models (such as sparse factor analysis and auto-encoders) are further applied to differentiate between patient-specific molecular mechanisms of diseases. This enables scientists to determine the drug variants best matching the different disease variants.

Finally, AI is used to quickly and more easily explore the vast chemical space of molecules that can potentially be made into medicines.

Ethics, Privacy, and AI for Good

Another big topic that was frequently discussed was that of ethical AI, how privacy plays an increasingly important role in how machine learning is applied today, and how AI can be deployed for the good of the people. In this section, we will highlight these topics by means of four of the invited talks by guest lecturers: Charles Isbell’s presentation on reducing bias in machine learning approaches, the vision of Chris Bishop on further partnering machine learning and human expertise, a talk on the invisible workers of AI by Saiph Savage, and Shafi Goldwasser’s lecture on pursuing robustness, verification, and privacy.

You Can’t Escape Hyperparameters and Latent Variables: Machine Learning as a Software Engineering Enterprise

by Charles Isbell, Dean of Computing, Georgia Tech

Image for post
A biased GAN attempts to reconstruct Obama’s facial features.Type image caption here (optional)

In this talk, Charles addresses the growing issue of bias in the machine learning research field. Bias stems from decisions and these decisions happen along the entire pipeline, from the data to the algorithms. We need to consider the ethical and moral consequences of these decisions. Consequently, we need to take the long view and think about the future and implications of the systems we build. Bias is all around us and hard to get rid of. The goal is to detect bias and reduce it. Even when it is the data that’s biased, it’s still up to us to build systems that can deal with that and reduce the bias.

Charles proposes several approaches to combat bias. One ‘easy’ method is to use representative data or augment the data. This is, however, not always possible. Other techniques involve a more rigorous approach to machine learning. We can take inspiration from software engineering and system modeling. Rather than trying to explain black box models, we need to build transparent models. We need to define proper objective functions that take into account moral and ethical implications. For this, we need to include people that are actually aware of these consequences. More diverse teams are needed.

The Real AI Revolution

by Chris Bishop, Lab Director, Microsoft Research Cambridge

While the creation of Artificial General Intelligence (AGI) would be truly revolutionary, such an achievement still seems to lie well ahead of us. Meanwhile, another profound AI revolution is already unfolding and is set to transform almost every aspect of our lives. Chris calls this the Real AI Revolution.

“For the last forty years we have programmed computers; for the next forty years we will train them”.

Machine learning is becoming ubiquitous; it is solving many problems that could not be solved before. Furthermore, it possesses the ability to customize solutions to each specific instance.

Chris continues his talk by addressing the need for a deep partnership between machine learning experts and those who bring domain knowledge. Today we have systems that perform very well on a broad range of applications. We now need to think about a lot more than the machine learning algorithms themselves. One of the new challenges is the partnership between machine learning and human expertise.

“AI will not replace radiologists, but radiologists who do not use AI will be replaced by those who do.” — Dr. Felix Nensa

A Future of Work for the Invisible Workers in AI

by Saiph Savage, Director, Human Computer Interaction Lab

While AI has gained unprecedented exposure as it is successfully being integrated in an increasing number of industries, much less attention goes to the human workers required to do tasks that the AI cannot do independently — think of labelling vast amounts of data, transcribing audio, categorizing content, etc. These so-called invisible workers are often underpaid and are faced with limited career growth. As such, these people represent a new underclass on a global scale.

Image for post
AI’s invisible workers are identified as a new underclass. (source: Wikimedia Commons)

A framework called Value Sensitive Design is proposed to evolve towards a future in which these invisible workers are empowered rather than exploited. Taking into account all stakeholders’ needs, values and experiences to enable skill growth, hourly wage increases, and easier transitioning to new creative jobs for AI labor workers. By employing techniques to encourage solidarity and entrepreneurship among the invisible workers, the model was shown to drive positive social change. For a future in which AI does good, it must empower people to create change, help each other, and coordinate groups of people to produce collective action.

Robustness, Verification, Privacy: Addressing Machine Learning Adversaries

by Shafi Goldwasser, Director of the Simons Institute for the Theory of Computing

Recently, the field of machine learning has met a lot of skepticism and distrust. In light of these events, Shafi sheds a cryptographer’s perspective on three important aspects of machine learning: verification, privacy and robustness.

The need for verification of machine learning models is best illustrated with an example. In 2018, California legislature passed senate bill 10 to end cash bail and replace it with a risk assessment system. Based on statistics, the system decides whether a suspect should be released or detained until trial. This led to a debate regarding the fairness of the system. The opponents argument boiled down to a lack of trust in the machine learning model:

The pervasive use of algorithmic risk assessment complicates or undermines many bedrock democratic values, institutional standards of accountability, and constitutional principles.

Shafi poses that, in order to deal with these concerns, a computer scientist should be concerned about:

  • Who builds the ML algorithms?
  • Who verifies that the ML code is correct?
  • How to ensure that the algorithms used are the same ones that have been verified?

A second important aspect of machine learning is privacy. The power of machine learning comes from data. In turn, this data typically originates from individuals, and their privacy needs to be guaranteed. This becomes a real challenge for training at scale, which requires large datasets and data fusion. Individual-level privacy is the main barrier. You can learn a lot about an individual by looking at their “anonymized” data. Past research has already shown that it is possible to track certain information back to their identities, simply by combining data at scale.

The third and final covered topic is Robustness. A machine learning model must be robust against clever manipulations of inputs by an adversary. If not taking into consideration, such manipulations can cause misclassification and fool applications (cf. “How to help/fool an object detector”). Currently the most common approach is to define a class of domain specific attacks and prove adversarial robustness e.g. robustness to rotations/translations of an image. As cryptographers consider worst case adversaries without restricting strategies, this approach does not suffice. Instead, they allow a learner to abstain/reject inputs for distributions it has not seen. However, when the learner chooses not to reject, it has to be correct. This way the model will not attempt to provide an answer to adversarial inputs.

Closing Remarks

You can find the official information about the conference, the complete schedule of this year’s edition, an overview of the proceedings, and more on the official NeurIPS website.

Related posts

No items found.

Want to learn more?

Let’s have a chat.
Contact us