Building a coral segmentation model using sparse data
No items found.
Subscribe to newsletter
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Share this post
The ocean, an often overlooked yet essential part of our planet, and a bastion against the threat of climate change. Considered to be the cities of the ocean, coral reefs are essential to its ecosystems, both to marine life and ocean vegetation. However, climate change and human-caused pollution isn’t being kind to our precious reefs, with projections that we will lose almost all of our reefs by 2050.
So what can we do?
Aside from mitigating the causes such as climate change and pollution, research will be essential in the attempt to save the coral reefs. One startup that tries to support researchers on this topic is Reef Support, who are creating software to aid researchers in decision making on this topic. Together with FruitPunch AI, they wondered if AI could play a role in this. As such, they organized a challenge to see if, with the help of AI, they could further aid researchers in protecting the coral reefs. In the context of this challenge, we at ML6 got into contact with them, and decided to give the challenge a shot in the context of our Christmas Projects.
The goal is to segment images taken from corals, in order to automatically determine the coral coverage of that image. Sounds like an image segmentation problem, straightforward enough. Sadly, as is often the case in the real world, the data is less straightforward. In the ideal world, the data consists of image segmentation masks as labels, as in the image below.
However, the data we we’re working with is the Seaview dataset. The dataset contains about 1.1 million high resolution images of coral, including about 11 thousand images which are annotated. These annotations, as shown in the image below, are pixels manually annotated by experts, indicating to what it belongs (coral, algae, sponge, …).
So the main challenge of this project is to apply the information of these annotations in a smart way, such that we can still segment an image without having actual image segmentation masks to use for training.
The rolling window classifier
A first idea that came to mind, was to use the labels we had not to train a segmentation model, but a classification model. The idea is that every single annotation could be used as information for image classification. The way we went about this was to, for each image, crop a square around each annotation. As we have about 50 annotations per image this can result in over 500.000 input images to train a classifier.
This model could then theoretically go over the entire image as a rolling window, classifying each pixel. However, we once again have to live with the fact that we live in the real world, and that compute time is finite. As such, to save compute time, the rolling window uses a certain step size (such as 16 in this case).
The result is then that each block of 16 x 16 pixels in the image is attributed to a class, resulting in a segmentation mask.
When the classifier is of sufficient performance, the result of this method looks quite promising, but also has its disadvantages.
Advantages of this method:
Efficient use of the annotation data.
Relatively simple yet effective.
Result is quite rough and “blocky”, though this can be mitigated with smoothing techniques (and isn’t that much of an issue as the main goal is to calculate coral coverage).
Can take quite a long time if the step size is small.
Random patches here and there are misclassified and don’t make much sense.
The semi-supervised method
Another method, for which the idea is credited to Maks Kulicki who also worked on the challenge, is a combination of unsupervised segmentation and a classifier as used earlier.
The idea of this method, is to first use an unsupervised technique (such as SLIC) to determine what the possible segments are in the image. Next up, a patch is cropped around the center of each of these segments, after which this patch is fed through the classifier.
The result is then that each of the predetermined segments are then attributed to a class, resulting in a segmentation mask.
Advantages of this method:
Significantly faster than the rolling window method.
Smoother segmentation result.
Less robust. If the classifier gets a segment wrong, a big error is already made.
In general less performant than the rolling window method.
The hybrid method
Looking at the previous two methods, they both have their advantages and disadvantages. So naturally the thought came: Can we combine the two?
The idea that we applied was as follows: Use the rolling window to get an initial segmentation “suggestion”. Next up, the unsupervised segmentation would divide the image into segments. Instead of using the classifier to directly classify the class of the segment, this is now based on the result of the rolling window method. If the percentage of pixels in a segment that are classified as coral, is greater than a certain threshold (e.g. 30%), the segment is considered to be of the type coral.
As a result, the performance went up significantly (both based on subjective visual assessment, and in the metrics).
While the methods proposed are definitely not yet perfect, groundwork has been laid on which we can build to refine the methods, to hopefully improve the work of coral researchers as much as possible.
For the next steps, we will be working together with the student organisation Everest Analytics. As this aligns with their goal of letting students work on data & machine learning projects with societal added value, this seemed like a perfect opportunity to take up this project with them. As ML6, we will be guiding them in trying to refine the earlier proposed solutions. Possible avenues are:
Improving the classifier
Experiment with different unsupervised segmentation methods
Experiment with preprocessing (such as color enhancement, for which the people at FruitPunch have already done some good work)
Speeding up the classifier (which will allow the rolling window method to be more fine-grained)
Finetuning the methods (there are lots of hyperparameters, such as the step size of the rolling window, input size of the classifier, …)