15
February
2022
13:45
Location: Technische Universiteit Delft
-------
At ML6 we think it is important to give the opportunity to learn, grow and thrive while working with the best of the best in a dynamic environment. We gladly accelerate you in taking your first steps into the professional world and towards the career of your dreams. We are looking for diverse talent who shares our values and wants to make an impact that matters.
Take a look at the opportunities and apply when interested. Please mention the internship you are interested in.
The goal of this project is to estimate the expected effect that using a custom language model will have over an existing one based on known heuristics. The current best approach is to compare the overlap of n-grams from your domain-specific texts to those from the texts that the existing solution was trained on and install some (arbitrary) cut-off (i.e, if the n-gram overlap is under 30%, we explore using a custom language model). However, this method is not very quantitative nor very rigorous.
During this internship, you will:
The goal of this project is to leverage different NLP techniques to arrive at an algorithm that can accurately highlight anomalous words and/or sentences in a document that you wouldn’t expect to appear in that document. This system would likely also exploit the repetitive nature of certain types of documents (e.g, rental contracts are 90% the same because they need a certain legal structure). If successful, such an algorithm could have very impactful use cases in fields such as the legal domain, insurance companies, etc. The concrete approach would be to develop such a system on legal documents but we are open to suggestions if there is another field that interests you more where it could also have a big impact.
During this internship, you will:
The goal of this project is to develop an algorithm that highlights the most important information in a document. However, how we define important information requires some creativity (i.e, it could mean sentences that summarise the text, sentences that are unexpected, etc.). Concretely, we see a major use case for legal texts where we can also exploit the repetitive nature of such documents (e.g, rental contracts are 90% the same because they need a certain legal structure) but we are open to suggestions if there is another field that interests you more where it could also have a big impact.
During this internship, you will:
For this use-case, we are interested in analysing a table tennis game in real-time. The challenge with table tennis is that it’s a very fast-paced sport, as the players are close to each other and the ball can go quite fast. Another challenge is the size of the ball, which is very small, especially because the camera needs to be quite far away to capture the table and players. At the moment, the TTNet paper by osai.ai seems to be the most promising one for solving these challenges.
During this internship, you will: