In recent times the general trend towards automation has meant that use cases which involve processing large amounts of data are becoming automated. The reasons for this are quite obvious: these are often repetitive time-consuming tasks that are prone to human error and lend themselves to being automated. However, still one of such tasks remains and that is reading. Unfortunately, we can’t automate reading but we can make it faster by highlighting the key information in the text.
The goal of this project is to develop an algorithm that highlights the most important information in a document. However, how we define important information requires some creativity (i.e, it could mean sentences that summarise the text, sentences that are unexpected, etc.). Concretely, we see a major use case for legal texts where we can also exploit the repetitive nature of such documents (e.g, rental contracts are 90% the same because they need a certain legal structure) but we are open to suggestions if there is another field that interests you more where it could also have a big impact.
During this internship, you will:
The duration of the internship can be flexible and depends on the candidate preference and the project requirements. The typical duration is 6 to 8 weeks. The preferred duration for this specific project is 6 weeks.