You will join our Data Team, working closely with a group of approximately 5 data experts. This highly collaborative team focuses on pushing the boundaries of generative AI, natural language processing, and privacy-preserving machine learning legal solutions.
Felix, our Director of AI & Data Engineering, will guide you through your journey at Noxtua. With deep expertise in AI systems, Felix leads with a passion for innovation and a collaborative approach, ensuring every team member thrives.
Design, build and optimize end-to-end ETL pipelines for legal data from multiple jurisdictions including ingestion, validation, cleaning, transformation, chunking, embedding, and ingestion into vector databases.
Work extensively with XML-based legal data feeds: parse, validate, normalize, and transform complex XML structures into scalable internal schemas and unified document formats.
Develop and maintain data models and storage schemas that support continuously updated datasets while ensuring consistency, scalability, and accuracy across diverse datasets and large amounts of data.
Coordinate data handover and integration from multiple internal and external data providers, including official sources, APIs, and web scraping pipelines, ensuring reliable and timely updates.
Implement and continuously refine metadata enrichment strategies to maximize searchability, ranking quality, and relevance of legal information in vector databases.
Build and maintain high-performance search and retrieval infrastructure enabling agent-based systems to call search functions and retrieve the most relevant legal information efficiently.
Explore and integrate generative AI techniques to enhance data processing workflows such as structured field extraction, metadata generation and document normalization.
Experiment with different embedding and chunking strategies including evaluation.
Conduct database performance benchmarking and tuning to ensure efficient query execution and scalability.
Collaborate with product, AI, and legal domain experts to deliver high-quality, reliable data solutions.
Noxtua is Europe’s sovereign Legal AI. This legally competent AI covers the entire spectrum of legal text work – from information gathering (research) and analysis of complex issues (understanding) to document creation (drafting). The legally compliant AI meets the professional, criminal, and data protection requirements for lawyers (e.g. Section 203 German Criminal Code, Section 43e German Federal Code for Lawyers), is certified according to BSI C5, TISAX, ISO 27001, 9001, 27018, 27017, and 42001. The tech company Noxtua has formed exclusive partnerships with leading European publishing houses from Germany, Austria, Switzerland, Poland, the Czech Republic, and Slovakia for the Legal AI Workspaces Beck-Noxtua, MANZ-Noxtua, Swiss-Noxtua, Beck-Noxtua Poland, Beck-Noxtua Czech Republic and Beck-Noxtua Slovakia.
Founded in 2017 in the German capital as a result of a research project by Dr. Leif-Nissen Lundbæk and Professor Dr. Michael Huth at Oxford University and Imperial College London, the European legal tech company has many years of experience in developing GDPR-compliant AI solutions and now has offices in Paris, Berlin, Zagreb, and Munich. Strategic partners including Germany’s leading legal publisher C.H.BECK as well as the leading law firms CMS and Dentons have invested around 81 million EURO in the European scaleup as part of its Series B.