Engineering · Full-time

Senior AI Data Engineer (RAG / Retrieval / Python)

Remote (Ukraine, Europe)

Build and maintain production AI data pipelines and retrieval-driven workflows for a funded AI/data platform. You'll own the data engineering layer that powers RAG and LLM-integrated features at scale.

We’re hiring a Senior AI Data Engineer to join a client-facing team building an AI/data platform with heavy external data ingestion, non-standard data engineering, and retrieval-driven workflows.

This is not classic BI/reporting or data warehouse work, and it’s not a pure backend API role. The system ingests large volumes of non-standard external data, processes it through AI/LLM workflows, and makes it retrievable at scale. If you’ve built production RAG pipelines or vector-based retrieval systems and you’re comfortable operating in an ambiguous startup environment, this role is for you.

What you’ll do

Build and maintain data ingestion pipelines for structured and unstructured external data
Design and support retrieval pipelines for AI/LLM workflows — chunking, embedding, indexing, metadata enrichment
Develop Python backend services (FastAPI) that expose data to the broader platform
Own data flows across Postgres, object storage, vector search, and related stores
Improve platform reliability and retrieval performance as data volume grows
Collaborate closely with software engineers and AI teammates to evolve the system

The environment

The team is small and operates with minimal process overhead. The codebase is production-facing, and engineers own outcomes end-to-end — not just their individual tickets.

What we're looking for

5+ years of commercial software/data engineering experience
Strong commercial experience with Python
Hands-on experience building data pipelines and ingestion workflows
Hands-on experience with AI/LLM retrieval systems — RAG pipelines, vector search, embedding-based retrieval, or document ingestion / chunking / indexing workflows
Experience building or maintaining FastAPI or similar Python backend services
Experience with AWS data/cloud infrastructure
Experience with unstructured or semi-structured data
Strong SQL and practical data modeling skills
Ability to work independently in ambiguous product environments
Strong written and spoken English — all technical documentation and client reviews are in English

Nice to have

Production experience with vector databases such as Pinecone, pgvector, Qdrant, Weaviate, OpenSearch vector search, or FAISS
Experience with graph databases or connected-data modeling — Neo4j, Amazon Neptune
Experience with scraping-heavy or connector-heavy ingestion systems
Experience with LangChain, LangGraph, Haystack, LlamaIndex, or similar orchestration frameworks
Experience with Terraform
Experience supporting retrieval quality, latency, and production reliability
Experience with reranking, hybrid retrieval, or evaluation of retrieval quality
Experience with AI agent workflows or tool-calling systems
Experience with data governance, permissions, or enterprise knowledge access

What you'll get

Direct client access — work straight with technical decision-makers, no PM or agency layer in between
Long-term engagements with well-funded SaaS and regulated-enterprise clients
Senior-only environment — every peer has 5+ years and a track record we've vetted
Real production AI work on modern stacks — not legacy maintenance or POC theater
Fully remote, flexible hours, B2B contractor model with transparent rates
Growth path into tech-lead and architect roles on multi-year client engagements

Don't see the right role?

We occasionally hire for roles we haven't posted yet. Send your background to career@insoftex.com and we'll keep it on file.

Send a speculative application