Multimodal AIData Infrastructure
The only open source Python library that provides incremental storage, transformation, indexing, and orchestration of your multimodal data.
From data chaos to production apps:
Unify Storage and Orchestration
Build multimodal AI apps and agentic workloads with a few lines of code without losing flexibility. Ship to production in days, not months.
The Anatomy of a Multimodal AI Agent
Explore the code behind Pixelbot, built on Pixeltable. See how declarative data infrastructure simplifies workflows like RAG, tool use, and multimodal data handling.
1pxt.create_dir("agents", if_exists="ignore")23# === DOCUMENT PROCESSING ===4documents = pxt.create_table(5 "agents.collection",6 {7 "document": pxt.Document,8 "uuid": pxt.String,9 "timestamp": pxt.Timestamp,10 "user_id": pxt.String11 },12)13chunks = pxt.create_view(14 "agents.chunks",15 documents,16 iterator=DocumentSplitter.create(17 document=documents.document,18 separators="paragraph",19 metadata="title, heading, page"20 ),21)22chunks.add_embedding_index(23 "text",24 string_embed=sentence_transformer.using(25 model_id=config.EMBEDDING_MODEL_ID26 ),27)2829@pxt.query30def search_documents(query_text: str, user_id: str):31 sim = chunks.text.similarity(query_text)32 return (33 chunks34 .where((chunks.user_id == user_id) & (sim > 0.5))35 .order_by(sim, asc=False)36 .select(37 chunks.text,38 source_doc=chunks.document,39 sim=sim40 )41 .limit(20)42 )4344# === IMAGE PROCESSING ===45images = pxt.create_table(46 "agents.images",47 {48 "image": pxt.Image,49 "uuid": pxt.String,50 "timestamp": pxt.Timestamp,51 "user_id": pxt.String52 },53)54images.add_computed_column(55 thumbnail=pxt_image.b64_encode(56 pxt_image.resize(images.image, size=(96, 96))57 ),58)59images.add_embedding_index(60 "image",61 embedding=clip.using(model_id=config.CLIP_MODEL_ID),62)6364# ... and so on for Video, Audio, Memory, Chat History, Personas, Image Generation ...6566# === AGENT WORKFLOW DEFINITION ===67tools = pxt.tools(68 functions.get_latest_news,69 functions.fetch_financial_data,70 search_video_transcripts,71)7273tool_agent = pxt.create_table(74 "agents.tools",75 {76 "prompt": pxt.String,77 "timestamp": pxt.Timestamp,78 "user_id": pxt.String,79 "initial_system_prompt": pxt.String,80 "final_system_prompt": pxt.String,81 "max_tokens": pxt.Int,82 "temperature": pxt.Float,83 },84)8586# === DECLARATIVE WORKFLOW WITH COMPUTED COLUMNS ===87# Step 1: Initial LLM Reasoning (Tool Selection)88tool_agent.add_computed_column(89 initial_response=messages(90 model=config.CLAUDE_MODEL_ID,91 system=tool_agent.initial_system_prompt,92 messages=[{"role": "user", "content": tool_agent.prompt}],93 tools=tools,94 tool_choice=tools.choice(required=True),95 ),96)9798# Step 2: Tool Execution99tool_agent.add_computed_column(100 tool_output=invoke_tools(tools, tool_agent.initial_response)101)102103# Step 3: Context Retrieval (Parallel RAG)104tool_agent.add_computed_column(105 doc_context=search_documents(tool_agent.prompt, tool_agent.user_id)106)107# ... other context retrieval steps ...108109# Step 7: Final LLM Reasoning (Answer Generation)110tool_agent.add_computed_column(111 final_response=messages(112 model=config.CLAUDE_MODEL_ID,113 system=tool_agent.final_system_prompt,114 messages=tool_agent.final_prompt_messages,115 ),116)117118# Step 8: Extract Final Answer Text119tool_agent.add_computed_column(120 answer=tool_agent.final_response.content[0].text121)
Declarative AI pipeline: tables → embeddings → RAG → tools. One file replaces hundreds of lines of traditional ML code.
Declarative. Multimodal. Incremental.
Pixeltable automates storage, orchestration, incremental computation, & model execution. Focus on logic, not infrastructure.
1. Unified Data Foundation
Natively manage diverse data types (images, videos, audio, docs, embeddings) without duplication. Persistent, versioned tables. Eliminate separate DBs/stores.
1import pixeltable as pxt23# Create a directory for your tables4pxt.create_dir('demo_project')56# Define table with image and text columns7img_table = pxt.create_table(8 'demo_project.images',9 {10 'input_img': pxt.Image,11 'raw_text': pxt.String # For UDF example12 }13)1415# Insert data (paths or URLs and text)16img_table.insert([17 {18 'input_img': 'image1.jpg',19 'raw_text': 'Text for image 1'20 },21 {22 'input_img': 'image2.png',23 'raw_text': 'Text for image 2'24 }25])
Unify Storage and Orchestration
pip install pixeltable
→Your entire AI data stack
Reduction in pipeline complexity
Simplify your AI data pipelines with declarative processing
Faster development cycles
Accelerate your ML development with automated workflows
Lower infrastructure costs
Deploy serverless functions when you need them, all without leaving pixeltable
* Performance metrics based on typical use cases and internal benchmarks.
Build Production-Ready AI Applications
Accelerate your multimodal workflows with unified data infrastructure for AI.
Computer Vision
Automate complex CV workflows with unified data management and declarative Python.
frames.add_computed_column(
objects=yolox(frames.frame)
)
RAG & Semantic Search
Build reliable RAG systems with auto-synced multimodal indexes, simplifying vector DB management.
docs.add_embedding_index(
'content', embedding=clip
)
Build AI Agents Faster
Unified infrastructure for agent data, state, and tools. Focus on agent logic, not plumbing.
@pxt.udf
def agent_tool(query: str):
return process(query)
A New Kind of Multimodal AI DatabaseStart building with Pixeltable today
Join ML engineers and data scientists using Pixeltable to build powerful multimodal AI applications with unified data management and orchestration.