Think of how web developers use relational databases like Postgres or Snowflake. The database provides a simple table interface while handling complex operations behind the scenes:
- File storage and indexing
- Transaction management
- Query optimization
- Converting declarative queries into file operations
This frees developers to focus on application logic and data modeling rather than infrastructure plumbing.
Pixeltable’s Approach
Pixeltable follows a similar philosophy but extends it to AI workflows. It provides a familiar table interface that encompasses both data and transformation logic
- You control:
- What transformations to apply
- Which models to use
- How to structure your algorithms
- Pixeltable handles:
- Orchestration
- Transactional storage
- Data retrieval
- Incremental updates
The interface is easily extensible through user-defined functions (UDFs).
Video Processing Example
This approach shines with video processing. Pixeltable includes built-in functionality for:
- Video frame extraction
- Audio separation
- Frame sampling at specified rates
- Automated metadata extraction
You can extend this through:
- Custom object detection models
- Integration with specialized video processing libraries
For example:
# Create a video table
videos = pxt.create_table('videos', {'video': pxt.Video})
# Create a frames view with automatic frame extraction
frames = pxt.create_view(
'frames',
videos,
iterator=FrameIterator.create(video=videos.video, fps=1)
)
# Add object detection
frames['detections'] = yolox(frames.frame, model_id='yolox_s')
Image Processing Example
Similarly for image processing. Pixeltable provides:
- Image loading and storage
- Basic transformations (resize, crop, rotate)
- Efficient batch processing
- Automatic format handling
You can extend with:
- Custom computer vision models
- Image segmentation algorithms
- Feature extraction pipelines
- Integration with deep learning frameworks
For example:
# Create an image table
images = pxt.create_table('images', {'image': pxt.Image})
# Add embeddings for similarity search
images['embedding'] = clip_image(images.image, model='openai/clip-vit-base-patch32')
# Create an embedding index
images.add_embedding_index('embedding', metric='cosine')
Key Benefits
- Simplified Data Management:
- No need to manage frame extraction scripts
- Automatic handling of video/image formats
- Built-in versioning and lineage tracking
- Efficient Processing:
- Intelligent caching of processed frames
- Incremental updates for model outputs
- Parallel processing where possible
- Declarative Interface:
- Express complex video/image pipelines in simple table operations
- Chain transformations easily
- Query processed results efficiently
- Development Focus:
- You maintain full control over domain-specific logic where your product adds value
- Pixeltable handles the undifferentiated heavy lifting of data operations
- Focus on innovation rather than infrastructure
For example, a complete workload might look like:
# Define custom processing logic
@pxt.udf
def custom_video_analytics(frame: PIL.Image.Image) -> dict:
# Your specialized analysis code here
return results
# Manage data
videos = pxt.create_table('videos', {'video': pxt.VideoType()})
frames = pxt.create_view('frames', videos,
iterator=FrameIterator.create(video=videos.video, fps=1))
# Add computed columns
frames['detections'] = yolox(frames.frame)# Object detection
frames['analytics'] = custom_video_analytics(frames.frame) # Custom analysis
This pipeline automatically handles:
- Frame extraction and caching
- Parallel processing where possible
- Incremental updates when new videos are added
- Efficient storage and retrieval
- Search capabilities
While you focus on:
- Detection algorithms
- Analytics logic
- Model selection and tuning
- Business-specific requirements
The result is a more efficient development process that lets you focus on your core competencies while Pixeltable handles the complex data management infrastructure.