top of page

COGNIZ

GATEWAY TO DATA AUTOMATION RESEARCH

Profile

Join date: Oct 15, 2019

Posts (13)

Jan 4, 2026 ∙ 32 min

Building an Advanced OCR System on Diverse Documents with DeepSeek and Gemma

Optical Character Recognition (OCR) has grown from simple text extraction to understanding complex documents. In this post, we’ll explore how to train two cutting-edge OCR models – DeepSeek-OCR and Gemma 3 – using PyTorch on a personal, air‑gapped server. We’ll cover the unique challenges of OCR on handwritten notes, printed forms, and invoices, why DeepSeek and Gemma are ideal in secure low-resource settings, how to set up and train them offline, and how to evaluate their performance....

Nov 23, 2025 ∙ 4 min

JEPA World Models: Innovative Predictive Learning Across Images, Video, and Agents

Joint-Embedding Predictive Architectures (JEPAs) are a family of models that learn by predicting high-level features rather than pixels. They unify image-based learning (I-JEPA), video-based learning (V-JEPA), and general predictive world models for autonomous agents. If you zoom out a bit, modern self‑supervised vision methods mostly fall into two categories: Invariance-based : given two augmented views of the same image, force the encoder to produce almost identical embeddings, and push...

Jul 27, 2025 ∙ 5 min

Universal App Launcher: Build Once, Use Everywhere

Every AI app wiring directly to multiple tools using multiple integrations lead brittle prompts and duplicated glue code. Here comes a solution - MCP (Model Context Protocol) , a standard client–server contract . Each Host ships one MCP Client and each capability lives behind an MCP Server . New pairings require zero new glue . It insures safer tool use, portable integrations, and faster iteration. MCP Overview An architecture diagram depicting the infrastructure prior to MCP...

Srijon Mandal

Admin

bottom of page