AI / ML Platform

Qimta — BOQ AI Pricing Engine

B2B construction pricing platform for Saudi Arabia & GCC — an AI model powered by RAG extracts BOQ line quantities, matches 19,699+ verified products from manufacturer specs, and continuously self-trains through web scraping pipelines to stay current with market pricing.

🇸🇦 Saudi Arabia & GCC 🤖 Self-Training AI Model ✓ Live in Production 2024 – 2025

Visit Qimta.com → ← All Projects

🧠

Self-Training AI Model — No Manual Updates Needed

The RAG model continuously scrapes manufacturer websites, catalogs, and price databases to update its knowledge base automatically — keeping pricing accuracy above 99.9% without human intervention.

Project Overview

Qimta is a B2B SaaS platform that solves one of the biggest bottlenecks in construction procurement: pricing every BOQ (Bill of Quantities) line accurately and instantly. Traditional pricing requires days of manual research across supplier catalogs — Qimta's AI does it in under 60 seconds.

I contributed to the backend data pipeline and AI model infrastructure — building web scraping modules, data normalization pipelines, and PostgreSQL schemas for storing and querying the product knowledge base that powers the RAG engine.

AI Pipeline Architecture

🕷️

Web Scraper Python scrapers collect specs & prices from manufacturer sites

🔄

Normalize & Store Data cleaned, structured, and indexed in PostgreSQL vector DB

🧠

RAG Engine AI retrieves most relevant products for each BOQ line item

💰

Instant Pricing BOQ priced in <60 seconds with 99.9% accuracy

🔁

Self-Update Scraper re-runs on schedule to keep knowledge base current

Key Features & Contributions

Web Scraping Infrastructure Built Python scrapers targeting manufacturer product pages, catalogs, and spec sheets — handling pagination, dynamic JS content, and anti-scraping measures.

RAG Model Integration Retrieval-Augmented Generation pipeline connecting the vector knowledge base to LLM for accurate BOQ line matching with manufacturer-verified specs.

PostgreSQL Knowledge Base Designed relational + vector schemas for 19,699+ products with full-text search, category indexing, and version tracking for spec changes.

Data Normalization Pipeline Python ETL pipeline that cleans, deduplicates, and standardizes scraped product data into a consistent schema for AI consumption.

Scheduled Self-Training Cron-based scheduler re-triggers scraping and model reindexing automatically — ensuring the AI always reflects the latest manufacturer pricing without manual updates.

BOQ Dashboard & Quote Management User dashboard for uploading BOQ files, reviewing AI-generated pricing, tracking quote statuses, and managing construction projects.

Multi-Brand Product Matching AI cross-references specifications across multiple brands to find the best-matching product for each BOQ line — not just the cheapest, but the most spec-accurate.

Export & Reporting Priced BOQs exported to Excel/PDF with full product details, brand references, unit prices, and totals — ready for procurement teams.

Results & Scale

19,699+ Verified Products in Knowledge Base

1B+ Manufacturer Tech Specs Indexed

<60s Full BOQ Priced by AI

99.9% Pricing Accuracy Rate

RAG Self-Training AI Architecture

GCC Saudi Arabia + Gulf Countries

Technical Challenges

Scraping dynamic manufacturer websites at scale

Many manufacturers use JavaScript-heavy product catalogs, rate limiting, and anti-bot measures. Solved with headless browser scraping (Playwright/Selenium), rotating proxies, and respectful rate-limiting with exponential backoff.

Data quality and deduplication across sources

Same product scraped from multiple sources often had inconsistent naming, unit differences, and spec variations. Built a fuzzy matching + embedding similarity pipeline to deduplicate and canonicalize product records before storage.

Keeping the model current without full retraining

Full model retraining on every price update would be prohibitively expensive. Solved using RAG architecture — the retrieval layer updates its vector index incrementally on new/changed products, while the LLM component remains static, giving fresh pricing without retraining costs.

Project Details

Status ✓ Live

Category AI / ML SaaS

Market 🇸🇦 KSA + GCC

Timeline 2024 – 2025

Role Backend & AI Pipeline

Platform qimta.com