Career
Principal Engineer (First Employee): daash.co
May 2021 - present (5 years)
5+ years building large-scale web data systems (1B+ signals/day, 60M+ requests/day). Founded and led data engineering and frontend teams, architecting end-to-end infrastructure from crawling to customer-facing product.
Key achievements:Web Crawling InfrastructureFounded and led Data Acquisition team (5 engineers)Built distributed crawling system processing 1B+ web signals/day with 98% uptime over 4 yearsWrote 50+ high-performance Scrapy crawlers (50k+ pages/min) with bot-evasion capabilities, including a broad-crawl system spanning 10k+ domainsArchitected serverless AWS infrastructure (Terraform, Docker, GitHub Actions CI/CD) with 1–3 minute deployments per crawlerDesigned custom ML pipeline (agglomerative clustering + KMeans) reducing scrape volume by 80%, cutting costs while preserving informational valueDeveloped statistical sampling methodology saving up to 90% on scrape volume, strengthening crawler defensibility Data Platform & AnalyticsFounded and managed data engineering team (10 engineers)Architected bronze/silver/gold medallion ETL pipeline (Python, SQL, Athena/S3) in a GitHub monorepo, normalizing unstructured web data through precompute layers with full PyTest coverageAchieved 10–15 minute end-to-end pipeline loads with transparent, auditable data management (write-audit-publish framework)Built time-series models (Python pandas, operationalized in SQL) generating production sales estimates, validated against ground truth via MAE and Pearson correlation Frontend & ProductLed frontend team (2 engineers)Independently architected and shipped production analytics dashboard in Next.js 15 app router design / React 19 (TypeScript, Tailwind, ShadCN, TanStack Query, Zustand) with DAL/DTO architecture (plus API routes), deployed on VercelImplemented full CRUD (DynamoDB), federated data queries (Athena), and user auth (AWS Cognito), achieving sub-2s latency per user query