About
Who l am
I am a passionate Software Engineer and Machine Learning Developer with over 7 years of experience designing and building scalable software systems, intelligent web applications, and AI-powered solutions. I specialize in crafting robust full-stack applications using React, Next.js, Node.js, Laravel, and AWS, and developing machine learning models for predictive analytics, NLP, and computer vision.
Services
My Specializations
Software Engineering / Full-Stack Development
I build scalable, efficient, and secure software solutions from end to end — covering everything from front-end user interfaces to back-end systems and cloud infrastructure. With expertise in JavaScript, TypeScript, Node.js, Laravel, React, Vue.js, and AWS, I design and develop applications that combine strong architecture with great user experiences. Whether it’s a web platform, internal tool, or enterprise system, I focus on clean code, maintainability, and performance.
Machine Learning
I create intelligent systems that learn from data to deliver actionable insights and automation. My experience spans from data preprocessing and model training to deployment and integration using frameworks like TensorFlow, PyTorch, and Scikit-Learn. From predictive analytics and computer vision to natural language processing, I aim to bridge software engineering with AI to solve real-world problems effectively.
Featured AI Projects
Featured Projects
Built a Retrieval-Augmented Generation (RAG) chatbot that answers NASA policy queries with 100% citation accuracy from 212 official documents.
Key Insights:
- Semantic search (OpenAI embeddings) outperforms keyword matching by 40% in relevance.
- gpt-4o-mini + ChromaDB delivers 2.8s avg latency and zero hallucinations via grounded retrieval.
- Page-level citations (e.g., N_PR_9420_001A.pdf#page=4) ensure auditability and trust.
- Conversational memory enables follow-up questions with full context.
Impact: Reduces policy lookup time from hours to seconds, supporting onboarding, compliance, and knowledge retention.
Tech: Python, LangChain, Streamlit, OpenAI, ChromaDB
NASA Policy Assistant: RAG-Powered
An exploratory data analysis project that uncovers hidden patterns in customer purchasing behavior using association rule mining and frequent itemset discovery.
Tech Stack: MLxtend, Pandas, NetworkX, Matplotlib, Plotly
Key Features: Apriori algorithm, association rules, network visualization
Business Impact: Optimizes product placement, cross-selling strategies, and inventory management
Top Cross-Selling Opportunities:
- Organic D'Anjou Pears → Organic Bananas (2.5x lift)
- 32% of pear buyers also purchase organic bananas
- Strategic placement recommendation: Co-locate these premium organic fruits
Strongest Purchase Patterns:
- Organic Fuji Apple → Regular Bananas (36.7% confidence)
- Cross-category association between organic and conventional produce
- Digital recommendation engine opportunity
High-Frequency Combinations:
- Organic Avocado → Banana (1.86% support)
- Affects ~60,000 monthly transactions when scaled
- Ideal for promotional bundling and inventory coordination
Retail Market Basket Analysis
A comprehensive real estate valuation system that accurately predicts California property prices using advanced feature engineering and ensemble machine learning methods. This production-ready solution demonstrates the application of data science to real-world business problems in the real estate sector.
Tech Stack: Scikit-learn, XGBoost, Feature-engine, Matplotlib, Streamlit
Key Features: Extensive feature engineering, hyperparameter tuning, interactive price calculator
Business Impact: Enables accurate property valuation for buyers, sellers, and investors
Top Predictive Features
Median Income (25% impact) - Strongest single predictor with 0.69 correlation to house prices
Geographic Coordinates (34% combined impact) - Latitude and longitude capturing location premium
Room Characteristics (12% impact) - Average rooms and room-to-bedroom ratios
Coastal Proximity (8% impact) - Distance-to-coast as premium location indicator
House Age (10% impact) - Non-linear relationship with property values
Advanced House Price Prediction Engine
An end-to-end machine learning solution that predicts customer attrition for telecom companies. This interactive tool helps businesses identify at-risk customers and implement proactive retention strategies.
- Tech Stack: Python, Scikit-learn, XGBoost, Pandas, Streamlit
- Key Features: Binary classification, feature importance analysis, probability scoring
- Business Impact: Reduces customer acquisition costs by identifying retention opportunities
Critical Risk Factors Identified
- Contract Type: Month-to-month customers have 42% churn rate vs 3% for two-year contracts
- Payment Method: Electronic check users show 45% churn rate - the highest among all payment methods
- Internet Service: Fiber optic customers churn at 42% vs 19% for DSL customers
- Tenure Impact: New customers (0-12 months) have ~50% churn rate vs <10% for 5+ year customers
- Service Bundles: Customers with fewer than 2 additional services are 3x more likely to churn
Customer Churn Prediction Dashboard
portfolio
Featured Software Projects
contact
Let's Work Together!
hello@brightwilliamsboakye.com
* Marked fields are required to fill.