Dimas Wahyu Saputro

Our Work

Jun 5, 2025
Assistant Intelligence Recruitment and Auto-checker (AIRA)

Agentic Chatbot AI Engineer Work Together
I developed AIRA, an AI-powered tool designed to streamline the hiring process for HR teams. Manually screening countless CVs is a significant bottleneck, so I built AIRA to tackle this in two ways. First, it intelligently sifts through a bulk upload of CVs to recommend the top candidates based on required skills. Second, it includes a document honesty checker that compares an application against supporting documents, providing a validity score to ensure consistency. This dual-functionality saves time, reduces manual effort, and helps HR make more accurate, data-driven decisions.
- Technologies Used: Gemini, LangChain, LangGraph, Pinecone, PyTorch, Docker, Streamlit
- View on GitHub

Mar 20, 2025
Automated Data Talent Recommendation System

AI Engineer Chatbot Agentic AI Work Together Automation Competition
To solve the challenges of manual, slow, and potentially biased internal talent requests, I co-developed an intelligent recommendation system. This web-based application, built with Streamlit, uses an LLM to understand project requirements in natural language and automatically matches them against a database of available data talent. The system goes beyond simple matching by integrating with Microsoft Power Automate to automatically generate official request documents and send multi-channel notifications via Teams, Email, and WhatsApp. The solution significantly streamlined the workflow, cutting selection time by 75% and increasing matching effectiveness.
- Technologies Used: LLM, Streamlit, Microsoft Power Automate, Apilogy API, SharePoint

Jul 1, 2024
LaTeX Template for Data Science Theses

Solo Dev Academic Quest Research
I developed a comprehensive LaTeX template to help Data Science students at Institut Teknologi Sumatera streamline their thesis writing. This project solves the common problem of inconsistent formatting and allows students to focus on their research content rather than on document styling.
- Technologies Used: LaTeX, Overleaf, Github Actions
- View on GitHub

Jun 1, 2024
MBKM EvalAuto: Streamlined Mentor Evaluation System

Data Engineer Automation Python Solo Dev Bangkit Academy
To address a key pain point for mentors in the Kampus Merdeka (MBKM) program, I created MBKM EvalAuto. This automation tool eliminates the tedious and error-prone task of manually transferring student evaluation scores from Google Sheets to the official MBKM platform. By automating this process, the tool significantly reduces the time and effort required for mentors, while ensuring greater accuracy and consistency in the evaluation process. It was a practical solution to a real-world problem I observed, and it was adopted by over 200 mentors to streamline their workflow.
- Technologies Used: Python, Requests, Google Colab, Google Sheets API
- View on GitHub

Dec 31, 2023
End-to-End Ecommerce ELT Pipeline

Data Engineer Python ELT Work Together Data Modeling Data Visualization
For this project, I built a complete data infrastructure from the ground up, mirroring a real-world ecommerce environment. I managed diverse data sources and orchestrated an end-to-end ELT (Extract, Load, Transform) pipeline. The process involved extracting raw data, loading it into a PostgreSQL data warehouse, and then running automated transformations using Pandas and dbt. To ensure reliability, I orchestrated the entire workflow with Apache Airflow and even implemented a Telegram bot for real-time notifications on pipeline status. Finally, I brought the data to life by creating insightful dashboards in Metabase. The entire system was containerized using Docker and Docker Compose for easy deployment and scalability.
- Technologies Used: Apache Airflow, dbt, PostgreSQL, Metabase, Docker, Pandas, Telegram API
- View on GitHub

Nov 23, 2023
Using Fashion-MNIST Dataset for Decision Making

Work Together Academic Quest Data Analyst Deep Learning Research
This project aimed to find the optimal balance between performance and cost for an image classification task. Using the Fashion-MNIST dataset, I experimented with three different Convolutional Neural Network (CNN) architectures, analyzing the trade-offs between their test accuracy and training time. The analysis concluded that while more complex models yielded higher accuracy, a simpler model offered the best value. I also evaluated various cloud deployment options, ultimately recommending Streamlit Cloud as the most efficient choice for this use case.
- Technologies Used: Python, CNNs, Streamlit, Google Cloud

Nov 22, 2023
Student Satisfaction Survey on Campus Facilities

Work Together Academic Quest Data Analyst Research
I collaborated on a research project to evaluate student satisfaction with campus facilities at ITERA. We designed and implemented a survey using a two-stage cluster random sampling method to gather representative data from Data Science students. My work involved analyzing the collected data using statistical methods like the Kruskal-Wallis test and Spearman correlation to test hypotheses. We also performed text analysis on open-ended feedback, creating a word cloud that clearly visualized key concerns, such as a need for better classroom air conditioning. The final report provided the university with actionable, data-driven recommendations.
- Technologies Used: Statistical Analysis, Survey Sampling, R/Python, Text Mining

Aug 23, 2023
Text Analytics for Regulatory Alignment

Data Scientist App Work Together Document Analytics Machine Learning Competition
Navigating complex legal frameworks is a major challenge for businesses. I built this tool to analyze and visualize the alignment of legislation within Indonesia's investment sector. The application allows a user to upload multiple legal documents as PDFs and, using TF-IDF and cosine similarity, it calculates how closely related the documents are. The results are displayed in an interactive heatmap, providing an intuitive way to identify consistencies and discrepancies across the regulatory landscape.
- Technologies Used: Streamlit, Scikit-learn, Pandas, Plotly, PyPDF2
- View on GitHub

Jun 29, 2023
EmoJournal: Speech Emotion Recognition for Mental Health

AI Engineer Work Together App Deep Learning Bangkit Academy
Recognizing the growing need for accessible mental health tools, I led the development of EmoJournal. This project explores how speech analysis can identify early warning signs of mental health challenges that might otherwise go unnoticed. As the Project Manager and a Machine Learning Engineer, I guided the team in building a system that analyzes the acoustic features of a user's voice to detect emotional states. We believe that by leveraging AI and machine learning, we can create more empathetic and proactive healthcare solutions.
- Technologies Used: TensorFlow, Python, Deep Learning, Speech Recognition APIs
- View Project Details, View on Github

Nov 5, 2022
COVID-19 Data Visualization in Indonesia

Solo Dev Data Visualization Academic Quest
During the height of the COVID-19 pandemic, I took on the challenge of making complex public health data accessible and understandable. Using a large dataset from Kaggle, I developed a series of interactive dashboards in Tableau to visualize the spread of the virus across Indonesia. The goal was to transform raw statistics into clear, eye-catching visuals—including graphs, diagrams, and geo-mapping—that allowed viewers to easily track trends and understand the situation at a glance.
- Technologies Used: Tableau
- View Visualization