skip to content
Site header image Dimas Wahyu Saputro

Our Work

  • Assistant Intelligence Recruitment and Auto-checker (AIRA)

    I developed AIRA, an AI-powered tool designed to streamline the hiring process for HR teams. Manually screening countless CVs is a significant bottleneck, so I built AIRA to tackle this in two ways. First, it intelligently sifts through a bulk upload of CVs to recommend the top candidates based on required skills. Second, it includes a document honesty checker that compares an application against supporting documents, providing a validity score to ensure consistency. This dual-functionality saves time, reduces manual effort, and helps HR make more accurate, data-driven decisions.

    • Technologies Used: Gemini, LangChain, LangGraph, Pinecone, PyTorch, Docker, Streamlit
    • View on GitHub

  • Automated Data Talent Recommendation System

    To solve the challenges of manual, slow, and potentially biased internal talent requests, I co-developed an intelligent recommendation system. This web-based application, built with Streamlit, uses an LLM to understand project requirements in natural language and automatically matches them against a database of available data talent. The system goes beyond simple matching by integrating with Microsoft Power Automate to automatically generate official request documents and send multi-channel notifications via Teams, Email, and WhatsApp. The solution significantly streamlined the workflow, cutting selection time by 75% and increasing matching effectiveness.

    • Technologies Used: LLM, Streamlit, Microsoft Power Automate, Apilogy API, SharePoint

  • LaTeX Template for Data Science Theses

    I developed a comprehensive LaTeX template to help Data Science students at Institut Teknologi Sumatera streamline their thesis writing. This project solves the common problem of inconsistent formatting and allows students to focus on their research content rather than on document styling.


  • MBKM EvalAuto: Streamlined Mentor Evaluation System

    To address a key pain point for mentors in the Kampus Merdeka (MBKM) program, I created MBKM EvalAuto. This automation tool eliminates the tedious and error-prone task of manually transferring student evaluation scores from Google Sheets to the official MBKM platform. By automating this process, the tool significantly reduces the time and effort required for mentors, while ensuring greater accuracy and consistency in the evaluation process. It was a practical solution to a real-world problem I observed, and it was adopted by over 200 mentors to streamline their workflow.

    • Technologies Used: Python, Requests, Google Colab, Google Sheets API
    • View on GitHub

  • End-to-End Ecommerce ELT Pipeline

    For this project, I built a complete data infrastructure from the ground up, mirroring a real-world ecommerce environment. I managed diverse data sources and orchestrated an end-to-end ELT (Extract, Load, Transform) pipeline. The process involved extracting raw data, loading it into a PostgreSQL data warehouse, and then running automated transformations using Pandas and dbt. To ensure reliability, I orchestrated the entire workflow with Apache Airflow and even implemented a Telegram bot for real-time notifications on pipeline status. Finally, I brought the data to life by creating insightful dashboards in Metabase. The entire system was containerized using Docker and Docker Compose for easy deployment and scalability.

    • Technologies Used: Apache Airflow, dbt, PostgreSQL, Metabase, Docker, Pandas, Telegram API
    • View on GitHub

  • Using Fashion-MNIST Dataset for Decision Making

    This project aimed to find the optimal balance between performance and cost for an image classification task. Using the Fashion-MNIST dataset, I experimented with three different Convolutional Neural Network (CNN) architectures, analyzing the trade-offs between their test accuracy and training time. The analysis concluded that while more complex models yielded higher accuracy, a simpler model offered the best value. I also evaluated various cloud deployment options, ultimately recommending Streamlit Cloud as the most efficient choice for this use case.

    • Technologies Used: Python, CNNs, Streamlit, Google Cloud

  • Student Satisfaction Survey on Campus Facilities

    I collaborated on a research project to evaluate student satisfaction with campus facilities at ITERA. We designed and implemented a survey using a two-stage cluster random sampling method to gather representative data from Data Science students. My work involved analyzing the collected data using statistical methods like the Kruskal-Wallis test and Spearman correlation to test hypotheses. We also performed text analysis on open-ended feedback, creating a word cloud that clearly visualized key concerns, such as a need for better classroom air conditioning. The final report provided the university with actionable, data-driven recommendations.

    • Technologies Used: Statistical Analysis, Survey Sampling, R/Python, Text Mining

  • Text Analytics for Regulatory Alignment

    Navigating complex legal frameworks is a major challenge for businesses. I built this tool to analyze and visualize the alignment of legislation within Indonesia's investment sector. The application allows a user to upload multiple legal documents as PDFs and, using TF-IDF and cosine similarity, it calculates how closely related the documents are. The results are displayed in an interactive heatmap, providing an intuitive way to identify consistencies and discrepancies across the regulatory landscape.

    • Technologies Used: Streamlit, Scikit-learn, Pandas, Plotly, PyPDF2
    • View on GitHub

  • EmoJournal: Speech Emotion Recognition for Mental Health

    Recognizing the growing need for accessible mental health tools, I led the development of EmoJournal. This project explores how speech analysis can identify early warning signs of mental health challenges that might otherwise go unnoticed. As the Project Manager and a Machine Learning Engineer, I guided the team in building a system that analyzes the acoustic features of a user's voice to detect emotional states. We believe that by leveraging AI and machine learning, we can create more empathetic and proactive healthcare solutions.


  • COVID-19 Data Visualization in Indonesia

    During the height of the COVID-19 pandemic, I took on the challenge of making complex public health data accessible and understandable. Using a large dataset from Kaggle, I developed a series of interactive dashboards in Tableau to visualize the spread of the virus across Indonesia. The goal was to transform raw statistics into clear, eye-catching visuals—including graphs, diagrams, and geo-mapping—that allowed viewers to easily track trends and understand the situation at a glance.