© 2014 - M. Altıntaş tarafından ile tasarlandı
English (United Kingdom)Türkçe(Türkiye)

Intro

Hi!, I am Mucahit

I have received degree of Ph.D. at Istanbul Technical University's Department of Computer and Informatics Engineering. I also worked at the same university as a research assistant. I had been a visiting researcher at the University of Manchester for almost a year while studying Ph.D. I am a member of the ITU Natural Language Processing (NLP) Research Group and the Marie Curie Alumni Association. My research subjects are defined by terminologies such as natural language processing (NLP), large language models (LLMs), data mining, dependency parsing, active learning, document understanding and deep learning. Currently, I have been working as a senior AI specialist in a research-development center since November 2022.

Question & Answer

Why is there such a page?
Actually, I had drafted this page in 2014, but because I couldn't decide what to put as content, it was one of the dusty files on my computer. From time to time, I needed the existence of such a page in order to share some things quickly, but I always postponed it becasue there was not enough time or it's more trouble than it's worth. Fortunately, the page is live now. Although the content needed enrichment, it is not bad in terms of design. I hope its content will evolve over time as well. I should not go without saying that it is an organic product in terms of design. Let's hope for the best..
Do the pictures on the page belong to you?
Yes, I can say that some of it belongs to me. I want to post some of the pictures I took here as a background. After all, we need to spread the beautiful things ;)
Where is the music from?
I can say that they are from all over the world's music that I like.
What did you study for your PhD?
Dependency parsing, a syntactic analysis approach, aims to identify the relations of the words that form the sentence. It is used as a fundamental component that enhances or improves the performance of many natural language processing (NLP) problems such as machine translation, information extraction, semantic analysis, question answering, and so on. While transition-based dependency parsing techniques offer a rich feature set and effective computational cost, their parsing performance falls behind graph-based dependency parsing approaches owing to error propagation. On the other side, the key problem with graph-based parsing is determining how to broaden the collection of features in order to improve performance. Some previous researches have attempted to expand the feature scope of graph-based algorithms, but their proposed methods require sophisticated components or computationally expensive inference approaches. In this research, we provide an alternate technique for increasing the efficiency of the graph-based dependency parser by equipping a bi-affine layer with augmented global and local features. Our experimental outcomes show that the proposed enhancements increase dependency parsing performance. Furthermore, we outperformed the current State of Art studies in both Turkish and English, attaining 82.60% and 76.33% in Turkish, and 93.27% and 91.34% in English, unlabeled (UAS) and labeled attachment scores (LAS) respectively. Ph.D. A. Cüneyd Tantuğ supervised this study. You can find more detail in our article.
What did you study for your master?
My master's thesis was on automatic classification of assistance tickets using artificial intelligence techniques, and even auto-replying some of them. The loss or minimization of physical face-to-face interaction between customers and staff affects the customers’ satisfaction negatively. In the current competitive world, organizations and companies need to improve online support systems by increasing the quality of the support services, responding quickly and minimizing procedural steps. Further, in an organization, better allocation and efficient management of valuable support resources have critical importance to reduce costs. Many support systems in use are not effective in meeting the pointed demands of the organizations. In this study, automated assignment of the issue tickets to the related person or unit in the support staff is carried out by using an online active learning technique. We intend to minimize manual efforts and human errors while improving customers’ satisfaction and service quality. The obtained results provide better allocation and effective usage of the valuable support resources for the needs of organizations. Ph.D. A. Cüneyd Tantuğ was the supervisor in this study.
Could you explain your project on the estimation of HS codes of products in international trade?
The Hscode (Harmonized System code) is a standardized numerical technique for categorizing traded items. It was generated by the World Customs Organization (WCO) and is widely used for identifying items in international trade. Each HS code usually has six digits long, however several nations add extra digits for categorization. In this project using linguistic models and vector databases an assistant tool has been developed to suggest product specialized HScode for Turkiye.
Could you tell me about your project on analysis of phone calls?
This project is designed to empower contact centers and customer service teams by enhancing their support capabilities. Using advanced speech-to-text and speaker diarization algorithms, the system transcribes calls, identifying and segmenting individual speakers for greater clarity and analysis. Key features include automated topic recognition, which detects the main subjects of conversation and keyword alerting, allowing the system to flag pre-defined terms that might indicate customer sentiment, urgency, or compliance issues. By leveraging natural language processing (NLP) and machine learning, this project not only helps agents respond more promptly but also facilitates post-call analysis, identifying areas for improvement. As a result, it contributes to improved customer satisfaction, optimized operational efficiency, and the continuous upskilling of customer service agents.
What about Document Layout Extraction System (DoLES)?
The automatic comprehension of a document's content is critical for accelerating the official process and minimizing error-prone human labor. This study employs a self-attention based multi-modal neural network design. Text content that is obtained by OCR (Optical Character Recognition) of the document and visual partitioning of matching parts of the document are both utilized to assign a label to each component (token, logo, signature, etc.) in the document. To create the training set, the open source Label Studio Platform (https://labelstud.io/) is employed. The developed algorithm can accurately extract layouts with a 95% F1 score. This project is overseen by Türkiye's Ministry of Industry.
There is also news tracker system in your projects. Could you explain it?
Keeping up with current legislation and news regarding a specific topic and communicating it to relevance customers is a job that requires serious time and effort. This project aims to make this job autonomous. Relevant content was accessed through web-scraping methods and the relevance of the content was determined with NLP methods. Selenium, Large Language Models, Flask API, SMTP technologies are exploited in this project.
Could you tell me about your project that is called 'AI Powered Document Management System'?
It can properly identify over a hundred different types of documents with a 97% success rate owing to the created artificial intelligence-supported archive management system. The document's class is determined using both optical character recognition (OCR) and document visual pictures.
Do you apply reinforcement learning in your projects?
Yes. Error propagation problem is seen in many artificial intelligence solutions that contain sequential decisions. The transition-based dependency parsing method is subject to error propagation due to its greedy nature. Some recent studies showed that reinforcement learning can be used to deal with error propagation. I apply it on transition-based dependency parsing methods. The goal of this research is to create solutions that are resistant to error propagation by utilizing reinforcement learning techniques, which is an advanced machine learning method. Ph.D. A. Cüneyd Tantuğ is the supervisor in this study.
Could you tell me about your project on extracting the important keywords for a query in text?
Artificial intelligence approaches increase human life quality in a variety of ways. In this project, an assistant is created to find relevant keywords with the supplied query inside patient notes in order to improve the treatment process of physicians. The proposed approach can discover 87 percent of relevant terms annotated by human experts in the NBME Clinical Patient Notes dataset.
What is the Extended Version of Iterative Averaging Baseline Correction Method?
Baseline correction is a pre-processing technique that is used to deal with data from sensors such as breath flow volume. In this work, the iterative averaging approach for determining baseline is extended to be used for a developed wearable piezoelectric airflow transducer sensor. Ph.D. Yi Li supervised this study.
You studied activity recognition by wearable sensors embedded on textile structure. Could you tell more about it?
In this study, all phases of activity recognition; design, data acquisitions and model development are handled. The essential requirements to design a sensor-based activity recognition system; measurement frequency, number of subjects, location of sensors, and sensor types are explored. Data acquisition procedures and collected data are introduced. Different feature domains; spectrogram, wavelet and time and deep learning models; CNN, LSTM and hybrids of them are examined. Comparable recognition performances against state-of-arts are obtained. This study has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No 644268. Ph.D. Yi Li and Ph.D. Senem Kurşun Bahadır supervised this study.

question & answer intro