Research Areas

👤

Person Re-Identification

🌍

Geo-Localization

📝

Scene Text Recognition

🖼️

Image Captioning

🧬

Mitosis Detection

🔬

Nuclei Instance Segmentation

📚

Semi-Supervised Learning

Person Re-Identification

Person Re-Identification (ReID) focuses on the task of matching individuals across multiple images or camera views. Despite significant advancements in deep learning, existing ReID models often struggle to generalize effectively beyond their training environments due to domain shifts, variations in lighting, camera angles, and occlusions.

My research is dedicated to addressing these challenges by developing models that emphasize robust generalization, making them more adaptable to real-world scenarios. This includes techniques such as improving feature representations, leveraging domain adaptation methods, and enhancing the robustness of models to unseen variations.

The ultimate goal is to bridge the gap between laboratory results and real-world applications, enabling practical implementations in areas.

Geo-Localization

Geo-localization is a critical area in computer vision that focuses on determining the geographical location of an image. My research aims to address the challenges in visual place recognition (VPR), including variations in lighting, weather, and seasonal changes, as well as the limitations of GPS in certain scenarios like autonomous navigation in space or disaster areas.

Traditional datasets often lack dense coverage or fail to represent the diverse real-world conditions effectively. I work on developing solutions that leverage high-resolution visual data paired with precise location metadata to bridge these gaps. My goal is to enhance the generalization and robustness of VPR models, enabling them to perform consistently under varying conditions.

Scene Text Recognition

Scene Text Recognition (STR) focuses on extracting textual information embedded within natural scene images, playing a vital role in applications like document analysis, and automated translation systems. Despite its importance, STR remains a challenging task due to the diversity of text styles, orientations, and backgrounds in real-world images.

My work in STR aims to address several key challenges: improving the recognition of irregular and non-horizontal text arrangements, enhancing the handling of noisy, low-resolution images, and developing methods that generalize across languages. Current models often struggle with linguistic variations, diacritic characters, and diverse font styles, making them less applicable to non-English languages or real-world conditions.

To tackle these issues, my research emphasizes the creation of robust datasets and advanced models capable of handling multi-language text, diverse fonts, and environmental conditions. This includes exploring novel synthetic data generation techniques and innovative recognition architectures tailored for scene text's unique challenges.

Image Captioning

Image captioning involves generating descriptive text for images, enabling machines to interpret and describe visual data. This technology is crucial for applications such as assistive tools, autonomous systems, and image-based search engines.

My research particularly focuses on challenges like the lack of annotated datasets for underrepresented languages, including Turkish, noisy image contexts, and linguistic diversity. Using advanced techniques such as vision transformers and text decoders, I aim to enhance the capabilities of image captioning models to support the Turkish language alongside others.

By addressing these challenges, my work contributes to developing models that generate syntactically accurate and contextually meaningful captions, making them more versatile and applicable to diverse real-world scenarios.

Mitosis Detection

Mitosis detection is crucial in the field of digital pathology, as it plays a vital role in diagnosing and prognosing cancer. This task involves identifying mitotic cells within histopathological images, which is challenging due to the high variability in cell morphology, overlapping cells, and the presence of cells that mimic mitotic features.

My research focuses on addressing these challenges by developing robust and generalizable models. Key objectives include reducing false positives, tackling domain shift caused by variations in scanners, staining protocols, and tissue types, and ensuring high performance on out-of-domain datasets.

Utilizing advanced techniques like domain adaptation, deep learning architectures, and comprehensive datasets, I aim to enhance the accuracy and reliability of mitosis detection systems. These improvements are essential for integrating computer-aided diagnostic tools into clinical workflows, ultimately aiding pathologists in achieving faster and more consistent diagnoses.

Nuclei Instance Segmentation

Nuclei instance segmentation is a critical task in the analysis of histopathology images, enabling the accurate identification and quantification of individual cell nuclei. This process is essential for understanding tissue morphology and aiding in disease diagnosis, particularly in cancer studies.

My MSc thesis focused on addressing the challenges in nuclei instance segmentation, including the detection of closely clustered nuclei and handling domain variations across different datasets. The study proposed novel methodologies like HR-YOLO, optimized for small object detection, and YOLO-U, a fusion of semantic segmentation and object detection techniques.

This research introduced standardized training and testing protocols, ensuring fair comparisons of segmentation models across widely-used datasets such as MoNuSeg, CoNSeP, and CPM17. These contributions aim to bridge the gap between algorithmic advancements and their practical applications in medical image analysis.

Semi-Supervised Learning

Semi-supervised learning (SSL) bridges the gap between supervised and unsupervised learning by leveraging both labeled and unlabeled data during training. This approach is particularly beneficial in scenarios where acquiring labeled data is expensive or time-consuming, while unlabeled data is abundant.

My research in SSL focuses on domain adaptation, tackling the challenge of training models that generalize effectively across diverse and unseen domains. This includes developing robust pseudo-labeling techniques and ensemble learning strategies to improve model reliability and performance in real-world applications.

Serdar YILDIZ

Research Areas

Person Re-Identification

Geo-Localization

Scene Text Recognition

Image Captioning

Mitosis Detection

Nuclei Instance Segmentation

Semi-Supervised Learning