İlkay Şafak Baytar

Statistician

Specializing in data engineering, analytics, and machine learning

About Me

As a statistics graduate with a strong technical background in data engineering, data analytics, and machine learning, I specialize in large-scale data processing, ETL pipelines, data visualization, and modeling. I actively work with technologies such as SQL, Python, Apache NiFi, ClickHouse, Kafka, and Podman to develop solutions for data management and analytics. Passionate about enhancing data-driven decision-making processes and creating value for businesses, I continuously improve my expertise in data engineering and data science. In the long run, I aim to specialize as a Machine Learning Engineer or Data Scientist, focusing on deep learning, statistical modeling, and big data analytics.

Skills

SQL

Python

Apache NiFi

ClickHouse

Kafka

Podman

ETL

Data Visualization

Machine Learning

Statistical Modeling

Education

Hacettepe University

Ankara, Turkey

B.Sc. in Statistics; GPA: 2.83/4.00

Sep 2019 - Jun 2024

Developed strong foundation in statistical analysis, probability theory, and data modeling
Gained practical experience in data analysis using R and Python
Completed coursework in machine learning, time series analysis, and statistical computing

Work Experience

Data Engineer Intern

Burgan Bank - Istanbul, Turkey

Sep 2024 - Current

Developed and maintained ETL pipelines using Apache NiFi
Designed and containerized data processing workflows using Podman & Docker
Optimized ClickHouse queries for large-scale analytics
Integrated Flask-based APIs with internal data infrastructure
Conducted data validation, schema enforcement, and transformation
Collaborated with cross-functional teams for data-driven insights

Apache NiFi

ClickHouse

Podman

Docker

Flask

SQL

Data Analytics Intern

Arçelik Global A.Ş. - Istanbul, Turkey

Nov 2023 - Mar 2024

Assisted in data preprocessing and ETL pipeline development
Developed interactive dashboards using Power BI
Conducted SQL-based data extraction and transformation
Supported data pipeline automation
Utilized Python (Pandas, NumPy) and R for data visualization and statistical analysis

Power BI

SQL

Python

Pandas

NumPy

Projects

Lung Cancer EDA and Prediction

Streamlit Web Application

06/2024

Conducted Exploratory Data Analysis (EDA) on lung cancer dataset, identifying key factors and their relationships.
Developed machine learning models to predict lung cancer risk, utilizing naive bayes, clustering, and PCA.
Created a Streamlit web application for interactive data visualization and risk prediction, improving accessibility for users.

Python

Streamlit

Machine Learning

EDA

Data Analytics Challenge

KPMG

04/2023

Conducted case study for opening a coffee shop, utilizing socio-economic and rental cost data from Istanbul.
Provided strategic recommendations for coffee shop placement, contributing to business planning and decision-making.

Data Analytics

Geospatial Analysis

Business Strategy

Istanbul Solar Panel Data

Time Series Analysis

01/2023

Analyzed Istanbul solar panel data using advanced statistical techniques such as seasonal decomposition, regression, and ARIMA models, enabling accurate forecasting of energy generation.
Applied data analysis and model development skills to enhance predictive accuracy and computational efficiency.

Time Series Analysis

ARIMA

Regression

Energy Forecasting

Data Visualization For COVID Data

Shiny Web Application

05/2022

Developed a web application for visualizing Covid-19 data, providing an interactive platform for data exploration and analysis.
Implemented features including four different ggplot2 plots with variable selectors and data tables for different continents, enhancing the comprehensiveness and versatility of the application.

Shiny

ggplot2

Data Visualization

Get in Touch

Contact Me

Contact Information

ilkaybaytar@gmail.com

Istanbul, Turkey

GitHub