👋

Hello, I'm Mohamed Hamed.
Software Engineer.

2+ years of hands-on experience in building modern, performant, and reliable web applications.

About

Hello, I'm Mohamed Hamed. I'm a software engineer with 2+ years of experience. I enjoy building reliable websites & apps. My focus is creating robust and user-centric solutions.

I am also deeply interested in contributing to open-source projects and exploring new technologies that push the boundaries of the web.

Projects

SNCF DelayFlow — TGV Delay Analytics Pipeline

End-to-end ETL pipeline processing TGV punctuality data (SNCF Open Data) across 3 sources (~18,000 records) with PySpark. Includes Spark SQL transformation, Parquet storage, a Random Forest Regressor (Scikit-learn) for delay prediction, and an interactive Streamlit/Plotly dashboard diagnosing root causes across infrastructure, traffic, and rolling stock dimensions.

PythonPySparkSpark SQLScikit-learnParquetStreamlitPlotlySQLite

GitHub

DataHub PFAS — Environmental Data Warehouse & BI

ETL pipeline over 929,452 measurements (CNRS Data Hub): filtering, unit conversion, PubChem API enrichment. Star schema in PostgreSQL 15 + PostGIS with 4 materialised views and 8 quality checks. MongoDB 2dsphere (~36,000 GeoJSON polygons), Flask API, Leaflet.js map and Looker Studio connector.

PythonPostgreSQLPostGISMongoDBFlaskLeaflet.jsLooker StudioPubChem API

GitHub

Skills

Python
Java
SQL
Bash
Apache Spark
PySpark
Apache Airflow
dbt
Scikit-learn
PostgreSQL
PostGIS
MongoDB
MariaDB
SQLite
Tableau
Streamlit
Plotly
Looker Studio
AWS S3
Docker
Kubernetes
Terraform
GitLab CI/CD
GitHub Actions

Work Experience

Data Engineer (Apprenticeship)

LCL

Sep 2025 – Sep 2026

Contributed to the design and development of an ETL pipeline (credit subscription propensity scoring): multi-source customer data ingestion at scale via Apache Spark (Java), physical modelling and technical specifications for an ML prediction model.
Data quality checks on ingested data (schema validation, duplicates, missing values); SQL query optimisation; flow orchestration with Airflow and transformations with dbt.
DAG monitoring in production, POCs on new data sources, technical documentation and code reviews in Agile.

PythonJavaApache SparkPySparkPostgreSQLMongoDBAirflowdbtTableauDockerGitLab CI/CD

Full-Stack Developer

Software Savants

Nov 2023 – Aug 2024

Integration of a national health API: eligibility verification and management of reimbursement workflows triggered on submission.
Implementation of the HL7 FHIR standard for medical data interoperability.
Set up CI/CD pipelines (GitLab) for automated deployments on short product cycles.

PythonFrappeMariaDBHL7 FHIRREST APIGitLab CI/CD

Education

Master 2 in Big Data

Université Claude Bernard Lyon 1

Sep 2024 – Sep 2026

Big Data, Cloud Computing, Distributed Databases, Software Architecture, Virtualisation, Real-Time Web Applications, Security.

Bachelor's Degree in Computer Science

Université de Nouakchott

Jan 2021 – Jul 2023

Algorithms, Data Structures, Web Development, Databases.

Open Source Contributions

PMD
3 merged PRs fixing false positives in static analysis rules — two in Java (UnusedLocalVariable false positive for pattern variables in braceless for-each, and a parser failure on switch expressions inside super() calls) and one in Apex. Released in PMD 7.20.0 and 7.21.0.
Checkstyle
2 PRs under active review: a bug fix for a false positive in IndentationCheck triggered by new operator expressions (#18686), and an enhancement to surface the expected line separator in NewlineAtEndOfFileCheck violation messages (#18972).
WeasyPrint
Added support for the CSS text-transform property in the HTML/CSS-to-PDF rendering engine (PR #2590, merged in v68.0).

Contact

Feel free to reach out for collaboration or just to say hi!

Email Me

Hello, I'm Mohamed Hamed. Software Engineer.