FR
Mohamed portrait 👋

Hello, I'm Mohamed Hamed.
Software Engineer.

2+ years of hands-on experience in building modern, performant, and reliable web applications.

About

Hello, I'm Mohamed Hamed. I'm a software engineer with 2+ years of experience. I enjoy building reliable websites & apps. My focus is creating robust and user-centric solutions.

I am also deeply interested in contributing to open-source projects and exploring new technologies that push the boundaries of the web.

Projects

SNCF DelayFlow — TGV Delay Analytics Pipeline

End-to-end ETL pipeline processing TGV punctuality data (SNCF Open Data) across 3 sources (~18,000 records) with PySpark. Includes Spark SQL transformation, Parquet storage, a Random Forest Regressor (Scikit-learn) for delay prediction, and an interactive Streamlit/Plotly dashboard diagnosing root causes across infrastructure, traffic, and rolling stock dimensions.

PythonPySparkSpark SQLScikit-learnParquetStreamlitPlotlySQLite

DataHub PFAS — Environmental Data Warehouse & BI

ETL pipeline over 929,452 measurements (CNRS Data Hub): filtering, unit conversion, PubChem API enrichment. Star schema in PostgreSQL 15 + PostGIS with 4 materialised views and 8 quality checks. MongoDB 2dsphere (~36,000 GeoJSON polygons), Flask API, Leaflet.js map and Looker Studio connector.

PythonPostgreSQLPostGISMongoDBFlaskLeaflet.jsLooker StudioPubChem API

Skills

  • Python
  • Java
  • SQL
  • Bash
  • Apache Spark
  • PySpark
  • Apache Airflow
  • dbt
  • Scikit-learn
  • PostgreSQL
  • PostGIS
  • MongoDB
  • MariaDB
  • SQLite
  • Tableau
  • Streamlit
  • Plotly
  • Looker Studio
  • AWS S3
  • Docker
  • Kubernetes
  • Terraform
  • GitLab CI/CD
  • GitHub Actions

Work Experience

Data Engineer (Apprenticeship)

LCL

Sep 2025 – Sep 2026

  • Contributed to the design and development of an ETL pipeline (credit subscription propensity scoring): multi-source customer data ingestion at scale via Apache Spark (Java), physical modelling and technical specifications for an ML prediction model.
  • Data quality checks on ingested data (schema validation, duplicates, missing values); SQL query optimisation; flow orchestration with Airflow and transformations with dbt.
  • DAG monitoring in production, POCs on new data sources, technical documentation and code reviews in Agile.
PythonJavaApache SparkPySparkPostgreSQLMongoDBAirflowdbtTableauDockerGitLab CI/CD

Full-Stack Developer

Software Savants

Nov 2023 – Aug 2024

  • Integration of a national health API: eligibility verification and management of reimbursement workflows triggered on submission.
  • Implementation of the HL7 FHIR standard for medical data interoperability.
  • Set up CI/CD pipelines (GitLab) for automated deployments on short product cycles.
PythonFrappeMariaDBHL7 FHIRREST APIGitLab CI/CD

Education

Master 2 in Big Data

Université Claude Bernard Lyon 1

Sep 2024 – Sep 2026

Big Data, Cloud Computing, Distributed Databases, Software Architecture, Virtualisation, Real-Time Web Applications, Security.

Bachelor's Degree in Computer Science

Université de Nouakchott

Jan 2021 – Jul 2023

Algorithms, Data Structures, Web Development, Databases.

Open Source Contributions

  • PMD

    3 merged PRs fixing false positives in static analysis rules — two in Java (UnusedLocalVariable false positive for pattern variables in braceless for-each, and a parser failure on switch expressions inside super() calls) and one in Apex. Released in PMD 7.20.0 and 7.21.0.

  • Checkstyle

    2 PRs under active review: a bug fix for a false positive in IndentationCheck triggered by new operator expressions (#18686), and an enhancement to surface the expected line separator in NewlineAtEndOfFileCheck violation messages (#18972).

  • WeasyPrint

    Added support for the CSS text-transform property in the HTML/CSS-to-PDF rendering engine (PR #2590, merged in v68.0).

Contact

Feel free to reach out for collaboration or just to say hi!

Email Me