Data Science · Machine Learning · Nashville, TN

Tianrun
"Echo"
Yu

MS Data Science candidate at Vanderbilt University — building machine learning systems that turn raw data into real-world decisions.

Echo Yu
Tianrun (Echo) Yu
97%
ML Accuracy
4+
DS Projects
3+
Yrs Experience
LinkedIn linkedin.com/in/tianrun-echo-yu-373bb9273 Email echoytr@outlook.com Phone (+1) 206-637-5492
01

About

I'm a Data Science graduate student at Vanderbilt University's Data Science Institute with a mathematical foundation from the University of Washington (BA Mathematics). I thrive at the intersection of rigorous analysis and practical impact — turning complex datasets into systems that businesses can actually use.

From training MobileNetV3 classifiers on 1M+ field-service images to fine-tuning Whisper on LIGO gravitational-wave data, I've learned to move fast on ambiguous problems while keeping an eye on production readiness.

Outside of data, I bring the adaptability and grit from years of bartending and event service — skills that translate directly into stakeholder communication, tight deadlines, and cross-functional teamwork.

I'm currently open to full-time data science and ML engineering roles. Based in Nashville, TN — open to hybrid and remote opportunities.

Education
Vanderbilt University
Master of Science — Data Science
Aug 2024 – May 2026 · GPA 3.94
University of Washington
Bachelor of Arts — Mathematics
Sep 2020 – Jun 2024 · GPA 3.77
Open to Work
Nashville, TN · Remote · Hybrid
Full-time · Data Science & ML Engineering
02

Experience

May 2025
– Aug 2025
myTickets
Data Science Intern
  • Built a daily ticket-price forecasting pipeline (14-day horizon) reading 1M+ listings/day from MotherDuck, training per-event Ridge models and writing predictions to predictions_data for front-end use; scheduled as a Kaggle notebook.
  • Cut memory pressure by modeling at the event level (~100K events), optimizing dtypes, and smoothing outliers to stabilize predictions.
  • Prototyped a global SGDRegressor using section/row/venue/category features on Colab Pro+ (50GB RAM); documented limitations and next steps after poor initial accuracy.
  • Improved search query latency from ~20s → 2–4s via SQL/query optimization to speed up customer lookups.
  • Led handoff: recommended new MotherDuck DB under admin ownership, transferred Kaggle/Colab secrets, and delivered full system diagram + model documentation.
Python pandas Ridge Regression MotherDuck SQL
Sep 2024
– Feb 2025
Vanderbilt DSI
LIGO Project — Data Scientist
  • Fine-tuned the Whisper audio transformer (DoRA, 0.13% parameter update) to classify gravitational wave signals vs. noise from LIGO detector data.
  • Achieved 81% accuracy on simulated LIGO data; correctly classified 59/66 real binary black hole events from LIGO's O3 run.
  • Generated spectrograms and built classification heads for multi-class glitch identification across 9 LIGO noise artifact types.
PyTorch Whisper DoRA Transformers Signal Processing
Oct 2023
– Jan 2024
Accenture
Strategy & Consulting
Shanghai
Data Analytics Intern
  • Designed optimization models (Genetic Algorithm & Simulated Annealing) for Airbus warehouse operations, reducing pick-path distances by 100+ meters vs. baseline.
  • Simulated 20+ randomized pick-lists, generating optimal routes within a 3D shelf-space layout with defined start/end points; benchmarked GA vs. SA trade-offs (path quality vs. training speed).
  • Created client-facing materials on 5G-enabled Smart Health and Industrial IoT use cases; provided strategic recommendations for adapting smart manufacturing across European & American markets.
Genetic Algorithm Simulated Annealing Optimization IIoT 5G Python
03

Projects

Jan 2026 – Apr 2026
FusionSite Smart Photo Verification
Vanderbilt DSI Capstone · FusionSite
Built an end-to-end service verification system combining GPS route validation and ML image classification for portable toilet service operations, processing 1M+ images from an Azure SQL database.
97% accuracy · MobileNetV3-Small · 0.9931 ROC-AUC
PyTorch MobileNetV3 Streamlit OCR Azure SQL Haversine GPS
Jan 2025 – May 2025
SymTrain ML Capstone
Vanderbilt DSI · SymTrain
Built end-to-end win/loss and time-to-close prediction models from HubSpot CRM exports using Random Forest and HistGB with GridSearch and leakage controls. Delivered a Streamlit app with persona-based clustering.
Dockerized · Feature Store · Timezone-safe Pipeline
Python Scikit-Learn Streamlit Docker HubSpot
Fall 2024 – Spring 2025
Nissan Part Price Prediction
Vanderbilt DSI · Nissan North America
Developed a 92% accurate XGBoost model to predict automotive part prices, enhancing supplier selection efficiency. Cleaned 13,000+ part records and identified ~$60K/year in labor savings.
92% accuracy · R² = 0.92 · $60K/yr savings
XGBoost ETL Streamlit Machine Learning
Sep 2024 – Feb 2025
LIGO Gravitational Wave Detection
Vanderbilt DSI · LIGO Collaboration
Fine-tuned Whisper audio transformer on LIGO detector data for binary classification of gravitational wave events vs. noise. Correctly classified 59/66 real black hole merger events from the O3 observing run.
81% accuracy · 66 Real Events · 9 Glitch Classes
Whisper DoRA PyTorch Signal Processing
Oct 2023
Food Waste App — Dubhacks
UW Dubhacks Hackathon
Led a two-day hackathon sprint to develop a campus food waste reduction app: real-time leftover availability, push notifications for interested users, Python backend, and a Figma prototype with full working demo.
Built in 48 Hours · Full Demo + Figma Portfolio
Python Figma Database Hackathon
+
More projects
in progress
04

Skills

Machine Learning
  • PyTorch / TensorFlow
  • Scikit-Learn / XGBoost
  • Deep Learning / CNNs
  • Computer Vision / OCR
  • Optimization Algorithms
  • Google Vertex AI
Languages & Tools
  • Python / SQL / R
  • Java / MATLAB / LaTeX
  • Jupyter / Streamlit
  • Docker / Git
  • Excel / Tableau / Figma
  • Google Big Data
Data Engineering
  • MotherDuck / DuckDB
  • Azure SQL
  • MongoDB / Redis
  • ETL Pipeline Design
  • Data Visualization
  • SQL Query Optimization
Soft Skills
  • Analytical Thinking
  • Stakeholder Presentation
  • Cross-functional Collaboration
  • Clear Communication
  • Adaptability / Problem Solving
  • Chinese (Native) · English (Fluent)
05

Contact

"Let's turn your data into something that actually works."