
| General Information
The SMT Data Challenge is an advanced data competition where students analyze real-world, player-tracking data. Projects are open-ended, emphasizing process, relevance, creativity and communication rather than purely quantitative analysis. The Data Challenge has become a top recruiting ground for MLB teams—more than 25% of past participants have been hired by professional teams or sports companies.
IF YOU WANT TO STAND OUT AS A JOB CANDIDATE, THIS IS WHERE TO START!
| Eligibility
-
The SMT Data Challenge is open to STUDENTS ONLY.
-
Must be 18 years or older and enrolled for both Spring and Fall 2025 semesters
-
Students may participate individually or in teams (max: 4 students). There is also an option for individuals looking for team members.
| Project Guidelines & Submissions
PROJECT TOPIC
“Good” baseball is all about decision-making. Did events on the field go as planned? This year, we challenge you to use tracking data to infer and analyze player or team “intent.” Possible approaches include:
• Determining who should field a ball and assigning responsibility for misplays
• Evaluating when a fielder should make a play or not
• Analyzing whether a fielder or baserunner moves as intended
• Investigating what an advanced scout can deduce about a team’s strategy
MLB teams have asked to see:
• Effective use of data visualization—compelling graphics that advance storytelling
• Proficiency with "messy" data, effective cleaning, analysis, and justifications for inclusion/exclusion
SUBMISSION REQUIREMENTS | Deadline August 1
-
Technical Paper in PDF format (max: 2000 words)
-
Analysis Code
-
Results in CSV format
| Judging
Judges from academia, industry, journalism, and sports will evaluate submissions based on creativity, relevance, methodology, storytelling and communication to technical and non-technical audiences.
| Finalist Showcase
Three finalist teams will be invited to a Showcase Weekend at SMT headquarters in Durham, NC to present to a panel of judges.
(Mid-Nov, date TBD)
Optional analysis demo/Q&A sessions:
-
May 8 – Introduction
-
May 22 - Analysis
-
June 5 – Data Visualization
-
July 3 – Technical Writing
Virtual office hours available: May 6 – July 31
| Key Dates
April 30: Registration Deadline
May 5: Earliest date Data might be available
Aug 1: Submission Deadline
Oct 6: Finalists Announced
Mid-Nov: Finalist Presentations