◆ Program Reflection · IST 782 · Spring 2026

Looking Back to
Move Forward

A reflection on completing the 4+1 accelerated M.S. Applied Data Science program at Syracuse University — what I built, what I learned, and the kind of data scientist I am becoming.

◆ Portfolio Presentation Video

Open in Loom →

AuthorMian Muhammad Abdul Hamid
ProgramM.S. Applied Data Science
ConcentrationsAI & Big Data
Path4+1 Accelerated
GraduationMay 2026
InstitutionSyracuse iSchool

Where I Started: The 4+1 Decision

I did not arrive at the M.S. Applied Data Science program through the conventional route. I completed my B.S. in Information Management and Technology at Syracuse University's iSchool in May 2025 — with concentrations in Information Security, Project Management, and Data Analytics, a 3.7 major GPA, and four Dean's List recognitions — and enrolled immediately in the accelerated 4+1 path that allowed me to begin graduate coursework while still finishing my undergraduate degree.

That decision was deliberate. I had spent four years building a foundation in information systems, security, and project leadership, and I could see clearly that the next layer of capability I needed was in applied data science and machine learning. The 4+1 program was not the easy path — it meant compressing what many students spread across two years into a single intensive year, often carrying graduate-level coursework simultaneously with undergraduate capstone obligations. But it also meant that by May 2026, I would hold both degrees from one of the country's strongest information schools, with a skill set that bridges IT systems, security thinking, and data science practice.

Looking back, the intensity of the accelerated format turned out to be one of its greatest strengths. There was no gradual ramp-up, no semester to settle in before the real work began. I had to operate at graduate level immediately, bringing my undergraduate foundation with me and building on it in real time. That pressure was productive. It forced me to integrate knowledge across domains rather than compartmentalize it, and it accelerated the development of judgment — knowing when to apply which tool, which questions to ask before building a model, and how to manage time and scope when both are limited.

The 4+1 program taught me something no single course could: how to hold complexity across multiple domains simultaneously, and how to move fast without cutting corners on rigor.

What I Expected — And What Actually Happened

When I enrolled in the graduate program, my expectations were specific: deepen my Python skills, understand machine learning at a level beyond introductory exposure, and learn how to work with large-scale, real-world datasets in ways that produced actionable results. Those expectations were met. But the program gave me considerably more than I had anticipated.

My undergraduate concentrations in information security and project management had trained me to think in terms of systems, risk, and process. I knew how to identify vulnerabilities, manage stakeholders, and document procedures. What I did not yet know — what the graduate program systematically built — was how to translate large, messy datasets into insight that actually changes decisions. That translation process turned out to be far more nuanced than I expected. It is not just about running the right model. It is about understanding what question you are actually trying to answer, who needs the answer, what they will do with it, and what the costs of different kinds of errors are.

The biggest surprise was the degree to which communication is a core technical skill in data science. I expected the hard work to be in the modeling and the code. In practice, some of the most demanding and professionally consequential work was in explaining findings to audiences who did not share my technical background — and doing so in a way that was both accurate and genuinely persuasive. Every project in this portfolio forced me to practice that translation, and I am a better data scientist for it.

The second surprise was how much ethical reasoning is embedded in what look like purely technical decisions. My security background had already sensitized me to questions of access, privacy, and risk. But the graduate program extended that sensibility into machine learning specifically. When you build a classifier that affects whether a restaurant gets inspected more frequently, or a predictive model that influences resource allocation across neighborhoods, the choices you make about features, thresholds, and evaluation metrics are ethical choices. They are not just technical ones. Recognizing that — and designing accordingly — is something I now consider a professional responsibility.

The Projects That Defined My Education

Predicting NYC Restaurant Inspection Outcomes (IST 707 · Spring 2026)

My Applied Machine Learning course project was the most intellectually demanding work I completed in this program — and the most satisfying. The project proposes a predictive system that identifies NYC restaurants at high risk of receiving poor health inspection grades before their annual inspection, using historical violation patterns, cuisine type, borough location, and neighborhood-level context as input features. The central design argument is a shift from reactive, calendar-driven inspection scheduling to proactive, risk-based resource allocation.

What made this project technically interesting was the requirement I placed on myself to treat interpretability as a core design constraint rather than a nice-to-have. It is relatively straightforward to build a classifier that performs well on held-out data. It is considerably harder to build one whose predictions can be explained to a restaurant owner who wants to understand why they were flagged, or to a health department official who needs to justify prioritizing one neighborhood's establishments over another. SHAP values, feature importance analysis, and careful model selection were all shaped by that interpretability requirement, and working through those tradeoffs taught me more about machine learning in practice than any textbook treatment of the algorithms.

The project also crystallized my understanding of what "actionable insight" actually means in an applied context. A model that predicts outcomes is only useful if someone can act on those predictions. That means designing the output for the stakeholder, not for the data scientist. The shift from a reactive inspection model to a proactive one is a policy argument as much as a technical one, and I had to be able to make both arguments coherently.

Two Decades of Student Aid: Is NY's TAP Keeping Up? (IST 737 · Spring 2026)

The TAP financial aid visualization project was where I came to understand what it means to design data communication for an audience that is not composed of data scientists. New York State's Tuition Assistance Program is the state's largest financial aid program, and the question of whether it has kept pace with rising tuition costs over two decades has direct implications for students across the state — including students like me. I attended Syracuse University as an Our Time Has Come Scholar, which gave me a particular investment in understanding how financial aid systems work and, critically, who they are failing to serve.

Building the Tableau dashboard required continuous attention to the person on the other side of the screen: a student's parent trying to understand eligibility, a policy advocate looking for evidence of equity gaps, a legislative staffer preparing testimony. Every design decision — which comparisons to highlight, how to use color to signal disparity, where to place annotations that make the story legible — was made in service of that audience's understanding, not my own satisfaction with the visualization.

Analyzing Hazardous Cosmetic Chemical Disclosures in California (IST 652 · Fall 2025)

This was my first fully independent large-scale data project at the graduate level, and it was where I first experienced the complete data science life cycle from start to finish. The California Safe Cosmetics Program requires manufacturers to disclose ingredients known or suspected to cause cancer, birth defects, or reproductive harm — producing a dataset of over 114,000 records spanning 13 years of regulatory reporting.

The cleaning process alone was a substantial technical undertaking: removing more than 70,000 duplicate and missing records, engineering product-chemical relationship features, standardizing company names and category labels across 13 years of inconsistent data entry, and reducing 22 raw features to 13 key analytical variables. This was not glamorous work. But it was foundational, and it taught me that the quality of any analysis is almost entirely determined by the quality of the cleaning that precedes it.

The key finding — that products containing higher numbers of flagged chemicals were significantly more likely to be discontinued — was analytically interesting, but what stayed with me was the broader implication. Regulatory transparency creates a record. That record is only useful if someone reads it and acts on it. Most consumers have no awareness that the products they apply daily contain ingredients flagged for reproductive harm. The California Safe Cosmetics Program makes the data available. What I did was make the patterns in that data visible.

Earlier Projects

Predictive Energy Usage Modeling (Dec 2024)

Before the three portfolio projects above, one of the most technically rich projects I completed was a predictive energy demand modeling system built in R. I developed and compared regression, random forest, and SVM models to forecast residential energy demand across counties, with the SVM performing best at R² = 0.633 and MAE = 0.240. I engineered time-series and weather-adjusted features to support grid demand planning, and built an interactive R Shiny dashboard enabling real-time scenario testing — including simulations of +5°C climate-change conditions across service territories.

This project pointed me clearly toward the kind of work I want to do after graduation. The intersection of data science and energy infrastructure — understanding how demand patterns shift seasonally and climatically, how predictive modeling can inform grid planning — is directly aligned with my professional goals. It connected organically to my NYCHA electricity consumption analysis, where I worked with 15 years and 258,000+ meter records of real New York City public housing billing data, identifying borough-level cost inequalities and infrastructure stress points.

NBA Salary Value Prediction (May 2025)

My undergraduate capstone built a multiple linear regression model in PySpark to predict NBA player salaries based on performance metrics including three-point makes, points per game, and minutes per game. After cleaning and merging two cross-source datasets covering 100+ players, the model identified players underpaid by up to $11.4 million and overpaid by up to $7.9 million. This project was where I first worked at meaningful scale with a distributed computing framework, and where I developed confidence in the end-to-end pipeline from data acquisition to model interpretation.

How I Demonstrate Each Program Learning Outcome

Outcome 1 — Collecting, Storing, and Accessing Data: Across my graduate projects, I have worked with public government datasets from the California Safe Cosmetics Program, New York State Open Data, and the NYC Department of Health and Mental Hygiene; cross-source sports datasets merged from NBA.com and Basketball Reference; and internal utility data simulations. In each case, data acquisition required navigating real-world messiness — inconsistent formats, missing records, and multi-source merges.

Outcome 2 — Creating Actionable Insight: Every project has been oriented toward a real stakeholder and a real decision. For restaurant inspections: health inspectors allocating limited resources. For TAP: policymakers needing evidence of equity gaps. For cosmetic chemicals: consumers and regulators understanding what disclosure data reveals. For energy modeling: utility planners needing climate-adjusted demand forecasts. The common thread is consistently asking not just what the data says, but who needs to know it and what they should do differently.

Outcome 3 — Visualization and Predictive Models: Visualization and modeling are not separate skills — they are two tools in service of the same goal: making data understandable and actionable. A predictive model that cannot be explained is limited in its usefulness; a visualization not grounded in rigorous analysis is misleading. The projects in this portfolio hold both standards simultaneously.

Outcome 4 — Python and R Programming: Python is my primary language, used for the full pipeline from data acquisition through modeling and visualization. R is my secondary language for statistical modeling, time-series analysis, and Shiny development. I also work with SQL, HTML, CSS, JavaScript, and PySpark. Beyond specific languages, the graduate program built programming judgment — knowing when a simple script will do and when a sophisticated pipeline is warranted.

Outcome 5 — Communicating Insights to Broad Audiences: My background in interfaith community leadership has given me unusual preparation here. As President of SAIL since 2021, leading discussions across communities that do not share frameworks or vocabulary — this is the same problem data science communication poses. The skill in both cases is translating across difference: making something clear to you accessible to someone who does not share your context.

Outcome 6 — Ethics in Data Science: My undergraduate concentration in information security built a foundation in thinking about systems from an adversarial and ethical perspective. The graduate program extended that into machine learning. The features I choose, the threshold I set, the metric I optimize — these are all places where values are encoded, often invisibly. I approach these decisions with awareness that they matter beyond the model's test-set performance.

Leadership, Service, and the Data That Lives Between Numbers

As President of SAIL since August 2021, I have facilitated bi-weekly discussions among campus religious and spiritual communities, organized Ramadan iftars in collaboration with Hillel, the Center for Jewish Life, and the Muslim Students Association, and worked to build the institutional trust that makes cross-community collaboration possible. As Interfaith Engagement Coordinator at Hendricks Chapel, I led 20+ cross-functional initiatives and organized four recurring interfaith service events mobilizing 100+ students — using feedback loops and attendance data to inform improvements in service delivery.

What interfaith leadership and data science have in common is more than it might appear. The core challenge in both is making meaning across difference — helping people with different frameworks find common ground, and making complex information accessible to people who do not share your vocabulary. The patience, the genuine curiosity, and the discipline of listening before explaining — these are not soft skills adjacent to data science. They are the skills that make data science useful in the real world.

My work in non-emergency medical transport through SU Ambulance added another dimension: operational precision under pressure. Conducting 100+ mobility-assistance transports, coordinating with hospitals and urgent care facilities, maintaining real-time communication with dispatch — this is a real-time logistics optimization problem with human consequences. It has kept me grounded in what it means for systems to work reliably when the stakes are not abstract.

Where I Am Going: Energy, Infrastructure, and What Comes Next

My professional direction has become increasingly clear. I want to work at the intersection of data science and energy infrastructure — specifically in the kind of large-scale utility and grid management context that organizations like National Grid operate in every day. Several of my peers work there, and my mentor has been with the company for six years. The NYCHA electricity consumption project, the energy demand modeling work in R, and my ongoing interest in infrastructure inequalities all point in the same direction.

What I bring to this domain is not just technical competence in machine learning and data analysis. It is a perspective shaped by information security — thinking about systems in terms of risk, resilience, and the consequences of failure — combined with a genuine commitment to equity in how infrastructure serves different communities. The finding from the NYCHA project that certain boroughs face systematically higher costs and infrastructure stress is not just analytically interesting. It is an operational and policy problem that data science can help address.

Beyond energy specifically, I intend to continue developing capabilities in natural language processing, and in designing end-to-end data projects that go beyond analysis to produce genuinely deployable solutions. The gap between "I built a model" and "I built something people actually use" is one I want to close.

What This Portfolio Represents

This portfolio is not a retrospective on finished work. It is a statement of capability at a specific moment in time — the moment of completing an accelerated graduate program that compressed five years of academic development into a continuous four-year run. The three projects collected here represent the range of what I have learned to do: machine learning and model interpretability; visualization designed for public-interest audiences; large-scale data cleaning and exploratory analysis; and ethical awareness applied consistently across all three.

What has stayed constant across all of this work is the question I bring to every dataset: who does this affect, and what should someone do differently because of what the data shows? Data is not neutral. What we do with it — how carefully we clean it, how honestly we analyze it, how clearly we communicate it, and how thoughtfully we consider who it affects — is a reflection of what we value as practitioners.

I am becoming a data scientist who not only knows how to work with data, but thinks seriously about how it affects the people and systems around it. That is the most important thing this program gave me — and the standard I intend to hold myself to in everything that comes next.

Word count: ~3,000 words  ·  Spring 2026  ·  IST 782 DS/AI Portfolio  ·  Syracuse University iSchool