Posts

Module 7: R Objects S3 vs. S4

  For this assignment, I used the built-in dataset mtcars from R (so I didn’t have to download anything). First I loaded it and checked the first few rows to confirm it worked. Step 1. Data mtcars is a data frame (32 rows × 11 columns). Since it’s a normal R dataset, it already comes with a class and lots of functions that work with it. Step 2. Can a generic function be assigned to this dataset? If not, why? A generic function is a function that chooses which method to run based on the class of the object you pass in (like print() , summary() , or plot() ). For mtcars , generic functions already work because mtcars has class "data.frame" (and also behaves like a list under the hood). For example, summary(mtcars) runs the summary.data.frame() method automatically. If I tried to use a generic function that has no method for a data frame, it wouldn’t know what to do (it would either fall back to a default method or error). That’s basically the “why not” case: the obj...

Matrix Operations in R

Image
     In this assignment, I practiced basic matrix operations in R, including addition, subtraction, and creating special matrices using the diag() function. These skills are important for understanding linear algebra concepts in data analysis and statistics. The topics are connected to matrix operations discussed in The Art of R Programming  and good coding practices from R Packages.    Question 1 :  Consider A=matrix(c(2,0,1,3), ncol=2) and B=matrix(c(5,2,4,-1), ncol=2). a) Find A + B b) Find A - B First, I created two matrices:      Matrix addition and subtraction are done element-by-element, as long as both matrices have the same dimensions. Question 2 :  Using the diag() function to build a matrix of size 4 with the following values in the diagonal 4,1,2,3. Next, I created a 4×4 matrix with values 4, 1, 2, and 3 on the diagonal.                 The diag() function places the given va...

Student Loan Program Trends in the United States (2007–2025)

Image
Written Report      This project examines how federal student loan programs in the United States have changed over time, with a focus on borrowing amounts, the number of student recipients, and the growing cost burden on individual borrowers. Although student debt is often discussed as a single issue, most people are not aware that federal loans come from different programs, each with its own history and trends. The goal of this analysis is to understand which programs are expanding, which are disappearing, and whether students today are receiving more borrowed dollars per person than they did in the past.      The idea for this project was influenced by several existing visual resources, including the Federal Student Aid dashboard, DataUSA’s educational finance reports, and journalist-designed college affordability graphics. These examples revealed how powerful visual storytelling can be in making complex financial data easier to understand. However, man...

Social Network Visualization Using Python

Image
     For this assignment, I created a simple undirected social network graph using Python, NetworkX, and Plotnine. The goal was to build a small network, convert it into a DataFrame, and visualize it using a ggplot2-style plotting library. Process & Reflection      To start, I generated a random graph with 10 nodes using the gnp_random_graph() function in NetworkX. This part worked really well, NetworkX makes it extremely easy to build graph structures and assign attributes like labels. I labeled each node from A through J and used the spring layout algorithm to position everything. Converting the graph into two pandas DataFrames (one for nodes and one for edges) turned out to be helpful because Plotnine works best when the data is structured in a tidy format.      The visualization step was surprisingly smooth. Since Plotnine is based on ggplot2, the syntax felt familiar: I used geom_segment to draw the connections, geom_point for th...

Module 11: Dot-Dash Plot in lattice

Image
       For this week’s visualization, I recreated the dot-dash plot using the lattice package in R. This type of plot was mentioned in Dr. Piwek’s discussion of Edward Tufte’s design principles, which emphasize simplicity and data-focused graphics. My version uses the built-in mtcars dataset, plotting miles per gallon (mpg) against car weight (wt). Each point represents a car, while the small dashes (created with panel.rug() ) along the x- and y-axes show the data distribution without needing full histograms.      The minimalist style follows Tufte’s “data-ink ratio” idea, using only the essential visual elements to highlight the relationship between variables. The clean background and limited color make it easy to focus on the pattern of the points rather than decorative elements. I like this design because it feels both simple and professional, showing how much you can communicate visually with very little clutter. ...

Module 10: Time Series and Visualization

Image
     For this module, I learned how to create and customize visualizations in RStudio using ggplot2 , focusing on patterns that change over time. I worked with two datasets, the famous Nathan’s Hot Dog Eating Contest results from FlowingData and the built-in economics dataset in R, to practice turning numbers into something visual and meaningful. Hot Dog Contest Data      I started with the hot dog eating contest data that tracked how many hot dogs the winner ate each year from 1980 to 2010. Using R, I made a basic bar chart that showed each year’s result, then highlighted the years where a new world record was set. The original version used gray and dark red bars, but I changed the colors and title to make it more my own.      In my customized version, I used a clean theme, blue and light gray colors, and angled the x-axis labels so they’re easier to read. The chart makes it obvious how eating records exploded in the early 2000s when Ta...

Multivariate Visualization Using mtcars

Image
 For this assignment, I used the built-in mtcars dataset in R because it includes several numerical and categorical variables that describe car performance. I wanted to explore how a car’s weight , horsepower , and number of cylinders affect its fuel efficiency (miles per gallon) . In my visualization, the x-axis represents weight , the y-axis represents miles per gallon , color encodes cylinders , point size shows horsepower , and shape indicates transmission type (automatic or manual).  The plot reveals clear relationships: heavier cars tend to have lower fuel efficiency, and those with more cylinders and higher horsepower are the least efficient. Manual cars generally achieve slightly better mileage than automatics of similar weight. These overlapping patterns demonstrate how multivariate visualization can highlight trade-offs between multiple design factors that wouldn’t be visible in a single-variable plot. When designing the graph, I applied three of Antony Hortin’s ...