UH CRIME
An analysis of crime at the University of Houston
The Project
Using publicly available data, the group will collect and analyze data relating to crime on the UH campus over the past 4-5 years. The data will all be sourced from records available through various freedom of information requests from each of the three group nmbers.
CAMPUS SAFETY
An overall analysis

Where the other two articles will be an in-depth examination of specific aspects of crime at the University of Houston, this piece will take a more broad approach to the data in order to determine what overarching trends may be present. Factors such as victim's gender and ethnicity will be accounted for in order to determine whether or not certain groups are more at risk than others. In addition, the goal is to create an easy-to-understand visual display of this data, likely in the form of a campus map overlay detailing locations of relatively high crime as well as any relevant statistics derived from the data.

In our overall analysis of crime on campus, we requested a detailed report of all crimes committed in and around campus over a 4 year period beginning in 2018 and ending in October of 2022.
Thus far, this data has been sourced via a texas public information act request to the information desk at the University of Houston. The data that was provided came in the form of over 100 PDFs, which will ultimately require scraping in order to efficiently analyze.
Considering the size and scope of the data collected, uploading it is not feasible at this time. However, once it has been cleaned up and the relevant information extracted, the data will be compiled into an easy-to-understand graphic/chart.
A cursory exploration of the information collected tells us that this dataset will provide insight both into crime frequency and change over time, COVID-19’s impact on crime rates, as well as information regarding crime prevalence for specific areas of UH.
All things considered, this data was relatively easy to collect. After making the PIA request, the data was compiled and sent within two weeks. The exchange with the public information officials was both cordial and efficient.
While the format it was provided in is less than ideal, using commonly available digital resources the data can easily be refined into a usable format.
The public information official who eventually fulfilled the request was Kelly Hill Wilson. khwilson@central.uh.edu
​
Data collection
​
​
​
The data supplied to me in my PIA request required an excessive amount of cleaning to analyze. The data provided came in Excel sheets converted into PDF files, making it much harder to input into a program to analyze. Aside from that, instead of simply supplying the crime data by month, the public information office simply sent over the weekly download of the 60-day crime report found here.
This presented an initial challenge as many of the PDFs overlapped in terms of the dates, requiring a thorough search through each one to weed out repeated dates. Ultimately, the easiest solution was to take the latest of the 60-day crime report for each month, and input the data into one spreadsheet for the entire year.
After this step came cleaning. Initially, the intent was to eliminate duplicates manually. However, after realizing that each report had a unique case number, I could use Open Refine to filter out and delete duplicate reports automatically.
With the duplicates eliminated, another issue quickly made itself apparent. There were many issues with certain crimes and areas in which they were committed, using slightly different formats leading excel to count them as separate crimes/areas. This required an excessive amount of clustering. However, after faceting the data through fingerprint and proximity, I could eliminate the majority of the duplicates and legibly array the data.
​
​
​

DORMS
For those who are living on campus, the realities of safety at UH


In order to properly assess the safety on campus, I needed to learn how UNsafe it is. Since my focus is on-campus safety and for those who live here at the University, I first wanted to examine daily crime to get an idea of what is "typical."
As you can see from the data above, 10 out 11 rape victims were assaulted in campus housing. This data needs to be readily available to all students who are thinking about living on campus.
​
I am specifically interested in the crimes happening to students who live on campus, so narrowing my search down to residential spaces and super populated areas on campus.
​
I used data software to scrape the data into a spreadsheet and then created PivotTables so I could further narrow down my search and only see the applicable information.
​
When requesting information I asked for crime reports or complaints of crimes on campus. I then narrowed my request for reports made by students who live in campus housing.
​
I requested the crime logs from the University's Police Department, which were promptly given to me by the UH Public Information Official, Kelly Wilson. Mrs. Wilson also sent me a copy of the annual crime report for 2021-2022. I also reached out to the student housing office but have yet to hear a response.
​
​
Data Collection
The open source data I used was the UHPD daily crime log and the data from the UH Annual Safety report.
I received the data in the form of a webpage spreadsheet that I had to scrape. I used a Google Chrome extension called Data Miner in which I created my own "scrape recipe" to perfectly scrape the data I needed and exclude what I did not.
As for the information received through FOIA requests, that came in the form of a PDF. I used Tabula to scrape the data into a spreadsheet. I was then able to navigate through all the data in the excel sheet by creating different tables and filters.
How does it compare?
UH compared to other, similar universities.

University of Houston is one of the many college campuses located in our diverse city. This project will require deep research into a few of the neighboring universities to compare crime data. We will be able to successfully analyze the different crime rates while also learning which particular crimes are the most prevalent on college campuses around Houston. All universities included in this portion of the project are public universities just like UH. By interviewing safety officials on different campuses, we aspire to positively impact students around the city. This work will help us to decide if there is an underlying issue that can be further explained.
​
​
​
​
Updated Blog via Darian Ellis:
​
​
​
It is my responsibility to compare crime data at UH to other universities here in the city. I would like to be able to compare which crimes are most prevalent and when they are committed. I was able to access the Daily Crime Log for Texas Southern University on their website. It was located in a pdf file so it needed to be scrapped. Tabula is the program that I am most familiar with that allows scraping from a pdf. The data found here can help me answer the following questions:
-
Where is the most crime taking place?
-
Who is being targeted?
-
What time and day is crime happening?
-
Are cases being closed or remaining “active”?
I sent my initial FOIA request to the University of Houston-Downtown Police Department. I requested the crime logs for July 2018 through June 2022. I received the information from Police Corporal Tabitha Rivera. She manages records and evidence for the on-campus police department. She was prompt and I received a response within two weeks. The data given will help me compare crime with UH. I can compute the total sum of crimes at the downtown campus, along with the institution size, to compare crime rates and percentages changes each year.
​
This information can greatly help to understand how campus location and population size can impact safety for students.
​
DATA COLLECTION
​
The open source data that I used was found on the website at Texas Southern University from their daily crime log. Unfortunately, it is provided in the form of pdf file. This was a bit challenging to scrape and convert to a pdf file. Using Tabula, I found myself being met with an error message. As a journalist, I knew that I wouldn't be able to just accept this as an issue. I am unfamiliar with TSU's facilities, but it is important to me to speak with someone at the campus police department so that I can receive the necessary files in excel form.
​
The FOI request proved to be successful. I was given the daily crime logs from 2018-2022. However, the data was presented in pdf form. Luckily, with the help of my classmates I was able to scrap the data and paste it into an Excel file.
"
​
After successfully scrapping the data, I was able to access how "dirty" the data was. Unfortunately, there were a lot of spelling mistakes in the description of the crimes/violations committed. I have been going through each year and correcting each mistake so that I can create accurate visualizations. I now understand how important it is to be diligent and thorough when analyzing the data provided. A small oversight can negatively impact your work.
​
​


Safety on campus is important to any student at UH, but perhaps more so for those who live on-campus. This project discloses the safest and most dangerous spots on campus, which crimes happen most often, where they happen, and who they happen too. It will go into detail on the different student housing options and how safe they are, crime rates for places on campus, and some facts based on data I have received. It goes in depth on real data to give you all the information you need to know in a way that is easy to digest.
