Danielle Durán, M.S.
1 min readJan 8, 2021

--

Yes, no problem - this is both interesting + timely. I was thinking about your article more last night, and I wonder if you might actually want to cluster models by city - the reasoning is that the unit of analysis is somewhat obscured here. The level of the shooting dataset is incident, which is nested within city and state (and nation, by extension although that's a n=1). The demographic data are at the state level currently, but you could pull in city-specific demographics. What happens here is a bit of obfuscation due to the aggregated nature of the demographic data, so the unit of analysis kind of de facto becomes the state level (or city level if you do that). The model then tries to look for relationships between variables at the incident level, where relationships are not as clear as the could be due to the mixing level of analysis. What I'd recommend is actually modelling by city: then the unit of analysis is the city and what factors affect the number of incidents by city. The alternative would be to look up the neighborhood where victims of shootings resided or where shootings occurred (demographics at shooting location would probably be more informative but depends on having an address where the incident occurred) and pulling in census tract-level demographics based on that location. THEN you have a model which truly uses the incident as the unit of analysis - couple of options here depending on your goal! Like I said, great work and looking forward to the follow-up :)

--

--

Danielle Durán, M.S.
Danielle Durán, M.S.

Written by Danielle Durán, M.S.

Statistician. Co-founder. ESG Advocate.

No responses yet