2025 & 2026 Elections

  • Thread starter Thread starter nycfan
  • Start date Start date
  • Replies: 785
  • Views: 26K
  • Politics 
There are social media posts that can be linked to individuals. While it's still an estimate, I think it's fair to say that people tweeting how Kamala is the devil were not voting for her.
So you want to take social media posts, which are small in sample relative to total actual voters, and are often anonymous or can very easily be faked… and try to bridge that to voter registration and turnout records in a meaningful way.

That sounds like bad data you got brewing there, but knock yourself out.
 
So you want to take social media posts, which are small in sample relative to total actual voters, and are often anonymous or can very easily be faked… and try to bridge that to voter registration and turnout records in a meaningful way.

That sounds like bad data you got brewing there, but knock yourself out.
Data brokers do this all the time. It's a billion dollar business. Why do you think Google gives away its browser, OS and email for free? It's the targeting. And the targeting knows who you are unless you are careful not to reveal it.

I'm not the data science grad student who published that piece. I don't know what he did or didn't do, or what data he did or didn't use. I'm just saying there are more tools out there than you think.
 
Data brokers do this all the time. It's a billion dollar business. Why do you think Google gives away its browser, OS and email for free? It's the targeting. And the targeting knows who you are unless you are careful not to reveal it.

I'm not the data science grad student who published that piece. I don't know what he did or didn't do, or what data he did or didn't use. I'm just saying there are more tools out there than you think.
Data brokers TRY to do this all the time. If they were successful at tying voter data to social media at scale, we’d have much more accurate polling and election insights.

I’m very aware of the tools and how targeting works, and if you think individual social media posts can at this point be scraped and linked to voter registration data in a meaningful way, i.e. at scale, then no, they’re not yet as sophisticated as you think. The best you can likely do with that is some thin modeling, which is likely what this guy did.

And I’m certain he didn’t pay walled gardens like google or meta for the data needed to even attempt to do what you’re suggesting (which is exorbitantly priced relative to a grad student or even an Ivy League graduate program, especially in the current climate).

Back to my original point — this data is modeled and likely a flimsy model. That doesn’t mean it’s worthless, but take it just as he states it to be — an estimate.
 
Data brokers TRY to do this all the time. If they were successful at tying voter data to social media at scale, we’d have much more accurate polling and election insights.

I’m very aware of the tools and how targeting works, and if you think individual social media posts can at this point be scraped and linked to voter registration data in a meaningful way, i.e. at scale, then no, they’re not yet as sophisticated as you think. The best you can likely do with that is some thin modeling, which is likely what this guy did.

And I’m certain he didn’t pay walled gardens like google or meta for the data needed to even attempt to do what you’re suggesting (which is exorbitantly priced relative to a grad student or even an Ivy League graduate program, especially in the current climate).

Back to my original point — this data is modeled and likely a flimsy model. That doesn’t mean it’s worthless, but take it just as he states it to be — an estimate.
I'm not well versed in this area so I won't argue the point, except to say that targeting tools would be more or less useless for polling, which requires a random sample.

I'm not saying you can target everyone, only that it could help what that guy is doing. And maybe I'm wrong about that last part.
 
Interesting but keep in mind these are still modeled projections of some kind, since nobody knows how anyone voted other than exit polls and other self reporting, which are unscientific and small in sample. Who anyone actually voted for is not public record.

Demographics are available, and registration and turnout numbers are available publicly, but there’s no surefire way to link those to actual votes for this candidate or that candidate.

Looks like this guy’s a data science grad student at Harvard, I’m guessing there’s reasonable methodology behind it but either way he is right to use the term “estimate.” Especially since pollsters have been getting so much wrong lately.
Having looked at what the tool does -- I mean, it's pretty basic. Other than the demographics, all of the data is directly available. Precinct level vote totals are a matter of record. I thought he was drilling down further to the neighborhood level and that's where social media might help. I know there are data files at the neighborhood level but I don't know how accurate they are.

I suspect the model is pretty good. Assuming correlations between areas are relatively stable over time, basically the system is just solving a huge matrix of equations. There will be a set of precincts P, and a set of precinct correlations PC(n, m). No idea if n and m are comprehensive across all precincts or if trivial correlations are weeded out. Census data gives breakdowns on racial composition at a granular level. There's also precinct change over time, which would help determine deltas -- i.e. if a precinct goes from 50% white to 30% white, you can correlate that with different vote totals and again use correlations to fill in the gaps.

It's still a model, but I suspect it's pretty good. The Needle is at least pretty good.
 
Back
Top